Hacker News new | past | comments | ask | show | jobs | submit login

We provide this functionality in Lantern cloud via our Lantern Extras extension: <https://github.com/lanterndata/lantern_extras>

You can generate CLIP embeddings locally on the DB server via:

  SELECT abstract,
       introduction,
       figure1,
       clip_text(abstract) AS abstract_ai,
       clip_text(introduction) AS introduction_ai,
       clip_image(figure1) AS figure1_ai
  INTO papers_augmented
  FROM papers;
Then you can search for embeddings via:

  SELECT abstract, introduction FROM papers_augmented ORDER BY clip_text(query) <=> abstract_ai LIMIT 10;
The approach significantly decreases search latency and results in cleaner code. As an added bonus, EXPLAIN ANALYZE can now tell percentage of time spent in embedding generation vs search.

The linked library enables embedding generation for a dozen open source models and proprietary APIs (list here: <https://lantern.dev/docs/develop/generate>, and adding new ones is really easy.




Lantern seems really cool! Interestingly we did try CLIP (openclip) image embeddings but the results were poor for 24px by 24px icons. Any ideas?

Charlie @ v0.app


I have tried CLIP on my personal photo album collection and it worked really well there - I could write detailed scene descriptions of past road trips, and the photos I had in mind would pop up. Probably the model is better for everyday photos than for icons




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: