Hacker News new | past | comments | ask | show | jobs | submit login

Yep, that's the way it's currently implemented in langchain.

The 4 is a hyperparameter you can change, though, so you could set it to 10 as well.

The way it works is that it first looks up the N most relevant documents (N being 4 in the default case) in the FAISS store relevant to the question, so it uses distance of embedding vectors for this lookup.

Then it uses GPT3 to get summaries of the 4 entries related to the question and finally all the summaries together with the question will lead to the answer.

In doing so, you can trace the source where the answer came from and can also point to that URL in the end.

When you make N larger it just gets more expensive in terms of your API costs.




Looks interesting! Have you considered a proper vector database like Qdrant (https://qdrant.tech)? FAISS runs on a single machine, but if you want to scale things up, then a real database makes it a lot easier. And with a free 1GB cluster on Qdrant Cloud (https://cloud.qdrant.io), you can store quite a lot of vectors. Qdrant is also already integrated with Langchain.


Probably not very helpful at the scale most people would run this. Even brute forcing the search on CPU gives results in a few ms on small datasets.


Using something like Weaviate, which can be started in Docker with a one-liner, will give the ability to move away or toward dense vectors by concept. While doing dot product with manual code is fairly easy, using Weaviate to do the lifting (for embeddings as well) makes things super simple.

https://github.com/FeatureBaseDB/slothbot/blob/slothbot-work...


that means you need docker running and the dependencies explode if you take this approach. I really like the tight dependency tree.


Thanks for the suggestion, but for my fun small experiment, FAISS was more than enough.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: