Hacker News new | past | comments | ask | show | jobs | submit login

The problem is asking for facts, LLM are not a database so they know stuff but it is compressed so expect wrong facts, wrong names, dates, wrong anything.

We will need an LLM as a front end then it will generate a query to fetch the facts from the internet or a database , then maybe format the facts for your consumption.




This is called Retrieval Augmented Generation (RAG). The LLM driver recognizes a query, it gets send to a vector database or to an external system (could be another LLM...) and the answer is placed in the context. It's a common strategy to work around their limited context length, but it tends to be brittle. Look for survey papers.


That‘s exactly it. It‘s ok for LLMs to not know everything, because they _should_ have a means to look up information. What are some projects where this obvious approach is implemented/tried?


But then you need an LLM that can separate between grammar and facts. Current LLMs doesn't know the difference, that is the main source to these issues, these models treat facts like grammar and that worked well enough to excite people but probably wont get us to a good state.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: