Semantic search without LLMs is already making a dent. It still gives traditiona...

Semantic search without LLMs is already making a dent. It still gives traditional results that need to be human processed, but you can get "better" search results.

And with that there is a body work on "groundedness" that basically post-processes output to compare it against its source material. It still can result in logic errors and has a base error it self, but can ensure you at least have clear citations for factual claims that match real documents, but doesn't fully ensure they are being referenced correctly (though that is already the case even with real papers produced by humans).

Also consider the baseline isn't perfection, it is a benchmark against real humans. Accuracy is getting much better in certain domains where we have a good corpora. Part of assessing the accuracy of a system is going to be about determining if the generated content is "in distribution" of its training data. There is progress being made in this direction, so we could perhaps do a better job at the application level of making use of a "confidence" score of some kind maybe even taking that into account in a chain of thought like reasoning step.

People keep finding "obviously wrong" hallucinates that seem like proof things are still crap. But these system keep getting better on benchmarks looking at retrieval accuracy. And the benchmarks keep getting better as people point out deficiencies it them. Perfection might not be possible, but consistently better than average human seems in reach, and better than that seems feasible too. The challenge is the class of mistakes might look different even if the error rate overall is lower.