Show HN: Alerting in realtime RAG: spot changes to LLM answers, using few tokens

bobur_umurzokov · on Nov 18, 2023

Thanks for sharing, Jan!

This real-time alerting use case can be also useful in many other areas. I am thinking of fraud detection, customer support, medical diagnosis, and treatment, or in manufacturing to predict when equipment will fail and alert if maintenance is needed. Or even monitoring model performance when LLMs can occasionally produce unexpected or undesirable outputs.

anupsurendran · on Nov 17, 2023

Jan, can you explain briefly how the deduplicator checks if the new answer is significantly different? Is there code in the repository we can take a look at?

janchorowski · on Nov 17, 2023

Sure: when a new response is produced because some source documents have changed we ask an LLM to compare the responses and tell if they are significantly different. Even a simplistic prompt, like the one used in the example would do:

    Are the two following responses deviating?
    Answer with Yes or No.

    First response: "{old}"

    Second response: "{new}"

(used in https://github.com/pathwaycom/llm-app/blob/69709a2cf58cdf6ea...)

pstorm · on Nov 17, 2023

Couldn't you just compare the similarity of the embeddings? I imagine that would work in the vast majority of cases and save a lot of LLM calls.

janchorowski · on Nov 17, 2023

That's a good idea, the deduplication criterion is easy to change, using an llm is faster to get started, but after a while a corpus of decisions is created and can be used to either select another mechanism, or e.g. train one on top of bert embeddings.