Can someone more versed in the field comment on whether this is just an ad or ac...

jrpt · 2025-03-15T16:10:04 1742055004

"Mayo’s LLM split the summaries it generated into individual facts, then matched those back to source documents. A second LLM then scored how well the facts aligned with those sources, specifically if there was a causal relationship between the two."

It doesn't sound novel from the article. I built something similar over a year ago. Here's a related example from Langchain "How to get a RAG application to add citations" https://python.langchain.com/docs/how_to/qa_citations/

afro88 · 2025-03-15T21:48:22 1742075302

I don't think you're getting it, it's not traditional RAG citations.

They are checking the _generated_ text by trying to find documents containing the facts, then rating how relevant (casually related) those facts are. This is different from looking up documents to generate an answer for a prompt. It's the reverse. Once the answer has been generated they essentially fact check it.

barrenko · 2025-03-15T15:37:35 1742053055

A consultant sold them something with a high margin, they need to justify the bill.

Palmik · 2025-03-17T11:10:02 1742209802

What I imagine

1. Use LLM, possibly already grounded by typical RAG results, to generate initial answer.

2. Extract factual claims / statements from (1). E.g. using some LLM.

3. Verify each fact from (2). E.g using separate RAG system where the prompt focuses on this single fact.

4. Rerun the system with the results from (3) and possibly (1).

If so, this (and variations) isn't really anything new. These kind of workflows have been utilized for years (time flies!).

binarymax · 2025-03-15T12:09:17 1742040557

The article is too high level to figure out exactly what they are doing.

1oooqooq · 2025-03-15T14:12:21 1742047941

in the publishing industry we call that "cooking a press release". the "news" article was entirely written and mailed by the PR of the subject (mayo clinic here) and the "journalist" just copy and paste. at most they will reword a couple paragraphs not for fear of looking bad, but just to make it fit in their number of words required for the column they are publishing under.

so, yes, an advertisement.

jknoepfler · 2025-03-15T14:54:41 1742050481

Isn't that essentially how the AP has functioned for over a century? (Consume press release, produce news article, often nearly verbatim.)

relaxing · 2025-03-15T15:51:40 1742053900

You’re thinking of PR Newswire.

The AP pays reporters to go out and report.

doubleg72 · 2025-03-15T15:21:28 1742052088

I read a lot of AP articles that aren't verbatim press releases.. you must be in the classifieds or something.

HenryBemis · 2025-03-15T12:41:32 1742042492

> where the model extracts relevant information, then links every data point back to its original source content.

I use ChatGPT. When I ask it something 'real/actual' (non-dev) I ask it to give me references in every prompt. So when I ask it to tell me about "the battle of XYZ" I ask it within the same prompt to give me websites/sources, that I click and check if the quote is actually from there (a quick Ctrl+F will bring up the name/date/etc.)

Since I've done this I get near-zero hallucinations. They did the same.

mdaniel · 2025-03-15T16:02:55 1742054575

> (a quick Ctrl+F will bring up the name/date/etc.)

Have you tried asking for the citation links to also include a WebFragment to save you the searching? (e.g. https://news.ycombinator.com/item?id=43372171#:~:text=a%20qu... )

johnisgood · 2025-03-15T20:49:02 1742071742

I feel that this is under{rated,used}.

mdaniel · 2025-03-15T21:04:49 1742072689

I was waiting so long for this to finally arrive in Firefox (and now I can't seem to unsubscribe from Bugzilla for some reason -- I guess "because Bugzilla"). However, in true FF fashion, I'm sure it'll be another 10 years before the "Copy link to selection" arrives like its Chrome friend, so I have an extension to tide me over :-/

johnisgood · 2025-03-15T22:38:40 1742078320

Do you know how to use this feature in, say, Vivaldi if it is even possible? I want to select text and have it appended to the URL.

mdaniel · 2025-03-16T01:30:35 1742088635

I don't use Vivaldi in order to know what its limitations are, but as I mentioned I have to run with https://addons.mozilla.org/en-US/firefox/addon/link-to-text-... (source: https://github.com/GoogleChromeLabs/link-to-text-fragment Apache 2) due to FF itself not offering that functionality natively

TBH, I also previously just popped open dev-tools and pasted the copied text into console.log("#:~:text="+encodeURIComponent(...)) which was annoying, for sure, but I didn't do it often enough to enrage me. I believe there's a DOM method to retrieve the selected text which would have made that much, much easier but I didn't bother looking it up

For posterity, I also recently learned that Chrome just appends the :~: onto any existing anchor, e.g. <https://news.ycombinator.com/item?id=43336609#43375605:~:tex...> but of course Firefox hurp-derps that style link

johnisgood · 2025-03-16T12:52:11 1742129531

I installed "link-to-text-fragment", but: https://news.ycombinator.com/reply?id=43376342&goto=threads%... does not seem to work. :(

It only highlights once I copied link to selected text, but if I click on the link, it does not seem to work, yours did.

cmiles74 · 2025-03-15T14:24:22 1742048662

I have an application that does this. When the AI response comes back, there's code that checks the citation pointers to ensure they were part of the request and flags the response as problematic if any of the citation pointers are invalid.

The idea is that, hopefully, requests that end up with invalid citations have something in common and we can make changes to minimize them.

copypasterepeat · 2025-03-16T13:09:03 1742130543

This sounds like a good technique that can be fully automated. I wonder why this isn't the default behavior or at least something you could easily request.

samstave · 2025-03-15T13:22:41 1742044961

I do this as well.

There was an article about Sam Altman that stated that ex/other OAI employees called him some bad_names and that he was a psychopath...

So I had GPT take on the role of an NSA cybersecurity and crypto profiler and read the thread and the article and do a profile dossier of Altman and have it cite sources...

And it posted a great list of the deep psychology and other books it used to make its claims

Which basically was that Altman is a deep opportunist and showed certain psychopathological tendencies.

Frankly - the statement wasn't as interesting of how it cited the expert sources and the books it used in the analysis.

however, after this OAIs newer models were less capable of doing this type of report, which was interesting.

amelius · 2025-03-15T13:27:04 1742045224

Well, the title said "secret" after all ...

aqme28 · 2025-03-15T13:34:42 1742045682

Reverse RAG sounds like RAG with citations and then also verify the citations (e.g. go in reverse).

lmeyerov · 2025-03-15T16:33:25 1742056405

It sounds like they go further by doing output fact extraction & matching back to the RAG snippets. Presumably this is addition to matching back the citations. I've seen papers write about doing that with knowledge graphs, but at least for our workloads, it's easy to verify directly.

As a team who has done similar things for louie.ai - think real-time reporting, alerting, chatting, & BI on news, social media, threat intel, operational databases etc - I find it interesting less on breaking new ground but confirming the quality benefit when being more broadly used in serious contexts. Likewise, hospitals are quite political internally for this stuff, so seeing which use cases got the greenlight to go all the way through is also interesting.

jerpint · 2025-03-16T00:04:26 1742083466

It doesn’t solve the biggest problem with RAG, which is retrieving the correct sources in the first place.

It sounds like they just use a secondary LLM to check if everything that was generated can be grounded in the provided sources. It might help with hallucinations, but it won’t improve overall performance of proper retrieval

m3kw9 · 2025-03-15T13:53:11 1742046791

Can’t fool the patent inspectors if they don’t name it like that

rvnx · 2025-03-15T14:04:23 1742047463

There's probably a patent for: "Just double-checking before answering to the user".

jay_kyburz · 2025-03-15T19:56:23 1742068583

I wish somebody would release an AI that did it.

johnisgood · 2025-03-15T20:49:47 1742071787

Is it not what "Reason" or "Thinking" features are for? Sort of...

zxexz · 2025-03-15T17:12:54 1742058774

This is just standard practice AFAICT. I’ve done it. Everybody I know who’s built apps for unstructured document retrieval etc. is doing it. It works better than The naïve approach, but there are plenty of issues and tuning with this approach too.

nojito · 2025-03-15T14:14:50 1742048090

They leverage https://en.wikipedia.org/wiki/CURE_algorithm alongside many subsequent LLMs to do ranking and scoring.

stuaxo · 2025-03-15T13:04:09 1742043849

It does sound like that.

I guess they have data they trust.

If that data ever gets polluted by AI slop then you have an issue.