Hallucinations can be mostly eliminated with RAG and tools. I use NotebookLM all...

hatenberg · 2024-12-27T15:32:09 1735313529

Even with RAG you’re bounded at some 93%, it’s not a panacea.

scarface_74 · 2024-12-27T15:55:47 1735314947

How are you bounded? When you can easily check the sources? Also you act as if humans without LLMs have a higher success rate?

There is an entire “reproducibility crisis” with research.

hatenberg · 2024-12-28T05:28:11 1735363691

False equivalence. Humans fail in predictable ways that can be mitigated though process and training.

Try training an LLM.

scarface_74 · 2024-12-28T14:23:37 1735395817

How are the results any different? You can’t completely trust human output. There are all sorts of cognitive biases that come into play.

HarHarVeryFunny · 2024-12-27T16:00:54 1735315254

Facts can be checked with RAG, but the real value of AI isn't as a search replacement, but for reasoning/problem-solving where the answer isn't out there.

How do you, in general, fact check a chain of reasoning?

scarface_74 · 2024-12-27T16:15:11 1735316111

It’s not just a search engine though.

I can’t tell a search engine to summarize text for a technical audience and then another summary for a non technical audience.

I recently came into the middle of a cloud consulting project where a lot of artifacts, transcripts of discovery sessions, requirement docs, etc had already been created.

I asked NotebookLM all of the questions I would have asked a customer at the beginning of a project.

What it couldn’t answer, I then went back and asked the customer.

I was even able to get it to create a project plan with work streams and epics. Yes it wouldn’t have been effective if I didn’t already know project management, AWS and two decades+ of development experience.

Despite what people think, LLMs can also do a pretty good job at coding when well trained on the APIs. Fortunately, ChatGPT is well trained on the AWS CLI, SDKs in various languages and you can ask it to verify the SDK functions on the web.

I’ve been deep into AWS based development since LLMs have been a thing. My opinion may change if I get back into more traditional development

HarHarVeryFunny · 2024-12-27T16:40:56 1735317656

> I can’t tell a search engine to summarize text for a technical audience and then another summary for a non technical audience.

No, but, as amazing as that is, don't put too much trust in those summaries!

It's not summarizing based on grokking the key points of the text, but rather based on text vs summary examples found in the training set. The summary may pass a surface level comparison to the source material, while failing to capture/emphasize the key points that would come from having actually understood it.

scarface_74 · 2024-12-27T16:50:20 1735318220

I write the original content or I was in the meeting where I’m giving it the transcript. I know what points I need to get across to both audiences.

Just like I’m not randomly depending on it to do an Amazon style PRFAQ (I was indoctrinated as an Amazon employee for 3.5 years), create a project plan, etc, without being a subject matter expert in the areas. It’s a tool for an experienced writer, halfway decent project manager, AWS cloud application architect and developer.