Perhaps the solutions(s) needs to be less focusing on output quality, and more o...

_bin_ · 2025-03-31T16:30:00 1743438600

The biggest issue I’ve seen is “context window poisoning”, for lack of a better term. If it screws something up it’s highly prone to repeating that mistake. It then makes a bad fix that propagates two more errors, the says, “Sure! Let me address that,” repeating to poorly fix those rather than the underlying issue (say, restructuring code to mitigate.)

It is almost impossible to produce a useful result, far as I’ve seen, unless one eliminates that mistake from the context window.

instakill · 2025-03-31T16:55:44 1743440144

I really really wish that LLMs had an "eject" function - as in I could click on any message in a chat, and it would basically start a new clone chat with the current chat's thread history.

There are so many times where I get to a point where the conversation is finally flowing in the way that I want and I would love to "fork" into several directions from that one specific part of the conversation.

Instead I have to rely on a prompt that requests the LLM to compress the entire conversation into a non-prose format that attempts to be as semantically lossless as possible; this sadly never works as in ten did [sic].

mvdtnz · 2025-03-31T19:51:22 1743450682

This is precisely what the poorly named Edit button does in Claude.

tough · 2025-03-31T17:52:20 1743443540

Google UI supports branching and delete someone recently made a blog post about how great it is

marlott · 2025-03-31T19:47:16 1743450436

which Google UI?

tough · 2025-04-01T03:40:03 1743478803

ai.dev AI studio sorry

genewitch · 2025-04-01T15:16:16 1743520576

LM studio has a fork button on every chat part. Sorry, can't think of a better word - you can fork on any human or ai part. You can also edit, but editing isn't, it essentially creates a copy of the context with the edit, and sends the whole thing to the AI. This can overflow your context window, so it isn't recommended. Forking of course does the same thing, but it is obvious that it is doing so, whereas people are surprised to learn editing sends everything.

theblazehen · 2025-03-31T17:50:13 1743443413

You can use LibreChat which allows you to fork messages: https://www.librechat.ai/docs/features/fork

PeterStuer · 2025-04-01T14:52:07 1743519127

"If it screws something up it’s highly prone to repeating that mistake"

Certainly true, but coaching it past sometimes helps (not always).

- roll back to the point before the mistake.

- add instructions so as to avoid the same path. "Do not try X. We tried X it does not work as it leads to Y.

- add resources that could aid a misunderstanding (api documentation, library code)

- rerun the request (improve/reword with observed details or insights)

I feel like some of the agentic frameworks are already including some of these heuristics, but a helping hand still can work to your benefit

bongodongobob · 2025-03-31T17:28:15 1743442095

I think this is one of the core issues people have when trying to program with them. If you have a long conversation with a bunch of edits, it will start to get unreliable. I frequently start new chats to get around this and it seems to work well for me.

_bin_ · 2025-03-31T23:15:49 1743462949

Yes, this definitely helps. It's just incredibly annoying because you have to dump context back into it, re-type stuff, consolidate stuff from the prior conversation, etc.

dr_kiszonka · 2025-04-01T03:08:34 1743476914

Have the AI maintain a document (a local file or in canvas) with project goals, structure, setup instructions, current state, change log, todos, caveats, etc. You might need to remind it to keep it up-to-date, but I find this approach quite useful.

donmcronald · 2025-03-31T20:45:11 1743453911

This is what I find. If it makes a mistake, trying to get it to fix the mistake is futile and you can't "teach" it to avoid that mistake in the future.

johnisgood · 2025-04-01T12:39:29 1743511169

It depends, I ran into this a lot with GPT, but less so with Claude.

But then again, I know how it could avoid the mistake, so I point that out, from that point onwards it seems fine (in that chat).

ModernMech · 2025-03-31T19:19:35 1743448775

> Perhaps the solutions(s) needs to be less focusing on output quality, and more on having a solid process for dealing with errors. Think undo, containers, git, CRDTs

LLMs are supposed to save us from the toils of software engineering, but it looks like we're going to reinvent software engineering to make AI useful.

Problem: Programming languages are too hard.

Solution: AI!

Problem: AI is not reliable, it's hard to specify problems precisely so that it understands what I mean unambiguously.

Solution: Programming languages!

Workaccount2 · 2025-03-31T19:58:07 1743451087

With pretty much every new technology, society has bent towards the tech too.

When smartphones first popped up, browsing the web on them was a pain. Now pretty much the whole web has phone versions that make it easier*.

*I recognize the folly of stating this on HN.

LtWorf · 2025-03-31T22:35:05 1743460505

No it's still a pain.

There's apps that open links in their embedded browser where ads aren't blocked. So I need to copy the link and open them in my real browser.

mdaniel · 2025-04-01T01:42:12 1743471732

Or my other favorite trap: an embedded browser where I'm not authenticated. Great, now I have to roll the dice about pasting a password in your "trust me, bro" looking login page because I cannot see the URL and the autofill is all "nope"

otabdeveloper4 · 2025-04-02T05:58:56 1743573536

> LLMs are supposed to save us from the toils of software engineering

Well, cryptocurrency was supposed to save us from the inefficiences of the centralized banking system.

There's a lesson to be learned here, but alas our sociiety's collective context window is less than five years.

techpineapple · 2025-03-31T15:49:48 1743436188

But, assuming this is a general thing not just focused on say software development, can you make the tooling around creating this easier than defining the process itself? Everyone loosely speaking sees the value in test driven development, but often I think with complex processes, writing the test is harder than writing the process.

RicoElectrico · 2025-03-31T15:56:05 1743436565

I want to make a simple solution where data is parsed by a vision model and "engineer for the unhappy path" is my assumption from the get-go. Changing the prompt or swapping the model is cheap.

herval · 2025-03-31T20:40:13 1743453613

vision models are also faulty, and some times all paths are unhappy paths, so there's really no viable solution. Most of the times, swapping the model completely randomizes the problem space (unless you measure every single corner case, it's impossible to tell if everything got better or if some things got worse...