I feel like it almost always starts well, given the full picture, but then for n...

mlsu · 2024-09-13T16:50:00 1726246200

This is the way.

I've had this experience many times:

- hey, can you write me a thing that can do "xyz"

- sure, here's how we can do "xyz" (gets some small part of the error handling for xyz slightly wrong)

- can you add onto this with "abc"

- sure. in order to do "abc" we'll need to add "lmn" to our error handling. this also means that you need "ijk" and "qrs" too, and since "lmn" doesn't support "qrs" out of the box, we'll also need a design solution to bridge the two. Let me spend 600 more tokens sketching that out.

- what if you just use the language's built in feature here in "xyz"? does't that mean we can do it with just one line of code?

- yes, you're absolutely right. I'm sorry for making this over complicated.

If you don't hit that kill switch, it just keeps doubling down on absurdly complex/incorrect/hallucinatory stuff. Even one small error early in the chain propagates. That's why I end up very frequently restarting conversations in a new chat or re-write my chat questions to remove bad stuff from the context. Without the ability to do that, it's nearly worthless. It's also why I think we'll be seeing absurdly, wildly wrong chains of thought coming out of o1. Because "thinking" for 20s may well cause it to just go totally off the rails half the time.

ethbr1 · 2024-09-13T17:40:49 1726249249

> If you don't hit that kill switch, it just keeps doubling down on absurdly complex/incorrect/hallucinatory stuff.

If you think about it, that's probably the most difficult problem conversational LLMs need to overcome -- balancing sticking to conversational history vs abandoning it.

Humans do this intuitively.

But it seems really difficult to simultaneously (a) stick to previous statements sufficiently to avoid seeming ADD in a conveSQUIRREL and (b) know when to legitimately bail on a previous misstatement or something that was demonstrably false.

What's SOTA in how this is being handled in current models, as conversations go deeper and situations like the one referenced above arise? (false statement, user correction, user expectation of subsequent corrected statement that still follows the rear of the conversational history)

lupire · 2024-09-13T19:21:48 1726255308

Here's something a human does but an LLM doesn't:

If you talk for a while and the facts don't add up and make sense, an intelligent human will notice that, and get upset, and will revisit and dig in and propose experiments and make edits to make all the facts logically consistent. An LLM will just happily go in circles respinning the garbage.

sqeaky · 2024-09-13T20:20:59 1726258859

I want to hang out with the humans you've been hanging out with. I know so many people who can't process basic logic or evidence that for my pandemic project a few years I did a year-long podcast about it, even made up a new word describe people who couldn't process evidence "Dysevidentia".

nick3443 · 2024-09-14T07:11:59 1726297919

People who have been taught by various forms of news/social media that any evidence presented is fabricated to support only one side of a discussion... And that there's no such thing as impartial factually based reality, only one that someone is trying to present to them.

Bluestein · 2024-09-13T21:19:56 1726262396

> "Dysevidentia"

This is great.-

Bluestein · 2024-09-13T21:17:36 1726262256

> stick to previous statements sufficiently to avoid seeming ADD in a conveSQUIRREL

:)

noisy_boy · 2024-09-13T16:55:38 1726246538

> That's why I end up very frequently restarting conversations in a new chat or re-write my chat questions to remove bad stuff from the context.

Me too - open new chat and start by copy/pasting the "last-known-good-state". OpenAI can introduce a "new-chat-from-here" feature :)

adriand · 2024-09-13T17:06:25 1726247185

Some good suggestions here. I have also had success asking things like, “is this a standard/accepted approach for solving this problem?”, “is there a cleaner, simpler way to do this?”, “can you suggest a simpler approach that does not rely on X library?”, etc.

skybrian · 2024-09-13T16:39:25 1726245565

Yes, I’ve seen that too. One reason it will spin its wheels is because it “prefers” patterns in transcripts and will try to continue them. If it gets something wrong several times, it picks up on the “wrong answers” pattern.

It’s better not to keep wrong answers in the transcript. Edit the question and try again, or maybe start a new chat.