I believe you misunderstood the comment you're replying to. It says the internal...

fragsworth · on April 22, 2023

> I believe you misunderstood the comment you're replying to.

No, did you even read my quote? How did I misunderstand this:

> At the end of the chat session, this internal state is not persisted

It's misinformed at best, misleading at worst. The non-misleading truth is at the end of each message, the internal state is not persisted.

I didn't disagree about how you can reproduce things. You have to do that to continue a conversation.

GistNoesis · on April 22, 2023

I agree that for chatbots that offer APIs, they are most likely currently implemented as stateless.

Meaning they take as input the last "context window" characters from the client, and use it to recompute the internal state, and then start generating character by character. But after the generation no memory need to be kept used on the server (except a very small "context window tokens").

Chatbots like llama.cpp in interactive mode don't have to recompute this internal state at every interaction.

You can view the last "context window" characters as a compressed representation of the internal state.

This becomes more pertinent as the "context window" gets bigger, as the bigger the "context window" the more you will have to recompute at each interaction.

The transformer architecture can also be trained differently so as to generate "context vectors" of finite size that synthesize all the past previous message of the conversation (encoder-decoder architecture). This "context vector" can be kept on the server more easily, and will contain the gist and the important things of the conversation, but won't be able to quote things exactly from the past directly. This context vector is then used to condition the generation of the reply. And once the chatbot has replied and received a new prompt, you update the finite size context vector (with a distinct neural network) to get a context vector with the latest information incorporated that you use to condition the generation, ad infinitum.

MacsHeadroom · on April 22, 2023

>I didn't disagree about how you can reproduce things. You have to do that to continue a conversation.

You do not have to reproduce the internal states to continue a conversation. When prior parts of the conversation are loaded into context the hidden states which generated those prior tokens are not reproduced. They are only reproduced if resubmitted in a piecemeal fashion, which does not happen in normal conversation continuation.

You seem to not understand what the internal state is or how it is differentiated from the external state of the overall conversation.

fragsworth · on April 22, 2023

Ok, I agree! That one sentence of mine was wrong. You don't have to reproduce state to continue a conversation. I was just trying to throw a bone, but even that bone was bad.

But then... doesn't that just point out that I was right on everything else? That there is absolutely no state between chat messages?