> LLM's have no short term memory. Their internal state never changes. Many have...

anon291 · on June 30, 2023

It's actually really easy. Recurrent neural networks have been shown to be totally capable of handling language modeling. For example, rwkv has pretty good recall on par with human short term memory (in my opinion). The issue is training. If you train rwkv like an Rnn it's really expensive and time consuming. If you train it like a gpt.. then it's tractable

The math works the same either way. Keep in mind... Llms learn orders of magnitude faster than humans due to parallel nature of transformers.

skissane · on June 30, 2023

I don't think that's it though – in recurrent neural networks, there is still a distinction between internal state (dynamic) and weights (static).

Whereas, in biological brains, the weights are updated continuously.

anon291 · on July 2, 2023

There is also a distinction in the brain between long-term and short-term memories. It is not so beyond belief that the brain stores short-term memories into long-term memories in a separate process (Perhaps sleep, whose lack thereof we know is linked with memory recall issues).

Recurrent neural networks 'learn' continuously by changing their weights (In effect) based upon the 'previous state'. There are papers showing that attention mechanisms in transformers basically provide a 'weight' update function so that models can 'learn' to accomplish a task based on a few examples. In other words, transformer networks 'learn' to train themselves based on the examples given. It is not so beyond beliefs that recurrent neural networks do the same thing. They learn to set up their internal states such that future tasks can be affected by previous input and patterns. In fact... playing around with models like RWKV, you soon learn that this is a fundamental part of the model. If you start talking to it in a certain way, it will start echoing that back. Clearly it's 'learning'.

sirk390 · on June 30, 2023

> Whereas, in biological brains, the weights are updated continuously. My personal impression is that many "weights" are updated during sleep time. For example when training juggling, I will make no progress at all for hours of training. But later, after a night of sleep, I will have a large and instant progress.

skissane · on June 30, 2023

If I learn something new in the morning, very often I still remember it in the afternoon, even though I haven't been to sleep yet.

Neuroscientists/psychologists/etc believe [0] humans have four tiers of memory: sensory memory (stores what you are experiencing right now, lasts for less than a second); working memory (lasts up to 30 seconds); intermediate-term memory (lasts 2-3 hours); long-term memory (anything from 30 minutes ago until the end of your life).

We don't need to sleep to form new long-term memories – if at dinner time you can still remember what you ate for breakfast (I usually can if I think about it), that's your long-term memory at work. What we need sleep for, is pruning our long-term memory – each night the brain basically runs a compression process, deciding which long-term memories to keep and which to throw away (forget), and how much detail to keep for each memory.

Regarding your juggling example – most neuroscientists believe that the brain stores different types of memories differently. How to perform a task is a procedural memory, and new or improved motor skills such as juggling are a particular form of procedural memory. How the brain processes them is likely quite different from how it processes episodic memories (events of your life) or semantic memories (facts, general knowledge, etc). Sleep may play a somewhat different role for each different memory type, so what's true for learning juggling may not be true for learning facts.

[0] https://en.wikipedia.org/wiki/Intermediate-term_memory

PeterisP · on June 30, 2023

If performance and computing cost wasn't an issue (which it is!) then you could just have a page of code that continuously runs the LLM in a loop, outputting both the actual output and also some certain "short term memory state" that gets fed in to the next iteration, and that IMHO might get you at least halfway there.

sirk390 · on June 30, 2023

In humans, sleep is also required fo learning, so it's not fully continuous. An AI that occasionnaly retrains using the new knowledge would still be very interesting.

skissane · on June 30, 2023

From what I understand, there are two different processes – a continuous process which runs while we are awake, building new connections between neurons (acquiring new "weights"); and a pruning process which runs while we are sleep, sorting through those new connections and deciding which to keep and which to discard. That's very different from an AI that occasionally retrains.

anonbuddy · on June 30, 2023

But what if you could run a new acquired neuron "wight" trough another LLM that would based on its provided context (guidelines) be able to determine the importance of this new neuron. That loop could create more accurrate and higher quality data that could be used to re-train (fine-tune) new version of LLM.

nerdbert · on June 30, 2023

Not quite the same though - if you teach me how to operate a drill press, I'll be able to do it today, I don't have to spend 8 hours in bed before the knowledge is available to me.