With the introduction of plugins, is it feasible to give ChatGPT some kind of lo...

sean_lynch · on March 26, 2023

OpenAI actually thinking about this too. It’s buried in their open source repo and not clear the exact mechanism that ChatGPT knows to make use of it. But we’re already here evidently.

https://github.com/openai/chatgpt-retrieval-plugin#memory-fe...

netsroht · on March 26, 2023

LangChain is a great workaround for that. [1]

> how to work with a memory module that remembers things about specific entities. It extracts information on entities (using LLMs) and builds up its knowledge about that entity over time (also using LLMs).

[1] https://python.langchain.com/en/latest/modules/memory/types/...

amrb · on March 26, 2023

There are attempts via langchains [0] depending on how much context is required I could see a summary step where the history to compressed and used to carry forward progress.

An alternative could be a vector store, injecting small snippets of relative text as a step.

0 - https://python.langchain.com/en/latest/modules/memory/key_co...

pmalynin · on March 26, 2023

Maybe, you could give it a combination of both. We'll call it long short term memory.

RohMin · on March 26, 2023

The reason I ask is because I feel that a memory model is one of the major bottlenecks toward AGI.

pmalynin · on March 26, 2023

On a more serious note, I do agree with you that memory and self-excitation seem like they are the last push thats needed to get to something more akin to "AGI". But I don't think that Rubicon will be crossed with plugins.

MacsHeadroom · on March 26, 2023

>I do agree with you that memory and self-excitation seem like they are the last push thats needed to get to something more akin to "AGI"

"We show that transformer-based large language models are computationally universal when augmented with an external memory. Any deterministic language model that conditions on strings of bounded length is equivalent to a finite automaton, hence computationally limited. However, augmenting such models with a read-write memory creates the possibility of processing arbitrarily large inputs and, potentially, simulating any algorithm."

From "Memory Augmented Large Language Models are Computationally Universal"

https://deepai.org/publication/memory-augmented-large-langua...

nunodonato · on March 26, 2023

why? short and long-term memory is really easy to do. Even my own basic assistant has it (running on fine-tuned curie model)

londons_explore · on March 26, 2023

I suspect with a 'window' of 32k tokens, OpenAI has already done similar memory tricks.

I suspect that if you filled the context window with "1 1 1 1 1 1 1 1 1 1", and then asked "How many 1's did I just show you?", it probably wouldn't know, simply because whatever tricks they use to have such an apparently large context window don't allow it to 'see' all of it at any given moment.

meghan_rain · on March 26, 2023

Ah so you think the 32k context window works differently than eg the 4k davinci context window? They didnt just increase ${hyperparam}?

londons_explore · on March 26, 2023

Training compute goes up with approximately the 3rd power of the window size.

So turning a 4k window to a 32k window means a 512x increase in compute they'd need (just to maintain similar output quality).

I suspect they must have found a better solution to be able to scale the window so big. They haven't announced what it is.

meghan_rain · on March 27, 2023

Very interesting, thanks