PMET: Precise Model Editing in a Transformer

KhoomeiK · on Aug 27, 2023

Fyi, Meng et al 2022 [1] is pretty much required reading in order to understand this paper

lucidrains · on Aug 27, 2023

Yannic did a great interview with the authors some time ago https://youtu.be/_NMQyOu2HTo

gmerc · on Aug 27, 2023

This may drop the cost and significantly increase the feasibility for government / court mandated changes / censoring / edits to models.

donpark · on Aug 28, 2023

Related https://github.com/xpq-tech/pmet

ttul · on Aug 27, 2023

The PRC would doubtless have an interest in precisely removing all knowledge of certain historical facts from LLMs within China.

PaulHoule · on Aug 27, 2023

That's just one application.

One of the worst problems of LLMs at this point in time is keeping them updated.

For instance ChatGPT should be able to talk about the Superbowl in 1984 when the Chicago Bears trounced the New England Patriots (I remember it well because I grew up in New England!) but I couldn't expect it to have anything to say about the (other kind of football) game I saw yesterday where West Ham beat Brighton because nothing about the later game is in the training set.

This problem just gets worse as time passes and the world continues to change. Bing's chatbot works around this for my soccer example by running a conventional query and then having the LLM summarize it which gave a pretty good summary of the game but when I asked it pointed questions about this particular game such "Who had the most possession?" which was relevant because it was really lopsided in the direction of the losing team, it fell down, it seemed to be working off structured statistics that didn't have this data as opposed to media reports of the game which surely would have noticed that.

With current technology they will need to rebuild the whole thing one day which will (1) be crazy expensive and (2) will break all the document vectors that people have saved from the model which will be a big problem for anybody using systems like LangChain or doing embedding-based similarity search.

There's a lot of need for some ability to update an LLM incrementally and not wreck it's performance and this kind of research points to one path to that.

jpfed · on Aug 28, 2023

The most promising work along these lines centers around augmenting LLMs with an external data store ("retrieval-augmented LLM"s). I think this started with Facebook's KNN-LLM ( https://arxiv.org/pdf/1911.00172.pdf ). Legal conflict may force the industry to move towards vector DBs as the predominant method by which facts are "stored" rather than model parameters ( https://arxiv.org/pdf/2308.04430.pdf ) , with the happy side effect of update-ability over time.

zaptrem · on Aug 27, 2023

https://en.wikipedia.org/wiki/Super_Bowl_XVIII

PaulHoule · on Aug 28, 2023

Crap I got the year wrong... It was 1986

https://en.wikipedia.org/wiki/Super_Bowl_XX

DennisP · on Aug 28, 2023

How do you save a document vector and do similarity search with it?

PaulHoule · on Aug 28, 2023

There is this

https://github.com/openai/chatgpt-retrieval-plugin

I just use SBERT which has models I can run locally

https://sbert.net/

nl · on Aug 28, 2023

You encode your document with some kind of embedding, eg HuggingFace Sentence Transformers: https://www.sbert.net/ (probably most commonly used) or OpenAI Embeddings: https://platform.openai.com/docs/guides/embeddings/what-are-... and then use a vector database (Elastic, Postgres, FAISS or whatever) to do a similarity search.

quantum_state · on Aug 27, 2023

they could just use it without publishing the paper … wonder what the reason could be …