One quick plug I want to have the memory part of langchain down, vector store + ...

santiagobasulto · on May 22, 2023

Sorry for my ignorance. But memory refers to the process of using embeddings for QA right?

The process roughly is:

Ingestion:

- Process embeddings for your documents (from text to array of numbers)

- Store your documents in a Vector DB

Query time:

- Process embeddings for the query

- Find documents similar to the query using distance from other docs in the Vector db

- Construct prompt with format:

""" Answer question using this context: {DOCUMENTS RETRIEVED}

Question: {question} Answer: """

Is that correct? Now, my question is, can the models be swapped easily? Or that requires a complete recalculation of the embedding (and new ingestion)?

bigfudge · on May 22, 2023

The embeddings can be based on a different model to the one you pass them as context to. So you could upgrade the summmariser model without upgrading the embeddings.

santiagobasulto · on May 22, 2023

But you'd need to keep both models in parallel, right? Using M1 to keep computing embeddings and using M2 for completions.