Hacker News new | past | comments | ask | show | jobs | submit login

One quick plug

I want to have the memory part of langchain down, vector store + local database + client to chat with an LLM (gpt4all model can be swapped with OpenAI api just switching the base URL)

https://github.com/aldarisbm/memory

It's still got ways to go, if someone wants to help let me know :)




Sorry for my ignorance. But memory refers to the process of using embeddings for QA right?

The process roughly is:

Ingestion:

- Process embeddings for your documents (from text to array of numbers)

- Store your documents in a Vector DB

Query time:

- Process embeddings for the query

- Find documents similar to the query using distance from other docs in the Vector db

- Construct prompt with format:

""" Answer question using this context: {DOCUMENTS RETRIEVED}

Question: {question} Answer: """

Is that correct? Now, my question is, can the models be swapped easily? Or that requires a complete recalculation of the embedding (and new ingestion)?


The embeddings can be based on a different model to the one you pass them as context to. So you could upgrade the summmariser model without upgrading the embeddings.


But you'd need to keep both models in parallel, right? Using M1 to keep computing embeddings and using M2 for completions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: