I am interested in trying to make LLMs know the contents of my project, so it can know what classes/functions/variables there are outside the current file/prompt. The first idea for "adding" knowledge of the code base (assuming it is too large to fit into the prompt) would be to fine-tune the LLM on the code. Has anyone tried this or knows of any work on it?
Try embedding, semantic search, retrieval, and plugging the relevant parts into the prompt.
You may need:
- summarizer prompt to summarize your project structure, main functions, methods.
- vector store/database to store and retrieve your relevant code from code base
- coder prompt to write code based on the retrieved part.
Try embedding, semantic search, retrieval, and plugging the relevant parts into the prompt.
You may need: - summarizer prompt to summarize your project structure, main functions, methods. - vector store/database to store and retrieve your relevant code from code base - coder prompt to write code based on the retrieved part.
Check out langchain: https://langchain.readthedocs.io/