Yes, we hope that the YAMLs will provide a clear separation between configuration and code allowing easier deploys of many apps from same codebase and principled experiments.
Thanks for the hint, we'll do a large summary of all options.
An index that combines multiple indexing techniques, e.g. vector search with more classical information retrieval techniques such as BM25. We found that they have different strengths: vector indexes are very good at getting synonyms and indirect matches, while classical term-based indexes are better for direct queries. Hybridization gives you the best of both worlds.
Hey, I'm Jan, a Machine Learning researcher, and CTO of Pathway. I've been working on RAG for the last year and I'm excited to share a new RAG optimization strategy that adapts the number of supporting documents to the LLM behavior on a given question.
The approach builds on the ability of LLMs to know when they don’t know how to answer. With proper LLM confidence calibration, the adaptive RAG is as accurate as a large context-based RAG, while being much cheaper to run.
What was really interesting for us here is that the basic idea is "geometric doubling" but it needs to be put into place with so much care, because of the counter-intuitive correlation effects of mistakes produced by LLM's for different prompts.
We provide runnable code examples, you will also find a reference implementation of the strategy in the Pathway LLM expansion pack:
Now you can use Airbyte source connectors to process data in memory with Python.
We integrated Airbyte connectors with Pathway, a Python stream processing framework, using the airbyte-serverless project. We believe ETL pipelines are coming back with many use cases in AI (RAG pipelines), ETL for unstructured data and pipelines that deal with PII data. In this article, we show how to stream data from Github using Airbyte and remove PII data with Pathway. We are curious on your feedback on the implementation and other use cases you may think of from decoupling the extract and load steps.
I had no idea it was originally talking about a Pascal book, thanks for the link! Fun to read old versions and see what changed over time (language recommendations, for example).
Sure: when a new response is produced because some source documents have changed we ask an LLM to compare the responses and tell if they are significantly different. Even a simplistic prompt, like the one used in the example would do:
Are the two following responses deviating?
Answer with Yes or No.
First response: "{old}"
Second response: "{new}"
That's a good idea, the deduplication criterion is easy to change, using an llm is faster to get started, but after a while a corpus of decisions is created and can be used to either select another mechanism, or e.g. train one on top of bert embeddings.
An index is a software component building block, which becomes a database when wrapped with the data management system. We will see more and more traditional databases to add a vector-search index, for instance pgvector makes a vector database out of PostgreSQL.
The LLM App is meant to be self-sufficient and takes a "batteries included" approach to system development - rather than combine several separate applications into a large deploymet, that includes databases, orchestrators, ETL pipelines it combines several software components, such as connectors and indexes into a single app which can be directly deployed with no extra dependencies.
Such an approach should make the deployments easier (there are fewer moving parts to monitor and service), while also being more hackable - e.g. adding some more logic on top of nearest neighbor retrieval is easy and adds only a few statements to the code.
This depends on the data source used. Some track updateable collections, some have a more "append-only" nature. For instance, tracing a database table using CDC+Debezium will support reacting to all document changes out of the box.
For file sources, we are working on supporting file versioning and integration with S3 native object versioning. Then the simply deleting the file or uploading a new version would be sufficient to trigger re-indexing the affected documents.
Thanks for the hint, we'll do a large summary of all options.