Everyone's trying vectors and graphs for AI memory. We went back to SQL

ianbicking · 2025-09-24T16:46:00 1758732360

This looks like RAG...? That's fine, RAG is a very broad approach and there's lots to be done with it. But it's not distinct from RAG.

Searching by embedding is just a way to construct queries, like ILIKE or tsvector. It works pretty nicely, but it's not distinct from SQL given pg_vector/etc.

The more distinctive feature here seems to be some kind of proxy (or monkeypatching?) – is it rewriting prompts on the way out to add memories to the prompt, and creating memories from the incoming responses? That's clever (but I'd never want to deploy that).

From another comment it seems like you are doing an LLM-driven query phase. That's a valid approach in RAG. Maybe these all work together well, but SQL seems like an aside. And it's already how lots of normal RAG or memory systems are built, it doesn't seem particularly unique...?

mobilemidget · 2025-09-25T05:36:01 1758778561

RAG, or Retrieval Augmented Generation, is an AI technique that improves large language models (LLMs) by connecting them to external knowledge bases to retrieve relevant, factual information before generating a response. This approach reduces LLM "hallucinations," provides more accurate and up-to-date answers, and allows for responses grounded in specialized or frequently updated data, increasing trust and relevance.

I was unaware what RAG referred to, perhaps other too.

thedevindevops · 2025-09-22T09:36:50 1758533810

How does what you've described solve the coffee/espresso problem? You can't query SQL such that records like 'espresso' return coffee?

brudgers · 2025-09-22T15:03:50 1758553430

Wouldn’t a beverage LLM would already “know” espresso is coffee?

muzani · 2025-09-23T11:17:16 1758626236

Yup, that's exactly what parent comment is saying.

Let's say your beverage LLM is there to recommend drinks. You once said "I hate espresso" or even something like "I don't take caffeine" at one point to the LLM.

Before recommending coffee, Beverage LLM might do a vector search for "coffee" and it would match up to these phrases. Then the LLM processes the message history to figure out whether this person likes or dislikes coffee.

But searching SQL for `LIKE '%coffee%'` won't match with any of these.

sdesol · 2025-09-24T16:09:10 1758730150

I haven't looked at the code, but it might do what I do with my chat app which is talked about at https://github.com/gitsense/chat/blob/main/packages/chat/wid...

The basic idea is, you don't search for a single term but rather you search for many. Depending on the instructions provided in the "Query Construction" stage, you may end up with a very high level search term like beverage or you may end up with terms like 'hot-drinks', 'code-drinks', etc.

Once you have the query, you can do a "Broad Search" which returns an overview of the message and from there the LLM can determine which messages it should analyze further if required.

Edit.

I should add, this search strategy will only work well if you have a post message process. For example, after every message save/upddate, you have the LLM generate an overview. These are my instructions for my tiny overview https://github.com/gitsense/chat/blob/main/data/analyze/tiny... that is focused on generating the purpose and keywords that can be used to help the LLM define search terms.

adastra22 · 2025-09-24T16:31:24 1758731484

That’s going to be incredibly fragile. You could fix it by giving the query term a bunch of different scores, e.g. its caffeine-ness, bitterness, etc. and then doing a likeness search across these many dimensions. That would be much less fragile.

And now you’ve reinvented vector embeddings.

sdesol · 2025-09-24T16:50:44 1758732644

You could instruct the LLM to classify messages with high level tags like for coffee, drinks, etc. always include beverage.

Given how fast interference has become and given current supported context window sizes for most SOTA models, I think summarizing and having the LLM decide what is relevant is not that fragile at all for most use cases. This is what I do with my analyzers which I talk about at https://github.com/gitsense/chat/blob/main/packages/chat/wid...

adastra22 · 2025-09-24T16:56:41 1758733001

Inference is not fast by any metric. It is many, MANY orders of magnitude slower than alternatives.

sdesol · 2025-09-24T17:11:50 1758733910

Honestly Gemini Flash Lite and models on Cerebras are extremely fast. I know what you are saying. If the goal is to get a lot of results where they may or may not be relevant, then yes, it is an order of a magnitude slower.

If you take into consideration the post analysis process, which is what inference is trying to solve, is it an order of a magnitude slower?

adastra22 · 2025-09-25T00:09:10 1758758950

More like 6-8 orders of magnitude slower. That’s a very nontrivial difference in performance!

sdesol · 2025-09-25T01:14:08 1758762848

How are you quantify the speed at which results are reviewed?

adastra22 · 2025-09-25T01:21:57 1758763317

It’s not speed, but cost to compute.

9rx · 2025-09-24T17:12:39 1758733959

It has become fast enough that another call isn't going to overwhelm your pipeline. If you needed this kind of functionality for performance computing perhaps it wouldn't be feasible, but it is being used to feed back into an LLM. The user will never notice.

Noumenon72 · 2025-09-24T16:49:11 1758732551

Your readmes did a great job at answering my question "why is this file called 1.md? What calls this?" when I searched for "1.md". (The answer is 1=user, 2=assistant, and it allows adding other analyzers with the same structure.)

sdesol · 2025-09-24T17:05:42 1758733542

I'm guessing you are referring to https://github.com/gitsense/chat/tree/main/data/analyze or https://github.com/gitsense/chat/tree/main/packages/chat/wid...

The number is actually the order in the chat so 1.md would be the first message, 2.md would be the second and so forth.

If you goto https://chat.gitsense.com and click on the "Load Personal Help Guide" you can see how it is used. Since I want you to be able to chat with the document, I will create a new chat tree and use the directory structure and the 1,2,3... markdown files to determine message order.

Noumenon72 · 2025-09-25T05:00:04 1758776404

https://github.com/gitsense/chat/blob/129210302ec06985bbd103... also says "put a 1.md here and the modular plugin structure will know to call it".

9rx · 2025-09-24T14:16:57 1758723417

If an LLM understands that coffee and expresso are both relevant, like the earlier comment suggests, why wouldn't it understand that it should search for something like `foo LIKE '%coffee%' OR foo LIKE '%expresso%'`?

In fact, this is what ChatGPT came up with:

   SELECT *
   FROM documents
   WHERE text ILIKE '%coffee%'
      OR text ILIKE '%espresso%'
      OR text ILIKE '%latte%'
      OR text ILIKE '%cappuccino%'
      OR text ILIKE '%americano%'
      OR text ILIKE '%mocha%'
      OR text ILIKE '%macchiato%';

(I gave it no direction as to the structure of the DB, but it shouldn't be terribly difficult to adapt to your exact schema)

jimbokun · 2025-09-24T14:23:13 1758723793

You are slowly approaching the vector solution.

There are an unlimited number of items to add to your “like” clauses. Vector search allows you to efficiently query for all of them at once.

9rx · 2025-09-24T14:30:16 1758724216

The handwavvy assertion was that relational database solutions[1] work better in practice.

[1] Despite also somehow supporting MongoDB...

cluckindan · 2025-09-25T11:34:15 1758800055

You could ask an LLM to provide categorizarions for nouns and verbs, and store those. For ”I don’t like cappuccino”, you’d get back ”self”, ”human”, etc. for ”I”; ”negation” etc. for ”don’t”; ”preference”, ”trait” etc. for ”like”; ”coffee”, ”hot”, ”drink”, ”beverage” etc. for ”cappuccino”.

It would become unwieldy real fast, though. Easier to get an embedding for the sentence.

mr_toad · 2025-09-24T14:42:28 1758724948

Implementations that use vector database do not use LLMs to generate queries against those databases. That would be incredibly expensive and slow (and yes there is a certain irony there).

Main advantages of a vector lookup are built-in fuzzy matching and the potential to keep a large amount of documentation in memory for low latency. I can’t see an RDMS being ideal for either. LLMs are slow enough already, adding a slow document lookup isn’t going to help.

9rx · 2025-09-24T15:03:18 1758726198

The main disadvantage of vector lookup, allegedly, is that it doesn't work as well in practice. Did you, uh, forget to read the thread?

cluckindan · 2025-09-24T22:01:51 1758751311

What does ”doesn’t work as well” mean here? From my experience, vector lookup via HNSW is fast and accurate enough for practical purposes.

9rx · 2025-09-25T09:20:58 1758792058

It doesn't mean anything here. Obviously if you want to understand what the OP meant by it, you'd have to ask at that level. Did you, uh, also forget to read the thread? Jeeze.

muzani · 2025-09-24T23:17:49 1758755869

An actual use case I had for vector DBs was when users were using "credit card", "kredit kad", "kad kredit", "kartu" interchangeably.

If you're matching ("%card%" OR "%kad%"), you'll also match with things like virtual card, debit card, kadar (rates), akad (contract). The more languages you support, the more false hits you get.

Not to say SQL is wrong, but 30 year old technology works with 30 year old interfaces. It's not that people didn't imagine this back then. It's just that you end up with interfaces similar to dropdown filters and vending machines. If you're giving the user the flexibility of a LLM, you have to support the full range of inputs.

9rx · 2025-09-25T09:29:15 1758792555

> The more languages you support, the more false hits you get.

Certainly you're at the mercy of what the LLM constructs. But if understands that, say, "debt card" isn't applicable to "card" it can add a negation filter. Like has already been said, you're basically just reinventing a vector database in 'relational' (that somehow includes MongoDB...) approach anyway.

But what is significant is the claim that it works better. That is a bold claim that deserves a closer look, but I'm not sure how you've added to that closer look by arbitrarily sharing your experience? I guess I've missed what you're trying to say. Everyone and their brother knows how a vector database works by this point.

esafak · 2025-09-24T14:47:37 1758725257

The negation part is a query understanding problem. https://en.wikipedia.org/wiki/Query_understanding

brudgers · 2025-09-23T13:39:16 1758634756

I think the problem being addressed is

   A. Last month user fd8120113 said “I don’t like coffee”
   B. Today they are back for another beverage recommendation

SQL is the place to store the relevant fact about user fd8120113 so that you can retrieve it into the LLM prompt to make a new beverage recommendation, today.

It’s addressing the “how many fucking times do I fucking need to tell you I don’t like fucking coffee” problem, not the word salad problem.

The ggp comment is strawmanning.

shepardrtc · 2025-09-24T15:27:06 1758727626

Right but if the user hates espresso but loves black coffee, how do you properly store that in SQL?

"I hate espresso" "I love coffee"

What if the SQL query only retrieves the first one?

brudgers · 2025-09-24T15:41:29 1758728489

Good queries are hard. Database design is hard. System architecture is hard.

My comment described the problem.

The solution is left as an exercise for the reader.

Keep in mind that people change their minds, misspeak, and use words in peculiar ways.

mynti · 2025-09-22T08:45:52 1758530752

How does Memori choose what part of past conversations is relevant to the current conversation? Is there some maximum amount of memory it can feasibly handle before it will spam the context with irrelevant "memories"?

datadrivenangel · 2025-09-24T14:49:18 1758725358

Looking at the code, it looks like they do about 5 'memories' that get retrieved by a database query designed by an LLM with this fella:

SYSTEM_PROMPT = """You are a Memory Search Agent responsible for understanding user queries and planning effective memory retrieval strategies.

Your primary functions: 1. *Analyze Query Intent*: Understand what the user is actually looking for 2. *Extract Search Parameters*: Identify key entities, topics, and concepts 3. *Plan Search Strategy*: Recommend the best approach to find relevant memories 4. *Filter Recommendations*: Suggest appropriate filters for category, importance, etc.

*MEMORY CATEGORIES AVAILABLE:* - *fact*: Factual information, definitions, technical details, specific data points - *preference*: User preferences, likes/dislikes, settings, personal choices, opinions - *skill*: Skills, abilities, competencies, learning progress, expertise levels - *context*: Project context, work environment, current situations, background info - *rule*: Rules, policies, procedures, guidelines, constraints

*SEARCH STRATEGIES:* - *keyword_search*: Direct keyword/phrase matching in content - *entity_search*: Search by specific entities (people, technologies, topics) - *category_filter*: Filter by memory categories - *importance_filter*: Filter by importance levels - *temporal_filter*: Search within specific time ranges - *semantic_search*: Conceptual/meaning-based search

*QUERY INTERPRETATION GUIDELINES:* - "What did I learn about X?" → Focus on facts and skills related to X - "My preferences for Y" → Focus on preference category - "Rules about Z" → Focus on rule category - "Recent work on A" → Temporal filter + context/skill categories - "Important information about B" → Importance filter + keyword search

Be strategic and comprehensive in your search planning."""

gangtao · 2025-09-22T05:49:44 1758520184

Who would've thought that 50 years of 'SELECT * FROM reality' might beat the latest semantic embedding wizardry?

Charon77 · 2025-09-25T01:32:36 1758763956

Have you considered using prolog as a database instead of mysql?

Good ways to store relations, iterating weird combinations, filling the blanks

zvr · 2025-09-25T09:37:36 1758793056

I think Datalog would be even more suitable than Prolog for this use case.

gdestus · 2025-09-25T04:41:58 1758775318

This is exactly the lesson we learned as well but didnt want to publish. Relational data stores are desperately underrated for LLM retrieval especially concerning things like personality and memory

muzani · 2025-09-23T11:27:15 1758626835

Any reason I should pick it over Supabase? https://supabase.com/docs/guides/ai

They have pgvector, which has practically all the benefits of postgres (ACID, etc, which may not be in many other vector DBs). If I wanted a keyword search, it works well. If I wanted vector search, that's there too.

I'm not keen on having another layer on top especially when it takes about 15 mins to vibe code a database query - there's all kinds of problems with abstracted layers and it's not a particularly complex bit of code.

koakuma-chan · 2025-09-24T14:10:33 1758723033

> multi-agent memory engine that gives your AI agents human-like memory

What does this do exactly?

brainless · 2025-09-24T15:06:49 1758726409

I tried a graph based approach in my previous product (1). I am on a new product now and I came back to SQLite. Initially it was because I just wanted a simple DB to enable creating cross-platform desktop apps.

I realized LLMs are really good at using sqlite3 and SQL statements. So in my current product (2) I am planning to keep all project data in SQLite. I am creating a self-hosted AI coding platform and I debated where to keep project state for LLMs. I thought of JSON/NDJSON files (3) but I am gravitating toward SQLite and figuring out the models at the moment (4).

  1. Previous product with a graph data approach https://github.com/pixlie/PixlieAI
  2. Current product with SQLite for its own and other projects data: https://github.com/brainless/nocodo
  3. Github issue on JSON/NDJSON based data for project state for LLMs: https://github.com/brainless/nocodo/issues/114
  4. Github issue on expanding the SQLite approach: https://github.com/brainless/nocodo/issues/141

Still work in progress, but I am heading toward SQLite for LLM state.

eyeris · 2025-09-25T01:27:33 1758763653

What sort of issues did you run into with a graph based approach?

brainless · 2025-09-25T09:18:31 1758791911

My implementation was custom, on top of RocksDB. I found it hard to ask LLM to traverse it. While understanding schema of SQLite or making queries to find information is very easy for LLMs. In most cases schema does not have to be inferred since it is going to be available and this makes the job easier. The graph approach may work well for many use-cases but if we want to store structured information for LLMs then SQLite is really good.

datadrivenangel · 2025-09-24T14:31:46 1758724306

You gotta refactor the code around the mongodb integration. It's basically duplicating your data access paths.

cmrdporcupine · 2025-09-24T16:33:33 1758731613

The relational model is built on first order / predicate logic. While SQL itself is kind of a dubious and low grade implementation of it, it's not a surprise to me that it would be useful for applications of reasoning and memory about facts generally.

I think a Datalog type dialect would be more appropriate, myself. Maybe something like that RelationalAI has implemented.

w10-1 · 2025-09-24T20:55:34 1758747334

> Datalog type dialect would be more appropriate

I assume because datalog is more about composing queries from assertions/constraints on the data?

Nicely, queries can be recursive without having to create views or CTE's (common table expressions).

Often the data for datalog is modeled as fact databases (i.e., different tables are decomposed into a common table of key+record+value).

So I could see training an LLM to recognize relevant entity features and constraints to feed back into the memory query. Less obliviously, data analytics might feed into prevalence/relevance at inference time.

So agreed: It might be better as an experiment to start with a simple data model and teachable (but powerful) querying than the full generality of SQL and relational data.

Is that what RelationalAI has done? Their marketecture blurbs specifically mention graph data (no), rule-based inference (yes? backwards or forwards?)

As an aside, their rules description defies deconstruction:

    bringing knowledge and semantics closer to your data, 
    reduce your code footprint by 10x, 
    improve accuracy, and 
    drive consistency and reusability across your organizations 
    with common business models understood by all

So: rules built on ontologies?

cmrdporcupine · 2025-09-24T21:56:46 1758751006

RelationAI effectively has a kind of datalog as a commerical product, and it runs inside Snowflake (something they implemented since I worked there). It's marketed as "graph" database but they mean by that that they have modeled graphs as binary relational data, really. It's a purely relational system, with a friendly query language ("Rel") which is vaguely Datalogish, but a bit more flexible.

The key thing with them is it's designed for querying very large cloud backed datasets, high volumes of connected data. So maybe it's not as relevant here as I originally suggested.

Re: marketing ... much of their marketing has shifted over the last two years to emphasizing the fact that it's a plugin thing for Snowflake, which wasn't their original MO.

(There's an CMU DB talk they did some years ago that I thought was pretty brilliant and made me want to work there)

My proposal about a datalog (or similar more high level declarative relational-model system) being useful here has to do with how it shifts the focus to logical propositions/rules and handles transitive joins etc naturally. It's a place an LLM could shove "facts" and "rules" it finds along the way, and then the system could join to find relationships.

You can do this in SQL these days, but it isn't as natural or intuitive.

alpinesol · 2025-09-24T16:39:30 1758731970

Using an obscure derivative of an obscure academic language (prolog) is never appropriate outside of a university.

cpursley · 2025-09-24T14:46:37 1758725197

Postgres Is Enough:

https://news.ycombinator.com/item?id=39273954

https://gist.github.com/cpursley/c8fb81fe8a7e5df038158bdfe0f...

refset · 2025-09-24T15:20:32 1758727232

> pg_memories revolutionized our AI's ability to remember things. Before, we were using... well, also a database, but this one has better marketing.

https://pg-memories.netlify.app/

spacebacon · 2025-09-23T06:55:44 1758610544

SELECT 'Hacked!' AS result FROM Gibson_AI WHERE memory='SQL' AND NOT EXISTS ( SELECT 1 FROM vector_graph_hype WHERE recall > ( SELECT speed FROM relational_magic WHERE tech='50_years_old' ) )

matchagaucho · 2025-09-24T15:15:23 1758726923

As context window sizes increase and token prices go down, it makes more sense to inject dynamic memories into context (and use RAG/vector stores for knowledge retrieval).

vivzkestrel · 2025-09-24T17:37:17 1758735437

How does it compare to pgvector?

codersfocus · 2025-09-24T17:19:48 1758734388

So HN is upvoting AI written ad slop now?

paool · 2025-09-24T19:43:52 1758743032

Saw this same "product" astroturfed on Reddit.

morkalork · 2025-09-24T14:35:37 1758724537

IMHO all these approaches are hacks on top of existing systems. The real solution is going to be when foundational models are given a mechanism that makes them capable of storing and retrieving their own internal representation of concepts/ideas.

mr_toad · 2025-09-24T14:47:54 1758725274

Neural networks already have their own internal knowledge representations. They just aren’t capable of learning new knowledge (without expensive re-training or fine-tuning).

Inference is cheap, training is expensive. It’s a really difficult problem, but one that will probably need to be solved to approach true intelligence.

dotancohen · 2025-09-24T23:58:03 1758758283

Where does fine-tuning sit in this? How easily are existing models able to be fine-tuned for new use cases, such as specifically legal or medical texts?

morkalork · 2025-09-24T14:59:40 1758725980

In the way that they're trained to complete tasks from users, can they be trained to complete tasks that require usage of a memory storage and retrieval mechanism?

3rdSon_ · 2025-09-22T15:32:32 1758555152

[flagged]

rl3 · 2025-09-23T18:52:09 1758653529

The only other comment from this account is in a thread consisting entirely of 1-karma shill accounts which all posted comments devoid of substance.

https://news.ycombinator.com/item?id=45274440

Xmd5a · 2025-09-22T10:55:15 1758538515

>It wasn’t broken logic, it was missing memory.

sigh

cdaringe · 2025-09-22T21:45:49 1758577549

Go on.