Hacker News new | past | comments | ask | show | jobs | submit login

just one question for the author, why does this system feel more like a relational system than a graph database system ? you ask users to define schema (which I think neo4j doesn't do) and also I think the concept of this rel table which I find in your blog is not present in any other graph db



I think it's a mistake for any DBMS to not support a schema. I bet this hurts Neo4j a lot and no question that this is a mistake. In fact some GDBMS, including Kùzu or TigerGraph supports a schema. I think MemGraph does too, though I might be wrong. Schema allows systems to do many core computations efficiently, most importantly scans of data, which is the backbone operation of DBMSs. In fact, because of this every semi-structured graph model historically has been extended with a schema.

In practice, if you want DBMSs to be performant, you need to structure your data. It's one thing to optionally support a semi-structured model, which is for example great when building initial proof of concepts when you want to develop something quickly. It's another thing to not support putting a structure on the data, which you'll want when you finally take your application to production and will care about performance.


I realized, I forgot to complete my sentence here: "...because of this every semi-structured graph model historically has been extended with a schema." Examples include, XML and RDF. More relevant to this discussion: there is an ongoing effort to define a graph schema in GQL, which is an industry-academia effort including I think all major players: Neo, Tiger, Oracle etc. (https://www.gqlstandards.org/home).

You can search for this on the link: "GQL will incorporate this prior work, as part of an expanded set of features including regular path queries, graph compositional queries (enabling views) and schema support."


A schema isn’t only for performance, but also for constraints. In Neo4j, people often add redundant data for performance, for example if you often need the number of nodes up to 3 hops away, or they want a certain type of link not to have cycles, but there’s little to no help from the system in keeping those constraints.

Whenever I hear the claim any “noSQL” store is so much simpler than a SQL database, I mention performance (efficient storage and querying) and constraints and then say “give it 50 years”.


The lack of a Schema does hurt Neo4j performance. Properties are stored "willy nilly" on a linked list of bytes per node/relationship. No order, an "age" property can be: 45, 38.5, "adult", [18,19], false... and that makes a terrible mess when aggregating, sorting, filtering, searching, etc.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: