Hacker News new | past | comments | ask | show | jobs | submit login

NoSQL is driven more by scaling issues than anything else. Joins and strong consistency are awesome but run head first into the CAP theorem and other concerns like single node performance on a sharded cluster.

There is also the fact that no programming language lets you deal with relational data sanely in code, so you have the well known impedance mismatch heaeache. All popular languages I've seen offer hierarchical (maps of maps etc.) and more primitive structures only.




Joins seriously have nothing to do with the CAP theorem, except inasmuch as multi-key read/write transactions do. Consistent multi-key read-only transactions and write-only transactions are actually not terribly difficult to do under serializability across partitions. Additionally, from a sheer performance and data perspective, centralized relational databases work just fine for real workloads. Unfortunately, pretty much zero real-world apps don't need multi-key read/write transactions, and modern businesses expect uptime that is unrealistic for a centralized system, so people are forced to replicate across datacenters. Ultimately what most people end up doing (regardless of database) is partitioning into small enough key groups that they can afford the highly expensive latency cost for maintaining high availability (across multiple datacenters) while maintaining consistency within each group for read/write transactions, and giving up on consistency across partitions for such operations (but usually maintaining consistency on read-only and write-only transactions). NoSQL can sometimes give you a clearer understanding of the cost/consistency tradeoff you're making, and hierarchical keys can make it much easier to partition, but joins really hardly enter into it.


> Joins and strong consistency are awesome but run head first into the CAP theorem and other concerns like single node performance on a sharded cluster.

The CAP theorem doesn't force you to give up consistency within a single node. NoSQL databases often do, though.

> All popular languages I've seen offer hierarchical (maps of maps etc.) and more primitive structures only.

So, instead of fixing programming languages, let's cripple databases?


I meant to suggest instead that we fix languages, and now that I think of it .net's linq is not bad.


LINQ makes querying somewhat more pleasant, but it doesn't fix the real problem: relations are inexpressible as first-class values (not the same thing as first-class objects!) in most object-oriented languages.


Mongo doesn't solve the impedance mismatch either, if you have 2 classes which have a many to many relationship you'll still need to think about it when you model datas, without any guarantee of consistency. Mongo DB barely make queries easier to write, without the power of SQL. How would you persist a "recursive" model with MongoDB while being able to do aggregation operations on the collection , like counting the number of models ? if you use a single collection for all the models, you'll then have to load all the models in the application code and count them in the code, where a simple COUNT would do the trick. With Mongo you're constently trying to reinvent SQL in the code. On the other hand, SQL allows one to write recursive queries for tree like structures.

I understand the trade offs in order to scale, but you can give up on joins WHEN it's time to scale. Mongo doesn't give one the choice at first place.


SQL is a relatively popular programming language that lets you deal quite sanely with relational data.

Then there's Prolog, Kanren, Mercury… which are admittedly not especially popular.

The so-called “impedance mismatch” comes mostly from people who don't understand all that SQL can do and inevitably end up replicating it in some other language.


> There is also the fact that no programming language lets you deal with relational data sanely in code, so you have the well known impedance mismatch heaeache.

The "Object-Relational impedance mismatch" is not a result of the supposed fact that "no programming language lets you deal with relational data sanely in code", its a result of the fact that industrially popular object-oriented languages, of the time when the term was coined, did not align well with the data model supported by then-existing relational databases, and vice versa.


The thing is, for the majority of deployments out there, developers don't even know what is the CAP theorem, nor do they need the theoretical performance NoSQL databases have over default configured RDBMS.

Just like all those big data deployments that can fit on an USB key and be processed by plain UNIX tools.


I think NoSQL is mainly driven by the fact that Mongo became the defacto standard in Node-land (eg, MEAN).




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: