You Might Be A Data Radical

emin_gun_sirer · on March 8, 2013

I'm the "radical" author of this post. It expands this other post (http://hackingdistributed.com/2013/03/07/partition-tolerance...) that busts some commonly propagated myths about partition tolerance in data stores.

But the part that I really wanted to emphasize, and I worry might get lost in the surrounding discussion about partition-obliviousness, is the following:

  NoSQL has the potential to supplant RDBMSs and capture the  
  bulk of the database market. The "daddy knows best" 
  attitude that RDBMSs bring to data management, the way they
  strip all information that the developer had about her data 
  and force her to write a declarative specification of what 
  she wants, only to try to then come up with an efficient
  evaluation plan in the query optimizer, all reflect a 
  klunky aesthetic to system design that the lean and mean 
  NoSQL movement can and will supplant. NoSQL is to RDBMSs 
  what Unix was to Multics.

My group is working on bridging the gap between NoSQL and traditional RDBMSs (with a second-generation NoSQL store called HyperDex that provides strong guarantees), and so are others. And meanwhile, RDBMSs are trying to retrofit NoSQL features. Overall, there is a big revolution afoot in data management, and it's an exciting time for system builders and application developers alike.

nissimk · on March 8, 2013

There are two primary categories of applications that were traditionally developed on sql platforms: Operational and Analytical. Operational applications want the ACID and transactions features of the RDBMS, but analytical applications benefit tremendously from the SQL language itself. If there is a question that my user wants me to answer and I can answer it by writing a SQL query but in the NoSQL world I have to write a program, I just lost one of my super powers. SQL gives me the ability to rapidly answer complex questions about my data. Please recognize that ACID and transactions are not your only hurdles to successfully competing with RDBMS.

emin_gun_sirer · on March 8, 2013

Absolutely spot on. But there is no reason why the translation from the user's query into a series of data accesses cannot happen in the front end. In many ways, NoSQL is a reaction to the poor job RDBMSs were doing in the front end. Everyone has seen the cases where AWK programs on desktops could outperform expensive Oracle servers, and that's mostly because they enable the user to put to use what they know of their specific data and their access patterns.

So, in between "writing a program for each query" (which I agree is a non-starter in some settings) and "SQL is the one and only interface" lies an exciting space of opportunities.

derefr · on March 9, 2013

I don't see why there's a competition. SQL is great for analytics. NoSQL is great for operations. When you want to do analytics on the data from your operations--which are presumably Big Data and far too much to pack into a single node--just query your Operations data using statistical sampling methods to generate a representative data set, stuff that "analytic snapshot" into any old RDBMS, and query it to your heart's content.

(And if they're not Big Data--well, then, the solution is even more obvious. :)

derleth · on March 8, 2013

> NoSQL is to RDBMSs what Unix was to Multics.

A simplified version that will eventually have to re-invent a lot of what it stripped away to become so much simpler?

emin_gun_sirer · on March 8, 2013

:-) Maybe. Let's not forget the part where the new thing, with its limber, simpler architecture completely displaces the old klunker, and provides an extensible, easily customized platform to its users.