Hacker News new | past | comments | ask | show | jobs | submit login
Sqlite and couchdb creators announce unQL. SQL for NoSQL (unqlspec.org)
99 points by buster on July 29, 2011 | hide | past | favorite | 65 comments



As I said multiple times, in my humble opinion this is completely a useless effort. There are two cases:

1) Two databases are semantically very similar. In such a case even if the query language is different porting your application from one to another is trivial, especially if you have a minimal layer of abstraction between your program and you data, things like fetchUser(ID) or alike.

2) If the data model is different, using the same query language does not help at all.

So the question is WHY? :)


A gazillion already-made programs (for example data visualization tools, analytics software, etc.) speak only SQL. If the choice is between reinventing the wheel for each NoSQL DB vs. using a SQL DB (such as http://luciddb.org/ [1] and soon postgres with their recent introduction of foreign data wrappers[2]) as an intermediate layer or even a full data mart that does what you want query-wise, a lot of people want to save time and just use the SQL DB. (Why do you think Hive exists?) There's more to DBs than storing user data for cat picture sharing sites...

[1] Disclaimer: I wrote a preliminary CouchDB connector for Lucid that'll get released separately with the upcoming 0.9.4 release.

[2] I believe one of the first postgres foreign data wrappers was around Twitter. So you move some API accessing code from an application layer that realistically only your application can talk to, to the DB layer, and now everything that can talk to the DB layer can access the data. (Unless they want it raw, then they'll get it from Twitter.)


In the same vein, I have been thinking about a mysql wrapper around various nosql products.

Same issue, installed base and connectivity.

Composition, interfaces, abstraction.


The argument in the article that inspired this work[1] (and it may be an academic one) is that data models are all semantically very similar. The authors argue that category theory provides mathematical formalism for a generic abstraction that would bridge both the relational and K/V worlds.

The ideas all seem very formative to me. There are a bunch of assumptions, logic gaps, etc. But it's not all bad.

[1] http://cacm.acm.org/magazines/2011/4/106584-a-co-relational-...


Obviously the only good thing is for developers who are not familiar with noSQL technologies. This is a nice try to unify the model language. Unfortunately, the additional load generated by the query parsing is a no go for me.


that is why i think that the "universal query languange" should be on a different layer. SPARQL is one answer to this very complex problem. something similar to sparql with map/reduce style distributed algorithms embedded could be an interesting approach.


SPARQL is substantially less powerful than SQL. No delete/update/insert statements, no subselects, and limited aggregation and functions. I've used SPARQL a lot in the past year and I am constantly running into limitations.

Performance has also been very poor (we use Jena SDB). I don't know if poor performance is an implementation problem or an overall weakness.


well there are efforts for updates etc. but yes, you have limitations with sparql, thats why i think it is not the final solution. but it is a first approach to push the possibility to query data to a different layer.


What's the point in adding an abstraction layer above NoSQL datastores when most developers are already interacting with the datastore using another level of abstraction (e.g., an ORM in their language/framework of choice)?


I'm at the CouchConf -- the demo of UnQL was fantastic.

The idea is that you shouldn't be locked into a NoSQL database, and that one language that would work across all NoSQL databases will grow the community. How each database manages their implementation of the spec is up to them.

INSERT into c1 value {"foo":1, "bar", ["a","b","c"]};

Now ignore the datastore behind that-- it could be Mongo, Couch, etc etc, but you know that your query won't need to be rewritten in case you decide to switch databases. This can only be a good thing.


In 2009 I wrote in "The Dark Side of NoSQL" (1) about problems with NoSQL:

"1. ad hoc data fixing – either no query language available or no skills 2. ad hoc reporting – either no query language available or no in-house skills"

(1) http://codemonkeyism.com/dark-side-nosql/


I don't know if I like it or not. If you know SQL well and want to switch to a NoSQL database, what's easier learn? The "proprietary" API of the dababase (like Redis, or MongoDB) or the limitations of unSQL?

I can't speak of other NoSQL databases, but unSQL doesn't seem to expose most Redis' features, like lists & sets.


"If people invested as much in learning to tune MySQL or Postgres as they did in working around MongoDB flaws they wouldn't need MongoDB." ~Benjamin Black


This is false, the writer of THE book on MySQL tuning (Jeremy Zawodny) is also the guy that is/has converted CraigsList to MongoDB from MySQL.

But let me interrupt your propaganda. We wouldn't want to address the reality that not all data and work sets can fit well in a relational database.

Example: http://blog.zawodny.com/2010/05/22/mongodb-early-impressions...


Calm down a bit, man. I've got a MongoDB mug sitting on my desk. I think you'd agree that even if the data structure fits into a key-value model more so than a relational one, that still doesn't mean you need NoSQL, and if you take a common definition of "big data" to mean "your data needs exceed the capabilities of a single machine", then a lot of people don't need all that scalability. (And if it did, you'd use Hadoop anyway. =P ) (There's also CAP Theorem considerations for added fun. http://en.wikipedia.org/wiki/CAP_theorem )

There was a presentation up here a few months ago on how the guys at http://wordsquared.com/ used MongoDB; they basically made the choice since they knew it already, instead of using postgres with their great geo libraries. And that's fine. What's stupid is when people who know one or the other pretty well spend a lot of time learning about the other for a use case that's most likely not really necessary anyway, or their current choice could handle with tweaks.

Of course, once public CS starts moving forward into innovative big analytics rather than just managing big data storage (such as the theta-join paper I linked elsewhere on this page), things may start shifting in favor of one of the NoSQL systems and the above quote would be equally suitable when comparing the Hadoop ecosystem or Mongo with some fancy new relational DB.


I wish people would stop linking CAP theorem, as if it proves something about one database or the other.

It doesn't.

It expresses some useful things about trade-offs but they aren't necessarily binary properties and it doesn't say anything about the underlying data structures or features of a database.

> then a lot of people don't need all that scalability. (And if it did, you'd use Hadoop anyway. =P

Maybe, not necessarily.


what's easier learn? The "proprietary" API of the dababase (like Redis, or MongoDB) or the limitations of unSQL?

That's what bothers me about this, too.

While I can understand the goal of making noSQL databases more accessible to people who already know SQL, as well as the need to unify the commands across all the different flavor of noSQL databases, there's something fundamentally flawed about accessing unstructured data via logic intended for structured data.


The data is not necessarily unstructured, it just doesn't have a strictly inforced schema. So I'm very intrigued by this if they're adding it as a layer on top of CouchDB views. If they are, you could use your views to selectively filter on documents with a known structure, and then safely operate across a subset of your docs with unql. We'll see where they go with this though.


To clarify for the downvoters, I'm talking about this in the context of CouchDB, which I believe was a fair assumption to make seeing as Damien Katz is one of the principle people involved and it was announced today at CouchConf.

In CouchDB, running a filter on top of view results is something you can only do in list functions or client side, so I am very curious to see how they incorporate this into CouchDB.


The data is not unstructured, though. It is in JSON format which can be taken advantage of.


I think good programmer should aim to know both SQL & NoSQL.

The backend should be chosen because of the data characteristics, not because of someone's experience..


I'm kind of wondering, what next? An optimizer?

I think there's a lack of clarity on what's being reinvented here, why, and where this is all headed. And when we get there, is it really something fundamentally new, or is it something that existing SQL products can absorb along the way?


I know SQL moderately well, but does anyone share my view that moving AWAY from using SQL syntax is a GOOD thing? I personally really dislike writing SQL, i'm sure lots of other people do too, thats why people write query builders etc.


Actually i find SQL a great thing, and one thing that bothers me with all those "other" databases is that you have to start from scratch with everyone of them. A common Query language would be awesome!


You might be interested in a project I've been working on lately: http://htsql.org/ -- it's a high-level query language that compiles into SQL. The syntax was motivated by URLs and XPath; the semantics is based on navigational model. As opposed to many other non-SQL query languages, it supports analytical queries including aggregates and projections.


I'm now a user of both MongoDB and SQL-based ORMs, and I find that using a No-SQL type syntax makes it much easier to work with data programatically, ie, turn a UI query into a db query.

But as a human, say when I'm using the Mongo client, I miss using the much readable SQL. Typing JSON queries can be quite painful. And driving it thru a map-reduce javascript function just to summarize a column is horrendous.


Title is missleading. Nosql is an umbrella term for keyvalue-, document-, object-, graph-, column- etcetera- databases. The API's these provide are probably using a paradigm most natural to the actual habitat the data lives in already. Is this "no sql for nosql" really a problem that needs solving or merely a flakey abstraction thats super interesting to implement?

Unql seems to only focus on json based document stores so it might actually make a decent abstraction.

Too easy to dismiss as impractical and i hope these guys pull off something amqazing if only for json document stores.


The title doesn't state that it's for every NoSQL database.


We've had great success with using ElasticSearch to index our couch db databases in near realtime into it's lucene indexes, and then querying those.

It allows each tool to focus on what it does well.


Looking into something very similar right now. What's your tradeoff been in terms of, for example, disk taken up by Elastic for indexing? How do you find Elastic in terms of how well it keeps up with indexing the data recently inserted into couch?


I would be greatly interested in finding out more... what problems did you run into? Where did you spend most of the time?


Uncle or Uncool. The hipster query language.


It can't get much simpler than this can it?

Redis: redis.set "foo", "bar"; redis.get "foo"

Mongo: db.stuff.save({json: "object"}); db.stuff.find({json: "object"})


it will get simpler when your statement would look like :

Redis, Mongo, <insert your own zoo> ... : db.set(), db.find()


It'll be interesting to see how much this gets adopted. Ad-hoc query languages like SQL are great for operations and reporting, but if your NoSQL store needs to conform to the limitations of the unQL language, aren't you losing that "polyglot persistence" quality of using NoSQL?


It looks like a SQL syntax to access noSQL databases.

I'm not sure if that's a step forward or backward.


SQL is a fantastic set based language. Using SQL to query MongoDB would be about 7 steps forward for just about everything I do, particularly if "SQL" magically included joins.


Also would be cool if everything magically supported the 1-Bucket-Theta algorithm. :) http://www.ccs.neu.edu/home/mirek/papers/2011-SIGMOD-Paralle...


I'd say forward for sure. This breaks down the barriers to entry for a lot of folks, especially in enterprise land. Definitely looking forward to how this develops.


I think it makes a lot of sense for the document databases. It would make much less sense applied to something like Redis though.


Yeah... Redis could probably use a subset of SQL, though, but I don't know if it would be practical.


I think the whole SQL vs NoSQL stuff is a bit mislead.

Relational databases are not dinosaurs. They solve lots of problems. It's the database schema which is a PITA, so let's just have relational engine (with reasoning, like Prolog) without the schema and move on.

Most people don't need NoSQL for scalability, they choose it because they dislike SQL. So maybe it's SQL (as a syntax) we should ditch, together with the schema thing, not the relational model as it is. In this context, the effort with unQL, while being a remedy for some short-term situation, is actually not so cool in the long term, because it will keep the legacy undying.


In my ignorance I see "...an open query langauge for JSON..." and I think "Cool. Now I'll be able to store complex database in plaintext." Am I wrong or just missing the point?

I like the idea of SQLite but it would be nice to not have to jump through hoops to access the databases of programs (I recall having some trouble with Pidgin in the past that I gave up on because I was too much of a newb to know how to interact with SQLite databases).


Don't repeat the same mistakes as OQL[1]?

[1] http://www.odbms.org/ODMG/OG/


Is that pronounced "uncle", or "uncool"?


SQL is pronounced as sequel, so I'd definitely go with uncool. Even though this actually can be quite cool.

Looking forward to explaining to a client that yes, we just need to run this one uncool query to get the report ;-)


SQL is not pronounced as "sequel". The ANSI standard declared that it is pronounced S-Q-L.

http://en.wikipedia.org/wiki/SQL#Standardization


it's pronounced uncle


Yeah, they pronounced it "uncle" at the CouchConf presentation.


While my first thought is that I would rather use native APIs for each datastore, I do use SPARQL and SQL a lot so there is some sense to a standard query language.

The problem is having too abstract of an interface that does not allow easy access to functionality of different datastores.


Anything that makes Couch more powerful is great! I'm interested to see how they integrate this into the viewserver, it seems like it allows a bit more expressibility than the current JS system does.


You could also claim that Sparql is already "a standard query language for NoSQL"

http://decentralyze.com/2010/03/09/rdf-meets-nosql/


unrelated to databases, but does anyone know how they made those charts for the grammar? is there a tool for that (ie something that would generate those charts automatically given a grammar).


Those are called "railroad diagrams". I don't have a specific recommendation, but this should get you started:

http://stackoverflow.com/questions/773371/what-is-a-good-too...


It seems all of the examples are coming soon. Where is my friend SCOTT/TIGER?


This deserves a XZibit meme.


Yo dawg, I heard you like NoSQL but missed yo SQL so I made UnQL for yo NoSQL.


Eminem, is that you?


Ohh, the irony: NoSql - because SQL is bad and we thought of fixing it! Fast forward some years into the future, current times - NoSql gets its, you guessed it, SQL back.

Can someone help me stop laughing at this, please?

Nevermind, it looks like a waste of time.


Non-relational databases have nothing to do with SQL, they have to do with trade-offs in data structures, consistency, availability, fault-tolerance, and distribution.

You need to understand how ignorant you are, because until you do you're going to have a calcified and incomplete understanding of databases.


Quite the opposite, I clearly recall some people from the NoSQL camp (read non-relational data stores, non SQL enabled) saying that SQL itself is bad and something else is needed.

I should have mentioned this before the zealots kicked in.


> I clearly recall some people from the NoSQL camp (read non-relational data stores, non SQL enabled) saying that SQL itself is bad and something else is needed

No, wrong, irrelevant, who said this?


I know it's wrong, you'll have to kill me to make me stop using SQL.

I wish I still had the article from 2009 - 2010 which said that SQL is too difficult and other things like that.

Anyway, this is what I was talking about when I originally replied.


I'm fond of SQL, you're either intentionally misinterpreting me or I'm not being clear. Non-relational databases are not about not using SQL, to believe so is dense and willfully ignorant.

I asked for specifics, you didn't provide them. I have to assume you're a troll at this point. I'm done here.


Ok, maybe I didn't use the right words or something: There was an article from someone in the NoSQL camp which said that SQL is bad.

I DO and DID understand that the querying language isn't restricted to relational databases.

What I was saying I REMEMBER reading something a while ago on the website of a NoSQL store or in an article which said something along the lines: "SQL is old and should be deprecated anyway, it was time for something new"

THAT was why I was amused.

If you think I was trolling or anything, ok, I'll just stop commenting on this site. It's not worth the trouble at all.

This is my last comment.

edit: I think this was it: http://gigaom.com/cloud/facebook-trapped-in-mysql-fate-worse...




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: