RethinkDB 1.12: simplified map/reduce, ARM port, new caching infrastructure

taf2 · on March 26, 2014

I don't understand why databases like RethinkDB and MongoDB don't just provide a SQL interface. I get it - they are marketing themselves as NoSQL but they are infact providing almost exactly the same set of features that say MySQL provided back in the day a simple "fast" sub-set of SQL. Instead they use the lack of an interface as a marketing gimmick... when in reality we have to learn a new query language... :(

coffeemug · on March 26, 2014

> I don't understand why databases like RethinkDB and MongoDB don't just provide a SQL interface.

Actually, it's a really good question. I'm one of the ReQL designers at Rethink, and I was the one pushing for no SQL compatibility. Here is some of my reasoning (we could talk about this for days, though):

  * Even SQL designers would tell you SQL isn't a very good
    programming language. It even looks like Cobol! Imagine if
    every language after Cobol decided to be backwards compatible
    -- what sort of world would we live in?
  
  * SQL is really bad for querying hierarchical data with lots of
    empty columns. The quality of experience you get from learning
    a new language designed for JSON, far outweighs the downsides
    of learning it. We're assuming people will be using ReQL
    fifteen years from now.
  
  * Designing a language that embeds into your programming language
    wasn't possible before -- but it is now. That means no more SQL
    injection, no more string manipulation, no more heavy
    ORMs. Well worth it in my opinion.
  
  * The chaining paradigm (largely pinoneered and proven by jQuery)
    is magical for getting people to intuitively understand how to
    write complex queries. No more StackOverflow questions of "How
    do I do X in SQL?" because there is now an intuitive
    consistency to the language.
  
  * SQL compatibility is very, very difficult. The standard is hard
    to implement, and has lots of grey area around the edges. You
    can go for full bug-by-bug compatibility, which would take
    decades. Or you can go for basic compatibility, which confuses
    people. They try to port their application, it works for a
    while, and then breaks in some grey area in production, which
    results in a terrible user experience.

As an example, I'd ponder on why SQL has both the `where` keyword, and the `having` keyword. This alone wouldn't be enough of a reason to design a new language, of course, but this sort of thing permeates SQL. It's 2014. We can do better.

AaronFriel · on March 26, 2014

I can tell at a glance what `where` and `having` mean - the former is a filter over a result set, and the latter is a filter over a grouping.

You're probably right that SQL should have earlier defined this sort of recursion, so that you could sequence groupings and filters more easily, but don't fret: that future is here with common table expressions. Recursion is even supported!

I'm reminded of this abomination of a query I wrote in 2010 that produced a numbers table:

http://blogs.msdn.com/b/sqlazure/archive/2010/09/16/10063301...

Code:

    DECLARE @N int = 1000000;

      WITH RecursiveRowGenerator (Row#, Iteration) AS (
           SELECT 1, 1
            UNION ALL
           SELECT Row# + Iteration, Iteration * 2
             FROM RecursiveRowGenerator
            WHERE Iteration * 2 < CEILING(SQRT(@N))
            UNION ALL
           SELECT Row# + (Iteration * 2), Iteration * 2
             FROM RecursiveRowGenerator
            WHERE Iteration * 2 < CEILING(SQRT(@N))
         )
         , SqrtNRows AS (
           SELECT *
             FROM RecursiveRowGenerator
            UNION ALL
           SELECT 0, 0
         )
    SELECT TOP(@N) 1 + A.Row# * POWER(2,CEILING(LOG(SQRT(@N))/LOG(2))) + B.Row# Row#
      FROM SqrtNRows A, SqrtNRows B
     ORDER BY A.Row#, B.Row#;

jjudge · on March 26, 2014

Wow, that must have felt good to wrap up.

josephlord · on March 26, 2014

Do you consider Rethink relational (in the Codd sense)? From a quick look at the API I would guess that it qualifies.

What I couldn't see at a quick glance was a way to constrain the data in a table or enforce foreign key constraints. I guess those are the trade offs against something like Postgres for schema flexibility and easy distribution. Is that a correct way to view it?

coffeemug · on March 26, 2014

> Do you consider Rethink relational (in the Codd sense)?

Under the hood, yes. How much of that is exposed to the user is debatable.

> What I couldn't see at a quick glance was a way to constrain the data in a table or enforce foreign key constraints. I guess those are the trade offs against something like Postgres for schema flexibility and easy distribution. Is that a correct way to view it?

That's about right. Some of this is by design (lack of schema enforcement), and we might add features to do that later. There are no architectural constraints preventing us from doing that.

Foreign key constraints will probably never be added to Rethink (or at least not for a long, long time) because doing those efficiently in distributed systems is anywhere from hard to impossible.

derefr · on March 26, 2014

Good arguments, though only the last one seems like a true objection.

I think, If I were implementing a database from the ground up, I'd indeed make my own query language--SQL does suck--but I would probably strive to accept SQL as an option, with the proviso that it's "just SQL syntax with ThisDB's semantics."

This would let people point their old applications, written under "the database is just an SQL-speaking dumb store for simple CRUD operations" assumptions (e.g. any ActiveRecord app), at the new DB and get somewhat-sensible results out, while still letting them write new applications against the new API.

The alternative is forcing any user with existing services to hack up some sort of application-level back-and-forth replication between their old DB and your new DB during the long-and-possibly-indefinite migration period, rather than just migrating in one fell swoop.

dataminded · on March 27, 2014

Background: I've done a fair bit of hiring for analysts and have worked in most facets of analytics/DW including vendor evaluation/selection and implementation.

I think that the lack of a SQL interface is a big negative. It's really hard to find analyst (not programmers) with good technical skills or train them to proficiency in new technologies. I would have a very hard time selling a platform that existing analysts wouldn't be able to use right away and that few people on the market could work with.

Also, designing a new language makes integration with other applications that much harder.

coffeemug · on March 27, 2014

That's a great point. Our hypothesis is that great querying capabilities for analysts and great querying capabilities for programmers should look completely different, and be two totally different technologies. We'll eventually ship a solution for analysts, but I don't think a great database product should shoot for a lowest common denominator.

nXqd · on March 27, 2014

You can find the answer to this in coffeemug' post :

"We're assuming people will be using ReQL fifteen years from now."

ashwinp92 · on March 26, 2014

Although initially it seemed otherwise, your point on having a different interface to confusion between complete SQL compatibility v/s the subset being offered makes a lot of sense. Also personally I would rather prefer an imperative language for tasks where efficiency is the primary concern compared to declarative SQL

shin_lao · on March 27, 2014

It's great to see someone with a lot of enthusiasm, but I submit you think too much like a programmer and not enough like a database administrator.

What problem does your product solve that, for example, PostgreSQL doesn't?

shawn-butler · on March 26, 2014

HAVING specifies a search condition for a group or an aggregate function.

coffeemug · on March 26, 2014

Sure, but consider how this works in ReQL:

  r.table('foo').filter(...)
  r.table('foo').group('category').filter(...)

A properly designed language shouldn't have two different keywords for something that does ostensibly the same thing in different contexts. That's a mark of bad language design.

meritt · on March 26, 2014

Are you sure about that? Looking at even your example page [ http://www.rethinkdb.com/docs/sql-to-reql/ ] this appears to be wrong or at least confusing?

It's suggesting:

    SELECT category,
       SUM(num_comments)
    FROM posts
    GROUP BY category
    HAVING num_comments > 7

and:

    r.table("posts")
     .filter(r.row['num_comments']>7)
     .group('category')
    .sum('num_comments')

are identical.

I don't think they are? Your ReQL to me looks like it's applying WHERE num_comments > 7 and then aggregating that.

I mean regardless, your example SQL should be doing the >7 against an aggregate (e..g sum(num_comments)) not a field, that SQL does not work as written.

coffeemug · on March 26, 2014

Whoops, that's a great catch -- it's a bug in the docs. Will fix momentarily. Thanks!! (that is indeed embarrassing)

In ReQL any command you call after `group` runs on each group. So once you've called `group`, you can run anything you could run on a table on each group and that just works.

meritt · on March 26, 2014

So, to be clear here, you have created two fundamental things (both called "filter") -- A pre-group and a post-group filter. Users must still understand the difference and when to utilize them.

SQL just happens to call those WHERE and HAVING instead of "filter" both times.

coffeemug · on March 26, 2014

You're right, but it's not just `filter`. Any command that can run on a group can run on a full table and vice versa. For example:

  # get a sample of 3 elements from a table
  r.table('foo').sample(3)
  
  # get a sample of 3 elements from each group
  r.table('foo').group('category').sample(3)

You could say that we created two versions of `filter`, and `sample`, and every other command. But another way to say it is that we use polymorphism, which is widely considered an advantage in modern programming languages.

meritt · on March 26, 2014

Yeah, it's definitely more expressive and terse I just don't think your argument against WHERE/HAVING is incredibly strong. Someone could also just do subqueries and only use WHERE and achieve the exact same behavior that you offer (albeit with a ton more typing).

HAVING exists because it was created prior to subqueries/dynamic tables, otherwise we'd have been likely to just use:

    select * from (select category, sum(num_comments) as comments from posts group by category) as temp where comments > 7

br1 · on March 27, 2014

As a LINQ user, I expected group('category').sample(3) to sample 3 categories, not to sample inside each category.

coffeemug · on March 27, 2014

In ReQL you can do it with group('category').ungroup().sample(3). It's really powerful once you get the hang of it.

haberman · on March 26, 2014

I don't think this is a "good language" / "bad language" dichotomy. SQL just sits at a different level of abstraction.

From the examples you have posted, it appears that ReQL is a lower-level abstraction that SQL. In SQL you specify what you want logically and the DB turns this into a query plan. It appears that ReQL is more like a query plan itself, where you explicitly specify the data flow from stage to stage of query evaluation.

As a more specific example of this, it appears that in many cases in ReQL the user specifies what index should be used in the query itself. SQL is more abstract than this; the idea is for the query planner to figure out what index(ex) should be used.

boomzilla · on March 26, 2014

In theory, yes. In practice, the best language is one that maps developer thought process into working code in the most natural ways. I don't know the community that ReQL is targeting, but for the current crop of data scientists, SQL is a more natural language in my opinion. (There is a reason why Hive is so popular among big companies, where you'd expect to find big data sets.

fidotron · on March 26, 2014

I'm totally the opposite, and blame you entirely for provoking what follows! SQL at this point is passed expiry.

The simple reason is we now have plenty of decent scripting languages to run as interactive prompts, all of which are far better at interfacing with the rest of the system than SQL is. Exposing a sane, direct, API in those languages gets you much more than SQL, along with having removed multiple layers of confusion, including string generation, escaping, reparsing, then is it doing what you want, etc.

I've been using LevelDB (via plyvel) a lot lately for data storage, and every time I end up having to use SQL (even indirectly via ORM) is painful by comparison because you can just feel the control being taken away from you, and somewhere you end up having to fire up the DB prompt for no good reason such as adding strange DB specific indexing flags to columns or even setting up authorization and DB creation, making it yet another thing to go wrong during deployment.

Protocol buffers stored in LevelDB prove so easy to use by comparison with something like SQLite or PgSQL just at an API level. The resulting code is simpler, cleaner, and much easier to reason about. If I need to move it to another format the code, again, is amazingly small.

As someone that cares deeply about my app's data structure seeing the acceptance of a world beyond SQL is one of the best developments in my career, and experience means I simply don't trust any SQL based abstraction to give you the controls to get it right.

coldtea · on March 27, 2014

>I'm totally the opposite, and blame you entirely for provoking what follows! SQL at this point is passed expiry.

Actually it's the very opposite.

SQL is a higher level abstraction for all this ad-hoc junk, based on actual mathematical principles (relational algebra), and it's also declarative, instead of imperative.

Furthermore, all this NoSQL query systems now emerging are nothing new. They were tried in the 70s and early 80s, and people found out that they sucked. Before SQL what we had was, well, NoSQL.

Wanting a flexible schema-less db or one that's denormalised for certain needs (like Google's or Facebook's) makes sense.

Replacing SQL and RBDMS for the common tasks they are used (company logistics and accounting, etc) with NoSQL is a regression to the primitive past. I guess people not knowning (computing) history are doomed to repeat it.

teraflop · on March 26, 2014

One reason is that SQL only lets you work with flat relational tables. MongoDB and RethinkDB let you store complex JSON-like objects as values, and extending SQL to query and manipulate those kinds of structures is hard. The closest attempt I've seen is Google's BigQuery.

Another reason is that SQL does its best to decouple the query (what you're asking for) from the execution plan (how it actually gets evaluated). But for operations like joins, the precise way the query is evaluated can have a huge effect on performance. So it makes sense to design your query language to explicitly expose those knobs to the programmer.

spamizbad · on March 26, 2014

I have no experience with RethinkDB, so maybe they did it right, but I have quite a bit with Mongo.

MongoDB's querying is nowhere near as powerful as SQL. And understanding its limitations and pain points are essential when structuring your data. Otherwise you'll never really scale unless you've got data that's embarrassingly easy to query (eg: filtering on 1 or 2 indexed fields).

Being able to write JS map-reduce queries is fine when you need to do stuff ad-hoc, but none of it can be used in production at scale. Then there's the aggregate framework which helps quite a bit but still doesn't offer the same level of performance I'd expect from Postgres, MSSQL, or Oracle. Things get really yucky when you have to start unpacking arrays - especially since there's a memory limit on $sort and $group. There are further issues related to skipping being slow on large collections because it must step through.

And to cope with these limitations you need to invest a ton of time into coding around them. The freedom of the database being schemaless? Gone. You must carefully structure data and decide up-front how it needs to be queried - or cope with poor performance.

And those complex queries, when written in ugly JSON, aren't the most human-readable things in the world. Certainly not any better than SQL.

I'd rather just invest the time in learning SQL than partake in the mental gymnastics necessary squeeze high-performance non-trivial queries out of MongoDB.

I understand MongoDB is a big hit with people who use it for low volume internal tools or people building MVPs. I can totally understand how its an awesome tool for those. But rusty old SQL starts looking better when your app scales and the business demands change.

orthecreedence · on March 26, 2014

You're right, you haven't tried REQL =]. As a developer of a RethinkDB driver and a fairly active user of the database, and also a previous fan of Mongo, I can tell you that REQL kicks the shit out of MongoDB's querying.

SQL really is well-geared towards relational, flat database systems. Mongo's querying is an obtuse command-based language that always felt like it was adding hacks on hacks to get the data you wanted.

REQL is almost like having your data in-memory, and you're running programmatic expressions on it that seamlessly melt into your native language. The lack of a query optimizer almost makes it better because you have to think about your query plans and indexes instead of just firing it off at the server and crossing your fingers. You're not running commands, you're processing data.

I don't have a lot of experience scaling Rethink, but I do know from reading the docs and architecture that it scales out better than most SQL servers will and certainly scale up better than Mongo. I've tried to scale both MySQL and Mongo. Both are difficult and painful.

You can't conclude that because of Mongo's failures, SQL beats Rethink. Rethink is light years ahead of Mongo.

zapov · on March 26, 2014

This is not true. Both Oracle and Postgres are ORDBMS and you can work with objects/nested collections and all kind of non-relational constructs.

chipotle_coyote · on March 27, 2014

I'm not really sure how you could wedge SQL into a system that doesn't really have the concept of columns without it being, at the least, a distinctly different dialect.

And, really, if you've worked with some of the "higher level" interfaces to SQL like SQLAlchemy or Laravel's QueryBuilder, ReQL doesn't seem that alien; they all tend to work by method chaining as well. (But they usually don't support everything SQL does, unlike ReQL's interfaces.)

Also: unlike my (admittedly limited) exposure to other NoSQL systems, RethinkDB supports relations and joins -- it's actually a lot easier for an SQL fiend to move to Rethink than any of its competitors.

neumino · on March 26, 2014

ReQL is a new query language, and indeed requires the user to learn it but

- It's actually easy to learn -- especially with the data explorer

- It's nicer to use than SQL -- You can chain commands in the language you are using

willvarfar · on March 26, 2014

I completely agree. Postgres, for example, has nice new JSON support, demonstrating that SQL can be grown in new modern ways.

baruch · on March 26, 2014

I much prefer ReQL over SQL. It is much nicer, flows naturally in my opinion and is combine-able. I'm using this combination method in my rethink-miner tool to help me work with data to try and figure it out and I can make generic queries that I can use on top of every other sub-query which is very very nice.

coolsunglasses · on March 26, 2014

For an idea of what's possible when you put aside SQL:

https://github.com/bitemyapp/Revise/

j_s · on March 26, 2014

For now it probably keeps a number of users away that they don't want to support, but it makes sense to at least have an automatic translation layer in the long run.

biokoda · on March 26, 2014

there are some that do. The so called newSql databases like actordb

slap_shot · on March 26, 2014

RethinkDB is my favorite piece of my current tech stack. I tell everybody I know about it, and I'm soon to release some blog posts and speak at a Meetup showing off what it is good at and how to get started - I really just can't say enough good stuff about these guys.

A high level overview from someone who has used it in production since 1.10 (about six months):

Pros:

* ReQL is a beautiful DSL that makes querying and using my data simpler than anything I've ever had before.

* An amazing UI lets you quickly do the things you do the most (verify a query, cherry pick some results, add/drop tables, indexes, shards, etc).

* Unparalleled support. During almost any reasonable hour, a RethinkDB employee will quickly field any question you have on #rethinkdb (freenode), their User Group, or email.

* Quick releases. These guys ship major releases every 60 to 90 days. Each release offers huge features, performance improvements, and bug fixes. They do a great job of listening to what people want and implementing them fast.

Cons:

* Database is still technically "beta" - great for side projects and prototyping, but be mindful if you intend to use it in production.

* Works great on tables of a few GB, but performance really degrades on the next order of magnitude. It looks like there were major changes in 1.12 addressing this.

* Only three officially supported clients (JS, Python, Ruby). That's a good start for their target market, but it is limiting for some.

RethinkDb is an archetype for startups - building what people want, shipping fast, always talking to customers, and clearly passionate about what they do. Even if you don't use their product, we all can learn from these guys. I hope they do well.

weixiyen · on March 27, 2014

Ditto. Also appreciated is the upfront honesty about what their tech can and cannot do currently, and what they plan to do to address limitations / bugs.

I really love their ability to design sensible APIs paired with their honesty, gives me the confidence to begin using this software right now with belief that it will eventually get to a very stable point.

Would not use it for any multi-million$ business just yet but it's perfect for personal projects that require quick iteration which might in the future become big.

coffeemug · on March 26, 2014

Hey, thanks for being a user!

In the spirit of making what people want, we're planning 1.13 right now. If you have feedback on what features you want, shoot me an e-mail -- slava@rethinkdb.com, and I'll work hard to try and get them in.

mglukhovsky · on March 26, 2014

Thanks for the great feedback! If you're speaking at a meetup, let me know (mike@rethinkdb.com) -- we'd be happy to send over T-shirts (and stickers, etc.), as well as spread the word about the event.

farhanpatel · on March 26, 2014

Has anyone used RethinkDB for a relative large database(100GB+)? I cant seem to find performance benchmarks anywhere.

egeozcan · on March 26, 2014

That's something I've been searching for, too. I actually opened this page, wishing that someone linked to a benchmark for large data sets. Large datasets are the reason I'm trying to move away from mongodb completely, and I've been very happy with postgres but I'm willing to give a try to others.

coffeemug · on March 26, 2014

Hi, slava @ rethink here.

We'll be publishing benchmarks and case studies for large data sets in the coming months. You might want to wait for those because there will be tons of info, and lots of bugs that I'm sure are still lurking ironed out. However, if you want to be one of the early adopters and give it try, we'd absolutely love your feedback and will work hard to incorporate it into the product.

You can use our regular channels for feedback (rethinkdb.com/community), your shoot me an e-mail any time -- slava@rethinkdb.com.

egeozcan · on March 28, 2014

Hi, do you have any plans to support FreeBSD?

leif · on March 26, 2014

If you're having problems with MongoDB on large data sets, try TokuMX: http://www.tokutek.com/products/tokumx-for-mongodb/

spyder · on March 26, 2014

I like their API and the webadmin interface but I tried the 1.10 with node.js client (using the cpp protobuffer) and in my test it was slow compared to MySQL (InnoDB). A simple select query with 100 items took around 70 ms while the same query was done in 3ms with MySQL. I'm not sure what did i do wrong or it is just in such an early stage. Also i cannot get versions above 1.10 from the repo on CentOS.

coffeemug · on March 26, 2014

Latency for range queries is a surprisingly tricky issue, there were a number of problems around it that we solved. Check out this issue for example -- https://github.com/rethinkdb/rethinkdb/issues/1766. Most of the problems have been resolved in 1.12, so we'd encourage you to give it a try again!

As far as the outdated CentOS build -- I haven't heard of this problem before, but I'll check with @atnnn who's in charge of packaging -- see https://github.com/rethinkdb/rethinkdb/issues/2176. We'll get this resolved ASAP.

EDIT: ok, @atnnn confirmed that 1.12 works on CentOS. If it doesn't for you, could you post more details on the GitHub issue? It would help immensely!

fletchowns · on March 26, 2014

It's not trivial to do performance comparisons correctly, you'd have to provide a lot more detail about the tests to get proper feedback on it. Ideally the test would be something that anybody can download and run locally themselves.

As far as the package repository for your distro not having the latest version of something, this is very very common. Your best bet is to build your own package from source, or find an alternate repository to use where somebody is building packages of the newer versions.

dkhenry · on March 26, 2014

I love the elegance of ReQL and how well it works in dynamic languages, but I still can't figure out a good way to port that lambda syntax used in python into the Java driver.

coffeemug · on March 26, 2014

What about lambda expressions in Java 8? I haven't programmed in Java in years, so unless there are limitations I'm not aware of, they seem like a great fit.

Without those, doing an elegant driver would be pretty hard, which unfortunately probably means we'll never get good ReQL support into legacy Java.

dkhenry · on March 26, 2014

So the way all underlieing communication is done is through a protobuf. So in python I know they hijack the expression and they can then construct the protobuf from the passed in python lambda. In Java land we can't really do that since lambdas are desugared at compile time to just be functions. In Java I can't even overload operators like you can in Scala or C++ to make it a little nicer. The result is your pretty much left with rolling the protobuf by hand.

coffeemug · on March 26, 2014

Ah, got it!

The upcoming 1.13 release will introduce a pure JSON interface for queries (which official drivers will switch to). You'll be able to construct JSON directly, and send it to the server -- no protobufs. Would that make things easier?

InAnEmergency · on March 26, 2014

FWIW, I think this will make it much easier to implement drivers. Getting protobufs to work is a real pain.

dkhenry · on March 26, 2014

It will make it on par with the interface for other document oriented databases. I think it will be a lot more user friendly then the protobuf, and it might be the best that Java can do.

RyanZAG · on March 26, 2014

Sounds like that would work well - it's the approach ElasticSearch has taken with their APIs and they're generally very nice to use.

coffeemug · on March 26, 2014

To clarify, in 1.13 JSON will be a communication protocol for queries. It's not intended to be used by the end user, but by the client drivers. You can use it as an end user, but it isn't nearly as nice as native language drivers.

The change is meant to simplify driver development, packaging, and improve performance. Protobuf is worse than JSON in almost all of these categories.

RyanZAG · on March 26, 2014

Take a look at this work by jooq:

http://blog.jooq.org/2014/03/21/java-8-friday-java-8-will-re...

You may be able to utilize a lot of their code for converting the lambda bytecode.. or not. Anyway it's interesting to see how they're dealing with the problem you've run into.

dkhenry · on March 27, 2014

Well that looks promising. I will need to make a JDK8 branch to integrate this.

kclay · on March 27, 2014

I could provide some assistance, I've did the scala driver and was making it work with java . When I get some time I'll submit a pull request.

dkhenry · on March 27, 2014

I was about to abandon my Java driver for your Scala driver at one point. Scala is just a better language in so many regards.

kclay · on March 27, 2014

I feel your pain, If I had to do it all in java I would of cried to no end.

knyt · on March 26, 2014

> you no longer have to manually specify cache sizes for tables to prevent running over memory and into swap

I'm really glad you're addressing this. Does this auto-sizing apply to RethinkDB's memory usage in general? The last time I tried using it on a small VPS, I set my table's cache size really low, but I still ran out of memory whenever I ran queries on a large table.

Definitely looking forward to secondary index export/migration and to upgrades without the export-import cycle.

coffeemug · on March 26, 2014

The autosizing only applies to the cache itself. There is a separate issue for a query data structures memory limit (https://github.com/rethinkdb/rethinkdb/issues/1375). We'll try to address it soon, but the cache autosizing should take care of most problems people have seen before.

I'm also looking forward to stable formats and seamless migration, but it's a really hard problem. I think for the time being we'll introduce a stable branch, and users will have to pick between new features or stability. It's not ideal, but will give most people most of what they want.

transientbug · on March 26, 2014

RethinkDB is pretty awesome. Its a blast working with it and the Python driver is quite well rewritten, imo.

As the author of PyRethinkORM for Python it was ridiculously easy to write compared to an ORM for say SQL which was a major selling point for using RethinkDB behind my last several projects.

I'm fairly excited about the ARM port as I've been wanting to use Rethink on a few projects on my BeagleBone White/Black. The new map/reduce changes are pretty cool too.

pandemicsyn · on March 27, 2014

The python driver is cool (and PyRethinkORM was super handy!), but setting up the driver to use the optimized protobuf backend was a PITA.

http://www.rethinkdb.com/docs/driver-performance/

coffeemug · on March 27, 2014

Hey, sorry you (and everyone else) ran into this! This will go away in the next version (1.13), because we're removing the requirement for protobufs and supporting a pure json transport interface. It turns out to be both faster, easier to setup, and paradoxically more space efficient.

See https://github.com/rethinkdb/rethinkdb/issues/1868 for more details.

transientbug · on March 31, 2014

Bit late on this (school gets the better of me most days), but thanks! I'm glad (and happy) that PyRethinkORM has been helpful to someone.

efge · on March 26, 2014

Still no official Java driver? How unfortunate...

dkhenry · on March 26, 2014

I know its not an official driver, but I maintain the Community Java driver here[1]. If its missing something you would need feel free to open an issue and I can add it for you. I have been using RethinkDB with java for a few projects and it works well enough for me.

https://github.com/dkhenry/rethinkjava

RivieraKid · on March 26, 2014

What are the three top practical reasons to choose RDB over a traditional database such as Postgres?

coffeemug · on March 27, 2014

There are two main reasons:

1. If you're dealing with JSON, a native JSON database provides a 100x better programming experience than a relational system. 2. Scalability out of the box. You can just add a node, reshard, and keeping going. It's much harder in relational systems.

If you have lots of null columns, hierarchical relationships, or eventual need for scale out, give RethinkDB a try.

If you have traditional rigid data or ACID requirements, stick with a relational DB.

Hope this helps.

RivieraKid · on March 27, 2014

Thanks, that definitely helps.

taybin · on March 26, 2014

This is great. I love RethinkDB and had a lot of fun writing Erlang and Elixir drivers for it.

dignati · on March 27, 2014

I really like how they are making videos to announce news or advertise their product.

jzelinskie · on March 26, 2014

I liked the Evangelion reference in the example video.

leccine · on March 26, 2014

Somebody pls fix the Clojure driver... :)