Jepsen: MongoDB 3.6.4

nathan_long · on Oct 23, 2018

> This interpretation hinges on interpreting successful sub-majority writes as not necessarily successful: rather, a successful response is merely a suggestion that the write has probably occurred, or might later occur, or perhaps will occur, be visible to some clients, then un-occur, or perhaps nothing will happen whatsoever.

> We note that this remains MongoDB’s default level of write safety.

This sounds pretty scary. How does it compare to other distributed dbs, like Riak? My understanding is that Riak lets you specify how many nodes a write must succeed on to be considered successful. Are its responses more reliable? Is this just a "distributed computing is hard" situation?

TheDong · on Oct 23, 2018

> Is this just a "distributed computing is hard" situation?

I think this is "gaming performance benchmarks and being right is hard".

MongoDB has spent a lot of effort on making sure it looks good in benchmarks, and that includes using worse defaults than any other distributed data-store I know of.

As long as MongoDB continues to wish to "win" at performance benchmarks by so much, they'll have trouble being able to provide a correct distributed system by default.

nathan_long · on Oct 24, 2018

This isn't about distribution, but someone once wrote about getting PostgreSQL upsert performance to be better than MongoDB's by disabling some of PostgreSQL's safety features.

https://markandruth.co.uk/2016/01/08/how-we-tweaked-postgres...

This made me laugh. Snarky interpretation: "yeah, we can go fast and sloppy, too. We just usually don't."

chris_st · on Oct 23, 2018

And this is why I appreciate the analysis by aphyr (and kit!) so much... it really shows what's actually happening.

Just sad that it takes so long to do this depth of analysis...

emmelaich · on Oct 23, 2018

It's not just distributed computing and being right is hard.

Most consumers of computing are happier with value and computers being mostly right/useful/evailable.

Evidence: the entire history of computing.

throwaway5752 · on Oct 24, 2018

Basho went bankrupt over a year ago. Riak is still around, but I encourage checking out the level of support for it. Redis, Cassandra, Couch, Influx might be good alternatives depending on your use case.

Thaxll · on Oct 23, 2018

It's the same for every DB, like you can say MySQL to ack a write but there was no fsync(). All those settings are usually tunable for various scenarios / performance you want.

aphyr · on Oct 23, 2018

It is, in fact, not the same for every DB. Databases vary significantly in their default and maximum guarantees for successfully acknowledged writes; Mongo chooses fairly weak ones. By contrast, consider systems like VoltDB, which aims for strict serializability by default, or etcd, which offers linearizability/sequential reads by default.

nvarsj · on Oct 24, 2018

> etcd, which offers linearizability/sequential reads by default

Not quite true - you have to opt-in to quorum reads in etcd. Otherwise it just returns the first read (which used to be the default in k8s and caused a lot of grief).

aphyr · on Oct 25, 2018

Ah, yes, sorry! I forgot exactly how that all shook out. I thiiink that prevents dirty reads, and with client monotonicity you can recover sequential consistency? Not sure how many clients actually do that; maybe they just do regular reads at RC...

ideal0227 · on Oct 26, 2018

That is true. etcd v3 provides l-read by default. You have to opt-out that to improve latency.

ideal0227 · on Oct 26, 2018

No. You are wrong. etcdv3 enables l-read by default.

wolf550e · on Oct 24, 2018

The MySQL default has always been thee safe, not the fast, option.

https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.ht...

jimktrains2 · on Oct 23, 2018

Correct me if I'm wrong, but an ack-without-fsync will not "unapply" a write while keeping things happening after it intact? It also won't "unapply" those writes without something catastrophic happening, e.g. dead disk or power outage. This isn't a general failure mode in a distributed system. MySQL with synchronous replication won't ack until all/majority of nodes ack the write.

Also, just because you fsync doesn't mean your data will hit the disk. It's a fairly large problem that has nothing to do with the consistency model of the database replication.

jlokier · on Oct 23, 2018

Abruptly terminating a VM is quite common in cloud services, can "unapply" a write, because it's equivalent to a power outage. The data lives in the VM guest's cache until the fsync sends it to the host.

With VM termination in the cloud being not exactly uncommon, I would say that's a general enough failure mode in a distributed system that it's an unacceptable risk to run MongoDB in its default configuration (or any database in the equivalent configuration).

rystsov · on Oct 23, 2018

> Thus far, causal consistency has generally been limited to research projects ... MongoDB is one of the first commercial databases we know of which provides an implementation.

Cosmos DB provides session consistency (looks like an another name for causal consistency) at least since 2014 [1].

Cosmos DB's session guarantees [2]: consistent prefix, monotonic reads, monotonic writes, read-your-writes, write-follows-reads.

Mongo DB's causal consistency guarantees [3]: monotonic reads, monotonic writes, read-your-writes, write-follows-reads.

Doubt that four years later still qualities as one of the first.

[1] https://www.infoq.com/news/2014/08/microsoft-azure-documentd...

[2] https://docs.microsoft.com/en-us/azure/cosmos-db/consistency...

[3] https://docs.mongodb.com/manual/core/read-isolation-consiste...

aphyr · on Oct 23, 2018

Causal and session are definitely similar, but I'm not entirely sure if causal implies consistent prefix, and conversely, I think causal miiight have stronger implications than just the intersection of MR, MW, RYW, and WFR. Because we weren't entirely certain whether we could make that claim regarding Cosmos, we opted to be conservative.

rystsov · on Oct 23, 2018

I agree it's hard for me too to be precise about naming in academic sense. But this published paper "Writes: the dirty secret of causal consistency" says that both Cosmos DB and MongoDB have causal consistency so I don't know.. At least Cosmos DB and MongoDB provide the same guarantees for session/causal.

Strongerpass · on Oct 23, 2018

.. but the post didn't say it was first. Not even the part you quoted.

Boxxed · on Oct 23, 2018

Well, the quote you reference says, "one of the first," doesn't it?

Volt · on Oct 23, 2018

"we know of"

polskibus · on Oct 23, 2018

Was Cosmos DB tested by Jepsen?

rystsov · on Oct 23, 2018

It's off-topic. But yes, Cosmos DB has rigorous tests[1] including Jepsen (a tool).

[1] https://twitter.com/dharmashukla/status/869104163510034432

aphyr · on Oct 23, 2018

To clarify: no, Jepsen, as an organization, has not worked with CosmosDB.

I'm delighted they have rigorous tests, and I'm glad our tool has been helpful for them! We just can't say anything about those tests, because we haven't looked yet. Maybe someday!

namibj · on Oct 24, 2018

I understand them having developed TLA+ to guide their design and test consistency primitives.

robterrell · on Oct 23, 2018

I thought "Jepsen.io" was just Kyle Kingsbury. Interesting that there's new author for this analysis. (Also might explain the lack of memes, which I always liked.)

kitpatella · on Oct 23, 2018

Hi, new author here. :) Kyle was clear that I could write it however I wanted to, but I opted for the more formal tone used in recent analyses.

brian_herman__ · on Oct 23, 2018

Nice job!

aphyr · on Oct 23, 2018

Yeah! Kit's done a bunch of work on Jepsen's core and Knossos, the linearizability checker. This is her first analysis, and I'm excited to see more. :)

sbilstein · on Oct 23, 2018

These analyses are a great way to get educated about distributed systems, even if you aren't in a position to evaluate choosing one. Thanks for growing and creating Jepsen!

jedberg · on Oct 23, 2018

Kyle got so well known for Jepsen that it became a full time thing for him. He got such a backlog that he started bringing in people to help him do the testing.

nevi-me · on Oct 23, 2018

I think Kyle recently said that he's been making analyses a bit more formal, so even when he writes analyses, there's less of what we're used to.

avitzurel · on Oct 23, 2018

Completely unrelated to the core of the article, does someone know which program was used for the sketches/diagrams?

kitpatella · on Oct 23, 2018

You might also enjoy this time lapse of the network topology graph.

https://www.youtube.com/watch?v=SQ8bbuqTVEw

kitpatella · on Oct 23, 2018

I used procreate on an iPad pro.

avitzurel · on Oct 23, 2018

Thank you for both comments! I'll look into that app.

wiremine · on Oct 23, 2018

This is off topic, and I might get downvoted, but I realized I am teed up waiting for the "mongoDB hate" comments to role in... seems to be not a lot of love on HN for MongoDB.

I wonder what positive use cases people have used Mongo for? I've used it for a few small/medium sized projects without problem myself.

0xFACEFEED · on Oct 23, 2018

The dislike for mongodb usually came from people who were responsible for maintaining medium-large scale deployments of its earlier versions.

IMO it all boils down to startup culture and growth; that these goals are incompatible with building a database system responsibly. It broke trust with a lot of people and never really regained it back. Not that it mattered much in the long run.

The lesson is that you can get your hands dirty while growing. If you don't grow you're dead anyway. If you do grow then you can throw money at the trust problem until it's fixed.

EDIT: Unless you're in the medical industry :P

manigandham · on Oct 23, 2018

MongoDB was a subpar product that rode the "nosql" document-store hype, but people following hype don't know about technical quality and this lead to it being used in a lot of places where it shouldn't, with marketing plastering over all the downsides. They've slowly made it better and now it's a rather smooth experience but overturning a poor reputation is extremely hard.

Meanwhile other database systems have further developed and the need for a pure document-store with weak aggregation and not much else just isn't very enticing. It's still good for data where schema is on-read or in your application, and needs completely flexibility like document and media management, low-volume logging, user profiles and sessions, etc.

dgritsko · on Oct 23, 2018

Compare this Jepsen analysis with that of MongoDB 2.4.3 from 2013[1] and you'll see that MongoDB has come a long, long way in the last few years. Note that the linked report on 3.6.4 was actually funded by MongoDB. I think a lot of the hate in the past was caused by their marketing writing checks that the technology couldn't cash. I think that has changed somewhat, but I'm not sure whether popular opinion has come around.

1: https://aphyr.com/posts/284-call-me-maybe-mongodb

threeseed · on Oct 23, 2018

It's actually a completely different database between version 2 and 3.

They acquired WiredTiger in 2014 and MongoDB has never looked back since. Sure there is a lot of hate on HN but they are doing very well out in the real world. And actually it's a pretty good database if your domain model fits.

hinkley · on Oct 23, 2018

How many versions existed prior to 2.4.3? That's a lot of versions that don't work as advertised.

addicted · on Oct 23, 2018

I agree. Their marketing was way ahead of their technology and that caught up to them. That being said, they’ve been rapidly improving the tech and it’s looking like a really solid option now.

BenoitP · on Oct 23, 2018

That's still in the limited uses that not having proper SQL semantics allows.

I've had to deal with a key-valued-json database on another technology. Lots of different json shapes, even when they are supposed to represent the same concept. Every new feature bringing its own tweak.

Low performance, as ad hoc materialized scans + hash joins had to have been developed in the app. I hated every second of it.

In the end, I came away with the following exhortation for people tempted to eat the easiness of MongoDB-likes:

The world is not a tree. It is a graph. Each entity in the database exists in real life, and interacts with others freely each in a specific manner. This is why they cannot be represented as trees (which is what json is). You need a graph, types, constraint checks, and acid semantics; and boring postgres tables + SQL will provide that.

Sure trees are a great way to represent and transport a piece of state. It is self-contained and succinct. This is why I'm a big advocate of GraphQL on top of an RDBMS. Out of the graph of properly stored entities, you can extract the tree that is relevant for the app view you're working with.

hinkley · on Oct 23, 2018

> The world is not a tree. It is a graph.

What's your opinion of ember and its sideloading semantics for JSON? I found it interesting... until I had to write my own REST endpoints. I found that none of the database tools I had available were up to doing it. Having to write all those sideloads by hand ended up being too tedious for a non-paying gig and I dropped the experiment.

aikah · on Oct 23, 2018

MongoDB inc itself published bullshit benchmarks claiming they were faster than the competition, omitting the fact that DB writes could fail without returning an error to the client. It seriously damaged their reputation. It also advertised as a replacement for RDBMS which it never was. You'll notice that no other NoSQL store has these reputation issues, no matter how bad they are, it's solely the fault of MongoDB inc and their questionable marketing.

jedberg · on Oct 23, 2018

I will personally never use MongoDB because when they first started their philosophy was speed over durability, which is a bad look for a database.

Even though they have apparently fixed those problems, it will take them a long time to win that trust back.

Also so far Elasticsearch seems to solve all my document storage problems.

jbergens · on Oct 25, 2018

As far as I can remember the Elasticsearch test that Jespen made years ago showed at by that time it was a very, very bad db. It was good for indexing content but not for storage.

dominotw · on Oct 23, 2018

> Also so far Elasticsearch seems to solve all my document storage problems.

Do you miss joins?

jedberg · on Oct 23, 2018

Not really, but that's because I've been using "nosql" so long that I've gotten used to writing joins in my apps.

But that is a fair point, things would be easier with joins. Usually when I need joins I use Postgres though.

bunderbunder · on Oct 23, 2018

Speaking as a sideliner:

MongoDB seems to have experienced a rather exaggerated version of the hype cycle. But it also seems like, at this point, the technology is well into the "slope of enlightenment" phase of its hype cycle, and may even have reached the "plateau of productivity". A lot of that is fueled by MongoDB's own efforts - they took the complaints about reliability quite seriously. Case in point: This series of Jepsen tests that they've been funding.

kitpatella · on Oct 23, 2018

Hey this is the author. Yes, it cannot be understated that MongoDB solicited work directly to check their sharding system and their new CC feature and were helpful towards getting the work done.

wenc · on Oct 23, 2018

Mongo works well as a straight-up JSON store for store-and-retrieve use cases (with no analytics). It is horizontally scalable, avoids the overhead of relational databases, has indexing capabilities, and provides a strong consistency model. The big improvement came with WiredTiger, which addressed many of the issues that plagued earlier versions of Mongo.

I've seen high-speed machine data stored in Mongo for logging and visualization purposes. It's an improvement over writing csv files to disk.

However, if you ever need to perform non-trivial analytics, Mongo's weaknesses quickly become obvious. For machine learning, typically you would want to first ETL the data into a dataframe-like structure (which is a structure native to SQL databases).

davidp · on Oct 23, 2018

  It's an improvement over writing csv files to disk.

This is not a high bar.

threeseed · on Oct 23, 2018

Actually MongoDB is quite popular in the analytics space. It has a unique trick with Spark/Hadoop where its data gets represented as a single wide table. This allows you to use it as an analytical/ML feature store which is not possible with anything other than Cassandra.

Also not sure where you get the idea dataframes are unique to SQL databases because that's completely wrong. HBase and Cassandra were even the original big data databases and they aren't relational. And Spark can manifest almost any database as a dataframe.

wenc · on Oct 23, 2018

> not sure where you get the idea dataframes are unique to SQL databases

I'm not sure I said this.

> HBase and Cassandra were even the original big data databases and they aren't relational.

They also had trouble doing joins and many other query operations which are common in analytics. Presto addresses this somewhat.

> And Spark can manifest almost any database as a dataframe.

Which entails a translation layer from whatever non-tabular form that data was (e.g. JSON) into a dataframe-like structure, rather keeping it in its native form, which reinforces my point. You still need to somehow transform data into tabular form. (ETL is just a batch way of doing this transformation; you can have live transformations of course, with accompanying overheads)

BI tools also generally require data to be in tabular form, which entails the use of a translation layer. The Mongo BI connector is one such translator.

> Actually MongoDB is quite popular in the analytics space.

I work in this space, interact regularly with vendors, and monitor the space actively for strategic developments. This does not track with my observations.

nevi-me · on Oct 23, 2018

I've had lots of fun using Mongo with Spark, I have found it pleasant, though most Spark libraries are good to use, including the JDBC ones. I haven't had much luck with the kind of viz where you don't first do ETL from Mongo to some tabular format.

hinkley · on Oct 23, 2018

But why not memcached or Redis? You are describing a situation that those are designed for, instead of something that is pretended for years to be what it never was.

wenc · on Oct 23, 2018

> you are describing a situation that those are designed for

I'm curious: how so?

memcached and Redis are commonly deployed for caching, but not for persistence (Redis has optional persistence). That is not to say you can't use them, but I'm not sure what makes them a necessarily better choice than Mongo in this situation (persisted high-speed machine data).

Edit: OK, I see you mean the "for store-and-retrieve use cases" part. You have a point, though Mongo seems to be ok for that use case too.

hinkley · on Oct 23, 2018

My life has been saner when we use the KV store for search or fast lookup and not as the system of record (for that we use a traditional database).

It tends not to occur to me that durability is a feature people are looking for in nosql, but if you’re trying to avoid having three to five copies (SQL, nosql, search, reporting, ?). of your data I could understand. But as the team gets bigger it gets harder to maintain that, figuratively and literally.

hinkley · on Oct 23, 2018

Do you ever make fun of people for getting fooled twice?

The haters that have lasted this long have been fooled once and very painfully. You don't forgive someone technical for straight up lying to you and your peers. For years. That's vendetta territory.

amenod · on Oct 23, 2018

I have used it for a monitoring solution that we developed in-house and rented to outside customers. In retrospect we could have used PostgreSQL too, and would have easier time. Hacing someone enforce schema on us wouldn't hurt either. But apart from a few gotchas (like not releasing storage on deletes - this should be fixed now) it worked quite well, so I am not in the hate camp (as opposed to HBase). I did however learn to appreciate strictness of SQL and would need very strong reasons to pick anything else than relational DB for future projects.

madeuptempacct · on Oct 23, 2018

db.webhooks.find({ "events": { $exists: true, $ne: [], $elemMatch: { "links.cancelInvoice": {$exists: true}} } })

What is this, I don't even. Mongo is a huge pain when it comes to searching nested arrays. And above is just an array in an object, not multiple levels of nesting. I have to re-learn the syntax every time I pick it up.

Other than that, it worked great for allowing users to build completely custom forms.

yoklov · on Oct 23, 2018

Honestly that doesn't look that bad. SQL is a lot weirder in many ways, and I say that as someone who has written a lot of SQL.

zongitsrinzler · on Oct 23, 2018

I have to agree. Basic search and replace queries are very readable in Mongo and very easily constructable. Imo the advantage of SQL queries shows the best when having to do complex aggregation in Mongo.

qaq · on Oct 23, 2018

That was actually a very smart decision as sql parsers are actually hard to write so they saved a ton of dev time.

dominotw · on Oct 23, 2018

There are couple of apache projects like calcite that are being used of new db companies like mapd.

zzzcpan · on Oct 23, 2018

It's not really a big deal either way. If you have some experience with parsers/compilers, SQL parsers are pretty trivial. If you don't, there is one in SQLite.

qaq · on Oct 23, 2018

Calcite looks like a really useful project

erik_seaberg · on Oct 23, 2018

Code is data. I'm surprised to see developers willing to adopt "there is no custom grammar, just create this data structure" after how poorly Lisp was received.