The AWS and MongoDB Infrastructure of Parse

edejong · on Feb 9, 2017

"our idea is that developers should know about details such as schemas and indexes (the Parse engineers strongly agreed in hindsight)."

Many engineers I have worked with like to throw around terms like: "CQRS", "Event sourcing", "no schema's", "document-based storage", "denormalize everything" and more. However, when pushed, I often see that they lack a basic understanding of DBMSes, and fill up this gap by basically running away from it. For 95% of the jobs, a simple, non-replicated (but backed-up) DBMS will do just fine. The remaining 3% might benefit from some sharding and the final 2% are for specialized databases, such as column-oriented DBMSes, document-oriented DBMSes or graph-based databases.

At this very moment, I'm working on a classical DBMS, searching through around 200 million tuples within the order of 2 seconds (partially stored columnar using psql array types and GIN indexes). Not for a second did I consider using any kind of specialized database, nor optimized algorithms, except for some application-level caching. Some problems are caused by DBMS to backend and backend-to-frontend network utilization, which is generally not something a specialized DBMS can solve.

zzzcpan · on Feb 9, 2017

You are assuming that 95% of the jobs will remain the same throughout their entire lifetimes. Which is not true and I'm sure many would prefer for a database not to become a liability in case they happen to attract a lot of users, which requires more availability than a classical DBMSes can provide.

The problem is more about constrains required to scale and achieve high-availabily and low-latency, rather than a database choice. But classical DBMSes lack those constrains and you may and probably will end up with a design that just cannot be scaled without removing and redesigning a lot of things that users rely on, and also with engineers that never learned how to do that, so you may end up in an urgent need of a distributed systems specialist too. Scary. You can avoid that by using systems designed for high-availability from the very beginning, like Riak. Same goes for blobs and filesystems, using POSIX API instead of an object storage is going to haunt you later on, because in your designs you may end up relying on things that ultimately cannot scale well and guaranty latency, as gitlab learned the hard way.

dhd415 · on Feb 9, 2017

I'd say YAGNI applies here since 95% of systems will never need to scale beyond the capabilities of a properly-configured RDBMS. Ensuring consistency and durability with a non-relational, distributed datastore generally requires a lot more detailed understanding of its internals than with a RDBMS, so choosing one of those rather than a RDBMS is going to impose costs that should be deferred until they're absolutely necessary.

oblio · on Feb 9, 2017

Plus it's waaay more likely that bugs in application code will haunt you even more. Especially the lack of things such as foreign keys and other constraints.

Schemas can be straight jackets, but in many situations they're actually closer to life jackets.

gerbilly · on Feb 9, 2017

Totally agree that schemas are lifejackets. Carfully modelled data can outlive the application it was originally written for.

Poorly structured data, on the other hand, where the schema and consistency is enforced in code is much harder to migrate.

Maybe our problem is that we write systems with a disposable point of view? We don't even think the system will be around long enough to require migration.

luhn · on Feb 9, 2017

I think people underestimate just how far a properly-configured RDMS can scale. A machine with a couple CPUs and a few hundred gigs of memory can do a lot. Couple that with some tastefully-applied caching and query from a read replica when you can tolerate stale data... You can scale pretty far.

brianwawok · on Feb 9, 2017

And don't forget sharding. If your load is very much user focused.. i.e. user A only cares about the items for user A within the DB... you shard by user_id, and can now host 10 billion users on a farm of RDBMS.

On the other hand, it gets more and more operationally complex as you add shards and failovers and such. If you KNOW you are going to hit that scale (say you are an established company with 100 million customers), it may not be the right choice.

luhn · on Feb 9, 2017

Right. I'll remember that next time I'm an infrastructure lead at a company with a 100 million customers ;)

But thanks for pointing out sharding. It can be powerful strategy.

user5994461 · on Feb 9, 2017

Once you are at a 100 millions customers company, the urgency becomes to ban all these MySQL and postgre for production systems. Or you need to figure out how to seriously handle a multi master setup in production.

There are the 80% bottom of companies who are just toying around doing small products and MVP, that should stick to what's easy.

There are the 5% at the top who are guaranteed to get 1 million customers no matter what they ship, that should stick to doing things right and sustainable from the start.

Contrary to what the other comment state: sharding and failover are ALWAYS the EASIER choice at scale. The alternative is to be unable to change hardware/systems and to be forced into 15 hours downtime window for every maintenance/upgrade operation, which is a huge liability when you have real users.

The 2 types of companies have opposite priorities.

sbov · on Feb 9, 2017

Use cases will vary, but where I work we've scaled a simple MySQL master/slave setup + memcached for some caching up to around 5 million active users per day. The largest table we regularly touched (no caching) had just over 1 billion rows. Most projects will never reach this scale.

dahauns · on Feb 13, 2017

You may be surprised how both extensive and fine grained the constraints you mention can be with current high-end RDBMS. And yes, that includes stuff like high availability and blobs/filesystems. (via naked POSIX API? Why the hell would you do that?) Though - cheap shots be damned - in my experience "high-end RDBMS" don't include (Non-Enterprise) MySQL or - to a certain extent - Postgres.

DivineTraube · on Feb 9, 2017

> Some problems are caused by DBMS to backend and backend-to-frontend network utilization, which is generally not something a specialized DBMS can solve.

I absolutely agree. RDBMSs are the kings of one-size-fits-most. In the NoSQL ecosystem, MongoDB starts growing into this role. Of course, many performance problems are not caused by the backend but rather the network (an average website makes 108 HTTP requests) or the frontend (Angular2 without AOT, I'm looking at you). We took an end-to-end approach to tackle this problem:

1. As objects, files and queries are fetched via a REST API, we can make them cachable at the HTTP level to transparently use browser caches, forward proxies, ISP interception proxies, CDNs and reverse proxies.

2. In order to guarantee cache coherence, we invalidate queries and objects in our CDNs and reverse proxy caches when the change using an Apache Storm-based real-time query matching pipeline.

3. To keep browser up-to-date we ship a Bloom filter to the client indicating any data that has a TTL that is non-expired but was recently written. This happens inside the SDK. Optionally client can subscribe to queries via websockets (as in Firebase or Meteor).

The above scheme is the core of Baqend and quite effective in optimizing end-to-end web performance. Coupled with HTTP/2 this gives us a performance edge of >10x compared to Firebase and others. The central lesson learned for us is that it's not enough to focus on optimizing one thing (e.g. database performance) when the actual user-perceived problems arise at a different level of the stack.

akanet · on Feb 9, 2017

I think the comparison to Firebase is a bit disingenuous. Their core offering is real-time synchronization of large, arbitrary data structures. From perusing Baqend briefly, it seems like your architecture is much more poll-based.

wolle · on Feb 9, 2017

Hi, I'm the real-time guy at Baqend and I'd like to object.

Baqend has streaming queries whose results stay up-to-date over time; purely push-based, no pulling involved. You just have to register your query as streaming query and receive relevant events or the updated result every time anything of interest is happening. In our opinion, this is not only comparable, but goes beyond the synchronization feature of Firebase, because our real-time queries can be pretty complex. A Baqend streaming query can look something like this: SELECT * FROM table WHERE forename LIKE 'Joh%' AND age > 23 ORDER BY surname, job DESC

We are going to publish this feature during the next weeks. But you can already read the API docs here to get a feeling for the expressiveness of the feature: https://www.baqend.com/guide/#streaming-queries

ako · on Feb 9, 2017

The streaming queries functionality seems to fill this need: https://www.baqend.com/guide/#streaming-queries ?

inlined · on Feb 9, 2017

I work on Firebase now, though not the DB team anymore. I'd love more data on how the DB is being outperformed if you're willing to share. Was it write throughout? Read throughout? Latency? Were you using small or large object sizes? Were indexes and ACLs used in both cases? Are you sure your firebase layout used best practices? Have you run the test lately to verify whether the findings are still true?

DivineTraube · on Feb 9, 2017

Sure, the benchmark and the application under test are public: http://benchmark.baqend.com/

We also do have some benchmarks comparing real-time performance, which are under submission at a scientific conference, so we cannot share them yet.

Unfortunately, testing against Firebase using established benchmarks such as YCSB is not very useful, as it's unclear which limits are enforced on purpose and which limits are inherent to the database design. Would you be open to providing a Firebase setup for heavy load testing w.r.t. to throughput, latency and consistency?

inlined · on Feb 9, 2017

Wow, the presentation here is really polished. Kudos!

For any readers, the test AFAICT is time to last byte for a single reader. It's a bit unfortunate that it seems local caching is affecting the bagend result by the time you switch the comparison to Firebase (though in full disclosure, Firebase's local cache isn't available on JS so that's a bit on us too).

I could give a few perf tips on the data layout for Firebase but it isn't clear whether the toy app has intentional indirection to model something more real.

philtar · on Feb 9, 2017

I wanna say he's actually talking about you when he talks mentions developers who over complicate tjings

sandGorgon · on Feb 9, 2017

> At this very moment, I'm working on a classical DBMS, searching through around 200 million tuples within the order of 2 seconds (partially stored columnar using psql array types and GIN indexes).

hi, this comment was timely. I have been asking myself what to use to persist a 10 million node graph. There has been a lot of marketing around using Hbase (obscene to setup and maintain) and Cassandra (much better.. but paywalled).

My main db is postgresql and we use jsonb quite extensively. The problem is that there is a lot of statements that "a single huge jsonb table in postgresql will never scale. you will have made a horrible mistake".

Seems you have ample experience on this front - whats your take?

mhluongo · on Feb 9, 2017

Depends heavily on your data access patterns. If you need eg recursive queries or pattern matching, you might be better served by a graph database.

edejong · on Feb 9, 2017

hi sandGorgon. I agree with mhluongo on this. If you want to do things like transitive closure or connected components, you fall in the 2% section above. However, if you want to encode graphs for message systems, a simple table with parent-id might do just fine. Wrt jsonb: if you only do key->jsonb tables, you might want to consider another kind of system. Nevertheless, we've got millions of trajectories, connected to millions of annotations (links to trajectories with semi-structured jsonb), and combined with some GIN indexes, it works pretty well.

wsh91 · on Feb 9, 2017

It seems like you're suggesting that "classical" DBMS are inherently relational. Pardon me if I've misread you but, respectfully, as awesome as relational databases are, you haven't offered any reason why they should be your default versus some kind of NoSQL store. Insofar as your concern is simplicity, which you say you prefer (understandably!), it's actually just as likely the simplest option will be non-relational. There's also nothing inherently specialized about a non-relational store; in particular, hierarchical and key-value stores have at least as much maturity in production as relational ones (cf. IMS and dbm, respectively).

(Sorry for the pedantry, but I both work on a cloud document-oriented database and recently watched a bunch of collegiate developers struggle to set up MySQL when their queries could easily have been handled by a simpler system.)

edejong · on Feb 9, 2017

The following reasons come to mind wrt RDBMS:

- When properly normalized, future modifications to the relational model will often be much simpler. Transforming hierarchical data stores or document-based ones often involves changing much of the code which calls the database

- Query optimizers are often performing work which is difficult to implement yourself.

- When properly called, query composition allows more computation to be done close to the data.

- Knowledge of RDBMs is widespread allowing a low barrier to entry (is also a huge disadvantage, since it brings inexperienced people to use a sophisticated tool, leading to overgeneralized bad press)

- Index management in relational model is often simpler and logical

- Many RDBMs have a proven track-record of stability, code quality and feature completeness. Take a look at PostgreSQL, for example: GIN / GIST / PostGIS / date-time / windowing functions / streaming results. Also, the extensibility of postgres is astonishing, the community is warm and welcoming to beginners.

Obviously, I could make a similar list of disadvantages of RDBMs. It's just my experience that the advantages mostly outweigh the disadvantages.

fabian2k · on Feb 9, 2017

I'd argue that the power of a classical relational database makes it easier to handle for someone without in-depth knowledge of databases than many NoSQL databases.

Of course, if all you need is to look up values by keys, a key-value store is a much simpler solution. But the moment you need more complicated or just different kinds of queries than you expected, NoSQL can be rather difficult. You can get a lot of complicated queries done with mediocre SQL knowledge and a general-purpose RDBMS, but trying to use a more specialized database outside the intended use can be pretty painful.

I don't have all that much experience with NoSQL databases, but my impression was always that it is more important there to understand the limitations of the database than with something like Postgres that is more forgiving. They might be simpler to get started with, but much harder to use correctly.

takeda · on Feb 9, 2017

The reason most of the time RDBMS are a fit most of the time is because majority of data is inherently relational. It seems that the problem for people with using relational databases is that there's a learning curve to use it. This makes NoSQL much more appealing at first, but once you start using document store and unless your data is trivial (everything can be referenced with a single key) you will realize how hard it is. Sure, NoSQL is easy because you don't have schemas, relations, transactions etc, but because it doesn't have all of that now you have to implement this in your application which is not an easy task.

Another thing (which often overlooked) is that because relational database typically matches data model better your data will take less of space than when stored in NoSQL. With Postgres there's also the extensibility.

I learned this myself, we had a database that stored mapping of IP to a zip code and latitude/longitude to a zip code. The database was on MongoDB, but it wasn't too big (~30GB) the thing is that to maintain a performance all of the data needed to fit in RAM.

I was a bit curious and decided to see how Postgres would deal with this. So I loaded ip4r extension (to store IP ranges) and PostGIS to enable lookups on latitude/longitude. All the data fit in ~600MB, and with proper indexes postgres returned all queries instantly (sub milisecond).

The Mongo was running on 3 c3.4xlarge instances, but postgres would be fine on smallest available one and would still be able to cache whole database in RAM.

falcolas · on Feb 9, 2017

IMO, because you can use a RDB as a K/V store, but you can't use a K/V store as a RDB. And I have yet to personally work at a scale where a sharded RDB was incapable of handling the load.

The big relational DBs scale well (near-terabyte with thousands of queries per second) with little maintenance. They scale ridiculously well (terabyte+ with a million+ queries per second) with more maintenance. Most loads simply don't require the scale at which RDBs start to actually falter at.

nodamage · on Feb 9, 2017

> At this very moment, I'm working on a classical DBMS, searching through around 200 million tuples within the order of 2 seconds

Just curious, does this kind of performance require dedicated hardware or is it possible on platforms like AWS?

edejong · on Feb 9, 2017

This is on a AWS RDS instance, everything is in (DBMS) cache, db.m4.large.

geocar · on Feb 9, 2017

You should revisit your schema. That sounds much too slow.

    q)a:(200000000?200000000),-42 /pick some random numbers
    q)\t a?last a
    303

303msec for a linear scan of 200m ints on an i5 laptop means you should be able to do better than 2sec for a "smart" query.

edejong · on Feb 9, 2017

Well, the schema is a bit more complicated than that. I'd say the 100M / s query rate is sufficient. Here we search for trajectories (road segments + time) going through certain regions within a certain repeating time-interval (say, 2 AM on a Sunday), in a 3 month time interval.

philliphaydon · on Feb 9, 2017

It seems slow to me too...

We have tables with > billion records In RDS / SQLServer and return results in <1 second...

reissbaker · on Feb 9, 2017

Although it sounds like the OP might have some tricky queries, just want to echo that relational databases can be (very) fast if you know how to index them: one of our RDS MySQL tables has 3+ billion rows and queries on it average ~50 milliseconds.

felixge · on Feb 9, 2017

And that's for a sequential scan over all of those tuples?

That'd be very impressive. Does SQLServer support parallel execution? Or how come it be so fast?

philliphaydon · on Feb 9, 2017

> Does SQLServer support parallel execution?

For a long time as far as I know?

felixge · on Feb 10, 2017

Cool. I'm not familiar with it, that's why I ask :).

Anyway, is the performance you mentioned for a scan over the entire table? E.g. to do an aggregate?

philliphaydon · on Feb 10, 2017

Honestly don't remember, I'm working on new stuff in PostgreSQL now.

I know a count of the whole table takes about a minute tho. But the filter > group by was ~1s as the grouping was done on about < 50 rows of the filtered result.

Without knowing the original query above was using it's speculation on why it was slow and mine was fast. My argument is just that SQL Server isn't magically slow. MySQL/PostgreSQL/SQL Server are super fast. And if you don't massage the database it can be super slow too.

felixge · on Feb 10, 2017

Thanks.

> I know a count of the whole table takes about a minute tho. But the filter > group by was ~1s as the grouping was done on about < 50 rows of the filtered result.

Yeah, that makes sense.

I'm running some queries in production that take a 150 million row table, filter it down to ~100-300k rows which are then aggregated/grouped. This usually takes < 1s. However, if I'd try to do a count(*) on the table, that'd be around a minute as well.

> Without knowing the original query above was using it's speculation on why it was slow and mine was fast. My argument is just that SQL Server isn't magically slow. MySQL/PostgreSQL/SQL Server are super fast. And if you don't massage the database it can be super slow too.

Yeah. The query / query plan will be needed to go more in-depth on these kind of discussions. The amount of disk vs memory hits during execution obviously as well.

Anyway, I just wanted to understand if SQL Server was e.g. an order of magnitude faster than Postgres when scanning a large number of tuples. But I guess the answer to that is: probably not.

philliphaydon · on Feb 10, 2017

PostgreSQL has a BRIN index's as of 9.6 which may make some forms of aggregation faster than SQL Server I believe. Would need to do real-world tests to verify that tho.

xref · on Feb 10, 2017

Just to clarify, the whole db fits into 8gb of ram you're saying?

pvg · on Feb 9, 2017

Why non-replicated? Do you mean non-sharded?

edejong · on Feb 9, 2017

I actually meant clustered. Replication is of course a good idea, but distributed query processing, esp. in combination with transaction processing is often too complicated to get the job done.

jo909 · on Feb 9, 2017

I agree, but how do you argue for that?

I work with a customer where every application has to have a dedicated environment, running on VMs in a seperate VPC on a private cloud. The default DB is a galera cluster with 3 nodes, even for the most simple application.

The argument for it is that cloud VMs are expected to fail, so everything has to be setup to tolerate instance failure without downtime.

All I have to counter it is the supposedly higher risk of data loss in a complicated split brain, the unspecific "more complicated" etc. While the risk of instance failure is very real, that feels a bit theoretical and vague.

falcolas · on Feb 9, 2017

> how do you argue for that

0 downtime and 0 data loss is impossible, both theoretically and practically. Even a Galara cluster is going to have downtime if a node fails (timeouts expiring, etc.).

Really, it comes down to what trade-offs is the company willing to accept: the potential for stale reads for high availability, slow/aborted writes for high consistency, downtime for reduced complexity, etc.

A few notes about Galara: its performance going to become severely degraded if one of the nodes decides it needs to be rebuilt - both that node and the donor node will effectively be out of the loop, with the network link between them saturated for the duration. That degraded state doesn't require the node to go down, it can happen spontaneously.

Galara also imposes limits on write frequency and location - if you're doing many writes, you don't want to split those writes between nodes, since every write has to look at every other pending transaction on every node before it can occur.

An automated RDB master failover can occur in under 30 seconds, easily. You can also typically run master-master without a risk of split brain by specifying even and odd primary keys.

> cloud VMs are expected to fail

Yup, but that timeframe is usually on the order of months or years. And if you use externally backed disk stores (at the cost of read/write speed) you can be back up in a matter of minutes with a single master. Even a full restore from backups can typically be done in 10-15 minutes; constrained mostly by your network speed.

pvg · on Feb 9, 2017

Close enough.

pmelendez · on Feb 9, 2017

"searching through around 200 million tuples within the order of 2 seconds (partially stored columnar using psql array types and GIN indexes). Not for a second did I consider using any kind of specialized database, nor optimized algorithms, except for some application-level caching. "

--

That's cool and all, but for some real-time applications, retrieving in the order of seconds is unacceptable.

DBMS has their earned place in the backend stack but I found that statement of covering 98% of the cases a bit of a stretch.

edejong · on Feb 9, 2017

Of course, this is an extreme example. Most of our front-end is able to serve quite complicated data within 500 milliseconds. Some queries are over a longer time-span (3 months, 0.7 million truck-movement-legs, +/- 200M links), so a 2 second wait-time (including a spinner) is sufficient.

schmichael · on Feb 9, 2017

As an Urban Airship employee from 2010-2013 this is eerily familiar stuff. I feel like we should all have a big "we tried and failed to use mongodb as a distributed oltp database" support group.

One point of disagreement:

> However, they had an unspoken value system of not trusting their users to deal with complex database and architectural problems.

At least for enterprise customers (who actually pay the bills) this is the correct value system. It's not that enterprise developers are bad. It's that enterprise development moves incredibly slowly, so the more freedom and options and implementation details you give them to work with the longer it will take them to be successful.

What you as a startup engineer may feel like is a powerful and expressive API just added 6-12 months to the development cycle of every enterprise customer... and thus many will give up on the integration.

powerbook5300CS · on Feb 9, 2017

Parse's MongoDB deployment was definitely not the largest by a longshot. Not even top 10.

They had the largest number of databases on a single replica set for sure, but largest in terms of cluster size or data size? No way.

DivineTraube · on Feb 9, 2017

Do you have a source for that? It is what the Parse engineers told me, but that kind of information is notoriously hard to corroborate. Any time I asked this question to MongoDB engineers they were evading concrete numbers for deployment sizes.

powerbook5300CS · on Feb 9, 2017

I can assure you that you are incorrect. I am the primary source and very aware of Parse's scale.

I worked on building a feature that allowed Parse to list that many databases (it had response that does not fit in the document size limit of 16mb -- requiring multiple cursors, command cursors).

I am a collaborator on the parse-server project, love the Parse team, respect what they built, and have no incentive to deceive you.

I worked at MongoDB for 4 years in various roles, consulting, engineering, etc and saw all sorts of deployments of various shapes and sizes.

DivineTraube · on Feb 9, 2017

I do believe you, but can your share some more hard facts?

I've changed the passage to "one of the greatest". As in the disclaimer above the article, I'm mainly presenting assertions by Parse engineers, who were very sure about being the largest MongoDB user in terms of cluster size.

jasondc · on Feb 9, 2017

Baidu is one: https://www.mongodb.com/blog/post/mongodb-at-baidu-powering-...

160 shards = 160 replica sets, and is already larger than Parse.

bogomipz · on Feb 9, 2017

I would like to hear more about the introduction of LSM-based RocksDB storage into the Parse architecture. Specifically how long before they felt MMAP was no longer meeting their needs?

I was also curious about this? >"Frequent (daily) master reelections on AWS EC2. Rollback files were discarded and let to data loss."

Was the velocity of rollbacks sufficiently high enough that they were't able to process them?

I thought Parse was a great company with a great product that served an important market. I'm still not sure why FB acquired them only to retire it.

mkania · on Feb 9, 2017

We had issues with the MMAPv1 storage engine from the beginning, mostly due to it's lack of document level locking and storage bloat.

One of the biggest benefits from being acquired by Facebook was working with the RocksDB team to come up with MongoDB +RocksDB, but this was also happening around the time when Parse was starting to wind down.

Being the only company running this new storage engine freaked us the fuck out. So we tried to blog and give talks as much as possible to get other folks interested and willing to test it out.

Parse now running MongoDB on RocksDB: http://blog.parse.com/announcements/mongodb-rocksdb-parse/

Strata: Open Source Library for Efficient MongoDB Backup: http://blog.parse.com/learn/engineering/strata-open-source-l...

MongoDB + RocksDB: Writing so Fast it Makes Your Head Spin: http://blog.parse.com/learn/engineering/mongodb-rocksdb-writ...

MongoDB + RocksDB: Benchmark Setup & Compression: http://blog.parse.com/learn/engineering/mongodb-rocksdb-benc...

bogomipz · on Feb 9, 2017

I thought you might say that. If you have an app that does a lot of deletes that bloat becomes noticeable quickly. I think they might actually have document level finally. Rocks is interesting as it deals with the write amplification problem and its tuneable. TokuMX was another interesting storage and had good compression. I would curious if you ever evaluated that.

Are you with Baqend now? Is this your medium post?

mkania · on Feb 9, 2017

Before even looking into RocksDB we tried TokuMX and even got pretty well aquatinted with their dev team. We ran into the same issues when testing Wired Tiger. Neither could handle millions of mongo collections. Since we were involved in the RockDB storage engine from the beginning, we made sure the implementation could handle that many collections.

This isn't my post and I don't work at Baqend, but I will say that the author comes off as a presumptuous asshat.

bogomipz · on Feb 9, 2017

Yeah this reads like this person at Baqend is going over "lessons learned" while at Parse. That's why I thought it might be someone from Parse. It seems like bad form to write a blog post on what other people should have done differently and use that to publicize your own company. Parse was a success.

DivineTraube · on Feb 9, 2017

> So here are some facts and trivia that are not so well-known or published that I collected by talking to Parse engineers that now work at Facebook. As I am unsure about whether they were allowed to share this information, I will not mention them by name.

It's both lessons learned by Parse engineers and us, so I think the intended ambiguity is okay.

bogomipz · on Feb 9, 2017

>"It's both lessons learned by Parse engineers and us"

All of those bullet points are from Parse's history. Where is Baquend's contribution to the "lessons learned" in this blog post then?

DivineTraube · on Feb 9, 2017

Author of the post here. If you have additional key insights about the Parse infrastructure, please post them here and I will directly add them to the article.

220 · on Feb 9, 2017

We were a parse user for (many) apps, and tried to run parse server briefly before just letting all the features built on it die.

I think one of the major instabilities not mentioned the request fanout. On startup the parse ios client code could fan out to upwards of 10 requests, generally all individual pieces of installation data. Your global rate limit was 30 by default. Even barely used apps would get constantly throttled.

It was even worse running ParseServer, since on Mongodb those 10 writes per user end up as a sitting in a collection write queue blocking reads until everything is processed. It was easier to kill the entire feature set it was based on.

I know there were a ton of other issues but I've forced myself to forget them.

koolba · on Feb 9, 2017

What was the throughput (per server) after the Go rewrite? I'd imagine an async rewrite would be a lot more efficent than the 15-30 req/s that the Rails version was getting.

How often (if ever) did you experience data loss due to the MongoDB write concern = 1?

inlined · on Feb 9, 2017

I was the lead for Parse Push. I reverse engineered Resque on Go so we could let any async stage be switched to Go cleanly. I added a few features like multiple workers per Redis connection and soft acks that let the next job be pulled by the framework but still kept the server alive during a soft shutdown until the job completed.

After moving the APNs (v2) server to Go I was able to map the wonky networking to Resque much better. My show & tell that week was literally "this MacBook is now running higher throughout than our cluster of >100 push servers". For APNs, the rewrite was a boon of over 500x per server.

As a great fringe benefit, fewer Resque servers meant less Redis polling. CPU dropped from 97% to <10 IIRC, which helped save us the next Black Friday where Redis falling over from Push load often caused problems.

I don't recall how often data loss was a problem. One of the points that hadn't gone into a lot of detail was when certain ops tricks happened. For example, I recall that reading from secondaries was reserved for larger push customers only. These queries could be wildly inefficient, have results in the 10s of millions, and due to mongo bugs would lock up an entire database if they had a geo component in the query. In later versions of Mongo where the bugs were fixed and after another generation of the auto indexer we were able to send all traffic back to primaries.

inlined · on Feb 9, 2017

Oh. And our ResqueGo outperformed Ruby so badly that we couldn't do a reasonable X% server transition from Ruby to Go by rebalancing the number of workers. It had to be done my making a new Go queue and doing % rollout of which queue work was sent to. We learned that the hard way once (though luckily there was no regression IIRC).

chatmasta · on Feb 9, 2017

I'm surprised vendor lock-in was not mentioned as a problem. I suppose it's a fundamental problem with offering "BaaS" and since the author is pushing his own BaaS offering, any critique of vendor lock-in may be unlikely.

I first used parse for a mobile app that grew to 600k users. I was totally against adopting parse, for a number of reasons including the unpredictable billing, failing background jobs, and the need for retry logic in every single write query. But my biggest issue was with vendor lock-in. When your app succeeds, and the backend is dependent on parse being in business, parse becomes a liability. Eventually you will need to take on a project to "migrate off parse." And you know what? That sucks for parse, because they were counting on power users to pay the big bills. But in reality, once a customer became a "power user," they started the process of migrating off parse.

When parse shutdown, I initially felt vindicated - ha! Vendor lock-in. But they handled the shut down and transition to open source extremely well. As a result, parse-server is now an open sourced, self hosted product with all the benefits of Parse.com but without the vendor lock-in!

I've been using parse-server exclusively for new projects. I'm very happy with it and it is always improving (though backwards compatibility has been hit or miss... but that's the price of rapid releases). It's very easy to setup, and does a lot of crud for you (user registration, email confirmation, login, forgot password, sessions). You can call the API from an open source SDK in almost any language on any platform (note: I'm the maintainer of ParsePy). Also, because you have direct access to mongo, you can optimize query performance outside of parse. For example you can create unique indexes, where with parse you had to query for duplicate objects at application level. There's even initial support for Postgres instead of mongo. Also, the dashboard is nice for non-technical people on your team to understand your data structures and ask more informed questions.

I'm not sure I would ever use another BaaS. It just seems like such a dumb decision to offload the entire backend of your app to a proprietary service. If the service was totally open source from the beginning, with a self hosted option, then I would consider it. At least that eliminates the threat of vendor lock-in. I get the feeling that the excellent handling of parse.com shutdown was an exception to the rule. I don't want to take unnecessary risks with the most important code of a project.

erikwitt · on Feb 9, 2017

I agree, the parse shutdown was organized extremely well. The open source parse server, one year of migration time and a ton of new vendors that now offer to host your parse app, all made it much easier to handle the shutdown. It's also great to see the community still working on the open source server.

That said, there are a lot of upsides to having a company work full-time on your proprietary cloud solution and ensure its quality and availability. If an open source project dies or becomes poorly maintained you are in trouble too. Your team might not have the capacity to maintain this complex project on top of their actual tasks.

Also open sourcing your platform is a big risk for a company. Take RethinkDB for example: Great database, outstanding team but without a working business model and most recently without a team working full time, it is doomed to die eventually.

Nevertheless, we try to make migrating from and to Baqend as smooth as possible. You can import and export all your data and schemas, your custom business logic is written in Node.js and can be executed everywhere. You can also download a community server edition (single server setup) to host it by yourself.

Still a lot of users even require proprietary solutions and the maintenance and support that comes with it. And often they have good reasons, from requiring a maintenance free platform to to warranties or license issues. After all, a lot of people are happy to lock into AWS even though solutions based on OpenStack, Eucalyptus etc. are available.

CurlyBraces · on Feb 9, 2017

"applications were often inconsistent" - I've heared this about parse before. Always thought this is due to using MongoDB. If you use the same database how can you enforce more consistency?

erikwitt · on Feb 9, 2017

Although MongoDB has its limits regarding consistency, there are things that we do differently from parse to ensure consistency:

- The first thing is that we do not read from slaves. Replicas are only used for fault tolerance as it's the default in MongoDB. This means you always get the newest object version from the server.

- Our default update operation compares object versions and rejects writes if the object was updated concurrently. This ensures consistency for single object read-modify-write use cases. There is also an operation called "optimisticSave" the retries your updates until no concurrent modification comes in the way. This approach is called optimistic concurrency control. With forced updates, however, you can override whatever version is in the database, in this case, the last writer wins.

- We also expose MongoDBs partial update operators to our clients (https://docs.mongodb.com/manual/reference/operator/update/). With this, one can increase counters, push items into arrays, add elements into sets and let MongoDB handle concurrent updates. With these operations, we do not have to rely on optimistic retries.

- The last and most powerful tool we are currently working on is a mechanism for full ACID transactions on top of MongoDB. I've been working on this at Baqend for the last two years and also wrote my master thesis on it. It works roughly like this:

   1. The client starts the transaction, reads objects from the server (or even from the cache using our Bloom filter strategy) and buffers all writes locally.

   2. On transaction commit all read version and updated objects are sent to the server to be validated.

   3. The server validates the transaction and ensures the isolation using optimistic concurrency control. In essence, if there were concurrent updates, the transaction is aborted.

   4. Once the transaction is successfully validated, updates are persisted in MongoDB.

There is a lot more in the details to ensure isolation, recovery as well as scalability and also to make it work with our caching infrastructure. The implementation is currently in our testing stage. If you are interested in the technical details, this is my master thesis: https://vsis-www.informatik.uni-hamburg.de/getDoc.php/thesis...

esseti · on Feb 9, 2017

would be intresting understing more about "Static pricing model measured in guaranteed requests per second did not work well." . what happened and what would have been a better solution afterwards.

DivineTraube · on Feb 9, 2017

The core problem was that the sustained number of requests did not really cause any bottlenecks. Actual problems were:

- At the Parse side: expensive database queries on the shared cluster that could not be optimized since developers had no control over indexing and sharding.

- At the customer side: any load peaks (e.g. a front page story on hacker news) caused the Parse API to run into rate limits and drop your traffic.

chrischen · on Feb 9, 2017

I feel like for indexing it could be done dynamically, based on on performance analysis from performance and query profiling. This way you can also do it without really understanding the application logic at all.

erikwitt · on Feb 9, 2017

That's actually exactly what parse did. They used a slow query log to automatically create up to 5 indexes per collection. Unfortunately this did not work that well especially for larger apps.

I guess 5 indexes might be a little short for some apps. On the other hand too many or too large indexes can get a bottleneck too. In essence, you want to be quite careful when choosing indexes for large applications.

Also some queries tend to get complicated and choosing the best indexes to speed up these queries can be extremely difficult especially if you want your algorithms to choose it automatically.

simscitizen · on Feb 9, 2017

We created more than 5 indices per collection if necessary. But fundamentally some queries can't be indexed, and if you allow your customers to make unindexable queries, they'll run them. Think of queries with inequality as the primary predicate, or queries where an index can only satisfy one of the constraints like SELECT * FROM Foo WHERE x > ? ORDER BY y DESC LIMIT 100, etc.

erikwitt · on Feb 9, 2017

That is absolutely right. You can easily write queries that can never be executed efficiently even with great indexing. Especially in MongoDB if you think about what people can do with the $where operator.

What would in retrospect be your preferred approach to prevent users from executing inefficient queries?

We are currently investigating whether deep reinforcement learning is a good approach for detecting slow queries and making them more efficient by trying different combinations of indices.

inlined · on Feb 9, 2017

It's hard to say. Most customers want to do the right thing (though some just don't feel that provable tradeoffs in design are their problem because they outsourced).

I did some deep diving in large customer performance near the end of my tenure at parse to help some case studies. Frankly it took the full power of Facebook's observability tools (Scuba) to catch some big issues. My top two lessons were

1. Fix a bug in our indexer for queries like {a:X, B:{$in: Y}}. The naive assumption says you can index a or b first in a compound index and there's no problem. The truth is that a before b had a 40x boost in read performance due to locality

2. The mongo query engine uses probers to pick the best index per query. If the same query is used in different populations then the selected index would bounce and each population would get preferred treatment for the next several thousand queries. If data analysis shows you have multiple populations you can add fake terms to your query to split the index strategy.

inlined · on Feb 9, 2017

Fwiw, the Google model is to just cut unindexable queries from the feature set. You can only have one sort or range field in your query IIRC in DataStore

DivineTraube · on Feb 9, 2017

The Google Datastore is built on Megastore. Megastore's data model is based on entity groups, that represent fine-grained, application-defined partitions (e.g. a user's message inbox). Transactions are supported per co-located entity group, each of which is mapped to a single row in BigTable that offers row-level atomicity. Transactions spanning multiple entity groups are not encouraged, as they require expensive two-phase commits. Megastore uses synchronous wide area replication. The replication protocol is based on Paxos consensus over positions in a shared write-ahead log.

The reason for the Datastore only allowing very limited queries is that they seek to target each query to an entity group in order to be efficient. Queries using the entity group are fast, auto-indexed and consistent. Global indexes, on the other hand, are explicitly defined and only eventually consistent (similar to DynamoDB). Any query on unindexed properties simply returns empty results and each query can only have one inequality condition [1].

[1] https://cloud.google.com/datastore/docs/concepts/queries#ine...

advisory5739f2 · on Feb 9, 2017

What is the reasoning behind not using sharding and homing each customer to a replica set?

How did you deal with customers growing large or spiking in usage? How did you manage such hot spots?

Sharding is one of the key benefits of MongoDB (and frankly most NoSQL solutions). Of course, you have to pick a good shard key.

One reason I can think of is that IMO sharding on MongoDB before version 2.4 had too many issues to be production reliable. If Parse started with MongoDB 2.2 or earlier, then I can see how they would avoid sharding.

anilgulecha · on Feb 9, 2017

Interesting new take on baas with Baqend. Looks like gomix?

DivineTraube · on Feb 9, 2017

There are definitely some similarities. Like in Gomix, the idea of Baqend is to let developers easily build websites, bots and APIs. However, Baqend has a stronger focus on how data is stored and queried, while Gomix is more focused on a rapid prototyping experience inside the browser.

In my opinion, the whole movement about BaaS is all about making things as smooth as possible for developers and shorten the time-to-market to its minimum. What some providers like Parse lost on the way, are the capabilities for small prototypes to grow to large systems. That requires being scalable at both the level of API servers, user-submitted backend code and the database system. And at some point, it also requires letting the user tune things like indices, shard keys and execution timeouts. This is the route we took at Baqend. We do not want to be the simplest solution out there, but we aim to provide the fastest (we use new web caching tricks) and most scalable one (we expose how the database is distributed).

csmajorfive · on Feb 9, 2017

Can I suggest you edit your title? I thought someone on our team had written this but, in this context, "lessons learned" is kind of misleading.

DivineTraube · on Feb 9, 2017

Could you share what your lessons learned were? That would be a very interesting perspective. Do you disagree with some of the points?

simscitizen · on Feb 9, 2017

Clash of Kings used Parse, not Clash of Clans.

DivineTraube · on Feb 9, 2017

You're right of course, I just changed that.

marknadal · on Feb 9, 2017

If the database is the bottleneck, why are you using the same (MongoDB) database as Parse (and Firebase) were?

The problems described in the article are quite literally are my pitch deck, which I have successfully raised from billionaires Tim Draper and Marc Benioff of Salesforce, for http://gun.js.org/ . So why did you decide to stick with MongoDB when many other databases have tackled solving those problems?

mdekkers · on Feb 9, 2017

var gun = Gun('https://gunjs.herokuapp.com/gun');

gun.put({hello: "world"}).key('random/WVNqluP4q');

and

Databases can be scary

Who exactly are your target market?

marknadal · on Feb 9, 2017

People who are fed up about databases not working. That is primarily developers, but also happens to include some government clients as well.

mdekkers · on Feb 10, 2017

If you are in this business professionally as some form of engineer, thinking that "databases are scary" is like your car mechanic thinking "engines are scary"

DivineTraube · on Feb 9, 2017

The database was not the bottleneck for Parse, how they used it was. Actually that is the case for most databases systems, you have to choose a set of trade-offs your application can live with. We wrote about this in more detail here: https://medium.baqend.com/nosql-databases-a-survey-and-decis...

marknadal · on Feb 9, 2017

Ah, thank you, that is a great answer and gives context.

redwood · on Feb 9, 2017

Word of advice, this post comes across as immature. What are you mentioning "billionaires" for in this context? And careful, with this audience the Theranos-enraged mob is a real liability

marknadal · on Feb 9, 2017

I used to use MongoDB (and I actually like it, unfortunately), however the problems I had with it (that are also problems mentioned in the article) are exactly what caused me to build an alternative.

Some people may think a "NIH" mentality is the "mature" route, but there are legitimate problems that exist that even billionaires can recognize (mentioning them gives credence to that fact). Those are also the same problems that governments face, which is why we (and other new databases that try) provide a solution.

Open Source software is testable, angry mobs won't be angry if you can prove a system works. Which, if you want, here are some examples: A prototype of ours scaling to 100M+ writes for $10/day ( https://www.youtube.com/watch?v=x_WqBuEA7s8 ), recovering from complete server failure and loss ( https://youtu.be/-i-11T5ZI9o ), managing live migrations with auto fail over ( https://youtu.be/-FN_J3etdvY ).

I'm sorry it sounded immature, and I also hope you can see the legitimacy of my question given my experience with MongoDB.