> This is basically just as much work as developing a new database from scratch, i.e. more work than something like migrating to Postgres. This completely debunks the "changing is basically impractical" claim.
I disagree for three reasons:
1. The long tail of code using MySQL at the company, like at any large software company, is prohibitive. You would have to maintain MySQL and PostgreSQL in parallel for years. A new storage engine, on the other hand, is controlled by one team.
2. Migrating from InnoDB to MyRocks consists of successively adding MyRocks replicas, letting them catch up, and removing InnoDB replicas. That is a dramatically easier proposition than migrating tiers to PostgreSQL.
The fact that RocksDB was a hard technical project is kind of irrelevant. The new storage engine provided major wins and could be done within a team, while migrating to PostgreSQL would provide at most small improvements and demand changes to huge amounts of code and massive data migration projects. That makes the former project deeply practical and the latter impractical. If the usual stack back in the day had been the LAPP stack instead of the LAMP stack, we would be having this discussion the other way.
> calling it a "key-value store" is a gross oversimplification at best
That's fair. The right thing to have said would be that the query patterns that are used are extremely simple selects over a single table, which is a place that MySQL has traditionally shone. MySQL's query planner still does strange things on complex queries from time to time. I had a case about six months ago where one shard decided it was going to reorder indexes in a query and load everything in the database's core tables before filtering it down instead of using the proper index order like the other nine hundred something shards. Easily fixed once we realized it (we forced the index order in the query), but the fact that we had to... I have heard that this has all gotten much better in MySQL 8.0.
You're severely underestimating the amount of effort that went into MyRocks. The development and deployment was a 3+ year effort spanning quite a few different teams.
Automating the physical rollout (as you correctly described) is the easy part. That doesn't account for all the many difficult spots that occurred prior to it: the massive complexity of mysql storage engine development in general; huge numbers of various performance edge-cases; converting replication-related system tables to MyRocks in order to achieve crash-safe replication; developing online hot-copy for MyRocks from scratch; schema change support; adding tons of new status vars and system vars; fast bulk import to MyRocks which is necessary for the replica migration to even be possible; updating hundreds of thousands of lines of automation code written under the assumption of InnoDB being the only storage engine in use and using various InnoDBisms...
The MyRocks migration wasn't a project I personally worked on, but I'm very aware of what was involved. It appears you joined FB PE in 2017 and therefore missed much of this historical context? I'm not really sure why you would have such strong opinions about it.
You say that FB is using MySQL because "changing is basically impractical", but also say MyRocks "provided major wins", which seems to be a contradiction. In any case, I'm not aware of any pg feature that provides compression ratios anywhere near that of MyRocks, and pg is only recently even adopting an arch that supports pluggable storage engines at all. In combination it's really hard to make a case that FB is using MySQL just due to historical investment and inability to change.
Honestly I would also not be surprised if FB moves some core tiers off of MySQL to a pure-RocksDB solution at some point in the future. The number of intermediate data services and proxies make this sort of thing absolutely possible. For the same reason, in theory a move to another db like pg would be completely possible without needing to run both in parallel for years (again, just talking in theory; moving to pg just doesn't make practical sense).
> The right thing to have said would be that the query patterns that are used are extremely simple selects over a single table
For UDB, sure. What about all the other MySQL tiers? The non-UDB MySQL footprint at FB, despite being a minority of FB's MySQL fleet, is still larger than the vast majority of other companies' relational database fleets worldwide. The range of use-cases in the DBaaS (XDB) tier alone spans a massive combination of different database features and access patterns.
> You say that FB is using MySQL because "changing is basically impractical", but also say MyRocks "provided major wins", which seems to be a contradiction.
I think I must not be expressing myself clearly. 3+ year projects involving a large number of teams to get back to where you started are impractical. That's what migrating to PostgreSQL would be. Perhaps I should have written "switching from MySQL to PostgreSQL would be impractical"?
Apologies if I'm misunderstanding. To take a step back and paraphrase this subthread, as I understand it:
* `dezzeus said MySQL was better for read-intensive workloads, Postgres better for mixed read/write
* I replied saying there are a number of huge social networks with insane write rates, which is contrary proof against that claim. (Having personally spent most of the past decade working on massive-scale MySQL at several social networks / UGC sites, this topic is near and dear to my heart...)
* You replied saying, iiuc, that FB is only using MySQL for historical reasons and difficulty of switching. (IMO, your initial comment was tangential to the original topic of comparative read vs write perf anyway. Regardless of why FB is using MySQL, factually they are an example of extremely high write rate, previously via InnoDB for many years. That said, I wasn't the person who downvoted your comment.)
* I replied saying that's inaccurate, as FB demonstrably does have the resources and talent to switch to another DB if there was a compelling reason, and furthermore MySQL+MyRocks provides a combination of feature set + compression levels that other databases (including Postgres) simply cannot match at this time. At FB's scale, this translates to absolutely massive cost savings, meaning that MySQL+MyRocks is a better choice for FB for technical and business reasons rather than just historical reasons or difficulty of switching.
I may have misunderstood, but it definitely felt like your original comments were throwing shade at MySQL, and/or publicly stating historically inaccurate reasons for why FB is currently using MySQL.
I disagree for three reasons:
1. The long tail of code using MySQL at the company, like at any large software company, is prohibitive. You would have to maintain MySQL and PostgreSQL in parallel for years. A new storage engine, on the other hand, is controlled by one team.
2. Migrating from InnoDB to MyRocks consists of successively adding MyRocks replicas, letting them catch up, and removing InnoDB replicas. That is a dramatically easier proposition than migrating tiers to PostgreSQL.
The fact that RocksDB was a hard technical project is kind of irrelevant. The new storage engine provided major wins and could be done within a team, while migrating to PostgreSQL would provide at most small improvements and demand changes to huge amounts of code and massive data migration projects. That makes the former project deeply practical and the latter impractical. If the usual stack back in the day had been the LAPP stack instead of the LAMP stack, we would be having this discussion the other way.
> calling it a "key-value store" is a gross oversimplification at best
That's fair. The right thing to have said would be that the query patterns that are used are extremely simple selects over a single table, which is a place that MySQL has traditionally shone. MySQL's query planner still does strange things on complex queries from time to time. I had a case about six months ago where one shard decided it was going to reorder indexes in a query and load everything in the database's core tables before filtering it down instead of using the proper index order like the other nine hundred something shards. Easily fixed once we realized it (we forced the index order in the query), but the fact that we had to... I have heard that this has all gotten much better in MySQL 8.0.