If you don't know what PlanetScale is, it is a serverless database offering. It is compatible with MySQL and offers lots of tooling on top of it. More here: https://docs.planetscale.com/
When I looked at it a few months ago, it had some limitations (lack of referential integrity, MySQL version compatibility) but I'm not sure if they've been addressed (or, in some cases, if they are possible to address).
Anyway, always nice to see more competition in such a fundamentally important space, so congrats!
To the best of my knowledge the scaling behind distributed ranged KV stores (which cockroachDB uses as its storage engine) is easier to manage overall. Whether that's important for "serverless" is another question.
I'm sure PlanetScale will help with rebalancing and re-sharding but, just like FoundationDB, CockroachDB "promises" to automatically re-shard hot keys and ranges for you under the hood, and so _in theory_ scales well for a distributed read/write heavy system over many nodes with potential hotspots. My CockroachDB knowledge is based off of research into FoundationDB, but I think it shares many of the same underpinnings (even the record layer concept). And FoundationDB scales very well.
Vitess, on the other hand, manages the cluster for you but I don't think that the rebalancing and sharding is as easy to manage. Would love to learn more as I don't know much about Vitess.
Since I feel injustice done here, I'd like to point out that harshit164 is a Vitess maintainer who is my go-to person of reference for these exact topics, and who is authoritative about this database technology.
The referential integrity piece (otherwise known as Foreign Keys) will not be addressed and is a limitation - some might say feature - of Vitess. They wrote a pretty good piece about it here, https://docs.planetscale.com/tutorials/operating-without-for...
To me, this means that planetscale (and vitess) immediately become less useful for a large class of applications: existing apps that use an RDBMS that you want to scale. Because most existing apps will expect FKs. I haven't tested my employer's application (which has FKs) to see what will happen, but I doubt it will be smooth sailing.
But there's probably plenty of greenfield development where teams would prefer to stick with standards like SQL rather than a proprietary noSQL language, and that seems like a great fit for this solution.
There is not and will never be a solution to the problem of horizontally scaling an application and database designed jointly for a single-box ACID RDBMS. You have to compromise something to get the scaling.
Depends on your definition of horizontally scaling and the inherent size limitations you would be willing to accept.
CitusDB for example handles this by allowing FKeys between tables that share the same partition key. However it's also not -as- scalable as Vitess or as georeplication friendly but for many uses that need "more than a single box" is probably enough scalability.
Heavy toll on performance and is not really postgres. For the same workload pg can be up to 30x faster or more than cockroach. At some point of course, pg won't be able to keep up and the horizontal scaling of crdb will beat it.
I believe they use some statistical analysis and declared locality tags (from config) to group related cells of data on the same nodes and regions, but I don't know how effective that proves. Pure performance isn't a strong suit of CRDB as far as I can tell, but sometimes that works fine, and it does scale well horizontally.
It's not entirely accurate to say that Vitess "doesn't have" foreign keys. It doesn't enforce FK constraints cross-shard, but it can use FKs to group referential data together within a shard, and the MySQL backing that shard can enforce FK constraints.
I still don't understand how you can work safely with this kind of pricing model. You pay $1.50 per 10 million rows inspected by the database.
A relational database like MySQL has a query planner, so you're not fully in control over how exactly the database accesses your data. How many rows are read not only depends on your specific SQL query but also on the contents of the tables you read and the parameters supplied along with the query.
A statistics update that triggers a bad query plan on a table with 10 million rows could cause a full table scan instead of retrieving a few rows using an index scan. Suddenly your query is a million times more expensive at $1.50 per query. Depending on how quickly and widely this scales, that could get very expensive very fast.
10 million rows is also not a lot for a relational database. I know my comparison now isn't fair as you get a lot more with the hosted version, but a quick and dirty experiment with Postgres and a small table of two ints and a text column with 10 million rows took ~1.5 seconds to query on my desktop, using only 1 CPU core. Translated into this pricing model that's something like $60 per minute.
Basically if you forget to query over an index and incur a full table scan, things are going to get very expensive, and it'll probably be a surprise since devs aren't going to bother checking charges run up.
imo at least with dynamodb, its pretty straightforward what any given call is capable of doing, and you can get back capacity usage on every request to track. with these mysql/postgres compatible apis the actual query is abstracted as is its cost.
Yeah the pricing model for SQL based queries is not optimal at all considering you don't control how the query optimizer will use the indexes, and (not trying to assign malice) they aren't very incentivized to improve it.
MySQL’s “explain <query>” does exactly what you described and better. It shows indexes used, exact number of rows scanned etc. It’s the first thing I do when I suspect a query taking time. It’s a pity that not too many backend engineers know about it now a days.
That only explains the particular query plan chosen for the query at that time. This might vary depending on the table statistics, query parameter or moon phase if you're unlucky. It is a very important tool, but doesn't solve the problem here that in the end the database is in control and decides how many rows to read, not you.
This is a good point. You can control for some variables (like query parameters), via testing, however as table statistics change, so does query planning. For instance, your table gets bigger (or smaller) and MySQL chooses a different query plan.
There's almost no way to completely predict pricing in all but the simplest of apps. Overall, the pricing model seems untenable.
I'm reading MySQL docs [1] and I can't figure out which value in the example output corresponds to the "exact number of rows scanned".
> It’s a pity that not too many backend engineers know about it now a days.
I use EXPLAIN with PostgreSQL when needed but mostly focus on the execution time, not the number of rows scanned (reading a large row could be orders of magnitude slower than reading a row in a JOIN table of two integers in a row based storage engine).
Let's just say that MySQL docs are not super helpful when you want to quickly refer to something.
Here's[1] an explain output that I ran just now. The row labeled "rows" is the actual number of rows scanned which as you can guess is bad because it's doing a full able scan.
I completely ignore the actual execution time and focus mostly on two things 1) indexes used; 2) number of rows scanned. If rows scanned is high it means I need to do something about it now even if the absolute query time is low. Either an index is missing or someone in the application code is querying the wrong way (random example, using a "like" on a full table).
> Every time a query retrieves a row from the database, it is counted as a row read. Every time a row is written to the database, it is counted as a row written.
> Reads: Retrieving or inspecting rows during a query or mutation of any kind to your PlanetScale databases
That's what the detailed pricing page says, which is somewhat different. The pricing is actually per row inspected, not per row returned to the client. So in aggregations you'd pay for every single row you aggregate. So I'd argue that the FAQ is actually misleading.
So does that make an accidental full table scan an expensive operation from a billing standpoint?
Like others say, the end user doesn’t have a lot of control over how the query planner does its magic. They do have control over what goes into and out of the system though.
That being said things like Google BigQuery bill by compute time and data examined so I guess that works as long as the price per “unit” is on the order of millicents or something.
I am having trouble understanding what a "Serveless Database" is. When I do a search of the term, I get hype, not a definition. For example:
"What is Serverless Database?
Serverless Database is a prerequisite for Serverless Computing. These are specially designed for the workloads which are unpredictable and can change rapidly. What’s more? This allows you to pay only for the database resources you use, on a second-by-second basis. ..."
I would really appreciate it if someone would give us a clear definition of this term.
Basically I understand the idea of serverless in this context as abstracting away everything related to managing & maintaining the database - i.e. anything like VMs, containers, OS, DB processes etc.
So you can set up a PlanetScale MySQL DB and use it just like a normal instance of MySQL, but also keep adding data from one small set of records all the way up until you have gigantic petabyte volumes of data without having to do anything beyond sending the data through your MySQL connector.
In theory it should just keep working in a performant way from 100 user records for your new startup to the scale of running parts of Slack. No choosing bigger and bigger AWS RDS instances as you scale, no need for autoscale strategies in case of traffic surges or worrying about replicas for perf etc. etc.
As someone who is honestly quite resistant to parts of the serverless paradigm this offering actually appeals to me. I prefer running my own fleet of VMs and traditional PHP/Nginx type stack but have already moved to AWS RDS to abstract away some of the replication complexity required to achieve high availability DB with minimum hassle. This seems like the logical next step and despite being allergic to kind of hype you mention finding this is something I'd definitely try out before moving other parts of my infrastructure onto anything like Lambda.
Now we need seperate access controls, seperate networking tools, seperate monitoring and diagnostics. It's becoming apearent to me that this kind of stuff is the scam of the century.
There is no clear definition with universal agreement. It's a hype-y term applied with... varying levels of rigor.
However, roughly speaking, "serverless" rolls together 3 features:
1. Fine grained pay-per-use (e.g. pay for a query by the number of rows scanned)
2. The pricing dial goes down to zero when usage is small enough.
3. You generally don't control VM/instance-level scaling but something closer to the abstraction level of the product being claimed as "serverless". For example in planetscale you get no control over how many mysql instances actually run your queries. This is great for reducing operational complexity but not so great for controlling performance. Performance tends to be quite opaque -- for example there's nothing I can find in Planetscale's docs about latency and throughput. The operational benefits are real, though. It's a tradeoff.
This is good info thanks. I have some cloud Infra experience so I am interesting in knowing how does they keep the data stored and remove the "query" servers when not in use.
Possibly some kind of EBS equivalent storage which is attached to the VM when it's booted up? I wonder that creates more failures at the cost of operational simplicity?
It's just a database service. You don't run the servers. You pay someone to do that and you just connect to it and use it and someone else maintains the servers, storage, scaling, backups, patches, etc., according to some SLA and other terms.
There are indeed servers somewhere. "Serverless" is misleading cloud speak for the otherwise easily understood concept of a service.
They almost never shut down the actual VMs, they are simply re-allocated. (Semi)Auto scaling exists behind the scenes but once a provider becomes popular enough the VMs become more expensive to stop.
Amazon Aurora has http endpoint too, for its serverless offering.
I use S3 with duckdb files in 'em as a sharded OLAP database of sorts. No transactions or joins though, but my workload is write once, ready plenty. Hoping Databricks' serverless SQL is a good fit once it emerges out of beta.
You don't provision and manage the infrastructure associated with your DB. Someone else does it for you and you just focus on querying/storing the data and whatever else your business requires you to do.
Not provisioning & managing is part of it, but the pricing model is important too. e.g. RDS and Mongo Atlas are managed database services so you're not provisioning and managing the infra, but you are paying for dedicated machines and their sizes, etc.
>but you are paying for dedicated machines and their sizes
I see this as provision & managing as well, but yes I see your point. Ultimately it's about the user only caring about its data and nothing else (to the extent possible).
With a dedicated Mongo DB cluster (server-full), you are paying for a certain cluster size, per hour. It doesn't matter if you read or write any data to it. You're paying for a machine with a specific amount of storage capacity and cpu. Use it or lose it.
DynamoDB (considered serverless), you're charged based on the read and write throughput, and how much you have stored (GB-month). If you don't store any data, you're not being charged for unused storage capacity, like a dedicated cluster. You don't think about the instance size, amount of memory, etc.
Serverless means infrastructure as code and you're in as much control of the system as they want, which can be near 0, and they'll tell you it's for your benefit to have no control. Why use a bank vault when you can have a crypto wallet.
I'm curious about what kind of workloads people would be using a serverless DB for?
If I understand it correctly, let's say my DB is only being used to process 1M transactions from 9am-1pm, I'm basically only paying for the load during that time, versus a managed DB where it's being paid to be on 24/7. With most serverless there is a penalty though - cold startup.
So is it purely an economic play for esoteric DB workloads? If my DB is realistically going to be churning for 24/7 anyway, why would I ever use serverless DBs?
Lower operational burden - if you're using a serverless db maintaining servers is one less thing to think about. Same with any infrastructure - you don't have to worry about orchestration, load balancing, authentication, etc.
As I understand it, they've extracted the storage and compute parts of the database and are running them in a scalable way such they can automatically add more compute or storage as needed.
PlanetScale is probably the only database vendor I like, and I don’t just like ‘em. I love ‘em. It works extremely well (Vitess) and scales easily using PlanetScales UI. The the thing that I like about ‘em so much is their great customer service… there have been a few hiccups but they’ve always been quick to remedy and above all honest about it.
Vitess, for what it’s worth, is probably still under the radar for most people but is a really nice piece of technology (distributed database built on MySQL).
Open source, serverless, developer experience, really going for SEO with this blog. But it doesn't say anything about what the product does or that it's MySQL compatible. Don't waste your time reading it.
This seems cool. I am currently shopping around for a cheap managed database for an app I am launching soon (and bootstrapping - so cost is a concern right now). We are starting small but want room to grow. The free tier looks really promising! Especially compared to some of the entry level plans on other cloud providers.
Can anyone who has tried Planetscale and compared it to other managed database solutions tell me about their experience?
I've tried Aurora and Firebase. I think PlanetScale wins on 2 areas at 2 different scales: branching when you're small, and connections when you're large.
With branching it's a feature where every dev can run a command and get their own database version to mess with. They can submit a pull request and as the tech lead I can check it and publish it to prod. Or integrate it with our CI.
On connections, it is interesting how serverless architectures totally flip the model. My node servers used to make a few connections to Postgres while serving thousands of users. Now, each NextJS instance makes its own connection and I might have thousands of database connections for thousands of users. PlanetScale is the only one I know of with effectively no limit on the number of connections. I think Aurora tops out at 16,000.
Does the branching system have an anonymization layer? At pretty much every company that is beyond "tiny startup with < 10 employees" you really don't want to give every developer read access to all customer data.
> The branching feature is really good idea, it allows you to confidently run migrations with no downtime
So this branching feature is one I still don't understand properly.
Let's imagine my startup is big enough to warrant a PlanetScale database. Hundreds of rows are written every second. When I branch, are those writes applied to the original and the branch? If so, what happens with migrations that mean the incoming writes no longer work? Or is the "confidence" it gives not about full migrations but about knowing that if your migration fails the original DB is still working?
Every branch in PlanetScale has its own separate MySQL connection credentials first off (and conceptually just looks like a separate DB), so no. You'll be using your main production database as usual, and you'll have separate credentials for a branch where you can make schema changes like ALTER TABLE, and you "connect" to that branch with... whatever. Deploy a canary version of your app, use your SQL IDE, whatever. The main branch always prohibits changes like ALTER TABLE or CREATE DATABASE. Merging a branch where the CREATE/ALTER happened is the only way to bring changes to the production database that serves your users queries.
When you merge a branch in PlanetScale, you can still write to the database while the migration is happening, online. This is done through "Online Schema Change", but the TL;DR is that a system in the background creates the new database, reads the mysql binlogs from your original database, including all the writes, and propagates those writes to the new database with the changes applied. It does this continuously, in real time. Once the new database is caught up, the system switches over transparently to make the changes atomically appear live.[1]
The requirement is that the running application server has to be able to handle both the old and new schema at the same time. This is easy if you do something like "add a column that isn't yet used and will be soon", but harder for "dropping a column that is in use". No database alone can solve this however, you have to coordinate your application layer correctly so that it can handle the switch over when you apply the changes. But this means you can have very high uptime with e.g. blue/green deployments and staged rollouts, and you'll never have to take your DB offline for it.
It's also trite but "hundreds of writes a second" is nothing. Vitess can handle 100x that write load (probably much more) and deal with schema changes to petabyte-scale databases completely online. PlanetScale currently doesn't let you control the Vitess keyspace sharding configuration (yet) which is crucial to this but I suspect it will happen eventually. But they'll also let you run Vitess in your own K8S environment while they provide a control plane, so you wouldn't pay list price, anyway.
I do not have any affiliation with PlanetScale but it's a pretty cool product I could see myself using for my low-volume needs, just because of the pay-as-you-go model where I don't pay for e.g. RDS allocation. And I say that as a really big PostgreSQL fan.
I think going for a speciality database at this stage is premature optimization. Go with the safe choice (Existing tooling, knowledge, free tiers) and use Postgres or something else established. Later on you can always move workloads to other databases if needed.
It's not really specialized, it's a fancy MySQL until you need more. And then, it becomes 'more' at the click of a button. IMO if I were starting a cheap MySQL-powered project today, I'd use their free tier. The branching stuff is genius, and they're really focused on offering a clean user experience.
This seems dodgy. Is planetscale behind Vitess? If you press the "Learn more about Vitess" button you get to a page that gives you the impression that it is the homepage of Vitess (and that it is their product). But the actual website of vitess (https://vitess.io) mentions nothing of planetscale and the people behind it. Additionally, planetscale are displaying users of Vitess which makes it look like they are also users of planetscale.
PlanetScale engineer and Vitess contributor here. As a brief history, Vitess was originally developed as an open source project in YouTube, and then handed over to CNCF.
PlanetScale was created by authors of Vitess, and employs a team of Vitess developers and contributors. There are Vitess contributors outside PlanetScale, from companies such as Slack, Square, Nozzle, Pinterest and others (apologies to people/companies not mentioned).
Just to explain how we work, if you ever come by to KubeCon, you're likely to find our Vitess booth. It is normally staffed by PlanetScale engineers. But in that booth we wear the Vitess hat, not the PlanetScale hat. We want to respect Vitess for being its own open source project, which we share with our community friends from different companies.
Then again, as we employ many Vitess contributors, including the Vitess project leader Deepthi Sigireddi, we do pride in driving this OSS project from within PlanetScale. As we wear our PlanetScale hat, we give credit to the Vitess technology our product is based on.
I'm a bit confused with the "branching" [0] and "non-blocking schema changes" [1] features. I'm confused as they sound like "the next big thing" and I don't see anything special here. Not saying are bad concepts or ideas, the contrary. But not really useful either. Surely I'm missing something, so I would love to hear from the PlanetScale team here if possible.
I have a strong and long Postgres operational background, so I may be also here with assumptions that might be different in MySQL/Vitess/PlanetScale. My main concerns/questions are:
* I can't imagine testing DDL changes without data. Having data there is so important to understand the change and its impact, that I won't do them without data. And unless I'm mistaken, these branches only contain DDL, no data at all ("data from the main database is not copied to development branches").
* While it sounds neat, has a web UI and a CLI, managing branches of a schema and using CI and approval lifecycle... is something that sounds like I could do, and possibly better (as it is more integrated with tooling and workflows) from Git platforms themselves, isn't it? I could do branches, merges, CI, comments on MRs, approval... I could even easily build a deploy queue ("promote") with a CI. Doesn't sound like too hard.
* I don't understand how the "safeness" and the non-blocking nature of changes are ensured. Many DDL changes will take different amount of locks on rows or tables, which may cause some queuing and even lock storms in the presence of incoming traffic. Without incoming traffic, they may run fine. In other words: the impact of a migration can only be determined in combination with the traffic hitting production. How does PlanetScale do this? How for example is handled the case where a DDL changes the type of a column to another type which causes a table rewrite, which essentially locks the table and prevents concurrent writes?
Again, not saying both concepts are bad. Terminology and methodology may be already an innovation. And surely I'm missing a lot. But other than this, I don't see myself using this (testing migrations without data is a showstopper, and not the only one) and I don't see much of an innovation from a safeness perspective here.
Why this system isn't one where thin clones of the database are created as the branches (e.g. like in Database Lab Engine [2]), where you can play with data too, and then some data synchronization is performed to switch over to the branch once done (is not easy at all, but doable with many precautions)? That would be a significant improvement in the process, IMHO.
Hi, engineer at PlanetScale and maintainer for Vitess here. Appreciate your thoughtful comment, a couple answers:
> is something that sounds like I could do ... from Git platforms themselves
Git is very bad at analyzing SQL diffs. It can show you the textual diff between two CREATE TABLE statements, but it will not know what it takes to get you from _here_ to _there_. It has many parsing issues, like capturing irrelevant columns due to trailing commas. Or, for example, if you change a column's data type as per your suggestion, Git cannot differentiate that from a complete drop and recreation of the column. It just doesn't have insight into how SQL works/parses. And most importantly, it is unable to provide the actual operational diff you're seeking: the ALTER TABLE statement to take you from state A to state B.
At this point I just want to give a shout out to skeema [0] and its underlying library tengo [1], which tackled this issue in a git-like manner for MySQL dialects/flavors.
> Many DDL changes will take different amount of locks on rows or tables, which may cause some queuing and even lock storms in the presence of incoming traffic.
This is indeed one of our main premises. Online schema changes in PlanetScale (and based on Vitess) will:
- Run concurrently to your production traffic
- Will automatically throttle when your production traffic gets too high, and in particular taking care not to affect replication lag
- Will run completely lockless throughout the migration, up to the cut-over point, where locking is required
- At cut-over point, will only cut-over when it predicts smooth operation (i.e. when satisfied that its own backlog for cutting over is short enough that lock time is minimal)
- To top it all, PS also manages the lifecycle of the Online DDL, such as service discovery, scheduling, error handling, throttling (mentioned), cleanup and garbage collection and more.
I just want to clarify the above is based on proven technologies, widely used in the MySQL community, which I was fortunate enough to be involved in for the past years. These run at scale for the largest deployment in the world today. We keep evolving Online DDL with more to come.
Please also see these docs on the Vitess website: [2]
Do note, earlier this month I archived the repo linked in [1] as a last resort, directly in response to the repeated behavior of you and your colleagues.
I don't even know where to begin with explaining this to outsiders, but here's a sample: I receive confused support emails from companies and users (always non-paying ones, at that) about Skeefree literally every single week; at no point in our lengthy email discussion last year did you ever disclose that your employer would be basing the entirety of its marketing campaign for its commercial offering around schema management in direct competition with my own bootstrapped products; then more recently in your fork of [1] your coworker is adding functionality that once again directly competes with the functionality in my commercial products.
In brief, repeatedly using my own open source work to compete with me, which in turn is preventing me from ever generating revenue from this work, which is needed for the work to continue.
It seemingly never ends. At this point I'm literally on the verge of throwing all my work in the trash and never touching a database again. I've clearly wasted the last several years of my life and all I get in return is a shout-out on a day-old HN thread, cool cool.
Thank you for your comments, I appreciate it. I'm still not sold, however. I would like to understand the underlying principles, "how this works". I don't need implementation details (happy if they are shared, though) but more on the main principles of operation. Please see my further comments below:
> Git is very bad at analyzing SQL diffs.
Agreed, nothing against. So PS has built-in a nice SQL diff. Neat! But what this really brings? I mean, it's not that there aren't SQL diff tools, tools to manage DDL migrations. Besides this, why not layer it on top of Git? Many orgs and integration tools already have similar workflows (e.g. approval workflows, issue management tools, CI, etc) and if instead of coming up with a new system it would be a layer on top of the existing ones, it would probably have less friction to use. Just my perspective on this, of course.
> Run concurrently to your production traffic
Can you elaborate? How? Do they run on another servers? Or are they waiting on a queue change waiting to be applied? If they run on different servers, what they run there, since AFAIK the migration is only DDL, there's no data?
> Will automatically throttle when your production traffic gets too high, and in particular taking care not to affect replication lag
Same as above: who will throttle, the migration? But what is the migration? Let's use my example: a column type change requires a table rewrite. So the table rewrite will throttle, i.e. slow down? But where is this table rewrite running, on the main server (apparently not) or on a shadow server (apparently either since migrations have no data)? Actually you mention "when your production traffic gets too high". What is "high", can you quantify? We run customers that do dozens to thousands of transactions per second. Is this high enough? Will their migrations ever run, or will wait for very long periods of time, maybe forever?
> Will run completely lockless throughout the migration
How is this possible? Where the migration is running, then? A shadow table, shadow server... none?
> At cut-over point
What's cut-over? Are groups of servers switched? This is what it sounds to me, and that would explain how it could be lock-less and not affecting production traffic. However, it does not explain how data is synchronized from the production database to the migration branch, nor how it keeps being updated with the real production traffic. This is essentially the crux of me failing to understand how this system works.
In general, I apologize if these are too many questions. But in essence, I feel this all sounds really well, but unless I have a deeper understanding of how the principles work, and they are sound to me, I won't be able to recommend this for production usage, as I know from experience the many caveats migrations have. If they are all solved, hats off, but I would appreciate if from a technical perspective this would be more clearly explained.
I think this comment answers most of my questions: https://news.ycombinator.com/item?id=29248306 Can you confirm (PS) this is how it works? From what I understand here, there are "shadow servers", replicating from the production traffic.
If so, this is cool. I still see some caveats:
* One already mentioned, the scope of migrations is limited to those where both old and new DDL are compatible with the currently running application. If this is the case, I believe it should be clearly advertised as such.
* Being the migration asynchronous, I lose control of when to deploy changes to the application. Even a hook would go a long way, to trigger this.
* Not knowing exactly then the cut-over process is going to happen is also potentially a problem. I understand the cut-over may involve performance degradation (e.g. higher latency) or even connection loss (may you also confirm PS how it is performed?) during some period of time, possibly small. But still, I may need to plan a small maintenance window. But if this is async, I cannot plan the window appropriately.
Neither of this takes away any merits from the solution.
Thank you! Please first see my comments to parent, as they describe how online schema change work within the same server; with PlanetScale branching, we do give you a development branch with which you can play as much as you want, without affecting production. Online schema change kicks in when you deploy your changes to production.
> the scope of migrations is limited to those where both old and new DDL are compatible with the currently running application.
You are absolutely correct, and that is the paradigm. Say, for example, you want to add a column, so you first run the migration that adds the column, and only afterwards can you deploy an application change that actually utilizes that column. Likewise if you want to DROP a column, you first deploy an app change that ceases to reference the column, and only then can you actually drop it.
This paradigm worked very well for the companies I worked with, and makes for both loose and tight coupling between code and database. It's loose when you have your test databases where you can deploy schema changes at will. It's loose in the sense you can take small steps at a time, each isolated from the other (e.g. ADD COLUMN does not require you to make any app changes _yet).
Then, it's tight where you couple your code changes with the schema in your git repo. It's tight in that the app never gets too far from the database (normally one change away at any given time, per development branch).
> Being the migration asynchronous, I lose control of when to deploy changes to the application
Great point and absolutely on our radar.
> Not knowing exactly then the cut-over process is going to happen is also potentially a problem.
Again great point and on our radar. To be honest I previously moved away from caring about the exact cut-over time. We designed gh-ost to do just that: stall cut-over until the engineer/developer is happy to sit at their desk. OVer time, we found it was unnecessary. But absolutely there's use cases for both approaches.
> Thank you! Please first see my comments to parent
Thank you indeed for the time taken to answer all my comments. Now together with all the information here, I understand how it works, and what the trade-offs are.
If my input serves for anything, I'd strongly recommend to take all the information here and write it in a structured way as part of the documentation. I didn't see there any information as valuable as this one. For me, and possibly many others, knowing this information is required in order to make informed decisions about whether to use this or not; and if so, how and what are the trade-offs (e.g. atomizing the changes such that db changes and code changes are independent, which I agree is in general a good thing, but is something to be clearly aware of).
> Again great point and on our radar. To be honest I previously moved away from caring about the exact cut-over time. We designed gh-ost to do just that: stall cut-over until the engineer/developer is happy to sit at their desk. OVer time, we found it was unnecessary. But absolutely there's use cases for both approaches.
For me it's important as cut-over takes some locks. Sure, for a small amount of time. But these locks may create some problems, so that's why I want to be aware. Most of the time are other DDL changes, which are a non-issue here since you already prevent that. But there could be others related to normal db operation. For example, and this may not apply here but does apply with Postgres, such a lock may queue other locks behind (including read-only queries). And if the cut-over lock is itself blocked by other lock (say an explicit table lock), then everything queues on that table and leads to a lock storm, which in turn may cause effective downtime. That's why when we plan migrations or operations similar as this cutover (for example in Postgres a repack operation, which is essentially rewriting a shadow table, in this case just for the purpose or reducing bloat), we really need to take this into account.
Yes, there are many tooling, but one of my frustrations is that they're, well, tooling. In Vitess and in PSDB, the database itself gives you the developer flow you expect. You can write a solution that integrates with GitHub, but then someone else uses BitBucket, or Phabricator, or whatever other framework they have.
I've been in the MySQL space for some 20 years now, 12 of which are active in open source. What frustrates me more than anything is how different companies have to come up with similar solutions to the same problems - but cannot afford to "just use" some existing tooling or framework, because it was build for a specific kind of infrastructure, or assumes this and that setup. Cloud or no cloud? Kubernetes or bare metal? DNS or proxies? Central service discovery or distributed configuration? And so on and on...
So many tooling written to compensate for funtionalities missing in the database. We all wished the database would _just do it_. As an engineer, here is my opportunity to write stuff into the database system, or into a framework that presents itself to the app as the database. To then be used by however users choose to, because they have a functionality they don't need to worry about, and can have less boilerplate code to fit it in their infrastructure.
I am estimating that your database space isn't MySQL, which is just fine of course. Reason I'm asking/guessing, is that in the MySQL space, online schema change toold have been around for over a decade and are the go-to solution for schema changes. A small minority of the industry, based on my understanding as a member of the community, uses other techniques such as rolling migrations on replicas etc., but the vast majority uses one of the common schema change tools:
I authored the original schema change tool, oak-online-alter-table https://shlomi-noach.github.io/openarkkit/oak-online-alter-t..., which is no longer supported, but thankfully I did invest some time in documenting how it works. Similarly, I co-designed and was the main author for gh-ost, https://github.com/github/gh-ost, as part of the database infrastructure team at GitHub. We developed gh-ost because the existing schema change tools could not cope with our particular workloads. Read this engineering blog: https://github.blog/2016-08-01-gh-ost-github-s-online-migrat... to get better sense of what gh-ost is and how it works. I in particular suggest reading these:
At PlanetScale I also integrated VReplication into the Online DDL flow. This comment is far too short to explain how VReplication works, but thankfully we again have some docs:
Not to leave you with only a bunch of reading material, I'll answer some questions here:
> Can you elaborate? How? Do they run on another servers? Or are they waiting on a queue change waiting to be applied? If they run on different servers, what they run there, since AFAIK the migration is only DDL, there's no data?
The way all schema change tools mentioned above work is by creating a shadow aka ghost table on the same primary server where your original table is located. By carefully both copying data from original table as well as tracking ongoing changes to the table (whether by utilizing triggers or by tailing the binary logs), and using different techniques to mitigate conflicts between the two, the tools populate the shadow table with up-to-date data from your original table.
This can take a long time, and requires an extra amount of space to accommodate the shadow table (both time and space are also required by "natural" ALTER TABLE implementations in DBs I'm aware of).
With non-trigger solutions, such as gh-ost and VReplication, the tooling have almost ocmplete control over the pace. Given load on the primary server or given increasing replication lag, they can choose to throttle or completely halt execution, to resume later on when load has subsided. We have used this technique specifically at GitHub to run the largest migrations on our busiest tables at any time of the week, including at peak traffic, and this has show to pose little to no impact to production. Again, these techniques are universally used today by almost all large scale MySQL players, including Facebook, Shopify, Slack, etc.
> who will throttle, the migration? But what is the migration? Let's use my example: a column type change requires a table rewrite. So the table rewrite will throttle, i.e. slow down? But where is this table rewrite running, on the main server (apparently not) or on a shadow server (apparently either since migrations have no data)? Actually you mention "when your production traffic gets too high". What is "high", can you quantify?
The tool (or Vitess if you will, or PlanetScale in our discussion) will throttle based on continuously collecting metrics. The single most important metric is replication lag, and we found that it predicts load more than any other matric, by far. We throttle at 1sec replication lag. A secondary metric is the number of concurrent executing threads on the primary; this is mroe improtant for pt-online-schema-change, but for gh-ost and VReplication, given their nature of single-thread writes, we found that the metric is not very important to throttle on. It is also trickier since the threshold to throttle at depends on your time of day, particular expected workload etc.
> We run customers that do dozens to thousands of transactions per second. Is this high enough?
The tooling are known to work well with these transaction rates. VReplication and gh-ost will add one more transaction at a time (well, two really, but 2nd one is book-keeping and so low volume that we can neglect it); the transactions are intentionally kept small so as to not overload the transaction log or the MVCC mechanism; rule of thumb is to only copy 100 rows at a time, so exepect possibly millions of sequential such small transaction on a billion row table.
> Will their migrations ever run, or will wait for very long periods of time, maybe forever?
Some times, if the load is so very high, migrations will throttle more. At other times, they will push as fast as they can while still keeping to low replication lag threshold. In my experience a gh-ost or vreplication migration is normally good to run even on the busiest times. If a database system is such that it _always_ has substantial replication lag, such that a migration cannot complete in a timely manner, then I'd say the database system is beyond its own capacity anyway, and should be optimized/sharded/whatever.
> How is this possible? Where the migration is running, then? A shadow table, shadow server... none?
So I already mentioned the ghost table. And then, SELECTs are non blocking on the original table.
> What's cut-over?
Cut-over is what we call the final step of the migration: flipping the tables. Specifically, moving away your original table, and renaming the ghost table in its place. This requires a metadata lock, and is the single most critical part of the schema migration, for any tooling involved. This is where something as to give. Tooling such as gh-ost and pt-online-schema-change acquire a metadata lock such that queries are blocked momentarily, until cut-over is complete. With very high load the app will feel it. With extremely high load the database may not be able to (or may not be configured to) accommodate so many blocked queries, and app will see rejections. For low volume load apps may not even notice.
I hope this helps. Obviously this comment cannot accommodate so much more, but hopefully the documentation links I provided are of help.
> I am estimating that your database space isn't MySQL, which is just fine of course.
You are absolutely right :) My background is strongly on Postgres, you can see from my profile more information if you want to.
So yes, I apologize if some of my questions are not applying or become to obvious for cases that are MySQL-based. But for the most part, I believe principles of operation are the same.
> [other comments]
As mentioned, thank you very much for the detailed information. This completes the picture that I was looking for. I will definitely go in more detail for some of the links provided.
This principle of operation is not too different from something I proposed to a Postgres project some time ago (https://github.com/cybertec-postgresql/pg_squeeze/issues/18). This tool indeed is conceptually pretty similar. It's a shame that supporting schema changes is not part of their focus at this point. It wouldn't do throttling either, but it shouldn't be a difficult feature to add, I guess.
For other users here that may be interested in the Postgres world, there are two tools that perform similar operation (creating a shadow table and filling it in the background), but are both focused on rewriting the table to avoid bloat, rather than for doing a schema migration:
* pg_repack (https://reorg.github.io/pg_repack/): the most used one, relies on triggers
* pg_squeeze: already mentioned, uses logical replication
I'm skeptical that YouTube would be using this for anything remotely significant given Google's strong propensity to only use internal technologies that they fully control.
I doubt YouTube is using the cloud service. If they're not 100% on Google's services by now, they still use Vitess. Vitess is the open source project PlanetScale manages which lets you horizontally scale MySQL databases.
I would love to try this out but having only worked with MSSQL and Postgres for any large datasets I wanted to check size limitations of MySql and found this from 2020:
"In addition, a practical size limit on MySQL databases with shared hosting is: A database should not contain more than 1,000 tables; Each individual table should not exceed 1 GB in size or 20 million rows; The total size of all the tables in a database should not exceed 2 GB."
Indexes, rather. They charge by rows read, so if you don't add indexes you will (very quickly) read all the rows in your table on every query. That racks your bill really fast.
The foreign key thing is because the author thought foreign keys add indexes, but here foreign keys are no-ops and don't do anything. Because it's so distributed there's no support for foreign keys.
So does this share MySQL’s weaknesses? As someone that used MySQL for over a decade before seeing the light and switching to Postgres, I’d be more reassured if this wasn’t compatible with any existing RDBMS than with it being MySQL compatible.
Vitess is basically a sharding layer on top of MySQL nodes, so yes, it will almost certainly retain whatever MySQL warts you dislike. It's clustered MySQL - not simply a different database speaking the same dialect, like CockroachDB is to Postgres.
As with most similar rants against the earlier days of MySQL, the majority of these points are based on old versions and don't reflect how things have been for almost a decade.
The question was what made me switch, which was all the crap MySQL does to a dev. That blog post is the best that I know of that captures the fractal of bad design that is MySQL. While, yes, it matches my understanding that some of those points are no longer valid, it doesn't match my understanding that most of the points are no longer valid. (E.g., just a few years ago, when I last used MySQL, I seem to remember that it still struggled w/ UTF-8, still struggled with the definition of what a "key" is, and still struggled with functional dependencies.)
Postgres is a great choice too, and for most intents and purposes, Postgres or MySQL is the correct default choice for a database, so I'm not trying to persuade you to switch back. Just clarifying that almost none of the points in that article are still valid, including UTF-8 handling. The author themselves put a disclaimer at the top of the article stating as much. Not quite sure what you mean by "definition of a key" or "struggled with functional dependencies", so I can't speak to those directly.
E.g., the following used to be a statement in MySQL, at the time I used it; I presume it still is:
CREATE KEY …
It is the primary form of this alias:
CREATE INDEX
which is what the command does (it creates an index; it does not "create" a "key".) But since the form of the command with the correct words is itself an alias, the wrong form gets output any time MySQL self-describes something, generating endless confusion from people thinking a set of columns are a key, when they actually may or may not be. (They might also be a key, but index does not imply key.)
> struggled with functional dependencies
E.g., in GROUP BYs:
SELECT a FROM table GROUP BY b;
where a is not a functional dependency of b. (The statement is thus nonsense.) MySQL will permit this statement to execute, and generate undefined results for column `a` in the output.
In the day I used it, UTF-8 was not the default (the default was latin1, IIRC). Even better, the "utf8" encoding wasn't UTF-8. It was like UTF-8 except where it wasn't. Generated endless numbers of "ah crap, this table/DB isn't UTF-8 and will need a migration."
Simple date math in SELECTS would result in idiocy. (The output of SELECT date_a - date_b was something like "stringify the dates, remove the hyphens, cast to int, subtract" — yes, that sounds crazy, but yet that was a bug I had to work out once. There's a separate DATEDIFF function or something to do the right thing, but the point is that the naïve thing neither errors nor does anything sensible.)
> it creates an index; it does not "create" a "key"
I'm still not sure what you mean. By key, do you mean a unique key? With some quick searching, I'm not finding any difference in the Postgres docs.
> MySQL will permit this statement to execute
That was disabled by default in 5.7, 6 years ago. I totally agree that it should have been turned on by default earlier, but isn't a valid criticism of modern MySQL.
> Simple date math in SELECTS would result in idiocy
I've never tried that before, and it looks like that hasn't changed. No idea why that doesn't throw an error.
Since 8.0 (April 2018), schema changes are transactional, but you can't include DML changes in the same transaction as a DDL. I haven't found that to be a dealbreaker, but would certainly be nice to have.
No, I mean key[1]. A key is a set of attributes that uniquely identifies a row. The primary key in a table is typically one such key, but tables may also have other such keys. In particular, if the primary key is a surrogate key, there is often another natural key. "Key" is just the general form of all of those things. A key MUST be non-null & unique in most RDBMSs, or it trivially isn't a key.
But an index is just a datastructure to speed up access. You almost always cover the key with an index, as keys are generally the means with which your going to look up other attributes. But one often has other indexes that cover non-keys, e.g., if one wants to search based on some non-key attribute.
So naming the create index command "CREATE KEY" is kind of dumb. I don't think they'll remove it, as it would be a breaking change. (Probably even changing the SHOW CREATE TABLE output to use CREATE INDEX would be breaking to somebody, and I don't see MySQL as being willing to do it…)
> That was disabled by default in 5.7, 6 years ago. I totally agree that it should have been turned on by default earlier, but isn't a valid criticism of modern MySQL.
So it has! I've used MySQL within the last 6 years, and definitely have hit this, so… IDK. It was 5.<something> but that's not really saying a lot given how long the 5.x series persisted. We were on a managed service, and I wonder if our service host was holding us back, here.
Also, at least in the Docker container, the default encoding does seem to at least be utf8mb4, which is good, though I kinda wish they'd just get rid of "utf8" and make utf8mb4 utf8, but, this'll probably do the job. So, good to see MySQL has improved, I guess, but for me personally it took much too long.
Related to some of the comments here, we are building PolyScale.ai https://www.polyscale.ai/ which is a global database cache. This solves regional latency and read query performance (concurrency), whilst maintaining the transactionality of the origin database.
When I looked at it a few months ago, it had some limitations (lack of referential integrity, MySQL version compatibility) but I'm not sure if they've been addressed (or, in some cases, if they are possible to address).
Anyway, always nice to see more competition in such a fundamentally important space, so congrats!