Hacker News new | past | comments | ask | show | jobs | submit login
Back from the Future: Global Tables in CockroachDB (cockroachlabs.com)
94 points by andreimatei1 on July 19, 2022 | hide | past | favorite | 12 comments



I recently tried CockroachDB's serverless offering and I was very satisfied with it. It has a generous free tier, easy-to-understand pricing, and a query analyzer that helps me estimate how much a query would cost. It is still in beta but already feels extremely polished.

The only complaint so far is that there are very few supported regions. (Oregon, N. Virginia, Frankfurt, Ireland, Singapore, Mumbai for AWS, and São Paulo, California, South Carolina, Iowa, St. Ghislain, Jurong West for GCP, as of now.) Even the list of supported regions could not be found online, only to be found after signing up. Was it intentional to drop the information from the documentation? Not that this is a huge problem, considering I'm still on the free plan, but I wonder if they're planning to add more regions in the near future.


Does their serverless offering actually support Global Tables? Last time I checked I remember only being able to select a single region. Their dedicated offering supported multiple regions, but started at several hundred dollars per month, which is pretty prohibitive for someone just starting out.

I'm currently using Fauna which offers the same strongly consistent global reads with higher latency writes approach, and their pricing is much better for smaller-budget projects that can benefit from global replication.


They said they were planning to expand its coverage to the multi-region use case last year[1], but nothing's concrete. I've heard good things about Fauna, but having to rely on their own query language put me off. I might have to revisit it in the future, though.

[1] https://www.infoq.com/news/2021/10/cockroachdb-serverless/


Thanks, that's good to know! Happy to see more competition in this space.

Really love being able to reason about a globally replicated DB as if it was in a single location with strongly consistent reads. The mental model is so much simpler than a single-primary read replica setup.

The cost in write latency is worth it IMO since it's still usually fast enough for most use cases, and nudges me towards using alternative replication strategies for use cases that are sensitive to write latency (instead of deluding myself into believing my app is fast for everybody when it's only fast for me because I chose a primary that's physically close to where I live).


Global Tables let database clients in any region read strongly consistent data with region-local latencies. They’re an important piece of the multi-region puzzle — providing latency characteristics that are well suited for read-mostly, non-localized data.


Instead of doing all this complicated thing, how about simply following a Raft-like consensus protocol with the minor modification that the leader won't include a write op its read processing until that write op has been applied to the log of all the replicas, not just the quorum. When the heartbeat response from replicas indicates to the leader that this write op has been applied everywhere, it can advance its internal marker to include this write op in the read operations.

This simple scheme allows all members including replicas to serve read-after-write consistency and penalizes the write op that happened. That write op wont be acknowledged to the caller until it has been applied everywhere.

There are no fault tolerance issues here btw. If any replica fails, as long as quorum was reached, the repair procedure will ensure that write will be eventually applied to all replicas. If the quorum itself could not been reached then the write is lost anyways and is no different than the typical case of reading just from the leader.


I don't think this scheme provides the "monotonic reads" property discussed in the blog post. Specifically, it would be possible for a reader to observe a new value from r2 (who received a timely heartbeat), then to later observe an older value from r3 (who received a delayed heartbeat). This would be a violation of linearizability, which mandates that operations appear to take place atomically, regardless of which replica is consulted behind the scenes. This is important because linearizability is compositional, so users of CockroachDB and internal systems within CockroachDB can both use global tables as a building block without needing to design around subtle race conditions.

However, for the sake of discussion, this is an interesting point on the design spectrum! A scheme that provides read-your-writes but not monotonic reads is essentially what you would get if you took global tables as described in this blog post, but then never had read-only transactions commit-wait. It's a trade-off we considered early in the design of this work and one that we may consider exposing in the future for select use cases. Here's the relevant section of the original design proposal, if you're interested: https://github.com/cockroachdb/cockroach/blob/master/docs/RF....


Thanks. Yes this explanation is something I can agree with. It does not provide monotonic reads.


> until that write op has been applied to the log of all the replicas, not just the quorum

That removes all the fault tolerance. What do you do if you never get the acknowledgement from all replicas?


That question doesn’t make much sense. If you have quorum then eventually repairs will kick in and will get replicated everywhere.

So it can tolerate up to N/2 failures just like other consensus system. Because this is basically Raft.


It's an exciting time for cloud-native databases. I'm especially curious how various emulation layer approaches will shake out, for example, between CRDB and Neon which do pgsql wire protocol emulation vs block device emulation, respectively.


Is something similar possible using Postgres / Citus?

E.g. If I have a multi-tenant architecture but want a global table for say a postgres full-text-search index of content from the distributed tenants, what would the recommend route be?

Global tables seem like a great feature. I'd gladly sacrifice write speed for it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: