Whilst not making light of the complexity of the field. Distributed systems, it ...

antoinealb · on Feb 11, 2022

But in a way you are doing distributed systems; that's what the "with replication" part of your architecture is saying. Now you can choose to never ever test that your replication setup is working, but I think that "distributed systems testing is for big tech" is bad advice. Something that would work pretty well in your setup: take down the Postgres leader, and see if your website is still up. That's already Chaos Engineering.

urthor · on Feb 11, 2022

Well yes exactly.

That's kind of why I brought up the idea of clock errors.

Single leader distributed systems are honestly an entirely different paradigm to multi-leader. You have a main node, the main node controls routing and which data goes where.

There can only be one main node at one time, hence no latency/clock errors/multi leader fail-over/PAXOS/voting/stale keys/exotic application specific conflict resolution logic.

Done, simple.

(Secretly not so simple, all of the above problems are actually hugely important to resolve in any monolithic database. However, Postgres sorts it out for you.)

Really, what I am calling "distributed systems problems" are the subset with multiple simultaneous, geographically separate leaders.

jorangreef · on Feb 11, 2022

> There can only be one main node at one time, hence no latency/clock errors/multi leader fail-over/PAXOS/voting/stale keys/exotic application specific conflict resolution logic.

This is not accurate.

You can certainly have clock errors in "single leader distributed systems", simply by virtue of the fact that the primary rotates throughout the cluster.

And, perhaps even more surprisingly, you can certainly have clock errors in "single node non-distributed systems like Postgres", i.e. clock errors are completely orthogonal to distributed consensus protocols (unless you're doing something dangerous like leader leases, which I don't recommend).

The problem is simple: what if the system time strobes or gets stuck in the future? How does this affect things like recording financial transactions? What if NTP stops working? Distributed systems aside, it's pretty tricky.

I've worked on this problem of clock errors specifically for TigerBeetle [1], a financial accounting database that can process a million transactions a second. In our domain, we can't afford to timestamp transactions with the wrong timestamp, because this could mean that financial transactions (e.g. credit card payments) don't rollback if the other party to the transaction times out, so liquidity could get stuck and tied up, for the duration of the clock error.

[1] https://www.tigerbeetle.com/post/three-clocks-are-better-tha...

urthor · on Feb 11, 2022

Fair point.

dikei · on Feb 11, 2022

>> There can only be one main node at one time

This is how most distributed system solve the consensus problem: the nodes automatically run leader election algorithm so that eventually only one leader remains.

>> (Secretly not so simple, all of the above problems are actually hugely important to resolve in any monolithic database. However, Postgres sorts it out for you.)

Except that in case of network partition of the primary node, Postgres can't promote a new primary node from the replicas automatically, unlike consensus systems such as ETCD/Zookeeper.

yakshaving_jgt · on Feb 11, 2022

In a way, yes. But that’s not what most people are referring to when they talk about “distributed systems”, so I think you’ve constructed a straw man.

eklavya · on Feb 11, 2022

What do those people mean by "distributed systems"? Are they running a single node server with web app and database on the same node? Is there a load balancer/cache/replica/cluster? If it's 1+ machine, it is distributed.

In your opinion what would qualify as a "distributed system"?

urthor · on Feb 11, 2022

Distributed system probably isn't precise enough in of itself.

But I was very much thinking of multiple leader distributed systems.

Anecdotal. But dealing with clock errors and multi-leader issues are of far greater complexity than caching data.

Replicas, load balancers, and all the vagaries of single leader systems are not absurdly difficult technical problems.

They're problems that can be solved by a team with a single senior engineer who knows what they're doing.

yakshaving_jgt · on Feb 11, 2022

I'm not taking a position on what would qualify as a distributed system, and neither was that the point of my comment.

I believe that when most people talk about "distributed systems", they're talking about some form of microservices architecture, where components of a single system are arbitrarily separated by network requests for nebulous reasons.

zozbot234 · on Feb 11, 2022

> Distributed systems, it honestly is just a toy for the tech monopolies.

Not so, decentralized systems are also inherently distributed. The whole point of both the old "Web 3.0" and the newer "Web3" is to make "distributed" thinking a part of the web, beyond the now trivial "linking to a document-like resource on a different host" scenario that defines hypertext/the "Web of documents".

(Of course one may then retort, as Jack Dorsey apparently has, that "Web3" itself is merely "a VCs' plaything". One has to be aware that these developments are somewhat speculative and pretty far from a real consensus.)

hughrr · on Feb 11, 2022

That’s ok until you reach the logical IO and compute capacity of off the shelf PC hardware. Also 20 years of insane coupling and complexity makes it pretty difficult to move away from that architecture which is where most companies seem to end up with from experience.

Also it’s a complete fallacy that “if we’re successful or grow large enough, we’ll just rewrite it”.

And then there’s the logical problems of managing complexity, distributing work to development teams and even things as determining change impact that become terrible difficult.

I say we should design with distribution in mind but deploy without it. Assuming you can scale up forever is an expensive and stupid mistake.

Incidentally it turns out that even adding any deferred processing (in our case SQS+workers) on simple LOB systems can have nasty problems like clock skew and QoS to consider which are very distributed pains :)

urthor · on Feb 11, 2022

> 20 years of insane coupling and complexity makes it pretty difficult to move away from that architecture which is where most companies seem to end up with from experience.

Pretty much. I don't want to generalise for legacy context however.

> I say we should design with distribution in mind but deploy without it.

Oh yes, absolutely. You can pretty reasonably predict when your compute needs will scale up such that you genuinely need to make the shift, and plan for it with a modular architecture.

> Assuming you can scale up forever is an expensive and stupid mistake.

See here's the thing. I think it has been, in the past.

I'm definitely sure that there will be applications in the future which will scale past single leader, multiple nodes.

What I've seen in the last few years was we just didn't ever have those demands (except for deep learning).

I wasn't working for small business, at a genuine mega-corp.

The total production data inflows and egresses of my mega corp, never peaked past I think 200MB/s in any year (for business critical systems. There was some user facing video that was jettison-able). The daily peak was far lower than that.

All production compute needs, bandwidth and compute, were DWARFED by employees on Zoom and Youtube.

The sum total of all proprietary OLTP data, across the company? And we had roughly 14 different legacy proprietary business systems from acquisitions. We had COBAL, we had DB2, we had it all.

The architect of our consolidation had it at less than 5TB (excluding photographs, backups and duplication etc).

40+ years of business critical OLTP data. And all the analytics was done on trailing views of the replicas.

Given the direction of Moore's law, I can safely say that for my former multi-billion dollar employer, Moore's law is going to outpace any of our business requirements.

(Except for photographs. But all our deep learning was being done by a spin out).

hughrr · on Feb 11, 2022

One size doesn’t fit all. Wait until you’re bought by another multi billion dollar employer and expected to scale out to their workload…

urthor · on Feb 11, 2022

Oh definitely. I'm not saying it doesn't happen.

legulere · on Feb 11, 2022

If you have good enough engineers and manpower for distributed systems, you can also optimize your systems enough to fit again, if you are not a billion dollar company.

Distributed systems are exciting and interesting, but currently not worth their investment for 99% of the cases. Things might change though in some decades.

urthor · on Feb 11, 2022

Generally speaking, the second you're running tricks to "jam" something that's overly large into a single machine, yeah you should just give up.

The issue becomes when you reach the scale of compute needs such that you cannot "jam" all your requirements into a single leader multiple node cluster.

The amount of data you'll need for THAT, is becoming unimaginable.

64 core 128 thread leader nodes have outrageous potential to scale.

That said, that doesn't quite apply to my Postgres example (barring some nifty DBA tricks and third party extensions).

But you can appreciate how much data you're going to need to blow past what 128 threads can do for you.

hughrr · on Feb 11, 2022

Actually 20 years of shitty code built on assumptions like this can break an AWS 96 core instance running SQL server. That’s my life.

It does however shift 10k shitty queries per second which is impressive.

hughrr · on Feb 11, 2022

Been there done that. That gave us two years and cost a year of developer time.

dikei · on Feb 11, 2022

Distributed systems are not limited to distributed computing: once you start breaking your monolith application into multiple services, you have yourself a distributed system.

cle · on Feb 11, 2022

Distributed systems are really about persistent state. As soon as you have/need that, you run into these problems. If you need to deploy while maintaining state (like DB contents), or fail gracefully without losing your entire DB, then you have entered the realm of distributed systems. These properties are present even in most "monoliths".

cmeiklejohn · on Feb 11, 2022

A bit of a shameless plug here, but I'm a Ph.D. student at Carnegie Mellon University (Ph.D. in Software Engineering) and my thesis is on the resilience of microservice applications at scale. We've published a little bit on this topic, with some new work under submission currently, but I've put together a website that talks about a.) why doing this work as a student is very difficult due to the lack of open-source applications in this style and b.) proposed a new technique for addressing these problems.

It's called Filibuster:

https://filibuster.cloud/index.html

Example of testing a small Java application with Filibuster:

https://www.youtube.com/watch?v=BhZLHpxQ7mI

nijave · on Feb 12, 2022

I think there's a fair bit of microservices in the CNCF space. Cortex is implemented as micro services (and Kubernetes largely is, too)

cgio · on Feb 11, 2022

Plenty non tech monopolies out there with more than 1bn revenue. Also distribution is not just relevant on the basis of data volume but also e.g. to accommodate variety of use cases. Don’t get me wrong, Postgres is a great tool but the moment you do replication, the only reason you potentially don’t have to test your distributed system is because other people did it for you, not because it’s not relevant.

tzone · on Feb 11, 2022

High availability + uptime is now necessary for any business no matter how small. If you need to run 24/7 available high uptime services you can’t avoid distributed systems one way or another. (Distributed systems vary by complexity of course but anything that has some failover type of mechanism is going to be a distributed system )

Also, industry is now transitioning where even small business might need geo distribution , not just for compute but also for data. That is going to make distributed systems even more complex even for the small guys.

Cloud services have made a lot of this stuff easier for average Joe but you can’t really avoid dealing with at least some fundamental concepts of distributed systems

adevx · on Feb 11, 2022

Not sure I agree. My Hetzner vps has less downtime than some AWS regions. To be honest, I have yet to experience any downtime from them over multiple years. There is a good chance you introduce more potential downtime by blowing up the complexity with a distributed system.