And have to be actually tested. Most of them are designs based on nothing but uninformed intuition. There is an art to back pressure and keeping pipelines optimally utilized. Queueing doesn’t work like you think until you really know.
Why is this hard, and can’t just be written down somewhere as part of the engineering discipline? This aspect of systems in 2021 really shouldn’t be an “art.”
It is, in itself, a separate engineering discipline, and one that cannot really be practiced analytically unless you understand really well the behavior of individual pieces which interact with each other. Most don't, and don't care to.
It is something which needs to be designed and tuned in place and evades design "getting it right" without real world feedback.
And you also simply have to reach a certain somewhat large scale for it to matter at all, the amount of excess capacity you have because of the available granularity of capacity at smaller scales eats up most of the need for it and you can get away with wasting a bit of money on extra scale to get rid of it.
It is also sensitive to small changes so textbook examples might be implemented wrong with one small detail that won't show itself until a critical failure is happening.
It is usually the location of the highest complexity interaction in a business infrastructure which is not easily distilled to a formula. (and most people just aren't educationally prepared for nonlinear dynamics)
It absolutely is written down. The issue is that the results you get from modeling systems using queuing theory are often unintuitive and surprising. On top of that it's hard to account for all the seemingly minor implementation details in a real system.
During my studies we had a course where we built a distributed system and had to model it's performance mathematically. It was really hard to get the model to match the reality and vice-versa. So many details are hidden in a library, framework or network adapter somewhere (e.g buffers or things like packet fragmentation).
We used the book "The Art of Computer Systems Performance Analysis" (R. Jain), but I don't recommend it. At least not the 1st edition which had a frustrating amount of serious, experiment-ruining errata.
Think of other extremely complex systems and how we’ve managed to make them stable:
1) airplanes: they crashed, _a lot_. We used data recorders and stringent process to make air travel safety commonplace.
2) cars: so many accidents accident research. The solution comes after the disaster.
3) large buildings and structures: again, the master work of time, attempts, failures, research and solutions.
If we really want to get serious about this (and I think we do) we need to stop reinventing infrastructure every 10 years and start doubling down on stability. Cloud computing, in earnest, has only been around a short while. I’m not even convinced it’s the right path forward, just happens to align best with business interests, but it seems to be the devil we’re stuck with so now we need to really dig in and make it solid. I think we’re actually in that process right now.
And have to be actually tested. Most of them are designs based on nothing but uninformed intuition. There is an art to back pressure and keeping pipelines optimally utilized. Queueing doesn’t work like you think until you really know.