> The other thing is that cloud hardware is generally very very slow and many en...

chipdart · 2024-09-22T04:28:45.000000Z

> Of course, it doesn’t matter, because the 10x latency hit is overshadowed by the miasma of everything else in a modern stack.

This. Those complaining about performance seem to come from people who are not be aware of latency numbers.

Sure, the latency from reading data from a local drive can be lower than 1ms, whereas in block storage services like AWS EBS it can take more than 10ms. An order of magnitude slower. Gosh, that's a lot.

But whatever your disk access needs, your response will be sent over the wire to clients. That takes between 100-250ms.

Will your users even notice a difference if your response times are 110ms instead of 100ms? Come on.

sgarland · 2024-09-22T15:33:32.000000Z

While network latency may overshadow that of a single query, many apps have many such queries to accomplish one action, and it can start to add up.

I was referring more to how it's extremely rare to have a stack as simple as request --> LB --> app --> DB. Instead, the app almost always a micro service, even when it wasn't warranted, and each service is still making calls to DBs. Many of the services depend on other services, so there's no parallelization there. Then there's the caching layer stuck between service --> DB, because by and large RDBMS isn't understood or managed well, so the fix is to just throw Redis between them.

chipdart · 2024-09-22T17:28:56.000000Z

> While network latency may overshadow that of a single query, many apps have many such queries to accomplish one action, and it can start to add up.

I don't think this is a good argument. Even though disk latencies can add up, unless you're doing IO-heavy operations that should really be async calls, they are always a few orders of magnitude smaller than the whole response times.

The hypothetical gains you get from getting rid of 100% of your IO latencies tops off at a couple of dozen milliseconds. In platform-as-a-service offerings such as AWS' DynamoDB or Azure's CosmosDB, which involve a few network calls, an index query normally takes between 10 and 20ms. You barely get above single-digit performance gains if you lower risk latencies down to zero.

In relative terms, if you are operating an app where single-millisecond deltas in latencies are relevant, you get far greater decreases in response times by doing regional and edge deployments than switching to bare metal. Forget about doing regional deployments by running your hardware in-house.

There are many reason why talks about performance needs to start by getting performance numbers and figuring out bottlenecks.

sgarland · 2024-09-22T17:59:54.000000Z

Did you miss where I said “…each service is still making calls to DBs. Many of the services depend on other services…?”

I’ve seen API calls that result in hundreds of DB calls. While yes, of course refactoring should be done to drop that, the fact remains that if even a small number of those calls have to read from disk, the latency starts adding up.

It’s also not uncommon to have horrendously suboptimal schema, with UUIDv4 as PK, JSON blobs, etc. Querying those often results in lots of disk reads simply due to RDBMS design. The only way those result in anything resembling acceptable UX is with local NVMe drives for the DB, because EBS just isn’t going to cut it.

Nextgrid · 2024-09-22T13:31:01.000000Z

It's still a problem if you need to do multiple sequential IO requests that depend on each other (example: read index to find a record, then read the actual record) and thus can't be parallelized. These batches of IO sometimes must themselves be sequential and can't be parallelized either, and suddenly this is bottlenecking the total throughput of your system.