If I look at traces of all the service calls at my company within our microservices environment, the "meat" of each service is a fraction of the latency -- the part that's actually fetching the data from a database, or running an intense calculation. Often times its between 20-40ms
Everything else are network hops and what I call "distributed clutter", including authorizing via a third party like Auth0 multiple times for machine-to-machine token (because "zero trust"!), multiple parameter store calls, hitting a dcache, if interacting with a serverless function, cold starts, API gateway latency, etc...
So for the meat of a 20-40 ms call, we get about a 400ms-2s backend response time.
Then if you are loading a front end SPA with javascript...fugetaboutit it
But DevOps will say "but my managed services and infinite scalability!"
Everything else are network hops and what I call "distributed clutter", including authorizing via a third party like Auth0 multiple times for machine-to-machine token (because "zero trust"!), multiple parameter store calls, hitting a dcache, if interacting with a serverless function, cold starts, API gateway latency, etc...
So for the meat of a 20-40 ms call, we get about a 400ms-2s backend response time.
Then if you are loading a front end SPA with javascript...fugetaboutit it
But DevOps will say "but my managed services and infinite scalability!"