I can't release any of our traffic numbers, but I can tell you that 1M req/s is an enormous number. For some context, here's a post from Netflix from the end of 2011 where they state that their API received 20,000 req/s at peak: http://techblog.netflix.com/2011/12/making-netflix-api-more-...
Yes, but! The impression I get from this 1 million req/s test is that no actual logic is happening on the backend. E.g. no database queries, no business logic, etc - basically a noop call.
As we saw when running the techempower benchmarks, simply going from the plaintext test to the single database query dropped the best performer from ~600,000 req/s to ~100,000 req/s. Throw in a bit more business logic, another query, and a slightly heavier response, and it is easy to imagine that 1 million req/s now sitting much nearer to 20,000 req/s.
My point being that, that 1 million req/s is a very optimistic number when used in such a comparison. Is it still an impressive max throughput? Yes. I just don't want anyone to think that they can now, say, host 50 netflixes on this setup.
Note: I realize you probably weren't meaning to directly compare those two numbers, but it somewhat read that way. I definitely do appreciate the context though - quite interesting to know that the netflix API was peaking at ~20,000 req/s in 2011.
This is not the point of the test, the test is about showing you that the load balancer in GCE can handle that many requests per second and with a single IP address. Whatever the machines are doing behind doesn't matter since the load balancer job is to handle a ton of traffic. This is practically the only case in which responding with 1 byte makes sense in the test.
I completely get that. I responded to the parent because he introduced the 20,000 req/s number as a comparison point.
The Google test is both a theoretical max throughput (that one wouldn't reach under basically any normal use case) and a test of the load balancer capabilities. The Netflix 20,000 req/s number is, instead, a real use case example.
My point was that one shouldn't directly compare those numbers and say, for example, that this GCE setup has 50x better throughput than Netflix.
I imagine that if Netflix were to stub all of their API calls with noops that returned 1 byte responses, they would be able to handle significantly more than 20,000 req/s. Basically, I don't think we actually disagree here.