At such a high volume of requests it probably makes sense to consider going one ...

jrockway · on Dec 13, 2019

I think using HTTPS is fine. But there is probably some value in using GRPC+proto by default instead of REST+json. With client-side streaming, you set up and tear down the connection less frequently, and that means you negotiate TLS and send initial headers less frequently. And the messages themselves are smaller, especially for small messages.

https://nilsmagnus.github.io/post/proto-json-sizes/

GRPC streaming is almost as efficient as just using a raw TCP stream, but saves you having to write the protocol glue code. There are already clients and servers that work, and you can just write your protocol definition in the form of a protocol buffer. Worth a look for this use case.

(Also, the clients know how to do load balancing, so you don't have to pay Amazon to do it for you. Unlike browsers, most language's GRPC clients are happy to take a list of IP addresses from DNS and only send requests to the healthy endpoints. Browsers, if you're lucky, try opening a TCP connection but will happy keep the same IP address even if it 503s on every request. Chrome, Firefox, and Safari all do different things.)

mr__y · on Dec 13, 2019

>the clients know how to do load balancing

that is of course true, but they won't be able to ommit not working/failed/overloaded nodes whereas a load balancer might be able to do so. On the other hand the client might be programmed to just use another IP from the list and resend request if one node fails to answer, but this would increase the total time required by the client to do a successful connection. I also realise that non-responsing nodes might be rare enough for this to be a negligible problem - just playing devils advocate here.

jrockway · on Dec 14, 2019

No you can do all that stuff with grpc. You can use active health checks (grpc.health.v1) to add or remove nodes from the pool. (You can configure the algorithm that is used to select a healthy channel for the next request, too.) You can also talk to a central load balancer, which provides your client with a list of endpoints it's allowed to talk to.

When you control the client, you don't have to resort to L3 hacks to distribute load. You can just tell the client which replicas are healthy. (And both ends can report back to give the central load balancer information on whether or not the supposed healthy endpoints actually are.)

L3 load balancing actually works somewhat poorly for HTTP/2 and gRPC anyway. They only balance TCP connections, but you really want to balance requests. That is why people have proxies like Envoy in the middle; the client isn't smart enough to be able to do that, but it is. But if you control the client, you can skip all that and do the right thing with very little resources.

namibj · on Dec 14, 2019

If your nodes are large enough, you won't need to balance requests directly. You do need it to work on 5~50 requests in parallel, however.

bureaucrat · on Dec 13, 2019

Or just encrypt in-house and use HTTP.

maxkuzmins · on Dec 13, 2019

These guys receive requests from mobile devices. Afaik sending unencrypted HTTP requests is not allowed on some platforms (e.g. iOS).

throw03172019 · on Dec 13, 2019

You can disable transport security for a certain domain using the plist key NSAppTransportSecurity. But you rarely should do this :)

https://developer.apple.com/library/ios/documentation/Genera...

bureaucrat · on Dec 13, 2019

First, they can offer both http and https endpoints. Second, you can send HTTP request if you set NSAllowsArbitraryLoads=true.

jrockway · on Dec 13, 2019

I think you should think long and hard before you roll your own crypto.

gberger · on Dec 13, 2019

You don't need to roll your own crypto in order to encrypt in-house.