Google can't necessarily upstream everything because of social problems in the k...

fdee · on Oct 11, 2019

I'm not sure where you have heard about this? Their DCTCP extensions have never been posted in the first place to a public list as of today. Pretty much all of the core TCP developers for the (upstream) kernel's networking subsystem are employed by Google and doing an excellent job. That said, I would love to see their extensions integrated into the upstream tcp_dctcp module.

fdee · on Oct 11, 2019

Is Facebook running DCTCP in production these days?

the8472 · on Oct 10, 2019

They did get BBR into the kernel though, and many moons ago BQL too which was a prerequisite.

jabl · on Oct 11, 2019

Isn't DCTCP generalized by TCP Prague and L4S? Which, if they get the IETF stamp of approval and the potential patent issues around L4S get sorted out, I'd guess would be implemented in the upstream Linux kernel pretty quickly.

pnako · on Oct 10, 2019

Social problems, a.k.a. Linux must work for everyone and not just Google.

bifrost · on Oct 10, 2019

Reinventing TCP over UDP is sortof silly, I hope they have a better reason than "they don't want to upstream our changes" lol.

wmf · on Oct 10, 2019

I think the inability to upstream changes into Windows and (ironically) old versions of Android are bigger motivations for using UDP.

jumpingmice · on Oct 10, 2019

Isn't it a pretty good reason? gRPC is terrible in a datacenter context without Google's internal TCP fixes that Linux won't adopt (and which have been advocated for in numerous conference papers since at least 2009). If they are steadfast cavemen what other workaround exists?

TanakaTarou · on Oct 11, 2019

Apparently Microsoft is considering gRPC as an future replacement for WCF, so that might change. https://news.ycombinator.com/item?id=21055487

wumpus · on Oct 11, 2019

The standard workaround is to send short messages using UDP and long ones using TCP.

derefr · on Oct 10, 2019

What parts of gRPC are fixed by using it over QUIC vs. TCP (presuming intra-DC traffic and equally long-lived flows)?

jumpingmice · on Oct 11, 2019

Latency caused by packet loss. TCP needs microsecond timestamps and the ability to tune RTOmin down to 1ms before it is suitable for use in a datacenter. With the mainline kernel TCP stack you are looking at, at a minimum, a 20ms penalty whenever a packet is dropped.

toast0 · on Oct 11, 2019

TCP over UDP seems rather silly to me, but congestion control and segmentation in userland is pretty useful. Especially so, since Google and partners have made an ecosystem where kernel updates on deployed devices don't happen.