TurboTLS: TLS connection establishment with 1 less round trip

r1ch · on Feb 14, 2023

An interesting idea, but QUIC / HTTP/3 also avoids the extra RTT for TLS negotiation by bundling it with the connection handshake and in a less janky way than this. I don't see a good reason for a server or browser developer to implement this when QUIC exists.

dwheeler · on Feb 14, 2023

TLS is used for other protocols, e.g., SMTPS (SMTP + TLS). But there's an extra DNS query for this case, and I don't think TLS setup time is a significant cause of delays. So I don't know how useful this is.

aseipp · on Feb 14, 2023

The latency hit of the extra trip will definitely be felt by end users if the endpoint is far enough away (e.g. ~100-200ms.) You can mitigate the initial setup other ways though, like the CDN approach: terminate TLS much closer with a proxy and use a warm, pre-established backhaul connection to the origin.

Less round trips are always good, though, without any extra stuff to put in place.

sam0x17 · on Feb 15, 2023

Having debugged stuff for someone crazy who wanted less than 50ms global latency for a private search engine I can tell you the TLS negotiation adds significant time the first time you initialize the connection

Karrot_Kream · on Feb 15, 2023

I've designed my own private CDN and unless you have edges near each geo, you're not getting that kind of latency. And even after TLS negotiation you have no idea what the MSS on a link along the way will be causing fragmentation and retransmits. I've made do with some shenanigans with Noise and pre shared certs and have tried to make payload sizes as small as possible.

sam0x17 · on Feb 15, 2023

Yes this was using a presence in every GCP and Cloudflare and AWS edge location simultainiously all hooked together with latency based routing

Karrot_Kream · on Feb 16, 2023

Hah gotcha. Hope you got paid well for that. I do it for my own pleasure, but it's a lot of work to do as the internet just isn't designed for low latency interconnect.

8n4vidtmkvmk · on Feb 15, 2023

50ms is impossible. Lowerbound is 68. One-way.

sam0x17 · on Feb 15, 2023

emphasis on wanted. And it was quite obtainable depending on geo location. Some locations 70 was best we could do, but most of the US east coast we were able to get around 40

j16sdiz · on Feb 14, 2023

I think most SMTP use STARTTLS with preetablished TCP connection..

mananaysiempre · on Feb 14, 2023

Not sure about real-world statistics, but the current IETF position is that SMTP STARTTLS for mail submission (not transport) is to be phased out in favour of “implicit” SMTP-over-TLS with no cleartext portion, due in part to the former being an implementation minefield[1].

[1] https://datatracker.ietf.org/doc/html/rfc8314#appendix-A

fomine3 · on Feb 15, 2023

Recent submission https://news.ycombinator.com/item?id=34736416

dwheeler · on Feb 14, 2023

There's a big push to use implicit TLS with SMTP, instead of STARTTLS. Here's a post about that:

https://blog.apnic.net/2021/11/18/vulnerabilities-show-why-s...

drewg123 · on Feb 14, 2023

QUIC is horribly inefficient on the server side, so there are legitimate reasons to use TLS 1.3 over TCP in high traffic scenarios (like CDN servers).

10000truths · on Feb 14, 2023

Interesting, can you explain in more detail on what makes QUIC more inefficient than TLS over TCP on the server side?

kixelated · on Feb 15, 2023

Kernels and hardware have been optimized for TCP. QUIC will catch up eventually.

10000truths · on Feb 15, 2023

AFAIK, kTLS and hardware TLS offload don't solve the latency problem anyways. Those only handle the AEAD and record encapsulation/decapsulation in the critical path, where maximizing throughput is the concern. Control messages are not handled, so session establishment with client hello, cipher exchange, key exchange etc. is still done entirely in userspace, and the handshaking process is where the latency issues arise.

kixelated · on Feb 15, 2023

Oh yeah, QUIC is an improvement in terms of latency and even throughput over TCP/TLS. The claim that "QUIC is horribly inefficient" centers around CPU utilization and the cost of delivering each byte. That's where hardware offload shines but it doesn't exist for QUIC yet.

drewg123 · on Feb 18, 2023

The only reason QUIC delivers throughput improvements over TCP is because it uses the same congestion control as BBR, so occasional packet loss doesn't kneecap the connection. If you use BBR TCP (or RACK TCP) on FreeBSD, you'll see the same improvement in throughput vs older TCP congestion control.

The same server that will do 375Gb/s at close to 50% idle will maybe deliver 70-90Gb/s of QUIC with the CPU maxed.

TLS+TCP delivers that performance with a lot of optimizations that just don't exist for QUIC:

- inline TLS offload (Mellanox CX6DX)

- async sendfile -- note, the above 2 mean that the kernel never even maps data from a file being sent to a client into memory, much less copies it to/from userspace like with QUIC. This cuts memory bandwidth to roughly 1/4 of what it would be with a traditional read/encrypt/write to socket server.

- TCP segmentation offload

- TCP & IP checksum offload

- TCP large receive offload

Some of these optimizations are slowly becoming available, but until all are present, QUIC will cost 2x - 4x as much to serve as TCP for a CDN workload.

drowsspa · on Feb 14, 2023

Doesn't it require you to already have connected to the server once?

r1ch · on Feb 14, 2023

There's a couple of "previously connected" bits with QUIC:

- The very first connection to a site is usually HTTP/2, which requires an additional RTT compared to QUIC, as the browser doesn't yet know if the server supports QUIC. In the response, the server can advertise the presence of QUIC support with the Alt-Svc header. This support flag can also be present in a HTTPS DNS record for the domain but that isn't queried by all browsers yet. Future connections to the same server will default to QUIC once the browser is aware of support, saving an additional RTT.

- Once connected to a QUIC server, the encryption keys can be cached for future connections, allowing the client to send data with the initial connection request (so called 0-RTT). This is only safe for idempotent requests though as this part of the protocol could be replayed by an attacker.

kixelated · on Feb 15, 2023

You're right that HTTP/3 requires Alt-Svc at the moment. QUIC itself doesn't require a pre-established connection (1-RTT), which is notable for non-HTTP/3 protocols and WebTransport.

londons_explore · on Feb 14, 2023

HTTP/3 seems to offer all these benefits already... And seems to be simpler and more compatible... And doesn't require a new DNS field which will surely trip up plenty of middleboxes...

ekr____ · on Feb 14, 2023

Without taking a position on TurboTLS versus H3...

There are actually two types of middlebox problems:

1. DNS resolvers which don't carry new record types. 2. Middleboxes which don't properly handle UDP-based protocols

(2) applies to both H3 and to TurboTLS, and in both cases you need some kind of fallback in case things fail.

(2) applies to TurboTLS as specified. However, it's worth noting that H3 also has a DNS-based mechanism for advertising support via the HTTPS record (that is also used for ECH). However, you can also advertise H3 support via Alt-Svc, so presumably you could do the same with TurboTLS.

In general, any new transport like H3 or TurboTLS has to be offered on a best-effort basis with a fallback, otherwise you'll have a lot of hard failures.

SahAssar · on Feb 14, 2023

> However, you can also advertise H3 support via Alt-Svc, so presumably you could do the same with TurboTLS.

Alt-Svc is an http header, by the time you get http you already have TLS setup. In a Alt-Svc workflow you generally advertise h2 in TLS ALPN, then use Alt-Svc to advertise h3 capability in the headers of the first response, and then the client establishes a h3 connection and closes the h2 connection when it is ready (and the h2 connection does not have any queued requests). At least that is my understanding.

ekr____ · on Feb 14, 2023

Yes, that's correct. Basically the only good way to have a "first contact" fast track setup like this is by priming in DNS. My point is that the situation is th same for TurboTLS and QUIC/H3 in this respect.

SahAssar · on Feb 15, 2023

I just meant that QUIC/H3 is beneficial even after you already have a connection, so upgrading an existing connection is a valid strategy. TurboTLS is only beneficial for establishing the connection, so it needs to have prior knowledge about the support to be useful at all.

londons_explore · on Feb 14, 2023

And H3 is already pretty widely deployed, so unless TurboTLS has some really compelling benefits, it will lose by default.

HTTP/3 is in most browsers under 3 years old, and most server stacks support it (usually with a bit of extra config).

ThePhysicist · on Feb 14, 2023

Encrypted Client Hello (ECH) is similar as it requires an extra DNS record, for a different purpose though.

bqmjjx0kac · on Feb 14, 2023

In fact, they both require the same DNS record: the HTTPS record.

(Well, technically ECH can obtain server configs any way, but HTTPS is recommended.)

JohnFen · on Feb 14, 2023

But what about all the non-hypertext protocols?

ekr____ · on Feb 15, 2023

HTTP/3 is based on QUIC, and these perf properties are due to QUIC, not H3.

With that said, the unusual performance dynamics of the Web make connection setup latency particularly important by contrast to, for instance, SMTP, where people don't really get bent out of shape if a message takes another 200ms to be delivered.

kixelated · on Feb 15, 2023

You can use QUIC without the HTTP/3 layer on top. It's a general purpose replacement for TCP+TLS.

londons_explore · on Feb 15, 2023

Does anyone use it for this?

jewel · on Feb 14, 2023

This reminds me of the noise protocol which lets you communicate securely with a single round trip.

http://www.noiseprotocol.org/noise.html#zero-rtt-and-noise-p...

jakear · on Feb 14, 2023

...provided a round trip has already been made. In other words, not a single round trip.

baby · on Feb 14, 2023

Not really, provided some data is already known, which is the case in TLS has well (either you know the public key of the server, or you trust a set of CAs)

jakear · on Feb 14, 2023

"Secure communication in a single round-trip" implies securely transmitting to a specific audience without any correspondence beforehand and securely receiving information from them afterwards - which seems impossible - probably because it is.

If you relax the constraints to allow for shared state beforehand (which could only arise from prior communication-trips of some sort), you're just at RSA: cool to be sure, but coming up on 50 years old at this point.

ekr____ · on Feb 14, 2023

As you say, if you want to have 0-RTT data, you must have some prior information about the server. This information can come in two main forms.

1. Have the client know the server's public key in advance. You can then do a RSA or ephemeral-static key exchange, as in gQUIC, OPTLS, or TLS Fasttrack. 2. Have the client and the server do an initial handshake and then reuse the symmetric key for future connections, as in TLS 1.3 and IETF QUIC (which uses TLS 1.3).

The public key version has a number of superficially compelling properties, in particular that you can publish the server's data somewhere like DNS ("0-RTT Priming") and thus have 0-RTT with the first connection. By contrast with the symmetric version you have to have a connection first. The public key mode also will work in this setting. However, making this work in practice has a number of challenges around authentication, anti-replay, anti-amplification, etc. Initially TLS 1.3 had both of these modes but we ultimately removed the public-key based one in favor of a symmetric-key based mode.

baby · on Feb 15, 2023

I’m not sure what’s the relation with RSA but you’re right that it’s simple. And that’s what noise has been doing for years. There isn’t much else to look at here, unless you specifically need tls you should be using noise

ekr____ · on Feb 15, 2023

TLS 1.3 and QUIC both support 0-RTT modes as well (as noted in the original paper). Note that anything that runs over TCP of course first has to absorb the TCP round-trip (modulo TFO) whatever crypto it uses. That's why QUIC uses UDP, and the motivation for the use of UDP in TurboTLS.

baby · on Feb 15, 2023

You can also use QUIC with noise (check nQUIC). The motivation behind using noise is that its much much much simpler, and so are the implementations

foo42 · on Feb 14, 2023

fewer

dbj99 · on Feb 15, 2023

less. If you correct people's grammar you have to make damn sure you're right otherwise you look a little bit foolish...

less modifies singular nouns, fewer modifies plural nouns. Check here for more

https://www.latimes.com/socal/daily-pilot/opinion/tn-dpt-me-...

coreyp_1 · on Feb 14, 2023

https://www.merriam-webster.com/words-at-play/fewer-vs-less

1. See the exceptions section. Less is preferred in this construction.

2. Please do not misconstrue the opinion of one writer that lived 200 years ago into a proper grammar rule (see the history section). Worse, please do not be dogmatic about it when it has nothing to do with the topic. It is, in essence, the equivalence of an ad hominem attack.

foo42 · on Feb 15, 2023

Interesting. I wasn't aware of that. I had a colleague who constantly "corrected" us (with some humour) and even long after he left the team we all wince when we see/hear "less" when it should be (or so we believed), "fewer". I preferred it before when I blindly said "less" but didn't always hear a voice in my head commenting on everyone else's use

semafour · on Feb 14, 2023

It's also a joke, though: https://www.youtube.com/watch?v=zXINZxodu9U