A lot of this sounds like transport layer instead of application layer. I'm curious why it ended up in HTTP instead of TCP.
For example, you could specify that ports 49152 to 50179 formed a block on a given host, and so on in groups of 1028. Ports in a block would have to connect to the same port on the other end, and would share a packet buffer. There could be some rules to let the network stack pass on data to a process in anticipation of a three way handshake completing, TLS could assume that new connections shared the same secrets as existing ones, and so on. That seems simpler, and it prevents the wheel from having to be reinvented for every new application protocol.
An obvious reason not to do that is that TCP is too widely used for the effects of changes to be predicted. Is that the only reason?
It ended up in application layer because of http gateways and NATs. The reason why spdy was https only at google was because https is pretty much the only port that nearly all corporate gateways let through as a "black box"
Google even did tests with http over SCTP[1] and found that it solved a lot of the problems that spdy did. It's well accepted that the transport-layer is the "right" layer to fix this in, but it would not allow as wide a deployment.
This is the major knock on HTTP2. It's entirely focused on hacking around transport layer issues while keeping TCP intact and has shied away from fixing real and serious problems with the HTTP protocol itself. Here's a quote from the HTTP2 FAQ: ( https://http2.github.io/faq/#can-http2-make-cookies-or-other... )
"In particular, we want to be able to translate from HTTP/1 to HTTP/2 and back with no loss of information. If we started “cleaning up” the headers (and most will agree that HTTP headers are pretty messy), we’d have interoperability problems with much of the existing Web."
Pretty disappointing really. It took 15 years of haggling and we ended up with precisely the old protocol but now in a new, improved binary format with multiplexing.
> Pretty disappointing really. It took 15 years of haggling and we ended up with precisely the old protocol but now in a new, improved binary format with multiplexing.
Yeah, we can have a perfect reimplementation of the widely used protocol to fix all of the problems at once. Just look at how successful IPv6 has been.
IPv6 is doing fine. I have IPv6 at home, and natively on all of my servers. According to my Openwrt AP stats, about 47% of all my household inbound traffic receiving is via IPv6 (over 8 days of basically just web browsing).
IPv6 doesn't fix most of the problems with IP, in fact it makes some worse (routing tables are now bigger because of bigger addresses and there can be more of them). If IPv6 actually did attempt to fix issues in IP it might have seen faster adoption.
I think you're confusing the transport layer with the session and presentation layers. That being said, HTTP/2 is more than just a different way of opening connections and, yes, changing TCP in such fundamental ways would be impossible given the widespread adoption and burden a layer that is not expected to do all of that (when it should be flexible and enable more complex operations from the upper layers).
It's funny to think that a mere 20 or so years ago, ethernet wasn't even a foregone conclusion. The first time I connected a computer to a network, I had to make sure I was plugging into the ethernet and not the token ring. Now here we are pushing transport layer fixes into the application layer just because we don't want to have to explain to sysadmins how to reconfigure their firewalls?
Major residential ISPs aren't rolling out IPv6 largely because they already have enough IPv4 space for their customers.
Take a major UK ISP like Virgin Media. According to Hurricane Electrics BGP looking glass project they are originating 9.4M IPv4 addresses[0], however, at the end of 2012, according to Wikipedia[1], they only had 4.8M customers.
Virgin Media can't even expand that easily because they must lay cable in areas that weren't covered by NTL/Telewest infrastructure: and I believe is extremely expensive and time consuming with planning permission being what it is.
It's not supported at all on Windows or OS X and the implementations everywhere else are far less tested than TCP. That's a large immediate problem, particularly since you have to update the kernel to fix it, and it also means that many, many intermediaries (home routers, proxies, corporate firewalls, etc.) have never been pushed to support it at all, let alone as well as commonly-used protocols.
The good news is that the semantics are close enough that if the situation improves for both client support and, critically, intermediaries it would be relatively straightforward to migrate to a HTTP/2-over-SCTP hybrid if proved better in some way.
I'm completely aware of the fact that SCTP isn't in the native Windows networking stack, and I'll take your word that it's not in OSX. But as long as we put up all these Frankenstein solutions of handling things up-in-the-stack because firewall admins can't be bothered up upgrade or configure their stuff correctly, we are just not putting out enough pressure that this will get changed!
And even if we now introduce this sad compromise that is HTTP/2, there also will be a lot of proxy/firewall appliances that block HTTP/2, and there will be the equivalent of the government-entity of large corporation that was stuck on IE4/WinXP until 2014 using an outdated web-browser or intranet server.
So, maybe we should try to get incompatible protocols out much earlier and if they turn out to have merit, we could enable them on released products and have Chrome/Firefox put up a nagging reminder: "Your web-experience would be much improved (or: this premium content could be watched at higher resolution, or security to your banking website, or...) if your network infrastructure would support SCTP/IPv6/DNSsec/, please ask your ISP or Administrator".
First, “sad compromise” is a pejorative value judgement and that line of reasoning has just been marketed by people who are appealing to the authority of the legacy OSI model to make “this is new and different and I don't like that” sound more compelling. To make that argument more compelling, someone has to actually do the hard work of analyzing the protocol and pointing out actual, specific engineering problems caused by it which would be fixed by using something like SCTP or why, for example, the predicted sky falling hasn't occurred with in 15 years of TLS not being implemented at the kernel level.
Thus far, the only serious work I've seen shows that something like SCTP or QUIC could possibly be a fair percentage faster on lossy networks. That's something which merits future work, particularly since either would be relatively easy to swap into place for the lower levels of HTTP2 now that the protocol has first-class support for the concepts, but it doesn't seem like a good reason to roll back deployment of a production-ready protocol to wait for everyone to upgrade their kernels first.
> there also will be a lot of proxy/firewall appliances that block HTTP/2
The beauty of reusing HTTPS is that this not the case for most firewalls and since HTTP/2 did not change the semantics, the default behaviour for anyone running an old tampering proxy is not to enjoy the performance benefits but otherwise experience no problems. That seems like a good compromise to me: full backwards compatibility with the cost of non-support being born by the slackers and reusing existing practice means that a much smaller percentage of users are affected.
> nagging reminder: "Your web-experience would be much improved (or: this premium content could be watched at higher resolution, or security to your banking website, or...) if your network infrastructure would support SCTP/IPv6/DNSsec/, please ask your ISP or Administrator".
The problem with this is that most users will just ignore the message and the few who try to escalate it are probably going to be told no because if their ISP/corporate IT was good they'd never have seen the message in the first place.
More seriously, though, there was a time when I had to install MacTCP and others had to install WinSock or Trumpet or whatever it was called. The upgrade was worth it then, and I bet it could be worth it again.
HTTP over SCTP give you more than what HTTP2 does. I just wish Google would have made SPDY only work over SCTP as a measure to push for widespread adoption. Imagine finally having a seamless connection for every application on mobile devices where it would switch from wifi to 4G without any serious latency or dropped connections.
To answer your specific question about why not in TCP, it's not deployable. Here's why:
* In order to get multiplexing and other features introduced with HTTP/2, you need to change protocol framing. However, this means that the protocol is no longer backwards compatible. There are many ways to roll out non-backwards-compatible changes, but for TCP, they were deemed unacceptable. For example, you could negotiate the protocol change via a TCP extension. However, TCP extensions are known not to be widely deployable (https://www.imperialviolet.org/binary/ecntest.pdf) over the public internet. You could use a specific port, but that doesn't traverse enough middleboxes on the public internet (http://www.ietf.org/mail-archive/web/tls/current/msg05593.ht...). Yada yada.
* More importantly, TCP is generally implemented in the OS. That means, updating a protocol requires OS updates. That means we need both server and client OS updates in order to speak the new protocol. If you look at the very sad numbers on Windows versions uptake, Android version uptake, etc, you'll understand why many people don't want to wait for all OSes to update in order to take advantage of new protocol features.
"""
I understand the points you make, and am sympathetic. I feel the same way when I see people abusing HTTP to provide async notifications, etc.
The fact is, however, if it isn't deployed, no matter how nice it would theoretically be when it is, it isn't useful and people WILL work around the problem.
That is true regardless of whether the problem is at the application layer (e.g. HTTP), or at the transport layer (e.g. TCP), or elsewhere.
The primary motivation is to get things working, and making things that lack redundancy, or are elegant comes in at a distant second.
Deployed is the most important feature, thus things which quickly move protocols and protocol changes from theoretical to deployed are by far the most important things.
The longer this takes, the more likely that the work-around becomes standard practice, at which point we've all "lost" the game.
-=R
"""
Basically, this is another instance of Linus's quote (https://lkml.org/lkml/2009/3/25/632): "Theory and practice sometimes clash. And when that happens, theory loses. Every single time." Theoretically, it'd be far more elegant to fix this in the transport layer. But in practice, it doesn't work. Except, maybe if you implement on top of UDP (e.g. QUIC), since that would be in user-space and firewalls don't filter out UDP as much.
If you're working on a new app and your primary concern is modern browsers, if your network stack supports it, you could probably design for HTTP/2 from the get-go. That's exciting - no more choosing between semantic code and boosting performance with hacks like image spriting.
It's annoying that IETF so behind the times that they still don't use HTML for their RFCs. In the last 20 years they've published 3 specs for hypertext transform protocol, but are still unsure about actually using hypertext yet.
They're probably very concerned about readability on a VT-100 by users who don't have enough memory to run Lynx.
In the meantime the spec is unreadable on a mobile phone screen (their "HTML" version of the RFC is still the same fixed-width text stuffed in a `<pre>` tag).
This is something Google pushed, so that Google can have as many tracking cookies as they like when you browse the internet, without the cookies causing a noticeable performance degradation because a http request might exceed the American DSLs MTU size.
This was one of the primary engineering criterias. No really.
I've only begun to study the protocol, but some things in the privacy section do concern me. Particularly:
HTTP/2's preference for using a single TCP connection allows correlation of a user's activity on a site. Reusing connections for different origins allows tracking across those origins.
Imagine a home or business where devices are configured for some privacy (cookies and various other things generally blocked) and the machines are communicating through a NAT or Gateway/Proxy. Wouldn't reused connections represent a unique threat in terms of allowing (third-party) websites to discern the behavior of individual machines and users?
Perhaps my understanding is way off, but I don't think it's possible to reuse a connection across different origins (unless perhaps the IP address of the server was the same for those origins)
Clients SHOULD NOT open more than one HTTP/2 connection to a given host and port pair, where the host is derived from a URI, a selected alternative service [ALT-SVC], or a configured proxy.
suggests to me that if you have TopLevelOrigin1 open and TopLevelOrigin2 open, and they both embed content from ThirdPartyOriginA, you'd (likely) have only ONE connection to ThirdPartyOriginA. I thought that would be an example of "reusing connections for different origins", and there would seem to be a correlation risk.
I was thinking that the "one connection per host:port" would (normally) be based on host NAME. However, your comment and the fourth paragraph in 9.1 Connection Management leaves me thinking that it could (optionally) be based on resolved IP Address.
To further complicate things, Alt-Svc can redirect requests to a different host. So HTTPS requests to TopLevelOrigin1 and TopLevelOrigin2 could all go to example.com and its IP Address.
Maybe a (welcomed!) reply will help clarify the correlation risk scenarios. I'll also search for prior discussions about it.
That doesn't increase the correlation risk, since the same could be done today with TLS session IDs or session tickets: even if you open more than one connection to the third party, your browser will attempt to reuse the same TLS session for performance reasons (it shortens the handshake and avoids expensive public key calculations).
On the flip side, the use of TLS means that connections to different origins must be kept separate, even with Alt-Svc. If I'm reading https://tools.ietf.org/html/draft-ietf-httpbis-alt-svc-06 correctly, "example.com" MUST present the certificate for TopLevelOrigin1 or TopLevelOrigin2, which obviously requires a separate TLS handshake and thus a separate TLS session for each one.
I appreciate you mentioning the Session ID and Session Ticket scenarios. Those can be disabled though, correct? HTTP/2 over cleartext TCP is also possible? Maybe still some increase in correlation risk due to those?
Do you think the Alt-Svc scenario we are talking about would guarantee two separate HTTP/2 connections?
> I appreciate you mentioning the Session ID and Session Ticket scenarios. Those can be disabled though, correct?
Yes, the server can ignore both and always start a new session. And you could modify a client to never send either, but if you are already modifying the client code you could also modify it to always use a separate connection.
> HTTP/2 over cleartext TCP is also possible?
In theory yes, in practice most will only implement it over TLS, to avoid middleboxes breaking it. TLS-intercepting middleboxes won't negotiate HTTP/2, so it'll fallback cleanly in that case.
> Do you think the Alt-Svc scenario we are talking about would guarantee two separate HTTP/2 connections?
Alt-Svc to a different hostname will always use TLS. From the draft: "Clients MUST NOT use alternative services with a host that is different than the origin's without strong server authentication; this mitigates the attack described in Section 9.2. One way to achieve this is for the alternative to use TLS with a certificate that is valid for that origin."
And with current TLS, the hostname is sent on the handshake (SNI), so it can't use the same session for different hostnames.
I think the security.ssl.disable_session_identifiers pref in Gecko browsers is meant to allow for it without modifying the code, and I haven't spotted a similarly easy way of controlling HTTP/2 connections. Otherwise, point taken.
Is there anything we haven't discussed that would fall within the HTTP/2 spec's "Reusing connections for different origins allows tracking across those origins." warning?
The HTTP protocol will have nothing to do with making UI/UX better. People will continue to write ones that are unintuitive, resource intensive, unnecessarily complex, and utterly stupid because they can.
For example, you could specify that ports 49152 to 50179 formed a block on a given host, and so on in groups of 1028. Ports in a block would have to connect to the same port on the other end, and would share a packet buffer. There could be some rules to let the network stack pass on data to a process in anticipation of a three way handshake completing, TLS could assume that new connections shared the same secrets as existing ones, and so on. That seems simpler, and it prevents the wheel from having to be reinvented for every new application protocol.
An obvious reason not to do that is that TCP is too widely used for the effects of changes to be predicted. Is that the only reason?