Important to note that unless your Nginx instance has a special (read: very high) keepalive limit configured, Nginx has a fairly reasonable defense against HTTP/2 rapid reset attack by default, as the article says. Still, interesting to see the response to these attacks.
I’m stuck trying to figure out if this is technically desired behavior or not. If you were retroactively designing http/2 with this knowledge, would you have done anything different?
Systems that require very high cognitive load on their human operators (whether machines, programming languages, etc.) are alway destined to fail. Human beings are not good at doing boring, repetitious work that requires them to stay focused: lapses will occur. And hackers are going to find and exploit those gaps. So the best way to avoid those problems is to build solutions or specifications in which those gaps are not even possible.
Modern software systems are some of the most complex systems that humans have ever invented--and they just keep getting more complex over time as we layer new things on top of them. Think of a really high Jenga tower with lots of holes in the base.
That means we need to strive to keep things simple. That may mean making decisions that prevent common or severely impacting failure cases from being possible, even if it is a little more difficult to do, or eliminates an esoteric use-case. This is even more true when writing specifications that others will implement and/or be expected to conform to.
I guess the parent post want to know if there are any _specific_ and _effective_ change for this kind of attacks.
In this case, simplifying the protocol won't help -- the vulnerability is:
1. Backend servers can't cancel immediately (this is no protocol problem)
2. The client can make concurrent request in a connection (This is the goal of Http/2)
3. The concurrency is pre determined, there is no way for the server to throttle without user-visible error.
4. The client can cancel any request mid-flight (removing this is equally bad, security-wise)
Unless you are removing the concurrency, making the protocol simpler won't fix it.
The protocol designer need adversary mindset, not a simpler mind
If I understand it correctly, a big part of the problem is that 1) requests which are in the process of being cancelled are not counted towards the concurrency limit, and 2) you can create and cancel a request in the same package.
1 allows you to have more pending requests than intended, making some form of DDoS possible. 2 allows it to trivially scale to hundreds of requests per packet rather than just the pending-stream limit, limited only by the packet size.
Disallowing 2 should be fairly trivial, as there is no valid reason to cancel a request in the same packet you started it. I'd consider it more of an implementation bug than a protocol problem.
Issue 1 is definitely a protocol problem though, and it's going to be a bit trickier to fix as it would require nontrivial changes to the request state machine. A fix would require subtracting a request from the pending-stream count not when it is cancelled, but when its resources have been fully cleaned up - and ideally you'd even add some sort of throttling on that to make it even harder to abuse.
FYI this is for the commercial nginx product, hastily purchased by F5 a few years back when software load balancers were annihilating their hardware offering.
Curious to see f5 still playing games with their own cve disclosure on the bigip product though...assigning it a mitre cw400 is just lying.
Both http2_max_concurrent_streams and keepalive_requests (the configuration parameters discussed in this article) are configuration parameters available in open source nginx:
From some first-hand experience over the last few months… these suggestions and patch will help prevent a single client from overwhelming an NGINX server, but it will do little to stop even a modest botnet from generating enough requests to be a problem. Keeping some state on IPs and downgrading those that exceed limits to HTTP/1.1 I believe is the only effective defense. Tuning those thresholds to get them right is… challenging.
If the only viable fix is to downgrade clients to an earlier protocol, do you take that to mean that there is a fundamental weakness in the protocol itself?
Hehe, when I heard about the attack a couple of days ago I was interested to know if Nginx was affected and did a search on Google for the CVE of that attack followed by the name of Nginx.
I didn’t find anything relevant so I assumed that Nginx was not affected.
There is a difference between an application being innately vulnerable and a user configuration exposing a vulnerability.
Interestingly enough, HAProxy seems to have the same mitigation:
> Until HAProxy dips below the configured stream limit again, new stream creation remains pending—regular timeouts eventually apply and the stream is cut if the situation does not resolve itself. This can occur during an attack.
That is, if I read it correctly, default configuration is safe and you can use configuration of stream limits to ensure you are not vulnerable, but they are saying HAProxy is not vulnerable...at least in the title. Later on they soften the language:
I think the important distinction is ‘a user may plausibly have this non default configure’ vs ‘this config is obscure almost nobody will be running it this way’
I am not sure I understand how stream limit configuration between two L4/L7 load balancers is meaningfully different. In my mind, either the configuration of stream limits is a vulnerability for all L4/L7 load balancers that offer that configuration or it's not for all of them.
If one doesn't _offer_ configuration of stream limits and therefore is not susceptible to user misconfiguration, then I would get the distinction. But as I understand it both HAProxy and NGINX have the same configuration options which _could_ be vulnerable if configured poorly by the user. One is just putting a lot more positive spin on it.
Nginx and HAProxy work around the issue in different ways.
Nginx by default simply kills the entire connection after 1000 requests. With this attack, that's two packets. This severely limits its amplification and basically makes the bypass of the concurrent stream limit a moot point - unless you manually increased the requests-until-killed count.
HAProxy avoids the issue by deviating from the specification. You are supposed to only count active requests towards the concurrent stream limit and ignore cancelled ones, but HAProxy does count cancelled requests and only removes them from the stream count once their resources are fully released. In practice this means the attack isn't any worse than regular http/2 requests.
The protocol-level bug still exists, but in both cases it just can't be used to launch an attack anymore.
OP mentioned they didn't find Nginx listed on the CVE, and the reply said
>If you read the article, you'll see that the default configuration is not affected.
Which, in the context of OPs comment, implies that the CVE wouldn't be associated because the default config is not affected.
Hence my reply that CVEs don't care whether its default config or not. If there is a CVE associated with the program, there is a CVE associated with the program, rare config or not.
Here’s the thing. I use Nginx. Some of the configurations in which I use Nginx were mostly copy-pasted from recommendations from third-parties. Hence, my initial assumption which had been that no action was needed when I didn’t find anything mentioning Nginx with this attack when I searched a few days ago, needed to be revisited.
Because when I saw the op article it turned out that there was reason for me to have a closer look at my Nginx instances. To see if any of the configs that had been recommended by third-parties involved changing values that could lead to this attack being able to affect me.
If someone asked me how to "speed up the web", I would not suggest "use HTTP/2".
I would remove ads and other garbage. As a decades long non-popular browser and TCP client user, I can testify this works very effectively. I prefer to have full control over the resources that I request, whether text or binary, so no auto-loading resources, no Javascript-requested resources and no HTTP/2 "server push". The clients I use do not auto-load resources, run Javascript nor carry out "server push". Works great for me. Web is not slow.
According to HTTP/2 proponents, the protocol originated at an online advertising services company and was developed by companies that profit from sale and delivery of online advertising, HTTP/2 was designed to "speed up the web".
I respect that opinions on HTTP/2 may differ. If someone loves HTTP/2, then I respect that opinion. In return I ask that others respect opinions that may differ from their own, including mine. NB. This comment speaks only for the web user submitting it. It does not speak for other web users. IMHO, no HN commenter can speak for other web users either. Thank you.
I would say that a text-only experience is valid I don't think it's how the majority of people want to use the web. Users want a rich multimedia experience.
If HTTP/2 speeds up a rich multimedia web experience then it may legitimately be one way to "speed up the web" for someone who expects that level of experience.
I don't think it's fair to criticize a protocol for who designed it. The specification is out there for anyone to interpret and if there is specific complaint in it's design then make that.
HTTP/3 is now the version to upgrade to. We're in Google cloud so we've been running that for some time now without issues. Works great actually.
I don't get all this anti HTTP 2 & 3 sentiment on hacker news. What's wrong with people here? HTTP 1.1 is a quarter century old at this point. This sounds just like a bunch of grumpy old men arguing against progress. Time to move on. Yes HTTP/1.1 works. But it's also a bit limited and slow in various ways that both new HTTP variants address. One little bug in nginx is not going to change anything. Bugs happen all the time. They get fixed and people move on. I'm not hearing a lot of rational arguments here.
You could 'caddy upgrade' pretty quickly to get the patch (servers had updated go), though the release number bump didn't happen immediately.
Running the same now, or pulling a new binary, using xcaddy, etc. will get you 2.7.5 which also includes some other small fixes not related to rapid reset.
> Like it was not enough to make HTTPS default, they need to eradicate the opposition.
I think you need to elaborate your world view by many paragraphs before I can understand what you're trying to say.
You're against HTTPS? Plain TLS wrapping HTTP?
> they profit from root certificates.
The (web) root certificate industry has never been weaker than it is today, thanks to free root CAs like LetsEncrypt.
> HTTP/1.1 is the fastest, most open, web you'll ever have; because it is small.
So run it. Nobody's stopping you. I won't. Aside from firesheep, the fact that you thanks to Snowden we no longer have to be called a conspiracy theorist to believe in Echelon, and with all the webpage ad and bitcoin mining injectors, or just plain DPI middleboxes, I don't think HTTP is a reasonable default.
Google used to provide "Conncetion: Close" (with the typo), to work around middle boxes that did shenanigans to headers, but luckily only compared some bad checksum of the header, not string comparison.
HTTP/1.1 is open and simple, sure, but it's also being interfered with on a massive scale. Companies out there are selling ad injector boxes to ISPs. They only work on HTTP. You don't have ads on your blog? Well, you do now, for the visitors from some ISPs.
Now HTTP/2 and HTTP/3, I see more of your point. Encryption is just table stakes to get a working website, at this point, but nobody's obligated to race to eliminate every RTT. Plain HTTPS is fine.
CSS sprites is still (last I checked) a bit faster than individual resources over HTTP/2 or 3. But if I'm making a photo album showing 50 thumbnails at once, then I'm unlikely to use plain simple HTTPS with individual resources. I'd at least have to choose between CSS sprites or HTTP/2/3. It would just make for a poor user experience otherwise. And if you're not making a website for your users, then what is it for?
All major browsers have now removed HTTP available as default = you need to change a setting to even be able to access a HTTP url.
Anti-virus software blocks native apps that try to connect on port 80 and you cannot make them open the port even if the setting is available.
I will always use HTTP/1.1 on port 80 but my customers wont be able to connect even if they try, my only option is to tell them to uninstall their anti-virus and hope that works.
To force a certificate that is gate kept by root-certificates and forces you to identify yourself is the largest censorship humanity has had so far: It protects the consumer from those that don't have a root-cert., but hurts a producer that is not complying with the authority.
And HTTP/2 & 3 tries to insert the cert. as a base function meaning you wont even be able to connect without it.
Pseudonymity was always the most important feature of the internet.
Watch them come after TCP and UDP soon: They will say that to use unencrypted protocols you need a license from your government.
> I will always use HTTP/1.1 on port 80 but my customers wont be able to connect even if they try, my only option is to tell them to uninstall their anti-virus and hope that works.
Because of infrastructure malware like ad injectors at ISPs, it's probably in your customers best interests to use HTTPS.
Hell, it's better to use HTTPS with a self signed cert than HTTP.
> hurts a producer that is not complying with the authority.
There's always the risk of a conspiracy of vendors deplatforming someone. That's true. I'd be more worried about your ISPs or electricity companies unilaterally shutting off your service.
If you see Letsencrypt going all political, like Patreon, and kicking off people with the wrong views, then yeah we have a problem.
> They will say that to use unencrypted protocols you need a license from your government.
It impacts the product, there is nothing sensational about that. If it wasn't impacted that's what they would have said and with Nginx being the tool of choice for many high volume installations you can bet that there are many thousands running in non-standard configs.