In some ongoing DDoS attacks, I've heard companies say they're dedicating more resources to combatting the problem.
Is active mitigation mainly about sifting through incoming traffic, trying to find signatures and update filters in realtime? Or more replacing dynamic with static content as its targeted? Or are there other ways resources get surged during a crisis?
ie, Is it heavy on manual analysis, or are we at the point yet where we can automate mitigation?
Just curious. Would recommend an AMA, but that's not this site.
(Answer from one of my more knowledgeable colleagues since I work on supporting the infrastructure rather than DDoS mitigation itself):
The larger the attack, the simpler the vector.
For the Flood (TCP, UDP, DNS, NTP, etc) attacks, creating accurate firewall rules within your cloud-scrubbing provider handles a large portion of this, the remainder can be mitigated by connection rate-limiting or TCP connection mitigations (check to make sure it’s a valid 3WHS before allowing connections to the origin).
Complex L7 attacks require more effort and usually shift around in what they are attacking, this takes more analysis to pin down, though the L7 Bot Defenses, intelligent rate limiting and automatic traffic analysis help with this.
DDoS mitigation is not explicitly a science, there is an art to it as well that comes from experience and learning day-to-day as attacks evolve. As attacks start up, mitigations may be too strict or too loose, that is the benefit of an expert SOC staff to monitor the situation and adjust as needed. Relying on automation for this will likely leave the customer frustrated with the outcome.
>"buy a ton of network capacity in multiple locations around the world"
There are estimates that the Dyn attack was 1.2Tbps. What provider do you work for that can absorb that?
>"scrub the traffic using using some expensive fancy appliances(*)"
Does scrubbing work with a bunch of randomized source ports and source address? How do you find the "signature" in that to scrub? It was my understanding that the traffic in the Dyn attacks was indistinguishable from legitimate traffic. Can you explain how traffic scrubbing works in such scenarios? Or do you just drop stuff on the floor?
Yes it is possible. For example, uniformly randomized ports and addresses are a signature - if you study real traffic you find it doesn't look like that.
A lot of smart people have spent a lot of time thinking about this problem.
Well sure it's a signature of being under attack, but is there also a solution to whom you are going to block and whom you are going to send a response to? That is not immediately obvious to me when packets have random source IPs, ports, query IDs and normal-looking domains.
But I don't believe that most of these DDOS providers "scrubbing services" handle this do they? It seems like a great fit for machine learning but in practice is anyone doing this? That's kind of what I was trying to ask.
The second part is the tricky part - I've gotten offers from DDoS mitigation providers that advertises that level of capacity, but how much it helps depends entirely on how much of that traffic they can scrub.
If they only scrub half of that traffic, you're still getting a massive DDoS attack against your origin servers anyway. And the frustrating thing about buying DDoS mitigation is that you have absolutely no way of knowing how much traffic any DDoS mitigation service can actually scrub of a given attack before it hits you...
(Answer from one of my more knowledgeable colleagues since I work on supporting the infrastructure rather than DDoS mitigation itself):
If the DDoS mitigation provider is doing their job, they will ingest all the traffic, scrub out the bad and return the good. While obvious, there are things a customer should know when entering into that arrangement:
- Does the provider have the ingress capacity to absorb all the attack traffic?
- Are their scrubbing centres peered with Tier 1 transit providers to reduce carrier congestion?
- Do they have a policy on dropping traffic at certain volumes?
- Do they charge you based on attack volume or clean traffic?
- Do they have rate-limiting in place towards the customer to protect them from high-volume attacks while mitigations are optimized to catch all the attack traffic?
Ensuring your provider has these technical and contractual terms in place will make sure they can actually offer value when under attack.
Agreed. Even if they have the capacity if they can only identify a a portion of it as "interesting" and then pass the rest on to you they haven't solved the problem. And like you said you are still getting DDOS and paying dearly for scrubbing.
I am curious if the traffic analysis at the scrubbing center is anything more than coarse grain I am assuming it is.
> There are estimates that the Dyn attack was 1.2Tbps. What provider do you work for that can absorb that?
I work for F5 Silverline.
It would be naive to brush off that size of attack as it would stress any DDoS mitigation provider.
We have the capacity to take it (and are adding a lot more in the coming months precisely because of attacks like this and the previous big one that hit Brian Kerbs) but you would have to talk to Sales about what sort of guarantees we would be willing to provide.
That said, at least that would be our problem, since we would have to pay for the bandwidth consumed - not the customer! ;)
> Does scrubbing work with a bunch of randomized source ports and source address? How do you find the "signature" in that to scrub? It was my understanding that the traffic in the Dyn attacks was indistinguishable from legitimate traffic. Can you explain how traffic scrubbing works in such scenarios? Or do you just drop stuff on the floor?
(Answer from one of my more knowledgeable colleagues since I work on supporting the infrastructure rather than DDoS mitigation itself):
Most attack traffic has a signature of its behaviour beyond source IP and port.
For simple flood attacks, connection rate-limiting per source IP is very effective. Typically there are some simple firewall rules that will work too, especially with attacks generated against targets that don’t support the attack vector (e.g. block all GRE traffic coming to your website which is only TCP 80/443).
If the attack is operating at L7, there are additional fingerprinting methods we can use around HTTP headers, intelligent rate-limiters, JavaScript injection to prove an actual human, etc.
Thanks for the response. I didn't know F5 had such an offering, I will certainly check it out.
">For simple flood attacks, connection rate-limiting per source IP is very effective."
But if a each of the 8 million bots in a hypothetical botnet sends exactly one syn packet then you can't really rate limit per source IP. Or the case of DNS a single UDP datagram which might be indistinguishable from a "legitimate" DNS query datagram.
Also rate limiting is different that what I would think of as "scrubbing" unless I am thinking about scrubbing wrong. Back to my example of the botnet where the command and control tells each node to send a single packet or a small handful of packets towards a destination, theres very limit match a rate limit filter on in that case no?
I think "rate limiting" is used as an indicator of malicious intent rather than in the QoS sense. If we classify traffic as malicious we drop it.
The single SYN will fail the 3WHS test, so we would drop it. For DNS we can usually fingerprint the attacker and block traffic. Because we are a full proxy we can do packet inspection as well. ;)
There's a fair amount of intelligence already built into our products which we of course make use of (BIG-IP LTM/AFM/ASM, "IP Intelligence").
Eventually it may well come down to the SOC monitoring and adjusting the filters in real time as required - that's life.
Any provider with a 1.2Tbps+ network can handle that. There are a few out there. The big ones have no problem (of course), most CDN providers will also easily handle this. On the hosting side, both OVH (they saw one of similar size) and Hetzner should be able to handle it (although Hetzner would be close to the limit). Just naming two where I know the size. I guess other large cloud providers (Vultr, DO, Linode) should have similar bandwidth. I just don't know how they handle routing.
>"Any provider with a 1.2Tbps+ network can handle that."
This isn't global this is per region/per AS. 1.2Tbs would be 12 100 Gig ethernet ports on a router or 120 10 Gig ethernet ports. There are not many people that have this capacity in a POP or even even in multiple POPs in a region such as North America or Europe. Just the money involved in being able to connect that much transit to your edge is insane. Because not only do you need enough transit provider diversity but you also need multiple chassises populated with enough switching fabric. I doubt there is a CDN or hosting provider that has 1.2T of capacity at any kind of a region level. A Tier 1 provider yes but not a CDN or hosting provider.
Not really getting your point. All but "having enough money" are actually good advice to not getting robbed. E.g. if carry around a golden Rolex and/or the newest iPhone you are much more likely to get robbed than if you have no watch and a $150 Android device in your hand. And most people don't carry around all their money for exactly the reason to not lose too much.
Except UNLIKE the Rolex analogy, the recommendation is not "do not carry your role" it is more akin too "do not own a Rolex". Or more directly, "avoid the problem by not having enough resources to own a Rolex".
True but now there's a separate danger with botnets. Now, instead of saying "don't own a Rolex" it's "don't own any silver, because that could be used to make a knife and steal someone else's Rolex."
With botnets like Mirai, innocent networks need to protect themselves not to avoid getting attacked themselves but rather to ensure that they don't participate in an attack against someone else.
I suppose anything more specific - both in case of DDoS and robbery prevention - would require a person/organization specific analysis that takes into account what you want to protect, what resources you have at your disposal and what threat model you operate under.
I would love a link to some resources about how to do such analysis. Maybe someone here knows a good one?
If you'd like more: I'm a professor at CMU, and my office is currently offline because someone decided to throw 5gbit/sec at my machine starting last Thursday. CMU's response - which I can't really fault, given that my office is not geo-distributed - was to filter inbound traffic at our upstream provider. No Internet for me; I'm back on the wifi. :)
Unfortunately, all of these "Best Practices" are "spend more money," which effectively means the attacker wins. They're forcing you to spend more money, even if they're not attacking you now/recently. Would love to see more things that reference open source mitigation software and such like that, e.g. tossing a hardened nginx in front of your Tomcat server, stuff like that.
Granted, at some point, you're going to have to spend money to mitigate the attack no matter what, but if mitigation of DDoSs becomes entirely focused on "Go with a big centralized provider" or "Spend lots of money to mitigate the attacks," we end up in a much different Internet.
Not really spend more money necessarily. Using dedicated hardware with prevention technology built in, making sure your servers are spread around the planet, all that sounds more like having a better architecture than necessarily spending more money.
True, but there are certain scenarios (think gaming, VoIP, etc) where you can't really decentralize in the way that traditional websites can. On top of that, many lower-tier websites and service companies have to use budget hosts simply by the nature of them being an initial startup. Unless we're talking SV startups, it's much more difficult to throw big money around when you're just starting out to get that kind of dedicated hardware protection.
True, and while Cloudflare is a great company, putting all our eggs in one basket isn't particularly wise. Not saying that this'll happen, but we've seen "great companies" that were great while there was strong competition, but eventually when they became the monopoly began to strangle out anything that they were against. Protection providers like Cloudflare would have enormous power to simply kick out a user for being "too costly to host," like Akamai did to KrebsOnSecurity, and then you'd get destroyed by attacks.
Krebs didn't pay akamai. It will be similar on cloudflare. If you're on a free plan, they will have a limit on what they will defend for you. Layer 3/4 and 7 attacks are only covered in business plan, but if you use this plan (which likely makes sense for many due to other features), I'm pretty sure they won't throw you out.
Mitigating DDos is one of the main selling points nowadays. That's why OVH wrote so much about the huge attack they defeated (and they didn't name the impacted clients), Cloudflare offered Krebs to host him (he refused) and other providers add scrubbing centers.
I actually think it's rather cheap to defend against DDos if you're a small company. Large companies will have it harder as they have typically more complex requirements and cannot just shift everything behind cloudflare or similar services.
The CEO of Cloudflare talked about it at Black Hat[1], and for instance they protected an Hong Kong voting website for free while it was under heavy DDoS attacks.
Well I finally got the article on DDoS attacks to load (I haven't been this put out since it raaaained on my wedding day). I didn't find it particularly illuminating.
> Deploy appropriate hardware that can handle known attack types and use the options that are in the hardware that would protect network resources...
> If affordable, scale up network bandwidth...
> There are several large providers that specialize in scaling infrastructure to respond to attacks...
People usually fight against filtering which limit legitimate usage (like filtering port 80 to prevent hosting web-site at home). I doubt that anybody fight against RFC2827, but there are still many ISP which don't implement this.
Because null-routing malformed packets, for example, is completely different from null-routing the port of a service the ISP don't want. What is again, completely different from null-routing bittorrent.
As someone that experienced DDoS which made whole leaseweb feel it (site i was operating was on servers in leaseweb), I can tell you one thing.. Unless you are ddosed by someone from their home connection or one server you need professional DDOS protection. No guide will help you here, you probably don't have the pipe big enough in case of UDP flood and don't have enough resources in case of sophisticated TCP attacks.
If you only want to protect your website then go for cloudflare pro, it will be enough for 95% of ddosers. If cloudflare is not enough then you need thousands or ten of thousands of dollars to get protection.
Do you know if anyone has ever been kicked from cloudflare pro because the attack was too large?
I'd guess that they're one of the largest companies providing DDos defenses. I'd guess if they can't handle it there aren't many more that could. But don't know if that has ever happened.
I was hoping that ipv6 would help with the attacks from non-spoofed IPs: each DDoS participating device would get a permanent sticky IP address, and there would be a global list of blacklisted IP addresses maintened by a neutral organization, such that these IPs would get null routed by transit carriers.
Could still be construed as ethical so long as they were focused on catching anonymized criminals (CP, scammers, etc). Of course that kind of research can lead to others using the same strategy to de-anonymize all TOR users.
Make sure you understand your infrastructure and web deployment.
Lock this down as tightly as possible.
Use a 3rd part to protect your web infrastructure such as Incapula
So are there any ways to mitigate a DDoS attack, aside from throwing money at it by buying a large pipe? Null routing/ taking the site offline doesn't sound like a solution.
>Locate servers in different data centers.
>Ensure that data centers are located on different networks.
>Ensure that data centers have diverse paths.
>Ensure that the data centers, or the networks that the data centers are connected to, have no notable bottlenecks or single points of failure.
Getting Robbed: Best practices for Prevention and Response
* Don't carry all of your money with you at all times.
* Don't advertise that you're carrying large sums of money.
* Have enough money that getting robbed doesn't really affect you.
* Pay someone else to do your errands so you're not at risk.