To block something like this you need to determine what is botnet traffic vs legit traffic. It's hard.
Source IP doesn't work since it is random and changes. You need to look at things such as HTTP headers, TCP window and any odd flags that might be set. If you're lucky the botnet isn't capable of running a copy of Chrome or Safari or using a random sample template from legit traffic. Lots of botnets are made up of low power IOT devices so once these devices are capable of running a full headless chrome it will get harder.
Not to mention when you do figure out how to discriminate traffic you have to code it. And the code to determine valid traffic vs invalid better run fast because you are getting hit with 100k requests per second. Oh did I mention the attacker can change their algorithm whenever they want? Hope you have a full tensorflow ML/AI pipeline that configures your hardware based ingress of choice just in time. All this while making sure your current production traffic is being served at a speedy pace and not blocking legit customers.
These are some of the issues Cloudflare and companies like them have to deal with.
In cases like this it's actually not that difficult as they're using devices that can be fingerprinted from the Internet. We at Shodan provide a local, embedded database (SQLite or RocksDB) so you can see which open ports connecting IPs have. If an IP is connecting from a device that's running weird ports, is compromised or has other unusual characteristics then you can either flag the connection as high risk or outright drop it if you're under attack. It's mostly used by banks etc. for fraud prevention but we have a few that use it for blocking traffic based on IP risk.
How does fingerprinting them help? You can fingerprint them but they are just desktops/mobile phones/laptops that have been compromised to be part of the botnet.
The compromised hosts that are part of the botnet look exactly like normal traffic.
If you have a database of known-compromised hosts (because a fingerprint scan of them shows something clearly identifiable as part of a botnet, which I think is usually rare [but possibly not for Mēris]), it can mitigate an attack if you've already blocked them.
But the problem that still exists is the initialization traffic -- there are still up to 200k hosts that may hit your site (essentially, a syn flood). Depending on your infrastructure, that can still hurt your firewall or single server. But it is unlikely to hurt as much as having to actually respond (through a request stack) to those requests.
That's not what the article said though. They say that the compromised devices had these characteristics among others:
* Port 2000 open
* Port 5678 open
* SOCKS proxy on port 80 (maybe)
Most likely most of the visitors to your website won't have those ports open and exposed to the Internet. That is a really easy way to filter traffic based on the network fingerprint. Especially when you're under attack it's a great way to reduce a majority of the impact without requiring any AI/ ML - just filter traffic from IPs that have TCP port 5678 open. That same technique was also used to identify Mirai bots and it worked well.
I think in future servers will ask clients to solve a small computation. It can be theoretically incorporated into the handshake and if it takes something like 100ms, human users would not notice but botfarms will feel the pinch. An additional benefit is that servers can monetise the computation offsetting some of their costs.
Wouldn't bot farms just incorporate that as a "cost of doing business" and expand to absorb the computational load? After all, it's not like the bot farmers are paying to add more hardware.
Bot farms exist because they are cheap. You don't need to be perfect, ultimately one needs to adjust the cost of the handshake to ensure that it's higher than the average earning of the farmer.
E.g.: the handshake can be made more expensive choosing a "harder" function for the handshake and giving clients that behaves "good" the possibility to reuse the connections. Bots are penalised because they constantly have to make new handshakes.
But the economic incentives of a botnet are very different from those of a bot farm.
How much server resources are used before asking the client to do work? If they've got 100k clients, and each opens 100 TCP connections to your server, is your TCP stack or your load balancer going to fall over before you even start to do a TLS handshake?
Can you manage as many TLS handshakes as they can throw at you?
This does not help one bit with botnets. The problem of defending against botnets is not blocking many requests coming from each IP address, it's blocking requests coming all those compromised devices. Those devices are perfectly capable of doing that computation.
Introducing javascript into the mix will not make the botnet more difficult to detect. Headless browsers have their own fingerprints which allow defenders to identify them from legitimate traffic. You can spoof the features that headless browsers don't have but that will always be a cat and mouse game.
20 million requests per second from a single beefy AWS server is easy to detect and block.
20 million requests per second coming from a rotating list of hosts from generic IP addresses is a nightmare:
> However, we suppose the number to be higher – probably more than 200 000 devices, due to the rotation and absence of will to show the "full force" attacking at once.
If your site normally has 10,000 users per day and suddenly you’re flooded with 200,000 additional IP addresses hammering at your site, you have a problem.
To put it in perspective, the top post on HN most of yesterday was about someone benchmarking their personal server as being able to handle about 5 million requests per day (Granted, that’s quite slow, but it will suffice for making a point). This botnet can deliver 4X that server’s total daily capacity every second.
Cloudflare uses CAPTCHA to drive away proxy users. Privacy conflicts with Cloudflare's endgame of profiling every Internet user and then monetizing that data.
DDoS attacks like this are usually launched from large number of malware ridden personal computers. Since the attacks are coming from an IP addresses on a residential network they're very hard to differentiate from legitimate traffic.
What's different about this attack is that it appears to not be PCs but network devices (routers) that are being taken over and used to launch attacks.
People are much less likely to catch that this is occurring and as a result there's concerns that this botnet is going to persist as a threat for a much longer time than is typical. Additionally, network devices may have access to a greater amount of bandwidth than a PC increasing the threat.
One more thing: when the botnet makes a request (attack) against a site it's using a modern performance optimization technique of "pipelining" where instead of a GET to "/index.html" just interacting with the index.html file it's holding a connection open to also then request all the other assets from the site. In normal usage this is great as it makes a site feel more responsive and reduces network overhead. However, in the context of this botnet it also increases the number of requests that each bot can make (which is bad).
Great explanation...
For anyone interested, the following jupyter notebook explains three different ways to process HTTP requests: serial requests (the baseline), pipelined requests and parallel requests with multiple connections (and without threads).
0. There seems to be a MikroTik exploit, and all versions are vulnerable. Well, it possible that someone collected the passwords back in 2018, and used them to access updated devices this year, but I guess naaah.
I wonder what should happen to that fine company to make them stop running all the potentially vulnerable system configuration services on all interfaces by default. IP > Services is one of the locations anyone should check ASAP to make sure unneeded ones are disabled, but it's actually misleading. These are just preconfigured wrappers for some components, and others, like DNS or bandwidth test server mentioned in the article, are not shown even if they are running. There isn't even a netstat-like command to check which ports are open.
1. While reaching a record number, the attack against something on the scale of Yandex and Cloudflare is more of a maximum capacity test not limited by target's connectivity, and/or an advertisement for someone's DDoS services.
2. Still, it's an application-level DDoS, so you have to have a swift application-level detection in place if you don't want to just ban IP addresses (and potentially cut legal users from HTTP(S) APIs and other services that might be shared on the server or network).
3. Some skill was demonstrated in finding no-trivial weak points to amplify the server load.
I understand the purpose of a botnet. I was asking for an explanation of the technical details of this apparent advancement. But apparently snide comments purposely devoid of any detail is what I'm getting here now. Cheers
SOCKS proxy: the botnet allows tunneling non-WWW traffic through it so users of the botnet can route say BitTorrent or other P2P traffic through it.
HTTP Pipelining: instead of simplistic one-request-one-resource requests to a server the botnet supports HTTP 1.1 pipelined requests. A single request can ask for multiple files meaning even more demand on target servers. Request resources not cached in memory can see the server eat up its IOPS trying to read files.
Blacklists are still a thing. Since those attacks are not spoofed, every victim sees the attack origin as it is. Blocking it for a while should be enough to thwart the attack and not disturb the possible end-user.
Missing from the article: Links to lists of IPs that they recommend to be blacklisted. It's the same thing that's missing from pretty much every NetSec vendor.
We really need a recovery clinic for compulsive threat-data hoarders.
"Connect to cloud by default" should be banned in any sensible network.
It's probably even more devastating that "default password by default" if exploited successfully.
A single stolen cert, or access to the device provisioning server instantly gets you "keys to the kingdom," and all of the devices online.
A default password, or a vulnerable API on the device, in contract, will still need the attacked to individually find, and hack each vulnerable device.
What makes it dangerous is that those requests aren't coming from a single source - it's a distributed denial of service attack. Anybody can push huge throughput from an XL cloud server with good networking, but it's just as easy to block that IP. Blackholing thousands of nodes is much more difficult.
Source IP doesn't work since it is random and changes. You need to look at things such as HTTP headers, TCP window and any odd flags that might be set. If you're lucky the botnet isn't capable of running a copy of Chrome or Safari or using a random sample template from legit traffic. Lots of botnets are made up of low power IOT devices so once these devices are capable of running a full headless chrome it will get harder.
Not to mention when you do figure out how to discriminate traffic you have to code it. And the code to determine valid traffic vs invalid better run fast because you are getting hit with 100k requests per second. Oh did I mention the attacker can change their algorithm whenever they want? Hope you have a full tensorflow ML/AI pipeline that configures your hardware based ingress of choice just in time. All this while making sure your current production traffic is being served at a speedy pace and not blocking legit customers.
These are some of the issues Cloudflare and companies like them have to deal with.