Block web scanners with ipset and iptables

justin_oaks · on Nov 10, 2022

I wonder why the author uses a 404 error response. I usually configure NGINX with "return 444;" which closes the connection without response. Scanners don't deserve a response. I may have wasted bytes receiving the request, but I won't waste any more once I know the request is garbage.

yabones · on Nov 10, 2022

That was mostly just for the blog post. In reality my default vhost 301's back to the IP that sent the request. I doubt it ever does anything, but I like to think it makes hackers attack themselves in the confusion :p

I also have a fake /admin path that just contains a bunch of offensive/illegal phrases in 10 ish languages, but it was out of character for the post.

444 is a good idea though, I didn't know about that response code!

quesera · on Nov 11, 2022

Beware though -- nginx 444 doesn't actually close the connection. At the packet level, it just does not respond.

This distinction is important if you have a load balancer in front of nginx. The LB will wait until timeout for a response, occupying a bit of stateful memory and probably causing an error which is indistinguishable from "backend application server is offline".

treffer · on Nov 11, 2022

That is actually cool, it is a tarpit for these bots!

On a well configured site the LB timeouts should be short enough anyway.

But it is a risk, especially on classic DOS attacks.

quesera · on Nov 11, 2022

Yep it's great for tarpitting if you are not behind an LB.

The other problem, if you are behind an LB, is that the client (DoS attacker) will get a 503 from the LB after timeout. So, no gain even if your timeouts are reasonable.

It'd be great if you could return a custom response from nginx that would tell the LB to drop the request -- or you could move the exploit-detection logic to the LB instead of nginx, and the LB could do its own 444 equivalent.

fluential · on Nov 10, 2022

You will like this more elaborate attacks you can do to those bots https://www.hackerfactor.com/blog/index.php?/archives/762-At...

justin_oaks · on Nov 11, 2022

Good read, thanks.

For others who aren't interested in reading the whole thing:

The author of the post used zip-bombs, which are compressed HTTP responses that expand to 1000 times the size of the compressed data. He could send relatively small responses that would fill the requester's memory and crash the process. Beautiful.

jcynix · on Nov 10, 2022

I use "402 Payment Required“ right now, which is sent to the client. Didn't know about 444, which isn't listed on the Wikipedia page about HTTP return codes ...

jonas-w · on Nov 10, 2022

It is listed on wikipedia, but under "Unofficial Codes -> nginx" as it is nginx specific and not standardized.

thrwawy74 · on Nov 10, 2022

Not to detract from the article, but we should be using nftables in 2022. :-)

https://wiki.nftables.org/wiki-nftables/index.php/Moving_fro...

TacticalCoder · on Nov 11, 2022

He probably is in a way. iptables is now just a wrapper around nftables and nftables understand the iptables syntax.

thrwawy74 · on Nov 11, 2022

I always feel like a git telling people "it's nftables now!", but it's been over a decade and folks keep using iptables as the common identifier. It's slow to change language. You're right, many of those iptables commands are utilities/scripts around nftables now.

BrandoElFollito · on Nov 11, 2022

Thi is snot very much different from "use IPv6".

I work in IT and it is now probably 25 years that I keep hearing that IPv6 is round the corner. "Adoption" is ~35% but what this means that in 35% of the cases, you can get to a place though IPv6. This does not mean that you must, or do. It is just the capacity.

When a technology takes 25 or so years to be mainstream it means that there is a problem somewhere ("too complicated", ...) or that there is no problem in the forst place ("iptables work fine for me", "I NAT my 10.x network", ...)

rroot · on Nov 11, 2022

If you've set up your rules from scratch using nftables then they are not compatible.

usr1106 · on Nov 11, 2022

Where is that wrapper? In user space or in the kernel?

suprjami · on Nov 11, 2022

aiui the iptables-nft wrapper is userspace, it translates iptables syntax into nftables and applies those nftables rules.

gerdesj · on Nov 10, 2022

I started with ipfw, then ipchains, then iptables and now nftables. That is just on Linux.

To be fair: I just downloaded a ipfw setup and didn't give it much thought. I spend several weeks hand crafting several ipchains scripts. I spent ages with iptables and wrote a rubbish multi WAN effort and eventually ditched it for pfSense for edge. More ages for a host based effort. I also use ufw quite a bit for iptables. I use firewalld for nftables, these days.

gw98 · on Nov 10, 2022

I've been using fail2ban to kill this for years. Seems to be quite effective: https://github.com/fail2ban/fail2ban

Avamander · on Nov 11, 2022

If possible, report the hosts you block using f2b to AbuseIPDB or similar projects. That way we'd be collectively better able to hinder this abuse.

guerby · on Nov 11, 2022

There's crowdsec to share info about IP in a collaborative way: https://www.crowdsec.net/

throwaway888abc · on Nov 10, 2022

This.

Trusted combo: Fail2Ban + 7G firewall

https://perishablepress.com/7g-firewall-nginx/#download

wooptoo · on Nov 10, 2022

Actually you don't need to respond to bogus http clients at all: https://gist.github.com/radupotop/2aef0bdc0ccbd3a706044e3598...

creeble · on Nov 10, 2022

I understand the first part -- sending requests with no host header to a spam log (or even better, don't log).

What I don't understand is the second part -- blocking those hosts. Seems pointless now that you've de-noised your logs. They're still sending packets. Saves thousands of bytes on outbound?

What about all the scan-spam on sites WITH host headers? Whatevs.

yabones · on Nov 10, 2022

Serves a few purposes, but as you said the main objective is already done by de-noising. The other reason I do this is because it's easy to detect that kind of scanning in HTTP logs, but not as easy for other services (ssh, ftp, smtpd, etc) without something like fail2ban, and the blanket ban applies to all of them. So, if a bot scans your HTTP server enough times, they can't go after "softer" targets later.

For scan-spam that does hit your "real" site, it's a bit more tricky as there absolutely will be false positives. You can grep for all 401/403's and add them to the list, but that will sooner or later hit a real user. So it's much more specific to the application you're hosting, where this works for just about any site. The other nice thing is that even when they scan your "real" site, they'll often hit the default host via IP scans at the same time, so you can still manage to ban them.

It's not perfect, but it's good enough :)

dpifke · on Nov 10, 2022

Running your own mail server is a great source of data to identify botnet-compromised hosts.

When I started banning IPs that send "HELO <myhostname>" for 24 hours, I cut the number of fake login/registration attempts on a bunch of my web-based projects by ~50%.

It works the other way, too. Temporary bans on hosts that try to access /wp-admin (I don't run Wordpress anywhere) cut my email spam significantly.

(Some day, I'll get around to implementing a real reputation tracking system, with exponential ban lengths.)

bombcar · on Nov 10, 2022

This is an important aspect of it - you can use information on one angle of attack to protect other devices.

Do note that doing this kind of thing can block people on Tor, because Tor is used for attacks quite often, also.

hoppla · on Nov 10, 2022

Another neat trick is to add a link in robots.txt and instruct bots to stay away. If they don’t, you add them to your blocklist

justin_oaks · on Nov 10, 2022

I was confused at what you were saying at first. For those that may also be confused:

You can add something like this to your robots.txt:

    User-agent: *
    Disallow: /some/unguessible/url

And then you ban any IPs/bots that visit that URL.

krick · on Nov 10, 2022

What are some best practices to deal with in on a PC? I mean, by default pretty much everything is closed and it's not like there is any "legitimate traffic" at all, but over time it still accumulates some open ports by running stuff in docker and elsewhere: a jupyter console here, an MPD UI there — most of the time I don't even think about the fact that I'm constantly scanned by someone, and remember only after I see some logs and get disturbed by the number of rude guests.

patja · on Nov 10, 2022

I have used wail2ban and just started using ipban

alyandon · on Nov 10, 2022

On my internet facing hosts, I use the firehol level 2 and level 3 block sets along with blocking all CN IP space that I can accurately identify. My logs are eerily quiet.

BrandoElFollito · on Nov 11, 2022

I tried firehol for some time and quite liked it (much more than iptables). This was after shorewall started to fade out (and is now abandoned or so).

I had some problems to get community support and it seems that activity around firehol is fading away and I am not sure whether this is because this is a complete, finished product, or because it is abandoned.

alyandon · on Nov 11, 2022

I don't actually use the firehol scripts - I use the source lists with my own custom iptable/pf scripts.

rjsw · on Nov 10, 2022

Just added some Digital Ocean IP blocks to my firewall config.

stanislavb · on Nov 11, 2022

This seems like a neat solution; however, my issue is that all my websites are behind Cloudflare nowadays. Hence, iptables is useless ¯\_(ツ)_/¯.

est · on Nov 11, 2022

Could this be easier done without install ipset command, by modify /proc/net/xt_recent/ directly?

https://ipset.netfilter.org/iptables-extensions.man.html

holoduke · on Nov 10, 2022

Why not using the way easier to configure i(f)tables? It's so much more straightforward and flexible.

BrandoElFollito · on Nov 11, 2022

What are i(f)tables? Google does not suggest anything.

holoduke · on Nov 11, 2022

Sorry i meant to say nftables.

Puts · on Nov 11, 2022

I get the impression the author missed out on zcat for reading gzipped files.

anon291 · on Nov 11, 2022

fail2ban is excellent. No need for anything else. Configure it for all your server logs. It'll handle the iptables or nftables config for you.

klausagnoletti · on Nov 11, 2022

I'll challenge you on that :-) https://www.crowdsec.net/blog/crowdsec-not-your-typical-fail...

1vuio0pswjnm7 · on Nov 11, 2022

ipset requires a separate kernel module.

usr1106 · on Nov 11, 2022

ipfilter requires seperate kernel modules for various options. On most common distros they have been built and installed by default and will just be loaded at runtime I'd assume. If you run a highly customized kernel you probably have had the issue before when doing something with the firewall.