Hacker News new | past | comments | ask | show | jobs | submit login
Curryfinger – Find the Server Behind the CDN (dualuse.io)
141 points by tbiehn on Sept 14, 2019 | hide | past | favorite | 40 comments



From helping on the Cloudflare Community forum for a while, this seems to be a fairly common issue[0] - users set up Cloudflare and then continue to get attacked since the firewall isn't properly set up to only allow connections from Cloudflare IPs.

Something I don't like is how Cloudflare themselves don't really suggest you firewall off connections that aren't from CF ips, as there's only a support article on whitelisting and not blocking[1]. This is an area I hope CF can improve since any competent, targeted DDOS attacker will know the IP the server had before the owner went to CF, and/or can use a tool like curryfinger to figure it out.

0: https://community.cloudflare.com/search?q=firewall%20cloudfl...

1: https://support.cloudflare.com/hc/en-us/articles/201897700-W...


A firewall doesn't stop the traffic from being sent


It does however defeat this probe method.


Correct, but it won't stop someone from killing the backend/origin server if it is known.


What if you use a firewall provided by AWS or GCP?


A nice way to route your web servers through CloudFlare is using Argo. No ports exposed :)


I haven’t set this up in a while but I think you can set up a “fake” cert on the origin to Cloudflare portion and then also pin that cert into Cloudflare, so you get protection against MITM on top of protection from scanning (and from accidentally serving directly from your host). They should probably support “secret name” http headers instead of the normal host, too. So e.g. your site is set to serve for fjeiiejdndjs.dhdjdj.com and publishes via Cloudflare as www.riskysite.com

Cloudflare also has (had? I haven’t kept up) some special accelerated serving products which would de facto protect from this. Doesn’t help if you just have https vs a full vps though.

It would be awesome to have some standardized containers/ami/etc which were set up for “concealed hosting” via cf, ipfs, tor, etc.


My understanding is this shouldn't be necessary. If you only start using your SSL cert after your origin is protected behind Cloudflare, and don't serve traffic to anything except Cloudflare's CDN IP ranges, your real IP can't be discovered through Shodan, Censys, or any other technique.

And even if you don't lock it down to their CDN, it may still never be discovered if your origin only serves the relevant content when a specific host header and SNI are passed (rather than served by default regardless of host header or SNI), which Censys/Shodan may never try. Someone could still scan a huge chunk of the Internet to try to look specifically for your origin, though. Anyone using Cloudflare or a similar CDN should always spend the minute or so it requires to restrict inbound 80/443 to only Cloudflare's published IPs at https://www.cloudflare.com/ips/


I'm talking about the specific case where www.mosthated.com is sufficiently hated that people will scan the entire Internet looking for it. At some point, you'd use other intelligence to develop info on which VPS providers someone prefers for their hidden middle nodes, too. Knocking a few random VPS providers offline and then observing if the site goes down might be sufficient.

IP acl is best practice but the absolute cheapest web hosting options don't make this trivial or even possible. Plus, you could conceivably scan close to the hosting provider candidate by jacking a CF more-specific.


Are you talking about adding mTLS between cloudflare and the origin?

If the origin doesn’t check CloudFlare’s TLS cert, an invalid cert wouldn’t block a scanner like this.


Am I right to see that if a domain doesn't have records in censys or shodan that this tool will need some aggressive scanning?


In the general case you are looking for server farms (digital ocean/amazon/etc) and passing the host header to get around reverse proxies. That could still be several million ipv4 numbers


Utilize SNI and serve up a fake cert when someone scans you without a matching hostname. Censys is scraping you by IP, so it'll just see the fake cert.

You can do this in nginx by making the fake cert the first server block.


To me it looks like this tool (and cloudflair.py) can only ever find servers which are configured in a very specific way:

* The traffic between server and CDN is encrypted using a valid certificate

* The server's firewall is not properly configured

Apparently there are indeed servers with this configuration, but I just find it odd how someone would go through the trouble of setting up HTTPS (instead of terminating it at the CDN) and then not bother to block traffic from anywhere but the CDN.


Imagine someone with a fully configured HTTPS web server (without a front CDN), that adds Cloudflare in front of it when the load gets higher, and does not bother configuring their firewall.


Yeah, that does sound plausible.


Could be a shared hosting platform behind the CDN where the user doesn't have the ability to ip-acl but does have https. Common among low-end users and in the particular case of a "hidden" site, easily purchased cheap VPSes which are de facto anonymous.


>I just find it odd how someone would go through the trouble of setting up HTTPS (instead of terminating it at the CDN)

At least wrt CloudFlare, it's actually recommended - for privacy reasons, at least so they claim - that you run SSL both between client and CDN, and between CDN and server: https://support.cloudflare.com/hc/en-us/articles/200170416-E...


Uhh they recommend it so the traffic between your server and cloudflare isn't unencrypted over HTTP


> then not bother to block traffic from anywhere but the CDN

How can you find that odd when there are so many mongodb instances on the open internet?


Because I imagine someone who would expose their mongodb instance to the web would also just terminate HTTPS at the CDN, rendering curryfinger/cloudflare.py useless.


Someone wants to lower their cdn bill, so they serve static things on cdn, and dynamic things from origin.


"One; Python has its uses, but writing highly performant multi-threaded scanners is not one of them"

Is that still true given Python 3 asyncio? My understanding is that it's really well suited to writing things like network scanners, without needing to run them in multiple threads.


It hasn't been a problem for most of our things at Shodan. I think there's a lot of confusion about when to use multithreading vs non-blocking sockets. Most security tools tend to use multithreading when they should just use non-blocking sockets. With Python3 you can also swap out the scheduler if you want better performance:

https://magic.io/blog/uvloop-blazing-fast-python-networking/


asyncio is still bound by the GIL. It's not like Go's continuations which can parallelize on the CPU. There still can only be one TCP payload from your Python process at one time. The author wanted to spam the network and use all his bandwidth. Python multiprocessing would have worked for this though (probably faster than the solution using shell commands, but obv this is network-bound much more than CPU bound so who cares).


asyncio makes it a lot better, but I still would reach for a concurrency-oriented language such as Golang or a beam language (elixir, Erlang, etc).


depends on your definition of 'highly performant'

While asyncIO hugely helps, this + interpreted language wont yield better performance vs custom native code doing true native threads which themselves are also using async methods.


The post doesn't explain what shodan and censys are. Would anyone mind explaining it?


They're websites that scan the entire internet to see which hosts are up and responding, and which ports are open, and then make the data available to researchers.


Here is a high-level explanation of Shodan and where it's used: https://help.shodan.io/the-basics/what-is-shodan


Easy solution for hiding yourself from most scanners (Shodan, Censys) - only allow requests with proper SNI, don't serve your server's cert by-default. Firewall is also nice, but you could make a mistake at some point.


How would proper SNI prevent this approach or how is it different? Doesn't this just connect to a given IP and issue a request for the target domain?


Curryfinger seems to work by querying Shodan and other scanners. Those scanners seem to work by just connecting to an IP address's port 443 and look at the certificate. If you always require correct SNI (the domain you host) then that scanning stops working (you literally disappear from Shodan for example). The fix (to scanners) would be to try and resolve every domain name you know of or scan every IP with every domain name you know, that's unfeasible. Only replying to correct SNI is not a defense mechanism by itself, but it does make it more difficult for attackers.


It turns out that if you have a targeted domain you have a good chance of finding it in one of the popular cloud hosting ranges. Masscan + curryfinger work well together. Alexatop + masscan + curryfinger makes an interesting dataset.


Shodan also scans via domains to identify the proper certificate for websites that require a valid SNI.


Interesting, some results out of Shodan were surprising - this might be the reason. How do you pick what domains to try?


Btw you can also see the info we have for a domain using our DNSDB. For example, if you have the latest version of our CLI:

shodan domain uber.com


We have a database of 400+ million hostnames that we scan each month.


So if a host supports SNI you send it 400+mil requests? C'mon, dish.


I haven't seen that, maybe if you manually request a scan, then.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: