I wonder if it would be better to pretend to have a captcha but really you are a...

wraptile · 2024-11-30T04:30:48 1732941048

That's kinda what every major captcha distributor does already!

Even before captcha is being served your TLS is first fingerprinted, then your IP, then your HTTP2, then your request, then your javascript environment (including font and image rendering capabilities) and browser itself. These are used to calculate a trust score which determines whether captcha will be served at all. Only then it makes sense to analyze captcha's input but by that time you caught 90% of bots either way.

The amount your browser can tell about you to any server without your awareness is insane to the point where every single one us probably has a more unique digital fingerprint than our very own physical fingerprint!

encom · 2024-11-30T08:22:44 1732954964

This is how ClownFlare and its ilk, make life hell on the internet, when you use a "weird" browser on a "weird" OS.

jeroenhd · 2024-11-30T08:37:55 1732955875

My experience is that IP reputation does a lot more for Cloudflare than browsers ever did. I tried to see if they'd block me for using Ladybird and Servo, two unfinished browsers (Ladybird used to even have its own TLS stack), but I passed just fine. Public WiFi in restaurants and shared train WiFi often gets me jumping through hoops even in normal Firefox, though.

I can't imagine what the internet must be like if you're still on CG-NAT, sharing an IP address with bots and spammers and people using those "free VPN" extensions donating their bandwidth to botnets.

gosub100 · 2024-11-30T18:00:32 1732989632

Re: your last paragraph, https://coveryourtracks.eff.org/

EFF have been running this for years. Gives an estimate about how many unique traits your browser has. Even things like screen resolution are measured.

zoltrix303 · 2024-11-30T06:01:06 1732946466

Would it be possible to serve a fake fingerprint that appears legitimate? Or even better mimic the finger print of real users who've visited a site you own for example?

nullpt_rs · 2024-11-30T06:21:09 1732947669

yep, but it can get tricky.

some projects worth checking out: https://github.com/refraction-networking/utls https://github.com/berstend/puppeteer-extra

saagarjha · 2024-12-01T09:37:35 1733045855

Unrelated, but who runs this account?

wraptile · 2024-12-02T15:59:56 1733155196

Yes, that's what web scraping services do (full disclaimer I work at scrapfly.io). Collecting fingerprints and patching the web browser against this fingerprinting is quite a bit of work so most people outsource this to web scraping APIs.

barbolo · 2024-11-30T10:14:30 1732961670

https://github.com/lwthiker/curl-impersonate

PUSH_AX · 2024-11-30T08:28:54 1732955334

In that case why do I ever receive a captcha?

Pikamander2 · 2024-11-30T09:30:19 1732959019

It adds another layer of analysis. For example:

If the user solves the CAPTCHA in 0.0001 seconds, they're definitely a bot.

If the user keeps solving every CAPTCHA in exactly 2.0000 seconds, each time makes it increasingly likely that they're a bot.

If the user sets the CAPTCHA entry's input.value property directly instead of firing individual key press events with keycodes, they're probably either a bot, copy-pasting the solution, or using some kind of non-standard keyboard (maybe accessibility software?).

Basically, even if the CAPTCHA service already has a decent idea of whether the user is a bot, forcing them to solve a CAPTCHA gives the service more data to work with and increases the barrier of entry for bot makers.

sdk16420 · 2024-11-30T14:29:04 1732976944

I found several websites switched to 'press here until the timer runs out', probably they are doing the checks while the user is holding their mouse pressed, it would be trivial to bypass the long press by itself with automated mouse clickers.

kccqzy · 2024-11-30T04:06:53 1732939613

That's what reCAPTCHA does.