No-JavaScript Fingerprinting

cookiengineer · on Feb 7, 2022

Note that among a sea of tracked browsers, the untrackable browser shines like a bright star.

Statistical analysis of these values over time (matched with client hints, ETags, If-Modified-Since, and IPs) will make most browsers uniquely identifiable.

If the malicious vendor is good, they even correlate the size and order of requests. Because that's unique as well and can identify TOR browsers pretty easily.

It's like saying "I can't be tracked, because I use Linux". Guess what, as long as nobody in your town uses Linux, you are the most trackable person.

I decided to go with the "behave as the statistical norm expects you to behave" and created my browser/scraper [1] and forked WebKit into a webview [2] that doesn't support anything that can be used for tracking; with the idea that those tracking features can be shimmed and faked.

I personally think this is the only way to be untrackable these days. Because let's be honest, nobody uses Firefox with ETP in my town anymore :(

WebKit was a good start of this because at least some of the features were implemented behind compiler flags...whereas all other browsers and engines can't be built without say, WebRTC support, or say, without Audio Worklets which are for themselves enough to be uniquely identified.

[1] https://github.com/tholian-network/stealth

[2] https://github.com/tholian-network/retrokit

(both WIP)

gtsop · on Feb 7, 2022

Wait, but the case for TOR isn't to hide you are using TOR, but to blend in with the crowd of TOR users. Does my TOR browser give a different fingerprint than yours?

cookiengineer · on Feb 7, 2022

The IPs of TOR exit nodes are publicly known. I'm not sure people using TOR are aware of how the concept and peer discovery mechanism works.

I mean, at some point you gotta ask yourself why cloudflare shows all TOR users a captcha in a targeted manner :)

coma_ · on Feb 7, 2022

> I mean, at some point you gotta ask yourself why cloudflare shows all TOR users a captcha in a targeted manner :)

Because exit node IP address' are known, you have just said it.

How does it have anything to do with user fingerprinting? And how does it answer the question of @gtsop asking whether different users of ToR web browser have different distinguishable fingerprint or not?

jokethrowaway · on Feb 7, 2022

the browser fingerprint may be software / hardware dependant

TOR will hide your IP to interested parties (unless they have NSA resources / can buy a good part of the TOR network)

octoberfranklin · on Feb 7, 2022

So what? There are plenty of stars in the sky, and if you are farbling [1] your star shimmers differently every time the surveillance-monster glances skyward.

The comment above regurgitates a misconception frequently found on HN: the assumption that the only defense against fingerprinting is to look exactly identical to everybody else. That is incredibly shortsighted.

[1] https://github.com/brave/brave-browser/issues/12069

omgitsabird · on Feb 7, 2022

I didn't review the linked projects, but isn't the "untrackable" browser one that does not implement or spoofs most APIs?

I think it would be, for the most part, trivial to make a text-based or extremely stripped down browser on top of existing projects, if you had the contracts mapped for appropriate code generation. There are IDLs for most Web APIs, so that is a head-start.

I think this would be achievable, but not as a browser that most people would want to use.

cookiengineer · on Feb 7, 2022

> WebIDLs

Exactly. One of the reasons was that WebKit implements its APIs based on the WebIDL schema files (iirc Firefox does this too since Aurora).

Though the C++ code generator is a real old perl script, it's generally feasible to spoof the APIs as they would behave while providing behaviour profiles of the most commonly used web browsers (e.g. chrome/edge on windows).

The real challenge is to implement behaviour profiles that are also timing specific because some Browsers have different timings in incognito mode vs. normal mode due to how the memory is allocated directly in-RAM. That's usually how incognito mode Browsers are identified by recaptcha.

shp0ngle · on Feb 7, 2022

The actual blogpost is here

https://fingerprintjs.com/blog/disabling-javascript-wont-sto...

basically they use CSS trickery together with server-side stuff.

It's pretty clever.

  @font-face {
    font-family: 'Helvetica';
    src: local('Helvetica'),
         url('/signal/(token)/fontHelvetica')
         format('truetype');
  }

to detect font (which detects OS), and

  @media (featureX: value1) {
    .css_probe_42 {
      background: url('/signal/(token)/featureX/value1');
    }
  }

to detect browser features (which detects browser)

yamrzou · on Feb 7, 2022

Thanks, I thought I saw the demo before on HN but I was not sure as I couldn't find the domain.

Previous discussion: https://news.ycombinator.com/item?id=29042791

redbell · on Feb 7, 2022

Nice read though..

I was surprised by the amount of CSS tricks used for fingerprinting all while having js disabled.

A few days ago, I came across this "GPU fingerprinting" called "DrawnApart" => https://blog.amiunique.org/an-explicative-article-on-drawnap...

and was thinking that this must be the most advanced fingerprinting approach so far but after reading your article, I have to reconsider!

3pt14159 · on Feb 7, 2022

Eh. It's been pretty easy to fingerprint browsers for a while now, including those types of CSS hacks. The real feat is doing so without looking like you're doing it and for it to be durable (survive OS upgrades, reboots, etc).

capableweb · on Feb 7, 2022

navigator.useragent will still have more precise answer as the amount of people who fake their useragent (or have JS disabled) are less than the amount of Windows/Linux users that have Helvetica (clone/real one) installed.

tyingq · on Feb 7, 2022

I thought it was going to use ETAG based fingerprinting/tracking, which I always thought was pretty clever. The etag header is supposed to be used to control caching, so it's typically a server-side generated hash of the requested resource's content. But, there's no requirement for it to be, so you can generate a unique one, and the client will send it back to you next time it asks for that uri. Sort of like a cookie.

The fingerprint sources are here: https://github.com/fingerprintjs/blog-nojs-fingerprint-demo/... Basically some css that uniquely identifies some browsers, then headers like user-agent, language, etc.

chmod775 · on Feb 7, 2022

ETag based tracking shouldn't work across domains.

jefftk · on Feb 7, 2022

Specifically, it won't work across sites -- all major browsers (now) shard the HTTP cache by site. While www.example.com and forums.example.com are different domains, they're the same site ("registrable domain" or "eTLD+1"; example.com in this case). See https://publicsuffix.org/list/public_suffix_list.dat for the list of eTLDs (parents of registrable domains).

cookiengineer · on Feb 8, 2022

Note that CNAME cloaking is also an issue here, because Firefox is the only Browser with a userspace network stack whereas Chrome/Chromium relies on the OS...which means that Chromium or Electron based Browsers cannot protect themselves from CNAME cloaked domains.

Adblockers like uBlock Origin just have a domain list of known CNAME cloaked domains, but that's not based on the DNS entries directly because there's no API for this for web/chrome extensions.

Long story short: Sharding can be tricked with CNAME cloaking.

jefftk · on Feb 8, 2022

CNAME cloaking doesn't allow cross-site tracking via the cache: it's still sharded based on the site you're visiting. Imagine a setup like:

   www.example.com/foo: page you're visiting
   tracker.example.com: CNAMEd to tracker.example
   www.example.org/bar: a page you visit on another site
   tracker.example.org: CNAMEd to tracker.example

This will use two cache partitions: one for example.com, another for example.org. ETags observed by the tracker will be different between those two cases.

cookiengineer · on Feb 8, 2022

Do you maybe have a link to the relevant source so I can take a look at it? I was always under the assumption that Firefox and Chromium only cache the resulting A/AAAA entries.

In WebKit, the DNS entries don't play an active role in enhanced tracking protection, because they only use metrics of content vs data being sent to determine the importance of subdomains.

Theoretically upstream WebKit could be tricked when using spoofing a domain that's part of the Quirks.cpp.

(e.g. microsoft.com or sony.com basically have access to all cookies, technically speaking, because their domains are hardcoded to get cross origin access)

jefftk · on Feb 8, 2022

The way ETag-based tracking works is:

* When you load a resource it may return a "ETag" header

* The browser stores the resource in its cache under a key like [site, resource url]. For example [example.com, https://example.net/foo] means "https://example.net/foo" requested when visiting some page on example.com.

* When the browser tries to use the resource again and sees that it's expired, it asks the server for it, sending along the previous ETag value in the If-None-Match header so the server can avoid sending an updated copy if the resource hasn't actually changed.

* Since the original ETag value is echoed back to the server, the server can tell this is the same browser it was talking to earlier.

Now, a few years ago, when the browser would store resources in its cache under a key like [resource url], without sharding by site, this was occasionally used like a third-party cookie. When visiting any site that is using the relevant tracker, the tracker would request https://example.net/foo, and as long as the resource had not been evicted from cache the tracker was able to reidentify users across sites. With the HTTP cache sharded by domain, however, this no longer works across sites.

This seems to me to be entirely orthogonal to CNAME cloaking; where do you see that fitting in?

cookiengineer · on Feb 8, 2022

The way trackers work these days is that they do not have their own domains (fqdn) anymore.

There is a CNAME tracker.example.com and a CNAME tracker.example.org which both point to the same tracker IP by a third-party.

This allows them to infiltrate both site caches because - as I understand it - the sharded caches do not separate the third party domains there, as the Browser still thinks that tracker.example.com belongs to example.com.

They will probably have two separate ETag values for the same URL, but in both scenarios the cross-site protection mechanisms are basically nullified. Things like cross site cookies aren't necessary anymore because the public suffix list alone mandates that those domains are actually second-party domains by the same first-party origin.

I've seen some trackers going as far as abusing the resource lifetimes (Last-Modified and Pragma/Cache related headers) where they use a timestamp far in the past (aka in the 1980s) and "reserving this millisecond" as a unique identifier for a specific client that they're tracking...in order to bypass the implementations that try to prevent this kind of tracking via HTTP headers.

jefftk · on Feb 8, 2022

No, the cache is keyed by url (https://tracker.example.com/tracker.js vs https://tracker.example.org/tracker.js) and not by IP. So those resources will have separate cache entries with separate ETag/Last-Modified/etc values.

cush · on Feb 7, 2022

You could just use any regular cookie then, in this case.

yeskia · on Feb 7, 2022

I suppose ETags could allow you to continue tracking someone even if they purged their cookies?

cmeacham98 · on Feb 7, 2022

I have Firefox RFP enabled, and as far as I know all these values will be the same across RFP users except the size of the browser window (which seems like a poor fingerprinting metric given the fact I can and do resize my browser during usage) - and maybe pixel density? (can't tell if this one is standardized or not).

If there are any other RFP users out there compare to mine and see if you get anything different other than browser window size: https://noscriptfingerprint.com/result/yiRFTom5qPpKPxOl

emrex · on Feb 7, 2022

What if you are the only one in your area that uses FF RFP ?

zelphirkalt · on Feb 7, 2022

It says:

"4. Click the toggle button until “true” changes to “false”"

But for me there is no toggle button. It seems to not be able to do anything, with my blocker settings.

When I allow the site, I see, that it did change the screen resolution result compared to yours.

nieve · on Feb 7, 2022

Even without RFP it seems to be very fragile, just resizing my window completely changes the result.

jefftk · on Feb 7, 2022

That's because it is including the dimensions of your window as part of your fingerprint. If your goal is to track users as they move across sites, matching browser size makes sense.

c0balt · on Feb 7, 2022

FYI, TOR browser has set a default browser resolution that ignore resizing to mitigate this issue. This might only make you blend in with other TOR users but this approach to resist device/ interface fingerprinting is quite interesting.

Urd- · on Feb 7, 2022

Yes, my signals only differ by screen size.

zaik · on Feb 7, 2022

I wonder how unique this fingerprint really is. Getting the same fingerprint every time is only half of what makes a good fingerprint.

Fennec on Android: cdec914cb91d1a88fbd3e7834b7968c8

twhitmore · on Feb 7, 2022

Agreed. It's not a "fingerprint" at all as in most cases, it won't be remotely unique.

For example: every user in the world with the same screen size, browser & platform will get the same result. For desktop it's window size rather than screen, but coarsely bucketed.

Thousands or millions will share the same fingerprint. It seems somewhat unrealistic to ascribe much advertisement or tracking value to this.

farias0 · on Feb 7, 2022

But this are not the only info. OS language, dark mode preferences, fonts installed, all this stuff goes a long way in helping pinpointing you.

farias0 · on Feb 7, 2022

Yep, this site should tell you if your fingerprint is unique among all the other visitors, like other fingerprint sites do. I know a positive would only be meaningful if the website gets popular, but a negative would say a lot regardless.

mrob · on Feb 7, 2022

I am unpleasantly surprised that Firefox's "ui.prefersReducedMotion" preference is detectable from CSS. I expected a preference starting with "ui" to only affect the UI. It should be possible to reduce (or ideally remove) UI animations without affecting web pages.

ChicagoBoy11 · on Feb 7, 2022

I'd figure most web-apps have UI-like interfaces (contrasting with just static pages), so I can see why it might be useful for sites to be able to minimize transitions/effects/motion/etc. that may exist on their page.

XCSme · on Feb 7, 2022

Isn't the main point of the user preference flag to allow the website developer to serve a reduced-motion version of the site?

mrob · on Feb 7, 2022

AFAIK, all the other ui.* preference affect the Firefox UI, not websites, so it's a misleading name. And I have already disabled web page animations in userContent.css[0], which is the correct place for modifying CSS. I shouldn't have to choose between allowing annoying UI animations and allowing fingerprinting.

[0]To disable web page animations, in Chrome/userContent.css in your Firefox profile directory:

  @namespace url(http://www.w3.org/1999/xhtml);
  *, :before, :after {
    transition: none !important;
    animation-delay: 0ms !important;
    animation-duration: 0ms !important;
  }

This also requires setting toolkit.legacyUserProfileCustomizations.stylesheets to true in about:config

alkonaut · on Feb 6, 2022

This doesn’t worry me too much, the value seems like it should be the exact same hash for any iOS Safari visitor with the same screen resolution and browser language? I’d be fingerprinted as part of a group of (probably) several hundred thousand.

8d666b05c42878d6d6d364c410a4eef2

It’s a shame that browsers leak things like font presence, but of course when sites can read back canvas contents without querying the user is when it all goes out the window.

NavinF · on Feb 7, 2022

Chrome (Safari) on iPhone 13 Pro Max: db9df4c8c770242aaf1e0efdd5cf8ab2

There’s likely enough info in the hash to figure out exactly what device I’m on since nothing else will have the same pixel density, screen height, and screen width: https://stackoverflow.com/questions/46313640/iphone-x-8-8-pl...

Dark mode adds on one more bit, but everything else should be the same across iOS devices. Not sure if that’s enough to track individuals among a small userbase every time their IP address changes.

HWR_14 · on Feb 7, 2022

I'm pretty sure the Pro Max 12 has the same screen settings. I also wonder about maximized windows on a 1080p monitor. I know DPI of desktop monitors at one point was a constant returned value.

niffydroid · on Feb 7, 2022

While this is cool and handy, sadly not of any use to me.

According to Uk ICO (Information Commissioner Office) Fingerprinting has to have consent.

https://ico.org.uk/for-organisations/guide-to-pecr/what-are-... "PECR also applies to ‘similar technologies’ like fingerprinting techniques. Therefore, unless an exemption applies, any use of device fingerprinting requires the provision of clear and comprehensive information as well as the consent of the user or subscriber."

Gwypaas · on Feb 7, 2022

I would rather see it as a way to easily demonstrate why GPDR is not simply about cookies and their management. It concerns any method of creating an identifier which would allow for tracking.

kwhitefoot · on Feb 7, 2022

Why sadly? Why would you want to fingerprint users browsers anyway?

ravenstine · on Feb 7, 2022

Cool. I get a different fingerprint every time. Although that doesn't prove that I'm not fingerprintable by this approach (one difference may be responsible for changing the hash), I'm kind of pleased.

My setup:

- Firefox

- Enhanced Tracking Protection

- uBlock Origin

- NoScript

- Privacy Badger

- Privacy Possum

- HTTPS Everywhere

- Clear URLs

- Decentraleyes

- Smart Referer

- JavaScript Restrictor

lelandfe · on Feb 7, 2022

Not familiar with Privacy Badger or Privacy Possum, but I'm guessing they are what's subverting this technique? The other ones probably don't come into play.

beebeepka · on Feb 7, 2022

NoScript allows you to skip font loading for allowed and blocked domains.

lelandfe · on Feb 7, 2022

This is using local/system fonts, not remote ones.

ReactiveJelly · on Feb 7, 2022

Any other Tor Browser users get b15acf0466e7b2da06cfea09b1ab2cda? TBB is of course meant to resist fingerprinting...

ugjka · on Feb 7, 2022

c4abc6d4de3c5ccb4e9b229f50ed85be Tor Browser 11.0.4

v3ss0n · on Feb 11, 2022

That is quite shitty way to fingerprint. Imagine an Office full of old people who don't bother to install or configure their own OS/Browser/Fonts and they only use company provided Laptops whicha ll of them are same , with Default windows 10 ? (or 7)

That would generate same fingerprint. What a uselsss way, Should not call it fingerprinting .

tentacleuno · on Feb 7, 2022

This reminds me of browser fingerprinting via CSS[0], which has come up quite a surprising amount of times. Sending the browser's width is a bit more convoluted though. From what I've seen, the popular (well, as popular as CSS-only fingerprinting is) method is to set a media query for each px value. When the user resizes the window, the new width is sent to the server. Obviously something like this would be generated code, not written by-hand.

[0]: https://news.ycombinator.com/item?id=29794518

RunSet · on Feb 7, 2022

This uses an extensive list of CSS options that each make a server request with url('URL/CONTAINING/TRACKING_CODE');

The view results button then points to https://noscriptfingerprint.com/compactResult/TRACKING_CODE

How unique of a browser fingerprint is generated by the CSS options would require further consideration. An easy fix to this would be to disable url values for styles, though.

Edit: Blocking CSS entirely in uMatrix causes it to mostly fail.

ComodoHacker · on Feb 7, 2022

After 3+ months, they haven't published any data, or I just couldn't find it. Which kind of makes the whole endeavor meaningless.

a1371 · on Feb 7, 2022

For me the fingerprint changed after the first refresh, but then it stayed the same. I wonder what's different in the first visit

danuker · on Feb 7, 2022

With the latest Tor Browser (11.0.4 based on Mozilla Firefox 91.5.0esr) on Arch Linux, I get the following:

* 9dfe9b69a9f18ef1c0d313aa4c013b52 in regular windowed mode

* 0fa96aa3ae2698bc2d1187f39b79db36 in fullscreen mode (F11).

Does anyone else get any of these fingerprints? If so, that is evidence that Tor Browser is well-designed. If not, that is evidence of some details overlooked.

remram · on Feb 7, 2022

Apart from window size and that Ubuntu-specific font, this mostly gets things openly sent in my User-Agent and Accept headers.

1vuio0pswjnm7 · on Feb 7, 2022

The text-only browser I am using does not support CSS.

It does not send E-tag headers either. The local forward proxy removes all HTTP headers except Host and Connection, and Cookie where needed.

notriddle · on Feb 7, 2022

On many websites, you will literally be the only person doing that. This is a unique fingerprint.

This might not matter to you, since it sounds like privacy isn’t your primary motivation here, but it is worth pointing out that custom patched browsers are going to be more fingerprinted, not less.

1vuio0pswjnm7 · on Feb 7, 2022

Yes, I understand some HN readers have this thought. I have gotten similar replies before. However, consider that I only send two to three headers: Host, Connection and (optionally) Cookie. There is nothing unique about the text-only browser by virtue of the patches. The TCP connetion and TLS is handled by a proxy. Sometimes I send the TCP requests with netcat, tcpclient, socat, etc. Then I open the HTML file with the browser.

Perhaps the proxy has a fingerprint, but it is a very popular proxy in widespread use.

Maybe the OS, e.g., the networking stack, has a fingerprint.

Maybe the TCP clients have fingeprints.

But seriously, what is the point of thinking about these things. Who is going to go to such lengths to try to "identify" me. What is their purpose. Am I a spy trying to hide from computer forensics nerds. No. Am I trying to "blend in". No. I am trying to improve the web experience. That involves 1. sending the minimum data (avoid feeding the online advertising juggernaut) and 2. reducing advertising, ideally to zero. I have done a good job of both 1 and 2.

Who is going to try to advertise to a user who is using a text-only browser.

Also, assuming hypothetically, for argument's sake, we tried to get every user to "blend in" by using the exact same browser with the exact same settings on the exact same computer. Which would be easier: (a) users have to copy all the settings and idiosyncracies of a "modern" graphical browser including numerous HTTP headers or (b) users have to refrain from running CSS or Javascript and limit to sending only two to three headers (Host, Connection and, optionally, Cookie) and only send one request at a time. The more points of differentiation one has to worry about, the greater the chance one will overlook something. Needless to say, the "modern" browser presents a greater number of points of differentiation than the text-only browser.

The goal for me is to improve the web experience, including minimising (a) how much data I voluntarily share and (b) advertising. I have succeeded. The use of a text-only browser in lieu of a graphical one for recreational web use is part of the solution. I also benefit from other strategies that help with (b). Attempts to advertise to me online are few and far between. It is a fruitless endeavour.

It is not a goal of mine to try to "blend in" with other web users. It appears this is a goal of some HN commenters. Alas, other web users share heaps of data voluntarily and subject themselves to large amount of advertising. There is arguably a price to pay for trying to appear "same".

notriddle · on Feb 7, 2022

> But seriously, what is the point of thinking about these things. Who is going to go to such lengths to try to "identify" me.

Nobody's going to very much trouble at all. They're just dumping every characteristic they can gather about you into an AI system, like a Bayesian classifier or a Convolution Neural Net. It doesn't require very much work to take into account clearly discrete data like the set of headers you submit, or the delay between switching pages, or parts of your IP address.

You hear a lot of stuff on HN about how inaccurate AI is, and much of it is true. But for figuring out when the set of HTTP headers correlates with your shopping habits, it should actually do a pretty good job, because it's basically just a matter of finding ways to correlate data together. No need to recognize when it's missing some form of outside context, because it doesn't "fail" or "succeed", it just does "worse" or "better." As long as it does better than a coin flip, it's worth it.

Right now, it's pretty effective to block ads by just not loading them, but there's no universal law that says it will always be that way. That already doesn't work on YouTube, which serves the ads from the same domain as the content, meaning that most ad blockers don't work on it. If ad blocking keeps becoming more popular, tactics like that will become more common. Once the ad serving becomes strictly first-party, relying on JavaScript looks like an increasingly terrible idea, not because of the minuscule number of people blocking JavaScript, but because you can't trust the potentially-malicious client to defend against click fraud.

1vuio0pswjnm7 · on Feb 7, 2022

These replies about "uniqueness" are in response to me disclosing I use a text-only browser or some non-graphical client to access websites. Why should "uniqueness" matter to me. As I said, I am just trying to avoid the annoyances of graphical web browsers. I am successful in doing that.

The majority of web use for me is not shopping. Why should I use the same browser for shopping that I use for recreational web use.

As for YouTube, this has been brought up many times. I cannot speak for other users, but I see zero ads when using YouTube. I search and download videos from the command line. With very few exceptions I never need to use youtube-dl because the signature values are already in the web page. There is no need for a "Javascript video player" to submit HTTP requests. The Javascript-enabled behavioural tracking on the YouTube website is insane. I use tiny shell scripts to search and download. I am aware of "SponsorBlock" which suggests some videos have ads embedded in them however I have never seen such a video. Most videos I watch are non-commercial.

"Click-fraud" is IMO secondary to fraud on the part of Big Tech and Big Tech wannabes who induce advertisers to purchase online advertising knowing, but not adequately disclosing, that it suffers from such inherent technical flaws.

notriddle · on Feb 8, 2022

> These replies about "uniqueness" are in response to me disclosing I use a text-only browser or some non-graphical client to access websites.

And your disclosure was in response to a CSS-based fingerprinting demo. If being fingerprinted doesn’t even matter to you, and you use a text-only browser just because you prefer the UX, then why bring it up on this article in the first place?

1vuio0pswjnm7 · on Feb 8, 2022

Because when there is a thread about a demo, and for some users the demo does not work, it is common to see comments that the demo did not work.

Being fingerprinted does matter to me. It is one more reason why the large, complex, graphical browsers supported directly or indirectly by online advertising annoy me. It is nigh impossible for users to control those programs.

As it happens, using a text-only browser, the TCP clients and the use of a proxy to remove headers all make fingerprinting less useful. The fingerprinting techniques used by "tech" companies tend to rely on the features of the large, complex, graphical browsers supported directly or indirectly by online advertising. For example, CSS fingerprinting does not work with a text-only browser doing its own formatting and ignoring CSS. Although this is not the primary reason I use a text-only browser, TCP clients and a proxy, any attempted fingerprint of that setup would indicate a user who cannot see ads. What would be the use of the fingerprint then.

checkyoursudo · on Feb 7, 2022

Thanks for this comment. This is basically my position. While I prefer that sites would respect "do not track", ultimately I just don't want to see all the shitty ads. I don't really care if you're trying super hard to track me, though I think you're a fool if you do based on what I do online (read HN, wikipedia; download academic papers, mostly?).

I sympathize with the people who are going for pure anonymity. If I could be anonymous and still have a usable web, then I would do that. If you really think you'll learn something about me by tracking me, then whatever. I have still never been served a relevant ad in my life, so, uh, great job there. But in the end I just don't want to see all the shitty ads.

I would be interested to know what these people think they know about me. Based on my experience as a cognitive scientist, I suspect they know an awful lot less than they think they know ... or at least claim in their sales pitches to advertisers.

1vuio0pswjnm7 · on Feb 7, 2022

"Thanks for this comment."

Likewise.

What these replies about "uniqueness" seem to ignore is that the majority of traffic on the internet is so-called "bots". In other words, it is traffic from clients that are not Chrome, Safari, etc. It is ridicuously easy to be mistaken for a "bot" when submitting requests manually, if one does not know what they are doing. For example, editing a single HTTP header is often enough. "Bot detection" is more often than not based on laughably crude heuristics. What happens if the user makes a single request manually and that header is missing. There is nothing to check. In almost all cases, nothing happens. There is no penalty for reducing the amount of information sent. In any event, it is rather easy to unintentionally "blend in" with the majority of internet traffic, which comes from "bots".

Those professing to have superior knowledge about user behaviour, including Big Tech, still cannot tell if someone is submitting requests manually or not.^1 (Absent keylogging on the users computer.) Their superior knowledge of user behaviour only applies to users who use "modern" browsers that place high emphasis on graphics. Chrome, Safari, etc.

No one is going to try to advertise to a "bot", i.e., a non-graphical client. It would be ineffective. The online advertisig industry relies on graphical web browsers like Chrome and Safari.

Using a common browser with the default settings to try to "remain" anonymous comes at a cost. Default settings do not include installation of extensions, e.g., ad blockers.

1. Contrast this with the different question of determining whether or not a user is using a certain client, e.g., Chrome, Safari, etc. That is an easier question to answer. However detection of other clients is not done. I am never notified whether or not I am using, e.g., tcpclient, original netcat, socat, openssl, etc. How does one detect the difference. And assuming they could tell, then what. How will ads be served. The HN commenters replying about "uniqueness" fail to consider why there is so much effort to "fingerprint". It is driven by advertising which puts a monetary value on gathering user data. Using a modern browser, sending more data voluntarily to "belnd in", feeds the online advertising industry and ensures such surveillance efforts will only increase. As checkyoursudo suggests, the data collectors will "claim in their sales pitches to advertisers" that they know a great deal about users, regardless of whether the data they have collected is truly accurate or usefully informative. Feeding the data collectors "fake" or "non-unique" data is one idea, but another idea is not sending the data at all. For HTTP requests, the later works for me.

lelandfe · on Feb 7, 2022

Do you generally have a good time on the web with that?

1vuio0pswjnm7 · on Feb 7, 2022

Yes. It is a relief from the times I must use the graphical browsers.

For example, I can search for and consume information much faster and more efficiently, free from distraction.

Perhaps one needs more than one client for the web. For example, I need a netcat-like program, some helper programs for working with HTTP and a text-only browser for viewing HTML. Plus a TLS proxy to deal with all the HTTPS. This is in addition to graphical, everything-but-the-kitchen-sink desktop or mobile browsers.

Perhaps a single, large, graphical browser directly or indirectly funded by advertisers is insufficient for all web use. Other clients may work better in certain situations. Certainly this is true for me. The smaller programs are faster, more robust and offer me greater flexibility.

makeworld · on Feb 7, 2022

Which browser are you using?

1vuio0pswjnm7 · on Feb 7, 2022

For text-only, I prefer Links. I use various versions compiled statically with personal patches.

captainmuon · on Feb 7, 2022

I wonder why we are not seeing more completely server-side analytics. With tricks like this you can get by with no or only minimal JavaScript. You can also set a session cookie to be more accurate. If the cookie is neccessary for the functioning of the site, you don't even need GDPR consent.

Practially, and I know this is cynical, but you don't really have to follow the GDPR, you just have to make sure there is no outside sign you are tracking users. And frankly, if I had a website, I would want accurate statistics, but I would want to avoid a cookie prompt, and I would want as little external javascript as possible. So it is weird that there is no black-hat stealth visitor analytics yet.

notriddle · on Feb 7, 2022

> Practially, and I know this is cynical, but you don't really have to follow the GDPR, you just have to make sure there is no outside sign you are tracking users.

Why bother, then?

I mean this in the relatively broad philosophical sense. There are few who seek pain, simply because it is pain, and there are few who implement complex tracking schemes simply to implement complex tracking schemes. They only go through this much work if there is some pleasure or profit to be gained from it.

If you start doing significantly-meaningful ad targeting, someone might notice the striking correlation. If you're a big enough organization for the practice to become dangerous, then even if nobody on the outside notices, someone might leak. What's the point in gathering the information if you can't use it for fear of someone figuring out what you did?

charcircuit · on Feb 6, 2022

My fingerprint is 05340cc39b7388b85993bd6117d4ebbb

caaqil · on Feb 7, 2022

The fingerprint changes on every refresh for me. I have uBlock, Privacy Badger and similar extensions enabled if that matters.

scotty79 · on Feb 7, 2022

DuckDuckGo android browser I'm currently using doesn't prevent me from getting consistently fingerprinted.

gmiller123456 · on Feb 7, 2022

I just changed my preference for dark mode and got a different fingerprint. If it's not resilient to modest changes, it's not really that helpful at tracking anyone. There are already plugins to randomize some header data, so this type of thing has been known and protected against.

heavyset_go · on Feb 7, 2022

If you have past data collected on the users, correlating users despite fingerprint changes should be pretty easy. If the fingerprint itself encodes data and isn't just a cryptographic hash, you don't even need to have past data collected, you just need fingerprints.

spicybright · on Feb 7, 2022

It depends on what settings actually change your finger print or not. If only 10% of people use dark mode toggle that's still 90% of users being tracked, which could be worth while.

khalby786 · on Feb 7, 2022

How practical would this method of recognising your fingerprint with CSS be?

dsnr · on Feb 7, 2022

This, and also the power of tracking comes from being able to share data/track across domains. So the best way to mitigate this is not to dumb down our browsers but to fight cross domain access to this data. This can be done either by regulation (see GDPR) or technical measures (disabling cross-site scripting, cookies etc)

zelphirkalt · on Feb 7, 2022

They can simply exchange the data server side, no?

dsnr · on Feb 7, 2022

In theory yes, but they can't use the data for legitimate business while also complying with the regulations.

encryptluks2 · on Feb 7, 2022

Well for one thing, an IP address is always going to be the easiest way to track a user, but also user agent information which browsers happily give away too much information.

selcuka · on Feb 7, 2022

> an IP address is always going to be the easiest way to track a user

Note that an IP address is not unique to a single user (NAT, CGNAT, mobile networks) and may frequently change.

> also user agent information

All users who use the same version of a browser on the same OS typically share the same user agent string (unless modified with a plugin or an extension). It's an indicator, but can't be used as a fingerprint itself.

jefftk · on Feb 7, 2022

> IP address is always going to be the easiest way to track a user

This is changing: Apple has rolled out Private Relay, and Chrome is planning some combination of willful IP blindness and near-path NAT.

> user agent information which browsers happily give away too much information

So is this: all the browsers are working on reducing how much they put in the UA.

czbond · on Feb 7, 2022

Yes, for many. VPN's, mobile connections, and commercial work connections (using a proxy) will throw off IP address checks as they report a single IP for large pool groups. You'd have to also use MAC address, but mobile have MAC address swapping ,etc.

I once wrote a tool that would capture all network requesters, and reverse fingerprint them through a connection of Operating system quirks responses to network oddities (eg: tcp fragmented frames), location, routers they connect through, etc combined with the other browser things avail.

farzher · on Feb 7, 2022

win11, brave/chrome, 1440p 165hz: 9b93344ef597a0b8ef7184d4a693d548

easrng · on Feb 7, 2022

This doesn't seem to work very well. Want a better scriptless cross-site tracking mechanism? Check out https://xsid2-demo.glitch.me and https://xsid2-demo.easrng.net and note how they both get the same id.

tpfour · on Feb 7, 2022

Ok sure, but how does it work? I clicked around your website but there doesn't seem to be a description.

easrng · on Feb 7, 2022

I haven't written a description yet. Here's a diagram I just made (I'm on my phone rn, please forgive any spelling issues) https://owo.whats-th.is/56opvAS.html

tpfour · on Feb 7, 2022

Thank you, interesting!

dfawcus · on Feb 7, 2022

Neither of the above links do much for me.

There was an initial screen, then a redirect to another page containing the text below, and appearing to depend upon JS being enabled.

"Waking up

To keep Glitch fast for everyone, inactive projects go to sleep and wake up on request."

That redirect is different for the two initial URLs, but is of the form:

    https://xsid2.glitch.me/https%3A%2F%2Fxsid2-demo.glitch.me%2Fcb%2FHASH

where HASH is a sequence of hex characters, different for the two original URLs.

I'm using Firefox, with uMatrix having scripts and css disabled by default, but 1st party cookies enabled.

dfawcus · on Feb 7, 2022

A second try loading each site in new tabs seemed to work, but this time each gave a different ID. One starting 3fa2, the other f918

Reloading each tab then gave two different numbers, 7eb6 and 6f83

Subsequent reloads did not change again.

Loading the sites in to two more tabs, gave yet a different pair of numbers (1934 and 1667), reloads of those tabs yeilded another pair (b308 and 3df8)

easrng · on Feb 7, 2022

Oh huh, I'm seeing this too. It worked at some point, not sure why it isn't now. I'll look into it.

Edit: it looks like this happens when the sites have gone down and are being restarted on load. I'll move them to my VPS, that should fix it.

easrng · on Feb 7, 2022

Ok, it should work more reliably now.

_6mdd · on Feb 7, 2022

This doesn't work very well.

On the glitch website, I get f9703...

On the easrng website, I get 8ecd6...

easrng · on Feb 7, 2022

Hmm, what browser? I've tested normal Firefox, Chrome, and Bromite and it works on all of them.

kevincox · on Feb 7, 2022

I get different results for each domain on Firefox Android.

easrng · on Feb 7, 2022

It works on Fennec, even in private mode. (Fennec is the F-Droid Firefox build.) Do you have any addons? It works with uBlock Origin on desktop, haven't tried mobile.

Arnavion · on Feb 7, 2022

I'm on a desktop using FF with uBO, both with fairly locked-down configs, and it doesn't work for me. That is, both websites eventually lead to a broken page because of too many redirects, and the URLs in the URL bars of each tab do not contain the same ID.

Without digging too deep into what your websites are doing, I did notice that they're trying to set / read cookies, which is a) being blocked by uMatrix in general, and b) perhaps being blocked by me having configured FF to block third-party cookies in particular.

easrng · on Feb 7, 2022

Oh yeah, it needs 1st party cookies enabled.

easrng · on Feb 7, 2022

Can you try again? I've moved one of the parts off Glitch and onto my VPS which seems to have fixed the issue.