Hacker News new | past | comments | ask | show | jobs | submit login
DoH Privacy Enhancement: Do Not Set the User-Agent Header for DoH Requests (bugzilla.mozilla.org)
116 points by generalpass on Dec 17, 2019 | hide | past | favorite | 59 comments



I don't see a benefit of the built in DoH in firefox since it is shifting the trust from my existing DNS provider to cloudflare[1] (based in the US).

Today I discovered that I couldn't just kill the disqus.com requests with my usual method by sink-holing them in /etc/hosts because for some reason firefox ignored the /etc/hosts file. While Chrome seemed to work fine and local dig requests also returned

  disqus.com.  0 IN A 0.0.0.0
I was left dumbstruck.

I was able to override this in the about:config as described in [1].

I really don't like how mozilla ignors what is defined in /etc/nsswitsch.conf because this breaks all the blacklisting rules a user had set up themselves. At least they should have produced a huge warning telling users that this ignores any local blacklists that are in place.

[1] https://ungleich.ch/en-us/cms/blog/2019/09/11/turn-off-doh-f...


Firefox is doing the correct thing here. Firefox cannot read /etc/hosts without breaking some applications since NSS is a series of black-box modules that Firefox can't assume work any particular way. There's nothing special about /etc/hosts on a modern Linux system.

The files module on your system could read /etc/dns or the blarg module could read /etc/hosts but in JSON format.

The thing you're annoyed about is that Firefox with DoH enabled doesn't use NSS at all when performing requests but there is absolutely no way that Firefox can do both -- it either can send requests through the NSS gauntlet and use whatever it returns or not use it at all -- "respecting /etc/nsswitch.conf" is impossible to do correctly.

Linux systems don't provide the right facilities to supplement application level DNS in the way you want. If you want to do DNS in your app you must ignore libc DNS.


Firefox should not be ignoring libc DNS without prompting.


>At least they should have produced a huge warning telling users that this ignores any local blacklists that are in place.

Its not just here, its unfortunately common with firefox. I still cant believe that firefox just silently disables addons if they are no longer supported instead of giving a giant colorful warning.


Is this true if you block the canary domain in /etc/hosts?

https://support.mozilla.org/en-US/kb/canary-domain-use-appli...


I just tested this with dnsmasq[1] and can confirm it works.

It's good to know that there is a method but it looks a bit strange imo (see this article[2]):

>> Now this seems a bit weird: DoH to a public resolver seemed intended to defeat an untrusted local resolver, but an untrusted local resolver can disable the use of DoH to the public resolver in this fashion. In fact, if you have content filtering in place on your DNS resolver, Firefox will also use that as a metric to decide whether or not to enable DoH to the default public resolver.

>> In other words: if the browser detects that one of the things that DoH would help protect against (censorship / content filtering) is in place, then it will not enable DoH. ?? Well, for you as the network provider, that's good news -- you remain in charge, as so often, it boils down to having to trust the things you have to trust -- if you (the end user) don't trust your network provider, you can't use their services to bootstrap a trusted environment.

[1] just add the line for the canary to /etc/dnsmasq.conf:

  address=/use-application-dns.net/
[2] https://www.netmeister.org/blog/doh-dot-dnssec.html


Yeah, this was a surprise to me as well. Spent a while trying to figure out why Firefox wouldn't show a local development instance despite curling it working fine.


If you are using DoH, wouldn’t you need to ignore the local resolv/nsswitch settings? Otherwise your request would get sent out to whatever the configured DNS server is.

I’m sure there is some smarter way to do this, but when you’re handling network configuration at the application level, you’re going to miss local changes (including blacklists).

Maybe he best way to handle this would be to change the system level resolver to use DoH?


> If you are using DoH ...

assuming you means the user-space applications, I agree.

I am not against DoH itself. But the assumption being made here by mozilla is that all users trust the US or US based corporations by default. This might be the case for some people (including myself: in some cases, but not as the default case).

This could have (imo) solved with a couple UI tweaks that educate the user about the implications. Having been protected by my browser extension ad-blockers from most sites it still rendered my second layer of defense (/etc/hosts[1]) useless with this new default. The worst thing is I was unaware of /etc/hosts being useless for several months.

I wish mozilla would make it a core policy that anytime a change makes a trust assumption on behalf of the user in all countries[1] that this is communicated not just in the changelog or blogs but as part of a guided tour, or other visual aids which really spell that trust assumption out.

> Maybe he best way to handle this would be to change the system level resolver to use DoH?

agree, but I also see the difficulty on making the FOSS world agree very quickly on how this is supposed to be implemented as some kind of default (could take a decade or longer, while just pushing it under the radar gives immediate results - that they need to form agreements with various providers, e.g. cloudflare)

[1] they seem to be rolling out different features anyway depending on which jurisdiction you are, as shown in the case of "Mozilla: No plans to enable DNS-over-HTTPS by default in the UK" (which purpose isn't to give freedom[TM] but to comply with the UK nanny-law to ban free speech (and porn, but mostly free-speech)). https://www.zdnet.com/article/mozilla-no-plans-to-enable-dns...


I'm glad this happened. I feel like it is one of those "if you don't do it now, you won't be able to do it later." As more DoH implementations spring up, at least one of them will require the User-Agent for no reason.

Plus it is useless, may allow hacky discrimination, and wastes bytes.


I wonder if the best approach might be to specify the precise request format required for DoH and actively deny requests with extra information.

'Thou shalt provide exactly this, no more and no less'

Depends on whether we want to consider later 'improvements' to be silent extensions or an entirely new protocol version. Given how easily abused HTTP headers have been in the past, might be worth the inertia to limit extensibility by design in order to limit MITM tracking.

But MITM middleboxes will of course start to rely on that...

I suppose that if TLS is broken for this part of the system you're rather screwed anyway, but defence in depth is always worth considering.


What's the point of using HTTP for DNS if you aren't going to use its abilities?


Its raison d'être is largely to mitigate the issues that limited DoT's (DNS over TLS) widespread adoption, specifically "Middlebox Bypass."

Meaning if you connect to a public WiFi network, many will only allow captive DNS and block other protocols including DoT. DoH is difficult to distinguish from other HTTPS traffic and is more likely to successfully resolve (since they cannot block HTTPS and still be useful).

Therefore you've created a secure, flexible, and reliable DNS replacement with only mild technical downsides on modern hardware (a lot of the complaints are ideological or relate to specific incompatibilities with existing solutions like the lack of HOSTS file support in the initial offerings).

So I guess, everything I just said is "the point." Rather than trying to use every single possible combination of HTTP headers.


DoH as a protocol has no issues with HOSTS files. The issue is browsers implementing DoH instead of operating systems. If browsers stayed in their lane, and let operating systems implement DoH, compliant with corporate IT design and systems like the HOSTS file, everything would be fine.

Microsoft has already committed to implement DoH in Windows itself. Browsers just need to stop trying to be their own DNS clients.


Mozilla's operating system bombed horribly, so the only leverage they have is with the browser.


Why is a browser developer trying to apply "leverage"? In that, we've already located the problem.


Spot on. This needs to be supported and implemented at the OS layer, not the application


OSs didn't implement it though. And they never would unless browsers forced it on them. The right approach is for browsers to bring it in and then when the OS supports it, default to the OS version. But right now I don't know of any OS other than I think android that supports it.


Yes, that's the problem.


> Meaning if you connect to a public WiFi network, many will only allow captive DNS and block other protocols including DoT. DoH is difficult to distinguish from other HTTPS traffic and is more likely to successfully resolve (since they cannot block HTTPS and still be useful).

Are you referring to HTTPS-intercepting proxies? (i.e., if you want to use that network, you'd also have to install a custom root CA cert)

On a typical "coffee shop" wifi (where users are not expected to install custom certs), shouldn't DoT and DoH be virtually indistinguishable to the network as they are both shielded by TLS?


> Are you referring to HTTPS-intercepting proxies?

No, I am referring to what I said. Public WiFi networks only allow their own DNS resolver and block all others and most other non-HTTP/HTTPS network traffic. This breaks DoT and even unencrypted UDP DNS to your choice of resolver (outside of the WiFi's captured DNS/Middlebox DNS).

A lot of free public WiFi today break DoT and work with DoH. DoH is indistinguishable from other HTTPS traffic, DoT and DoH are highly distinguishable from one another. DoT is trivial to block (often without even targeting it specifically).


Out of couriosity, do you know how they do it? Just port blocking?


> Therefore you've created a secure, flexible, and reliable DNS replacement with only mild technical downsides on modern hardware

One (semi-ideological) concern I believe is valid is that this makes it increasingly complex to understand which HTTP features can or cannot be used somewhere - and when writing a naive implementation, there is the risk of accidentally enabling features that are supposed to be disabled.

E.g., when WebSocket was developed, there was some discussion how much HTTP should be allowed in the initial handshake - specifically whether or not redirects should be followed: Formally, following redirects would seem reasonable as, until the handshake is completed, you are speaking "ordinary" HTTP, which redirects are a part of. Practically, though, it didn't make a lot of sense as the requirement would have added a lot of complexity to clients - even though no there was not even demand for the feature in the first place. In the end, I believe it was decided that the handshake is actually a tightly limited subset of HTTP and any server response except 101 is invalid.

However, last time I checked, browsers in the wild do send cookies and user-agent strings in the handshake, even though you could argue those have nothing to do with WebSocket either.

So with DoH, we now have another "dialect" of HTTP with its own subset of features: On DoH requests, the User-Agent header must be omitted and cookies must never be stored. I can already imagine future privacy exploits if some DoH implementations just used a standard HTTP library and e.g. forgot to disable cookies.

I believe using an actual custom protocol would make it a lot easier to understand which features are and are not available.


> DoH is difficult to distinguish from other HTTPS traffic and is more likely to successfully resolve (since they cannot block HTTPS and still be useful).

Considering that most users would keep default setting and there are two browsers with two default DoH servers, you could just block few IP addresses and effectively block DoH for >95% users.


> many will only allow captive DNS

That's an anti-feature, something one shouldn't even call internet access. Engineering entire protocol stacks around it sounds like enshrining it as if it were good practice.

It's reinforcing an inadequate equilibrium.


So you are left with the choice of making something that works right now and will provide enhanced privacy for everyone or attempting to fight middleboxes and corporate configs and see a similar rate of adoption as ipv6.

Once absolutely everything is running through opaque https requests we might see middleboxes go away since they aren't able to do anything anymore.


> So you are left with the choice of making something that works right now and will provide enhanced privacy for everyone or attempting to fight middleboxes and corporate configs and see a similar rate of adoption as ipv6.

Actually solving the problem is always better than enshrining it by providing an ugly workaround. Even if it takes more time.

Imagine how quickly IPv6 would have been adopted if NAT didn't exist. DoH is kinda like that, a cludge that ultimately hinders progress.


What you are most likely to see is more corporate deployment of middleboxes that MITM https requests and an overall reduction in privacy for those on corporate networks.


the user-agent header was a mistake from the very beginning. it was meant to help servers cater to different browser capabilities, but it ended up being misused to segregate the net and drive non-standard features.

without the user-agent header, browsers would have been forced to standardize earlier.


The point, iirc is to run it on port 443 so that it doesn't get blocked.


That would explain DNS over TLS on port 443, no HTTP needed.


Yes but initial hanshakes would look foreign compared to regular port 443 traffic.


To stop people from blocking DNS over HTTPS.


If it comes to wasted bytes, there is also Server response header which is high on my list of things to disable whenever I get the chance.


To save everyone else from googling, DoH is DNS over HTTPS.


Not to be confused with DNS over TLS (DoT)


I feel like User-Agent in general is not well suited to the modern web. Most browsers pretend to be some version of Mozilla or Netscape. And most things that scrape sites like Apple's messaging app pretend to be something equally outdated. Deprecating User-Agent seems like the best course of action in the long run.


I'd be OK with getting rid of User-Agent if some kind of "capabilities supported" header were added. Something like

  Agent-Capabilities: 1; 97464bde5d94a54f6e199309489a4b60
where the 97464bde5d94a54f6e199309489a4b60 is a set of bit flags in hex, each representing some feature, and the 1 before the semicolon is a version number for the bit definitions. The assignment of these flags would have to be standardized. When new versions of HTML, CSS, or JavaScript are standardized, that would include assigning bits for their new features, and bumping the bit definition version number.

You might say that it is better to do run-time checks on the client for features you want, and have the client then load workarounds if necessary for missing features, and that indeed is a good way to go in many cases (maybe even most cases).

That method, though, only works going forward. Sometimes I want to know what capabilities past visitors to my site have had. For example, I'm going to be redoing some pages at work and want to know if I can require certain CSS and JavaScript and HTML features.

I could make a list of the features I want to use, add some JavaScript to those pages to test for those, and send the results back to the server...and then I'd have to wait weeks or months or longer to get a good idea of what the consequences would be of requiring those features.

With User-Agent strings, all I had to do was pull up the site logs. Our shopping cart logs the User-Agent string, and page flow through the cart. It was an easy matter to pull up the last 18 months of those logs, and write a script to find all the successful orders and make a table showing how many there were for each browser. (I limited it to successful orders, because those would almost certainly all be real people in regular browsers that are reasonably honest about the User-Agents). I could then look up the features I was interested in on https://caniuse.com/ and see which of the features were not on browsers that we still get significant traffic through.

Dropping User-Agent without adding something to allow that kind of analysis, such as Agent-Capabilities, would be very annoying.


I'd be happy for the User-Agent string to die (or be frozen in amber), but I want the ability to have a rough sense of the capabilities of a browser in order to determine what JS I should send down to them. Like, if you're on IE11 I can send down code that's compiled to ES5 and has all the polyfills you need, and if you're on almost any other browser I just need some signal that you've got a reasonably up to date version and I can send down a much smaller and more optimal JS file.

Maybe runtime browser-side feature detection can be fast enough. I haven't benchmarked.

The other tricky thing that User-Agent strings are good for is working around browser bugs. Every once in a while a browser will ship with a totally broken language feature that's essentially impossible to detect at runtime.

A particularly bad example was from an old version of Edge, which had a JIT bug where class constructors could – in very particular circumstances – return the class object itself rather than an instance of the class. In that situation, if you want your site to work you have to detect that version of Edge and treat it like IE: send it the bigger, more compatible blob rather than code for modern browsers.



This got me to thinking how my stuff uses User-Agent. I'm not in a datahoovering biz. Still, we've always believed, rightly, that we need User-Agent to help us troubleshoot users' problems. Mostly, the questions we asked of it were "Internet Explorer or not? If so, what version?"

Let's throw down this challenge to the browser companies: MAKE YOUR STUFF SOLID ENOUGH AND STANDARD ENOUGH THAT WE DON'T NEED User-Agent.


The Cloudflare TRR endpoint is mapped to a subdomain that indicates the user agent, i.e., browser, was distributed by Mozilla: mozilla.cloudflare-dns.com. The IP address for the subdomain is the same as for cloudflare-dns.com. Would this subdomain be leaked in SNI.^1 Presumably it serves to enable someone, maybe Cloudflare, to track queries as coming from Firefox.

Will the endpoints for other TRRs have special subdomains.

Perhaps another "privacy enhancement" would be to avoid using a browser-specific subdomain.

1. ESNI is still experimental and if I am not mistaken it is disabled by default in Firefox.


That's still better than sending the entire UA, though, which includes OS and browser version numbers.


Encrypted SNI is still experimental and Disabled by default in Firefox, but it requires Cloudflare to work.


Just curious, would your employers let you use DoH? I was told to stop using it.


Employers are typically not afraid of the protocol, but afraid of not being able to control it. Companies can run their own filtered DoH just like they can with DNS.


> Instead of setting it to an arbitrary string it would be great if the UA header was not set at all.

Didn't someone post his experience getting more expensive flights because he used a Mac?

Yes! Please, take UA out.


Are we sure this wasn't a session cookie that just happened to pick experiment A or experiment B?

The user agent isn't very important. What we should start being worried about is that ISPs are now offering API access that maps your IP address to your phone number and demographic information. (Verizon does this right now, but you can opt out in your account settings.)

Maybe all those VPN providers have a point. Delete the user agent, share an IP with thousands of people, and maybe you can browse in peace.


Then you get stuck in ReCaptcha hell because you are not "human".


The sites still get your UA. This was about your DNS server getting your UA.


Destination websites/servers can still require that you have a UA. DoH moving away from requiring a UA wouldn't change that.


Then there's this: https://isc.sans.edu/forums/diary/Is+it+Possible+to+Identify...

detection based on payload size alone.

Any implementations with a random padding ballast header?


It's all bout money encapsulated by "security" marketing buzzword. Yes, DoH is an improvement over old plaintext UDP DNS. But why should I used a nobody service over Cloudflare or Google that update their databases within seconds all over the world and rarely if ever faced a downtime?


> It's all bout money encapsulated by "security" marketing buzzword. Yes, DoH is an improvement over old plaintext UDP DNS. But why should I used a nobody service over Cloudflare or Google that update their databases within seconds all over the world and rarely if ever faced a downtime?

It is not so straightforward as always being an improvement. For example, a user can be uniquely identified through session resumption session ID whereas with plaintext UDP behind a firewall with many others doing lookups unique identification of a user can be close to impossible.


Yes, that's a one more reason to trust Cloudflare and Google over a nobody service since their motives with their free DNS services (including DoH and DNS over TLS) relate more to having up-to-date synchronization, load balancing, etc.. with the servers they host themselves. Spying on users via DNS queries by Cloudflare or Google is kind of a joke that can bait only the ignorant. Cloudflare terminates TLS connections and even issues TLS certs and can "theoretically" see every sensitive data that can compromise both businesses and users and yet they are trusted by countless public companies and startups alike. The same thing if not more for Google for their own ervices or their GCP business.


What if there's a bug in Mozilla's implementation at some point and DoH servers have to return a slightly different response for certain versions of Firefox? How will they achieve that?


That seems less like a fix than a bandaid applied at the wrong level of the stack. It's perfectly normal for websites to serve slightly different pages to different browser versions, but I would submit that is not a desirable property for a system to have but rather a symptom of our inability to properly standardize & implement something as complex as a browser.


Then Mozilla need to fix their implementation. If the client is bugged you don't fix the bug in the server.


Welcome to the real world.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: