Hacker News new | past | comments | ask | show | jobs | submit login
Stop Firefox leaking data about you (github.com/amq)
180 points by amq on June 25, 2015 | hide | past | favorite | 91 comments



Seems like lots of FUD; how do Firefox Hello, Pocket and Geolocation "leak data about you" if you don't explicitly use them? How do DRM and Reader mode leak data at all?

Also, Safe Browsing, DRM, Search suggestions, Telemetry and Health report can be disabled in the preferences UI. Don't need sensationalist about:config protips for that.


Also, from the description of the Safe Browsing feature (as linked on the above page), it seems that it doesn't actually send (and thus leak) URLs; rather, it downloads a blacklist from Google periodically (~30min), and checks URLs against it locally... https://support.mozilla.org/en-US/kb/how-does-phishing-and-m...

(Though, for file downloads, some meta information seems to be sent if I'm reading correctly.)


I've worked with safebrowsing (v2) data/api before. It downloads and maintains lists of hash prefixes, which it checks locally for hashes of various versions of the url. When it gets a match, it downloads a chunk of all full hashes that start with that prefix, to see if the full url/variant hash is one of them. Usually that list very short, just one or two full hashes for a particular prefix.


Note that it does indeed recheck against the remote copy on a blacklist hit.

So Google could indeed easily track URLs by adding it on the periodic check and then returning false on the specific check.


I'm quite late with this, but this is incorrect. At no time is the URL sent to Google; in fact, at no time is even the hash of the full URL sent to Google. I'd suggest you re-read the safebrowsing protocol.

As ploxiln notes, if a hit is found with a matching prefix to the (canonicalized) URL, a request is made for all hashes of URLs beginning with that hashed prefix. The hash of the current URL can then be checked against that list locally.


I was oversimplifying, sorry. You are correct that the URL isn't ever explicitly sent.

However.

Generally there are only one or two URLs that start with the hash prefix that is explicitly sent to Google. Which means in practice it may as well be leaking the actual URL to Google.

Especially as there are multiple hashes per URL (5 worst-case?).

If Google wants to track a URL, they can do so.


I would prefer it if they stripped out Hello and Pocket. They don't need to be there and their inclusion makes me wonder whay Mozilla's Firefox goals really are.


> I would prefer it if they stripped out Hello and Pocket. They don't need to be there and their inclusion makes me wonder whay Mozilla's Firefox goals really are.

In the case of Hello, the goal is to provide a fully FOSS video chat client built entirely on top of HTML5 APIs. That's huge! Skype, Hangouts, Facetime, etc. are all proprietary. Firefox Hello doesn't even require anyone to sign up for an account!

With Firefox Hello, I could paste a link in this comment that any reader with a WebRTC-enabled browser could click on and start a video chat with me. They don't even need to be using Firefox - they could be using Chrome on the laptop, or even Firefox for Android[0].

People keep bringing up the Telefonica branding, but that's kind of a red herring. Basically, since not all devices have globally addressable IP addresses (yet), they can't have it be fully P2P (yet), and they need some server that can facilitate the initial connection. Telefonica sponsors these servers, so they get their name listed alongside Hello.

What would "stripping out" Firefox Hello provide? It's built entirely on top of HTML5 APIs (which is why it works in all browsers with WebRTC support), so it doesn't actually increase the browser surface area at all.

[0] Yes, I can use my phone's FOSS web browser to place a video chat using a FOSS web client. If that doesn't sound amazing, I don't know what is!


Yes. Hello sounds amazing, but it could have been an extension along with hundreds of other amazing extensions for firefox. The point is there are extensions that I use and there are extensions that I do not use. There is no reason to stuff something down my throat however amazing they sound.


"They don't even need to be using Firefox"

Then it doesn't need to be a irremovable part of the binaries, right?

"What would "stripping out" Firefox Hello provide? It's built entirely on top of HTML5 APIs (which is why it works in all browsers with WebRTC support), so it doesn't actually increase the browser surface area at all."

The question to ask is not "why should we remove it" but "why should we include it". And if it is just "built entirely on top of HTML5 APIs" then why should it need its privileged position above that of an addon?


Mozilla is reliant on the money integrations bring. The search engine integration alone brings in millions.


That's the crux of this situation. Mozilla needs money. Unless they find another way to finance themselves, they will continue selling user data, directly or indirectly. This will not change however we cry out.


> Mozilla needs money. Unless they find another way to finance themselves,

Firefox Hello and Pocket are not attempts to "sell user data". In the case of the latter, Mozilla isn't even getting paid by Pocket, as they have said numerous times.

But yes, Mozilla is dependent on money, like all corporations. If you want to ensure that their funding sources are never in conflict with what users want, there's a very easy solution to that: https://sendto.mozilla.org/page/contribute/givenow-seq

(If every Firefox user gave $2, they wouldn't need their partnerships with Yahoo/Google for search integration, which has been their primary funding source for years).


> they will continue selling user data, directly or indirectly

Their main revenue comes from setting the default search engine. It's a bit of stretch to say that is selling user data.


They are selling the information what you are searching for to Yahoo, not directly but indirectly by configuring their software in a way such that Yahoo can collect your data easily. That information is incredible personal. This is the reason why Yahoo pays for it. There is no stretch at all in what I've said.


That's nonsensical.

You have to use some search engine. Whatever search engine you choose is going to get that incredibly personal information.

Mozilla sold the default choice position to Yahoo. Any user who considers Yahoo to be more nefarious than some other choice can switch with about 10 seconds of effort.


That it is possible to opt out does not change the fact that personal information is sold, indirectly.

You know what would be nonsensical? If Yahoo didn't collect data about you.

If Mozilla did not need to sell our data it could ask which provider we want to use or integrate technology like YaCy.


I do not follow your logic, at all.

Users use search engines. In fact it's pretty much a required feature to display a search bar proudly in the UI of a browser.

Users therefore give their search data to search engines. You can quibble about which corps are good corps and which corps are bad corps, but users cannot use search engines without giving search engines their search queries. Obviously.

Mozilla does no concomitant damage to users' privacy by allowing them to use their browser to use search engines. Mozilla, therefore, is not complicit in any wrongdoing which you ascribe to them.

If Mozilla made a deal with a manifestly worse option than the popular ones, measured either by results quality or by user abuse, the yes -- Mozilla would be reprehensible.

DDG is better, but it's not what users want.

> You know what would be nonsensical? If Yahoo didn't collect data about you.

Sure. Cool. That'd be neat.

> If Mozilla did not need to sell our data ...

Repeating that doesn't make it true. Mozilla does not sell your data. They sell placement of choice. We can agree that most users won't change the default choice, but we must also agree that almost no users will choose !google !yahoo !bing !ddg. In that order.

> ... it could ask which provider we want to use or integrate technology like YaCy.

The choice exists and is highly accessible. Are you suggesting a first-run dialog to ask the user to pick a search engine, a la Internet Explorer post-DOJ judgement? That's usability insanity.

YaCy doesn't even exist in Mozilla's user population's awareness. What's better? A good browser option or a dead browser?


Many of it can be found in Firefox Preferences, but providing a list of about:config entries seems more convenient to me than describing a way of how to disable things in menus, which are in different languages and change with time.


> Seems like lots of FUD; how do Firefox Hello, Pocket and Geolocation "leak data about you" if you don't explicitly use them? How do DRM and Reader mode leak data at all?

You really don't expect much from your browser at this point any more, do you?

I want a browser that connects only to the website I asked it to display and let me configure how any (ANY) third party connection will be handled.

I shouldn't have to use lynx for that.


You can just use uMatrix or Policeman.


Then point is, we just used Firefox for this before. Now we're becoming a niche market. That change should be terrifying to more than just me.


Another thing worth noting is that if you are using Debian-rebranded Firefox (Iceweasel), you have a very unique user agent that is easy to track.

There is a bug opened (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=748897), but as far as I know, no simple solution exists yet. You can change the user agent with an extension to keep it identical with the most popular Firefox version, but then you have to manually keep it up-to-date.


If you don't want edit manually user agent, oscpu and platform in "about:config" then blender can do it for you: https://addons.mozilla.org/it/firefox/addon/blender-1/

EDIT: seems that this extension need to be updated, sorry...

But I don't know, if you allow javascript maybe you will leak your real user agent?

I like to switch between iceweasel and seamonkey, and I'm "proud" of my different and unique user agent :) Speaking of privacy, actually I'm more concerned about referer, third-party sites, content delivery networks and so on.


> But I don't know, if you allow javascript maybe you will leak your real user agent?

The user-agent provided to JavaScript is the same one sent via HTTP.


Good to know, thank you!


Seems like a bad idea turning off phishing notifications and browser warnings

http://kb.mozillazine.org/Browser.safebrowsing.enabled Firefox 2.0 incorporates the Google Safe Browsing extension in its own Phishing Protection feature to detect and warn users of phishy web sites.


I think using a blocker extension like uBlock Origin restores (at least partially) this functionality without involving Google.


Right, but then you're just changing who you leak data to, you're not stopping the leak.

EDIT: wow, downvotes? Getting a list from EasyList is just as much a leak as getting a list from Google. Someone has your IP either way.


> Getting a list from EasyList is just as much a leak as getting a list from Google. Someone has your IP either way.

The problem with SafeBrowsing isn't downloading the list, it's that it sends data back to Google if it finds a match. Malware lists with AdBlock plugins don't do this.


My understanding was that Google's malware list is a two prong approach:

  1: An all-in-one lump download of blacklists
  2: Optionally, "Enhanced" also sends hashed URLs to Google in case specific sub-pages aren't on the list, etc
Easylist, of course, only offers #1. Firefox, by default, uses both, which is less private, but seemingly still configurable to use only #1.

Citations:

http://www.google.com/tools/firefox/safebrowsing/faq.html

https://developers.google.com/safe-browsing/firefox3_privacy...

http://www.pclinuxos.com/forum/index.php?topic=124878.0


Yeah. I personally don't care if something is from Google but other people do. I'm quite surprised this stupid list that doesn't even explain what each setting actually does is on page 1.

(Edit: Oh, it seems it was updated in the meantime.)


It does send your IP address to Google, when you download the filter. And I can't tell if "Enhanced Protection" is enabled which actually sends your URLs to Google.



I couldn't find Browser.safebrowsing.remoteLookups in my about:config. Does it still apply?

upd: I am searching for it in the Firefox sources now. Just for curiosity: there are 117512 files in 8610 directories totaling 743 MB, and search in files is really slow even on SSD.


browser.safebrowsing.remoteLookups works indeed, I could confirm this using Fiddler. For some reason, Notepad++ couldn't find a mention of 'remoteLookups' in sources.


    > time egrep -Rni remotelookup . 
    ./toolkit/devtools/gcli/source/lib/gcli/types/selection.js:81:    spec.remoteLookup = (typeof this.lookup === 'function');
    ./toolkit/devtools/gcli/source/lib/gcli/types/selection.js:128:  if (this.remoteLookup) {
    egrep -Rni remotelookup .  2.01s user 0.32s system 99% cpu 2.346 total


A key that's not in the config file uses the defaults. You can just add the key by going into about:config , right clicking anywhere in the list, and choosing New, and in this case, boolean, and set it to false.


Mozilla has an annoying pattern of removing items from the user preferences to "avoid user confusion", an excuse companies often use when deceiving customers. (Example: Microsoft dropping the "RT" designation. [1]) "Accept/reject third-party cookies", for example, doesn't always appear in the preferences any more.

Mozilla's new "social" features don't have a turn-off option in the Preferences. You can disable them by going to "about:config", creating the tag "social.enabled" (it doesn't even exist by default) and it to False. Mozilla provides no easy way to do that. This add-on takes care of those convenient little omissions.

Obviously, Mozilla is doing all this to tie users to their mothership and make it harder for them to leave. It's not like users were crying out for "Pocket" integration in the browser.

[1] http://www.winbeta.org/news/surface-2-no-longer-has-rt-brand...


Mozilla is "obviously" trying to lock in their users with these tactics? There are Pocket add-ons for every browser and OS. How does Pocket integration represent lock-in? You can certainly argue that these additions are bad for users, but I don't see anything in that list that represents an attempt at lock-in.


Yeah, Google never does this with their interfaces (/sarcasm)


OP said nothing about Google. What Google does isn't necessarily relevant to what Mozilla does.


Don't forget about this;

"media.peerconnection.enabled = false" WebRTC leaks IP when you use TOR/VPN, test it with ipleak.net

"beacon.enabled = false" Blocks https://w3c.github.io/beacon/ analytics.

Also recommend using plugins; uBlock, NoScript if you use VPN.


> "media.peerconnection.enabled = false" WebRTC leaks IP when you use TOR

Tor Browser disables this by default and if you're using Tor over standard Firefox you already have far bigger problems.


How can WebGL leak your IP? Did you confuse it with WebRTC?


Yes sorry ^_^ I meant WebRTC :) Edited!


> "beacon.enabled = false" Blocks https://w3c.github.io/beacon/ analytics.

Do you also disable <img>? Beacon is a performance optimization but it doesn't offer anything more than site owners could already see.


Just tried it on ipleak.net, it seems WebRTC only leaked a local IP address, so if you're behind a router, it's almost meaningless… Until there's an attack involving a device on your local network.


That is really disappointing. Particularly for firefox with their supposed focus on privacy, to enable a service that leaks ip when I run a vpn, contradicting the plain intention of the user, shocks me.


> That is really disappointing. Particularly for firefox with their supposed focus on privacy, to enable a service that leaks ip when I run a vpn, contradicting the plain intention of the user, shocks me.

Firefox doesn't necessarily know that you're using a VPN, so it's not really contradicting "plain intention". The problem is the way that WebRTC works; it has to have access to this information in order to function.

The real issue is that there should be a prompt guarding this information, the way there is when a website requests your location. Hopefully this can be fixed.

Also, for what it's worth, Firefox lets you disable this, but I don't think Chrome/Chromium let you disable it at all: https://productforums.google.com/forum/#!topic/chrome/gJ8HF-...


ff doesn't know I'm not using a vpn, so they shouldn't add features that leak ip addresses

and a setting buried in a pile of configuration is the same as it not existing for 99.99% of users


Please for the love of god do not disable the Google SafeBrowsing preferences. SafeBrowsing protects you from a lot of malicious websites, and does not leak much information to Google. For most people the security benefits of SafeBrowsing far outweigh the privacy concerns.

It is important to remember that malicious websites and malware in general may negatively impact your security and privacy in extremely harmful ways (malware compromises PII, website credentials, financial information, uses webcam and microphone to photograph/film/record you from blackmail/revenge porn purposes, ...)

For context, please see these relevant Mozilla bugs about SafeBrowsing privacy concerns: [0], [1]. tl;dr Firefox must set a cookie for SafeBrowsing, but it uses a separate cookie jar for SafeBrowsing so Google cannot tie the Safebrowsing activity to anything else you do related to Google or their services (which is the biggest concern here). They can learn a limited profile of your browsing activity, along the lines of "Random user x often uses their browser between 9am and 5pm on M-F".

The Safebrowsing implementation is specifically designed to be privacy-preserving. [2] It uses a Bloom filter to implement fast lookups in a minimally sized hash table of known malicious URL's. The only time a full URL (actually various hashes of multiple prefixes of the full URL, including the full URL) that you browse it sent to Google is when a prefix of it collides with a known malicious URL, in which case the URL must be sent to Google to resolve the question of whether the URL you are trying to visit is actually malicious or just a false positive from the Bloom Filter. Yes, the hashes are unsalted so it would be possible for Google to check if you were trying visit some pre-determined URL ("were they trying to visit www.thoughtcrime.org?") but only if it collided with a known malicious URL.

It would be helpful to know what the average rates of collisions and false positives are to get a sense of how much of an average user's browsing history is leaked to Google through Safe Browsing - can anybody from Google comment?

[0]: https://bugzilla.mozilla.org/show_bug.cgi?id=368255 [1]: https://bugzilla.mozilla.org/show_bug.cgi?id=897516 [2]: https://code.google.com/p/google-safe-browsing/wiki/SafeBrow...


Please for the love of god do not disable the Google SafeBrowsing preferences. SafeBrowsing protects you from a lot of malicious websites, and does not leak much information to Google. For most people the security benefits of SafeBrowsing far outweigh the privacy concerns.

I would never disable it for my mom, or any non technical friends. But I would hope the majority of HN users are pretty good at spotting, and steering clear of malicious websites.


They're designed to trick you, so I don't think any population, no matter how sophisticated, should trust themselves to correctly identify malicious websites 100% of the time.

Additionally, some sites may potentially contain exploits that run as soon as you visit the site (vulnerabilities in plugins like Java or Flash, drive-by downloads, etc.) in which case it doesn't matter if you correctly identify the website as malicious and hit the "Back" button - it's already too late. Much better to avoid loading the content at all, which is exactly what is achieved with SafeBrowsing.


> But I would hope the majority of HN users are pretty good at spotting, and steering clear of malicious websites

yeah, about that:

http://arstechnica.com/security/2013/02/web-forum-for-iphone...

http://arstechnica.com/security/2015/02/pwned-in-7-seconds-h...

etc etc


Safebrowsing protects you from more than tricky websites - it blocks sites and services that are known to serve malware. It doesn't matter how clever a user you are, if your browser (doesn't matter which one) navigates to a site that hosts malicious code that targets an unpatched vulnerability, you're hosed.


sure, you might not click on an advert or the such, but XSS isn't exactly visible, along with a whole host of other problems


In the history of the malicious site blacklist I've only ever tried visiting one or two sites that were flagged. For your average non-technical person sure, keep it enabled. For someone browsing HN and editing config values it's not a problem.


"separate cookie"? Just correlate it by IP address (or whatever) + time. While I'm not a database expert, I'm sure I could make something like this work:

    SELECT users.id             AS google_user_id
           sb_hits.ip_addr      AS safebrowsing_update_ip_addr
           sb_hits.request_time AS safebrowsing_update_time
      FROM all_page_hits                AS user_hits,
           all_page_hits                AS sb_hits,
           normal_google_accounts       AS users,
           safebrowsing_pseudo_accounts AS sb_users
     WHERE users.cookie = user_hits.cookie
       AND sb_users.cookie = sb_hits.cookie
       AND user_hits.ip_addr = sb_hits.ip_addr
       AND (sb_hits.request_time BETWEEN (user_hits.request_time - interval '1 hour')
                                     AND (user_hits.request_time + interval '1 hour'))
Any cookie at all betrays information (that's what it's for!), and once any sort of correlation is established, that "separate cookie" can be permanently tied to the real account(s).

The IP betrays information as well, but that's not a reason to make it even easier with a cookie.

"Random user x often uses their browser between 9am and 5pm on M-F"

That's exactly the important information that should be protected, to resist pattern-of-life analysis.

(apologies for any SQL errors; it's been a while since I did any serious db work)


I don't understand the skeptical scare quotes around "separate cookie". If you read the linked bugs, you would see that when SafeBrowsing was originally added to Firefox, it used the same cookie jar, which meant that SafeBrowsing requests included a cookie for safebrowsing.google.com (necessary for it to function) but also all cookies for *.google.com, which is clearly undesireable from a privacy perspective and has since been fixed.

If pattern-of-life analysis is a concern of yours, you should be using the Tor Browser and taking a whole host of other precautions. Fiddling with a bunch of prefs in about:config and using an ad blocker isn't going to cut it.

And again, it's not a zero-sum game. Safe Browsing provides some meaningful benefit of terms of protecting users from malicious websites, which on balance is probably worth the compromise to their privacy (which is comparatively minor and was minimized through careful and intentional engineering).

I agree that it's worthwhile to try to stop the trend towards increasing surveillance of Internet users using whatever techniques are available, but it's really at the core of the Internet's business model and some fundamental changes are necessary.


Sceptical scare quotes? I was quoting your previous post.


How exactly does reader.parse-on-load.enabled leak privacy? Isn't everything parsed locally?


I too am curious about this one.


While visiting google every 30 minutes or so is a way of leaking, you aren't leaking much more than ip and the fact that this up is in Firefox.

Isn't reader an offline functionality?


https://en.wikipedia.org/wiki/Pattern-of-life_analysis

You're leaking when your computer is on/active, with fine granularity. Pulling out reasonable estimation of your daily schedule from that data is easy.


You're missing that any blacklist hits are rechecked against Google.

So it leaks a heartbeat, and any hits against the blacklist.


Is that new? Back when they started implementing they were saying that the did it as privately as possible, so that google couldn't track your browsing. It seems unnessecary. You do t get hits very often, so they could just redownload the entire list when there is a hit if they feel like double checking.


Don't forget about WebRTC: https://github.com/diafygi/webrtc-ips

If you have WebRTC enabled, any website can determine both your local IP address (e.g. 192.168.1.1) and your globally-addressable IP address. The combination of these is essentially unique, and can even be better than cookie tracking or browser fingerprinting.

It's possible to disable WebRTC in Firefox, but AFAIK not in Chrome/Chromium[0].

As for Firefox Hello and Pocket integration, you can turn these off if you want, but I'm 99% certain that they don't actually send any data about you unless you actually use them.

[0] https://productforums.google.com/forum/#!topic/chrome/gJ8HF-...


You're really not going to like IPv6 are you?


Recommends turning on Firefox's built-in tracking protection[0] (which matured in Firefox 37 or so), but has anyone compared this to uBlock? I guess the first thing to measure would be number of trackers blocked, but then of course memory and CPU usage would be interesting as well. uBlock has done this comparison[1] against AdBlock Plus, Disconnect, etc, so it would be very interesting...

[0] https://support.mozilla.org/en-US/kb/tracking-protection-fir...

[1] https://github.com/gorhill/uBlock/#performance


Important changes:

- Reader mode is confirmed not leaking data. No need to disable it.

- There is a way to stop leaking the browser history to Google while keeping Safe Browsing.

* both tested using Fiddler



I guess there is a similar howto on various opt-out settings in Google account itself?

https://history.google.com/history/ and https://plus.google.com/settings/endorsements etc.?


I don't know but the DRM stuff is actually cool with me. I guess you can't convince the lawyers of nearly all media to turn on DRM for a few decades to come. But I still want to use things like Netflix. With the new DRM stuff you can at least have it running on a Linux instead of a Windows system. Step by step in the right direction, I'd say.


Maybe I'll update my Firefox configuration: https://bitbucket.org/snippets/cedricbonhomme/cbj6/firefox-c...


It would be awesome to turn it into an extension that makes it a single toggle.


I wouldn't use it since it:

- Disables the malware and phishing lists/warnings.

- Disables HTML5-video DRM (even if this has nothing to do with leaking data).

- Geo location already brings up a popup (and thus disabling it seems petty).


I never saw a warning for malware and phishing in the browser. I do receive phishing emails obviously, but they are easy to spot. I read mail with Thunderbird and sometimes I check the source just to see which strange URLs they're using this time :-) If you read mail in the browser maybe disabling those checks is not wise. Agreed.

Disabling HTML5 video DRM could be annoying and geolocation is solved by the popup, as you say.


>* - Disables HTML5-video DRM (even if this has nothing to do with leaking data).*

I would rather live in a browser without video support, than with DRM


Mozilla provides EME-free builds of Firefox: https://news.ycombinator.com/item?id=9534096


So we need a plugin where you can individually toggle each and every thing. (Just kidding)


configuration mania is the name of it. It is missing a couple things, but most mentioned here are options.


Aside from the things you listed, it's ok to disable the other things?


Maybe. Both several of the other things can be disabled via the normal settings UI, so disabling them via about:config seems unnecessarily indirect.


It could do with some updates (e.g. Pocket is missing), but most of these are set automatically (and can be toggled on/off en masse) by TinFoil:

https://addons.mozilla.org/en-US/firefox/addon/tinfoil/

https://github.com/cohjam/tinfoil


So how about a checkbox for each setting?


127.0.0.1 www.google-analytics.com

127.0.0.1 www.hosted-pixel.com

The political candidates are the worst.


Wow. What the fuck, Mozilla? Here I was, really hopeful that you were actually serious about honoring user desire for privacy.


Did you even check out the link? There is nothing sensational or exceptional about collecting/sending basic user data when using certain features, most of which can be easily disabled/not used.


"Easily disabled" for example like tweaking about:config? Yeah, that's super easy and accessible for the average user!


The fact that he only lists the about:config way doesn't mean that this is the only way to do it. Many of these options have GUI prefs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: