Hacker News new | past | comments | ask | show | jobs | submit login
Facebook JavaScript SDK is often illegal under GDPR (markssoftware.com)
241 points by markarichards on June 23, 2018 | hide | past | favorite | 92 comments



CVS Pharmacy definitely includes the facebook scripts and hooks into every damn button you click on.

Not sure what is "illegal" about the scripts themselves. I would tend, however, suggest that the sites using these scripts may be using them in ways that are illegal (as in HIPAA for instance in the US). Under HIPAA the violator would NOT be facebook, because they didn't install the script on other companies' sensitive sites, nor are they aware of such usages and they didn't sign BAAs with them. The ones that would be doing something illegal are the ones that sign BAAs or otherwise are directly responsible for keeping health information secure.


This article isn’t about HIPAA, it’s about GDPR. If CVS does not have stores in the EU, then they’re most likely in the clear. Under Recital 23 [1], in order for the GDPR to apply to any site, the site must “envisage” serving EU customers. It would be a stretch to argue that a website designed to help customers of a chain of stores that has no presence in the EU does in fact “envisage” serving EU customers.

[1] http://www.privacy-regulation.eu/en/recital-23-GDPR.htm


It's not just about GDPR. It's about any regulated environment that requires systems and controls in place to maintain information security.


I don't know if you read it or not, but the article focuses mainly on GDPR. I’m apparently not the only one with this impression. The poster of this story on HN actually titled it “Facebook JavaScript SDK is often illegal under GDPR”.

edit: Loving the downvotes on every comment I make regardless of content guys, keep them coming! You have about 13,000 to go before I get to 0, and you've only taken about 60 this week so far. At that rate it's going to take you a while, but I know you'll get there!


Regarding the downvotes, it may have to do with the fact that the person you're accusing of not having read the article happens to be the author of said article. I think they know what's in the article and are a bit of an authority on its intent.

If you're getting a lot of signals that you're wrong, it's often worthwhile to stop and consider why, rather than dig a deeper hole.


Yes, I've added a focus on GDPR, but the article does state that this would often affect regulated contexts and I didn't intend the wording of that to be specific to the EU.

Finance, healthcare, etc are all likely to include requirements for information security controls that go as specific as demanding access controls and audit logs; which appear completely impossible to achieve with what Facebook offer with their SDK. As an example, ICH GCP applies internationally for pharma. However, these are much smaller areas of regulation than most of us are not exposed to

Whether all countries have more general provisions, like GDPR, that would apply to a very wide audience of businesses I don't know: but I know GDPR is now recognisable internationally and has parallels in many countries outside of the EU, so is hopefully a trigger for those in other areas to check what their privacy laws demand.


The term "illegal" with respect loading of the Facebook Javascript SDK these days is mostly a GDPR-invented issue. Since I doubt that Facebook is actually stealing data from the web pages its libraries reside on, it certainly isn't a HIPAA issue, and in the case of CVS, they aren't likely subject to GDPR at all (the comment I initially responded to was regarding CVS).

Your general sentiment in your comment here is correct, that loading of any third party javascript library may be problematic for some sites depending on what jurisdiction(s) they do business in and what those libraries actually do. Merely having the capability to take data from a given page wouldn't be enough to trigger violations of many privacy laws (at least the in the US) - the library would actually have to be doing it. But the focus of the article is GDPR and Facebook JavaScript SDKs, and that is the context in which my comments should be viewed.


> Since I doubt that Facebook is actually stealing data from the web pages its libraries reside on, it certainly isn't a HIPAA issue

"Stealing" is a loaded term here. I've observed myself that at least some pages with the FB SDK installed send a tracking HTTP request with every page click - FB appears to be hooking into some very high level page event. Depending on what content is on the page, I could see that being a HIPAA violation: even if they aren't deliberately doing it they could well be logging confidential data.


Depending on what content is on the page, I could see that being a HIPAA violation: even if they aren't deliberately doing it they could well be logging confidential data.

There are probably some instances where this is true - for example on a poorly designed site that includes confidential health data in URL query string parameters (e.g. "hasHIV=true" or similar). But if the site is designed in that way, they have much bigger problems than the security risks imposed by Facebook SDKs. Facebook JavaScript SDKs do not scrape and store page content as far as I've ever been able to tell, so it would take egregiously bad design to turn their use into a HIPAA violation.


First, the rule is to not complain about downvotes. Particularly challenging folks to downvote you to oblivion.

Second, while the title mentions GDPR, it also mentions banking, this cannot mean providing an advertising company with unaudited, uncontrolled access to do whatever it likes, you seem to be excluding these cases as you repeat your argument about GDPR.

Likely you are being downvoted as your posts aren't fitting with the guidelines. Oh, and your comment I don't know if you read it or not, is a reply to a comment made by the author. Please stop.


Once again, the article briefly mentions other privacy legislation but heavily focuses on GDPR.


Read the post and saw Facebook like button at the bottom. Was pretty amused


I find it amusing too.

I don't hold user data or regulated data... so I'm hopefully one of the cases that isn't illegal, but if I'm wrong then please let me know a worthwhile wordpress.com -> static site tool. With a baby abusing my free time, new hosting has been a low priority.

Update: I don't like that Facebook gets told what you read on my site, but I'm not sure it indicates much to them, maybe they'll sack Facebook employees who read this? Let me know.


Jekyll has a self-hosted wordpress [1] and a wordpress.com [2] import tool. Jekyll works on GitHub pages and GitLab pages, the GitHub option requiring very little effort to use. Posts are easy to write in Markdown, which is used in a basic form here on HN, and themes are readily available and easy to make.

[1]: https://import.jekyllrb.com/docs/wordpress/

[2]: https://import.jekyllrb.com/docs/wordpressdotcom/


If you don't mind losing the like counter, you can just switch to a static link for the like button. Here's a WP plugin that does essentially that:

https://wordpress.org/plugins/optimized-sharing/

(I haven't used that plugin, but it's similar to what I normally do in WP themes manually)


Among the interesting choices, this was posted this week I think https://blot.im/ (not self hosted)

I personally like Ghost, although it is not without its critics. Jekyll is great too.


I don't think anything is being loaded from Facebook there. When I check the network tab and umatrix, there are no network requests to Facebook.


Yeah I think this is a feature of Jetpack. Jetpack has the resources to check all the analytics on a URL in the background and then just the necessary data is brought over to the page.


Hmm no there is a request to graph.facebook.com so I don't think that assessment is accurate.


   If a website loads third party JavaScript into a page using a <script> tag then by default it loads with a security context of same-origin – this means that it often it can do whatever JavaScript hosted from the websites’ server can do, so likely:

    Read any content on the page it is loaded
        Read your user details and often session cookies
    Modify (add/change/remove) any content on the page
        Add a username and password field, capture the values

I always* wondered why there isn't more data breaches out there. Most websites have trackers and shady scripts that can do a lot of harm... Even on banks websites or payment pages !

Thing is, I don't see why technically it's the company providing the website 's fault. They are sending a webpage, and it's the user's browser who is sending it's own data to facebook.com / google / twitter / metrics scripts / shady stuff... What would be illegal would be for company to make direct connection from their servers with your data.

* i.e. since I learned web development


>I always wondered why there isn't more data breaches out there. Most websites have trackers and shady scripts that can do a lot of harm... Even on banks websites or payment pages !

They do, constantly. You just only hear about the massive ones at public companies. That's why we have GDPR now. The web has become a complete utter nightmare in terms of security. Users have absolutely no idea how critically dangerous it is to plop a third party CDN script into their pages.


You mean web developers not users ? I think dev and users don't feel concerned enough and that's a shame. I am not for GDPR though, I think users should educate themselves and try to get to know which browser + extensions fits their privacy / security needs. We also need more benchmarking / consumer information so that we can select website best, competition will do the rest. It seems it's a niche market as of now


I say “users” because most actual developers know better at this point. The real problem has come from the innumerate people using CMS systems that think nothing of dumping a script tag on the page they copied/pasted from some random plugin provider.


I always educate myself on any technical subject instead of relying on democratically enacted laws. I educate myself on biochemistry instead of relying on The State to keep my food safe. If I have offspring I will educate myself on teaching techniques so I can choose the right private school instead of relying on publicly funded ones. This is all highly efficient.


While I think that there are fair critiques of this post[1], I can definitely empathize with the overwhelming sense of drowning in ignorance and the limited energy I have to defend against goods and services that entail hidden compromises I would not consent to were I properly informed.

[1] My most available example stemming from:

> ... relying on democratically enacted laws.

I find these often lack the required subtlety at best, or are precipitated by general ignorance at worst, and while are much better than anarchy, can cause significant harm in their own right.


I don't deny there are hidden compromises all over (I think that is a good way to put it) and we need to educate ourselves all the time. I can't imagine a more efficient way to handle all of it than fostering a political tradition that is inherently critical of concentrated and unchecked power, whether private or governmental, and having individuals of the tradition succeed in democratic government and adversarial journalism.

The idea is meant to imply a fractal society of checks: the minimum amount of radically skeptical and power-focused individuals and campaigns per issue and scope would be needed to keep powerful people and groups from being able to get away with abuse. We have some pieces of this in place today, more in the U.S. than many other countries.


Most laws supposed to protect you actually give you a false sense of security, they create business entry barriers, deform market incentives and increase legal risks / burden. Costs very well known but doubtful benefits. Governments profits of your fears


There are ways to make this efficient but 1) people need to be concerned about the issues at stake 2) a lot of online business model also needs to be refreshed (-> chicken and egg issue). The best thing about privacy / security scandals and laws like GDPR is that it brings theses issues in broad light and they become a topic of discussion.

Side note : state backed safety laws and inspection may bring more harm than good, and I will never send my children to schools it would be the best way to make them stupid


> Thing is, I don't see why technically it's the company providing the website 's fault.

If a bank wrote code on their website button that told your browser to send your account username and password to an evil person, technically the bank is at fault.


I don't think there is a single fortune 500 company that was not breached in the last 10 years.


The degree matters a lot. Many have never had a serious breach


I have very serious doubts about that.



That might be useful for the 0.00000001% of the population that read your post, but wouldn't it make sense that sites using the script make changes?


That's ~0.76 people.


I multiplied wrong because the percentage thing :)


Assuming every person (worldwide) read the post.


No, that's assuming only .76 of a person read the post.


Ublock?


That's a client-side solution, not a server-side. The article is focusing on the problem from server-side.


Does not work on iOS.


someonewhocares.org is easier to remember

you can install this on your router too. but no way to do this on locked down ios or android, because of those ad dollars.


pi-hole works with phones IIRC


on some cases (only the very same as putting that dns file in your run off the mill dsl/cable modem anyway). namely when you are at home using wifi. using an app that doesnt do it's own dns, etc.


This has always worried me. My company works a lot with healthcare organisations and as a developer often my first task is to add google analytics to a page. But of course, this is dangerous and in the case of healthcare, should be avoided. Google could, if it so chose, scrape the data of every user whenever they wanted to.


Google has a disclaimer about this as well. probably best to involve a lawyer to answer the questions of where and where not on a website you can use Google analytics or any analytics software when HIPAA is involved.

https://support.google.com/analytics/answer/6366371?hl=en#hi...


To state more clearly what you only implied, this is a potential HIPAA violation waiting to happen.


FWIW Google will sign BAAs with HIPAA-covered entities. Several of their services are popular in the industry, including Google Docs.

And really, most established players in tech have HIPAA-compliant offerings and go the BAA route. It's too lucrative a sector to pass up.


BAA is not an acronym I an familiar with and the internet assures me it is a sound a sheep makes. I would appreciate clarification.

Thanks.


Business Associate Agreement/Addendum.

Basically, when you are a covered entity -- someone who is directly required to comply with HIPAA because of what you do (for example, you're a doctor, or a pharmacy, or a health insurance company) -- any services or contractors/subcontractors you use that might end up handling protected health information as a result of what they do for you have to sign a BAA with you outlining what information they'll be receiving/handling and and how they'll be handling it, along with any specific requirements you each have to fulfill as part of your relationship.

So, for example, if you are a company in the health care industry (so you're a HIPAA covered entity) and you want to use AWS for some things that involve protected health information, you need a BAA with Amazon (and Amazon will happily sign one and take your money).

Google will also sign a BAA with you to let you use their cloud services, Google Docs, etc. Microsoft will sign a BAA with you. Sentry will sign a BAA with you so you can use it for monitoring on your systems. It's extra work, but health care is a big enough market to be well worth the trouble for these companies.


Apparently BAA is Business Associate Agreement: https://healthitsecurity.com/features/what-is-a-hipaa-busine...


In the US, quite possibly. I'm surprised more hasn't be made of this before. Rules in the UK are fairly strict, although I'm unsure of how strictly enforced they are as mass-market online healthcare services are fairly few and far between since the vast majority of healthcare is provided by one provider: the state.


Good number of websites put random third party javascript on pages that they shouldn't. My favorite are pages where I'm entering my payment details.

Some, upon closer look, even send my payment total and what I bought to GA as extra data with a tracking request. (when I cancel the payment)

Some of these tracking solutions even let you see what the user is seeing on the website in real time, including his/her mouse cursor, etc.


That would be the Enhanced Ecommerce functionality of GA[1].

It's supported by default in most ecommerce platforms, and is one of the tools that enables really sophisticated performance analytics, A/B testing, and remarketing if you really leverage it.

But in return you're giving Google incredibly detailed insight into your business model and performance. Which would be really concerning if you were in an industry Google decided to come after.

[1] https://support.google.com/analytics/answer/6014841?hl=en


Tangential: Does anybody know or have a reference about whether the opt-out-or-can't-even-opt-out tracking in Android, Windows 10 and possibly iOS are GDPR compliant? My reading is that it isn't, but I'm not well versed on the subject.


I'm not sure I can answer this, but it might help anyone who could if you know an example of an application that does the nature of tracking you are concerned with?

I should mention, that demanding tracking may well be okay in GDPR, in necessary contexts: for instance a banking service may have to do some natures of fraud prevention using tracking, perhaps of recent internet facing IP addresses used and may have a regulatory need to do something like this.

Also bear in mind that GDPR isn't the only law here. If you want to access data stored on a user's terminal (mobile device, laptop, etc), then you likely need consent too under ePrivacy: for example "Article 5" https://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=CELEX...


> I'm not sure I can answer this, but it might help anyone who could if you know an example of an application that does the nature of tracking you are concerned with?

Using any map application (especially google maps) does not actually need to store my location for posterity to give me useful service, and I did not opt-in but android still does; Perhaps it's because I haven't been back inside Europe in the last month or so (and when I do, I'll get a "please opt in" prompt).


There are days when I wish all JavaScript was illegal... Step 1: Go to media website with Firefox on my mobile phone. Step 2: Mobile phone hangs, gets hot, jerky scrolling, delayed scrolling, unprompted scrolling (as ads load and get inserted and reflow everything), combinations of all of these. Step 3: Give up and use Firefox Focus for the same g.d. site, and it just works.

Some sites won't load at all though if you block JavaScript. They've ruined the internet.


Sites that wont load on the internet without JavaScript dont deserve to be on the internet.

Edit: left out the important without javascript


This is an outdated and flawed mentality in my opinion. JavaScript has great potential to enchance UX if used selectively with care. There are plenty of ways to ruin a website, JavaScript is but one.

You could also use a defibrillator to beat someone to death but surely discouraging violence and not condemning the defibrillator is the answer (this is a crappy analogy I know).

And the appropriate use of JavaScript is being encouraged more and more. Browers have increasingly usful features against "user hostile" sites. And then there's the old "vote with your wallet" except this time it's vote with your visits - if a site uses JavaScript or anything else you find unpleasant then.. don't go to that site.

Leave JavaScript alone already!


That's funny, I had to do this for the first time this week.

But I went with https://developers.facebook.com/docs/facebook-login/manually...

I guess since I don't load any external js, this is fine, right ?


your local files still send data to their servers.


If you want to respect your users, use self-hosted shariff.

https://github.com/heiseonline/shariff


As a user, I despise social buttons and never trust or use them. I'm very curious if I'm in the minority on that or not.

(And that's considering only UX, not privacy issues.)


Firefox + uMatrix should take care of this if I am not mistaken


you only forgot the last factor: + 5 years of webdevelopment experience.


I don't see how this has anything to do with Facebook specifically as any 3rd party JS script can do this. Clickbaity title.


It's an extremely common and well-known library. Why is that clickbait? It's also helpful to study one concrete example at times. We know what Facebook collects and how they use it, and that closes the loop on whether this is scary vs. merely interesting.


[flagged]


Agree, I think most web devs should already know this, I am just surprised it's not a popular topic of discussion.

Also, the common argument for this technique that CDN lets you cache content so that users don't have to re download it every time. But I think it could also be done at the HTML level by precising a hash value of the file that we want to integrate (I stopped web dev for a few years I don't know if that has been added to HTML ?)


https://developer.mozilla.org/en-US/docs/Web/Security/Subres...

The property is used to ensure the CDN is serving legit content that has not been tampered with.


What about your internet provider and belongers. Also sorry but nothing is free in our world. Mostly if it involve servers.


Google fonts only requests CSS.


Yeah but look at the url that download the fonts. They contains trackers. These are hash keys, exchanging crafted informations about your browser and your current «position» on the Internet. These informations alone are enough to track everything in an unique manner. Our browser are leaking many things, the fonts installed, the screen size, the external and local ip first, but also: nothing is perfect in our world. Including machines, the same tests on identical machines doesn't match exactly, and there is many possible tests. All this informations makes you absolutely unique on the Internet, and a single call to a server is enough. And I am not talking about cookies there.


it can only request css. but most common implementation is js


The wizardry is server side. See the css. You won't get the same hash. When called one time + ip + client browser hash, it is enough. Next website visited with a google font, you are tracked.

/* latin */ @font-face { font-family: 'Tangerine'; font-style: normal; font-weight: 400; src: local('Tangerine Regular'), local('Tangerine-Regular'), url(https://fonts.gstatic.com/s/tangerine/v9/IurY6Y5j_oScZZ784Ox...) format('woff2'); unicode-range: U+0000-(...) }

https://fonts.googleapis.com/css?family=Tangerine


That's sneaky, I didn't know about it. Thanks for pointing it out.


So, ability to commit a crime is illegal? Did I miss something in this article?


Facebook who have the ability are not necessarily criminal.

It is the websites that invite them into a secure context that are often illegal.

In the physical realm, is it okay for an advertising company to be invited into a bank safe or customer records storage without any business controls to audit, monitor or check their actions? Same is true on websites.


It's maybe not okay, but legal for sure.


From what I can tell, GDPR did not have any impact at all. It was supposed to end tracking without explicit consent. But did even a single big website end their tracking? Not that I know.


Does the EU plan to actually enforce the GDPR?


We have to wait and see, both Google and Facebook were sued on day 1 [0].

There have been many implementations. From outright banning EU sites [1]. To companies such as medium who tell you their privacy policy has changed and to accept, to companies who give you a modal window on how to change your security settings or just click "OK".

What would be interesting to see are these stats:

1) How many users get upset about non-compliance and complain about GDPR. Just how many do actually follow up with the ICO?

2) How many users who see non-compliance but just don't want to bother and move to another "compliant" site?

3) How many users just don't want to bother, want to consume the content and click "OK". In effect, GDPR turns into the "cookie-law" effect. Where users become blind to it.

Also, to follow up on 1. How many complaints to the ICO are actually dealt with and enforced?

I think for now, we are in a holding pattern. This needs to be tested in the courts first. Google and FB are going to be the front line. Whatever happens there, will affect how things move from there.

[0]: https://www.theguardian.com/technology/2018/may/25/facebook-...

[1]: https://www.standard.co.uk/news/uk/gdpr-compliance-us-websit...


Wait a year to see. It is next year it gets serious, by which time Brexit will have happened.

There was a collective bit of Y2K style madness about it, I do wonder how big most people's mailing lists are after sending out emails they need not have sent. The law was never aimed at regular businesses wanting to update their customers, it was aimed at the Facebooks of this world.


If you need to let load external JS, you have failed as a webdeveloper.

Just from a performance aspect: An additional DNS resolve, additional TCP handshake, additional TLS, just to deliver a .js file that you could have easily served from the original website.

Not to mention the security aspect.


The performance risks may not be so bad, if you're using a common CDN/library as it may be cached and save on download speeds: just hope most of your users don't have secure browsers, wiping caches regularly.

But there is still a problem with loading third party JS, even beyond the SaaS type that you expect to change regularly (where SRI+CORS becomes difficult-impossible to control), just loading Bootstrap has risks.

Although a good web developer hopefully knows how to add SRI, uses a validated third party library, will pick a fixed version and use a CDN with qualities as good as their hosting solution... they are rare to find and I doubt any data protection controller/officer responsible for GDPR should allow this risk.

The developer might: forget to add SRI; pick a CDN that allows tracking (read https://www.maxcdn.com/dpa/); pick a CDN registered or running servers outside of the EU.

So, the data protection office therefore has to check the terms of services of the CDNs, audit them regularly and then ensure there is appropriate staff training and QA to put in SRI and validate it it as needed.

Meanwhile, if the developer or data protection officer changes, there has to be enough documentation and process around to transition these practices to the next staff.. it all adds up.

Man power is often more expensive than CPU, so chucking the JavaScript in static hosting the company has control of is likely less for the DPO to worry about.


People will argue that ajax.googleapis.com will be loaded anyways and that its jquery JS file will be cached but I'd argue that there are so many different JavaScript frameworks with so many different minor versions, that the caching aspect isn't that good. There are also at least a dozen different popular CDNs for this and everybody is using a different one.

The only thing it got going for it, is the bandwidth savings for the original website.


How else would you suggest loading social media?




You don't.


The downvotes again confirm that this is indeed a correct assumption.


Huh, isnt this where node and react are based on.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: