>> Dropbox gave us access to project-folder-related data, which we aggregated and anonymized[...]
Wait, Dropbox gave away non-anonymized data to a third party and they then anonymized it. Wow, what could go wrong? Just thinking of the endless possibilities of where all that data is now... Its deeply troubling how much unwarranted trust there is when it comes to handling of personal data.
Dropbox put Condoleezza Rice on their board, who supports warrentless wiretaps [1].
I deleted my account when they did that. Not so much because it would have any direct effect, but because it’s clear that we have differing views on how user data should be treated.
I surprised that people are shocked by them treating user data like this, it’s absolutely in character.
Truth be told I've also been dissatisfied with the price of the Plus and Pro subscriptions, relative to what they provide, with their support and their direction, so was looking for motivation to move.
Given the very public failure of Netflix to "anonymize" data, they shouldn't be even giving away anonymized data without user permission on account of anonymized data not actually being anonymous.
This is deeply troubling. As a scientist who uses* Dropbox I gave no informed consent. I know they claim personally identifiable information was removed but still I gave no consent for this.
I can't speak to how informed you were when you gave the consent, but if you are using the service, you provided it.
"Law & Order and the Public Interest. We may disclose your information to third parties if we determine that such disclosure is reasonably necessary to: (a) comply with any applicable law, regulation, legal process, or appropriate government request; (b) protect any person from death or serious bodily injury; (c) prevent fraud or abuse of Dropbox or our users; (d) protect Dropbox’s rights, property, safety, or interest; or (e) perform a task carried out in the public interest."
I would assume that this research fell under the "task carried out in the public interest" clause.
Isn’t public interest a criteria for getting IRB approval? Having read all the recent AoIR threads on ethics, this doesn’t seem outside of the accepted norms.
"The principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable. This Regulation does not therefore concern the processing of such anonymous information, including for statistical or research purposes." [1]
I imagine the work was carried out by a processor, which could be perfectly legal if the contract between the two entities had adequate data protection clauses. This is just a guess though, I'm sure its much more complex than that.
If they asked consent specifically for these types of studies, then it's legal.
As in: a form asking the user if their information can be used in this way and giving them the possibility of opting out. Adding one more clause to the privacy policy doesn't count.
opting-out is not OK via the GDPR. Only Opt-In is allowed or at least that's my reading
GDRP section 32
Consent should be given by a clear affirmative act establishing a freely given, specific, informed and unambiguous indication of the data subject's agreement to the processing of personal data relating to him or her, such as by a written statement, including by electronic means, or an oral statement. This could include ticking a box when visiting an internet website, choosing technical settings for information society services or another statement or conduct which clearly indicates in this context the data subject's acceptance of the proposed processing of his or her personal data. Silence, pre-ticked boxes or inactivity should not therefore constitute consent. Consent should cover all processing activities carried out for the same purpose or purposes. When the processing has multiple purposes, consent should be given for all of them. If the data subject's consent is to be given following a request by electronic means, the request must be clear, concise and not unnecessarily disruptive to the use of the service for which it is provided.
You are correct, however this is for when consent is relied upon as the legal basis for processing.
My guess is that they are using provision of service as the legal basis for processing, whilst relying upon the "public interest" clause in the ToS to justify the sub-processing by the third party.
That doesn't work, you need to have a legal basis for all processing. It's hard to argue that operating the service requires this sort of research, so you need another basis.
There's some public interest exceptions, but from my knowledge it's not established that stuff like this would work under it.
Yes, you are correct. I think it would be extremely difficult to justify that this kind of processing was necessary for the provision of service.
It seems to me that an organisation the size of Dropbox would have a fairly watertight justification. However if the legal basis for processing is neither consent nor provision of service, then they must have done a pretty good job of obfuscating all PII (as the article says "...we and Dropbox employees could view no personally identifiable information.". If this is the case then this sharing of information may not even be in-scope of GDPR.
I'm not sure if the public interest exceptions would be a safe route to go down. The EU has made it clear that, like 'Legitimate Interest', the get-out-of-jail-free justification is going to be highly scrutinised.
EDIT: I have just seen that the article has been edited to say that the anonymisation and aggregation was carried out by Dropbox before being transferred to the third party, which kind of kills the discussion.
They could be relying on the "public interest" part of their TOS, in which case they'd possibly argue that this processing was necessary as part of the provision of service, and therefore wouldn't require any further consent from the user.
For the record: I'm not suggesting that what they did was ok, just trying to think about it from a GDPR perspective. Anonymising account information is great and all, but how can you be sure you've obfuscated all PII from information saved to file storage, unless you audit all that information - which in and of itself seems ropey from a data protection point of view.
So it looks as though the article has been edited to say the anonymisation and aggregation was carried out before being transferred to a third party, which craps on our discussion a bit.
However, to answer your question anyway - I don't believe you could justify the work as being in the public interest. I think it would be an extremely tenuous link and I think you'd be a fool to try and rely on something as flimsy as public interest if you're not a government body, or processing data on behalf of one.
I suppose I was taking a stab at understanding what their thinking was to see if anyone else could provide me with something which I had not considered.
I wonder how they got approval from the Northwestern Institutional Review Board. Not having explicit consent from research subjects might indicate that they qualified for some form of exempt status. Did they sell them that Dropbox collects that data as part of their normal operation, therefore consent is not required? Did they say that Dropbox's anonymization was enough to guarantee subjects' anonymity? Did they say that Dropbox's user agreement already enrolls users into research projects?
As a Dropbox paying customer and never having heard of the Northwestern Institutional Review Board it's not them that pisses me off. I haven't reconsidered my usage of Dropbox for a very long time, since I made the decision to stay with them after their no-password fiasco. Today is the first day in a long time.
Imagine you have a folder in your Dropbox with 237 subfolders, and each of those subfolders has a certain number of files in it. The largest folder has 1,132 files, for example, the second largest has 916, the third-largest has 771, etc.
Then imagine you have a second folder with 117 subfolders with another pattern like above.
Now imagine that the first folder structure matches a torrent of embarrassing pornography and the second appears to be a superset of a project published to GitHub under your name (i.e. with some directories being gitignored)
I've stored non anoynmized data on Dropbox as part of my own research. IRB gave me permission to keep that data and my consent form explained it to participants. We were all working under the assumption this type of sharing by Dropbox was impossible. My school's IRB does not allow the use of Google drive for nonanonymized data storage based on just this type of concern.
Maybe the parent was performing a research project on Dropbox collaboration techniques and got scooped?
But seriously, as an example, I know people that share sensitive personal information with their accountants at tax time using Dropbox. Would suck for any of that to be made available to any third parties.
Given it was from universities, then prior disclosure of IP for patent applications could be a "harm". Even the directory names and structures could be key information in some applications.
The harm is that 1) data which seems anonymized can be de-anonymized due to carelessness or advances in analytical techniques, and 2) it’s mine. I have laptops in my house for example that I’ve not used in years and will never use again. That doesn’t give you the right to steal them even if there’s no explicit “harm” to me.
> Dropbox gave us access to project-folder-related data, which we aggregated and anonymized, for all the scientists using its platform over the period from May 2015 to May 2017 — a group that represented 1,000 universities. This included information on a user’s total number of folders, folder structure, and shared folder access
This seems like heaven for industrial espionage purposes. Just because there's some anonymisation doesn't mean that the metadata is useless. I sincerely hope they get GDPR'd over this.
If it's complete "folder structure" tree data with names removed, that could potentially be matched against public repositories that contain the same folders, like posted Zip files, GitHub projects or university web space, to identify some of the users.
From there, you could potentially identify other tools those people are using or embarrassing folder structures (e.g., deep folder tree structures people used to use to primitively conceal porn and other secret files, or signature folder structures for embarrassing repositories like erotica archives, collections of extremist literature, or piracy tools).
Hopefully they'll release more info and some sample files (like for themselves).
> It is only data from universities, not real for-profit businesses.
Plenty of universities work (often in collaboration with national laboratories and/or with corporations) on work that's far more important and critical than many "real for-profit businesses".
A large part of what universities do is research in consultation with businesses and especially governments. It's not just teaching students; in many universities, that's only a small fraction of what the staff do.
The lack of consent requested is ridiculous. There may be some sort of 'obscure paragraph in the terms and conditions that says Dropbox can do whatever they want' but this is horrible for privacy and business security. I'm glad I've been client-side encrypting my Dropbox files.
For the past two years I've using a free open-source encryption app called Cryptomator (https://cryptomator.org/) for my Dropbox folder without problems. The only caveat is the mobile apps aren't free.
Another Dropbox encryption app is BoxCryptor, but I quit using them when they went subscription-only.
Boxcryptor used to be a one-time purchase to allow an unlimited number of cloud providers and devices.
However they decided they needed a consistent revenue stream so they renamed their software "Boxcryptor Classic", stopped updating it, and now users have to pay $48/year to get features previously available as a one-time fee. This was about 2 years ago, by now they've probably scrubbed all references to the "Classic" version on their website.
To be fair the subscription version does have new group/admin features for multiple users or businesses.
Dropbox sharing your data without consent? Nothing new. I've been using dropbox paper for a year now and only recently found out that by default all docs are shareable. That means, if you log into dropbox -> open a dropbox-paper doc -> Logout you're not safe. Anybody with your browsing history can re-access the document you've been working on even after you log out. So essentially if you're using dropbox paper on a shared computer or on a public/library computer and logout, people will still have accesss to your docs that you've worked on. Only way to turn this off, is to manually click the 'invite' button and uncheck the share option for each and every document seperately. I had some personal/sensitive info on a few docs and was shocked to learn of this. Completely unacceptable from dropbox!
There are a _LOT_ of services that treat the URL as a secret. If you're leaving these URLs in your browser history on a public computer other people can access them. For example, the images served up by Google Photos, that have the form lh3.googleusercontent.com/[kilobytes of base64-encoded spew], can be accessed by anybody having the URL. So if you use a public computer to access these, even though you need to be authenticated to browse Google Photos, and despite the fact that you conscientiously logged out, anybody with access to the history can still look at your photos. This is not be any means the only such example.
I love Dropbox, but this kind of data-gathering without explicit permission is bananas. What I'd really love to see added to Dropbox is client-side encryption (i.e. I want to manage my own keys so nobody can monkey with my data). And yes, I know I can store an encrypted container inside Dropbox, but that defeats the purpose of easily accessing my data from every device.
Dropbox may not natively add client side encryption in the near future. The way Dropbox has stored data from the beginning is by deduplicating information across all its users to save space. So if you upload a book or a movie or a song and I upload the same, Dropbox stores one single copy of it for the both of us.
This is just a simplified explanation. The actual deduplication is done in smaller blocks.
If you want client side encryption with Dropbox, you have to add a layer before the Dropbox client sees your files on your system, using Cryptomator or Boxcryptor or encrypted volumes with Veracrypt, etc.
Or you could switch to other online backup/sync services that claim to have client side encryption, like SpiderOak and a few others.
What kind of junk statistics are these? HBR ought to be more discerning in what it publishes.
"How successful teams collaborate"... wait, I meant "the average number of users who update the same directories in Dropbox from institutions that tend to have influential research.
Sound insights. Make sure you're collaborating with no more than 2.3 people or else you'll have to move your research projects over to Yale.
Agreed, these sound like completely arbitrary measures.
I agree that senior researchers probably bring valuable experience and insight to research projects, but I don’t think you can validly arrive at that conclusion from the number of times they open a doc in Dropbox.
Yes this was curious. I wonder if these insights (2.3 v 3 collaborators over 180 vs 130 days with the top person contributing x%) was really effective or just a coincidence.
As the linked article states, all personally identifiable information is removed, but you still want to be able to say "Alice worked with Bob in folder 1, and that same Alice worked with Charlie in folder 2", so you assign unique identifiers to each user, such that you can't tie Alice to "Prof. Smith at University of Chicago", but you can tie folder 1 and folder 2 to the same Alice.
The GDPR has provisions for information like this, specifically to say that small pieces of information can together still constitute personal data. Consider that you can retrieve names if you map someone's professional interactions with this kind of detail.
Regardless of whether or not the GDPR applies to these people, it's a useful tool to illustrate why this kind of data is still wrong to share (especially without any kind of consent!).
If using a Mac, it’s pretty easy to encrypt a drive and store it in Dropbox, then mount it when you want to use it. Kind of negates the whole point of Dropbox (mobile access, small sync etc) but I started doing it for more sensitive things.
Doesn’t require any 3rd party addons.
I really can’t believe they shared this data. Universities do work for businesses all the time. Imagine a folder of research subjects organized by geo/age/sex then full patient name or SSN, under a folder called HIV survey or something. I mean really?
I am sorry this might be a cliched post and doesn't add to the discussion but if they do things like this to academics what do they do with other peoples private data that we dont know about?
This was a very poor read, and just listed correlations based on some numbers. I personally didn't learn anything that I, or anyone else, could apply. It's just a "correlation=causation" based list.
Wait, Dropbox gave away non-anonymized data to a third party and they then anonymized it. Wow, what could go wrong? Just thinking of the endless possibilities of where all that data is now... Its deeply troubling how much unwarranted trust there is when it comes to handling of personal data.