Google Takeout: Export a copy of your data in Google services

MertsA · on Nov 12, 2020

Another fun one. Go into Facebook and download your data and look in the ads folder. There's a very helpful "advertisers_who_uploaded_a_contact_list_with_your_information.html" page that lists every advertiser who somehow obtained your contact data and ran targeted ads to you. There's a lot of sketchy companies out there who I never had any kind of relationship with who somehow managed to obtain email or other contact data. Bit of an enlightening experience as to just how widespread your personal data is being shared, even if you're somewhat careful of who you give your personal data to.

singron · on Nov 12, 2020

The facebook dump also includes email addresses and phone numbers that you have deleted from your account. After I removed my contact information from my account, I was curious why advertisers were still able to target my account with these uploaded lists, so I guess that explains it.

Closi · on Nov 12, 2020

Well that clearly sounds like it's not GDPR compliant, but Facebook might have a different policy for the EU.

lmkg · on Nov 12, 2020

Facebook is currently involved in multiple legal battles in Europe about the GDPR, including at least one where it's the instigator. Facebook has different views about what's allowed under the GDPR compared to the people in charge of enforcing the GDPR.

lkbm · on Nov 12, 2020

You can also find this without having to wait on the export: https://www.facebook.com/adpreferences/?section=audience_bas...

You can browse all the various data here: https://www.facebook.com/your_information/

This is the download page: https://www.facebook.com/dyi/

mdoms · on Nov 12, 2020

That first link isn't the same thing as what GP is saying, at all. That's just a list of advertisers for whom you have been caught in their advertising dragnet (eg you're in one of their targeted demographics).

fsflover · on Nov 12, 2020

See also: https://www.facebook.com/off_facebook_activity/.

Closi · on Nov 12, 2020

How nice of them to make a button for me to clear my history.

Unfortunate that it pops up with a blank alert box, and then on clicking the button to proceed it informs me that it "Failed to clear history".

nimajneb · on Nov 12, 2020

Apparently I'm doing something right because mine is empty.

"You have no available activity to show at this time."

Maybe because I've never had a habit of linking phone apps, web apps, etc to Facebook. I tend to keep different accounts completely isolated. Which is why Google really bugs me, I don't like that my YouTube, Gmail, etc are all the same account.

fsflover · on Nov 12, 2020

> Which is why Google really bugs me

You could probably try Firefox containers for that.

411111111111111 · on Nov 12, 2020

Multiple Google accounts is against ToS though afaik

pmiller2 · on Nov 12, 2020

Where in the ToS is it prohibited?

411111111111111 · on Nov 12, 2020

you're absolutely correct. its only against the tos to create another account if any of yours have been banned. Having multiple accounts is allowed for work/private purposes.

> Many people have more than one Google Account, like a personal account and a work account. Uses like that are fine.

webmaven · on Nov 12, 2020

> you're absolutely correct. its only against the tos to create another account if any of yours have been banned. Having multiple accounts is allowed for work/private purposes.

Huh. What happens if you already have multiple accounts, and one of them gets banned?

fsflover · on Nov 13, 2020

In this case nothing prevents Google from banning all other accounts. There have been quite a few related stories on HN. It's scary.

dalbasal · on Nov 12, 2020

Good info.

Any organized efforts to export and donate data? Data is not data in some important senses unless it's aggregated. A "watch-them-watch-us" dynamic can't exist if one side is aggregated and the other isn't.

pletsch · on Nov 12, 2020

Mind linking a project if you find it? Currently working on a tool that needs text conversations for data training and would be happy to donate if we can use the data as well.

mtnGoat · on Nov 12, 2020

I've been watching that list for years. Oddly, a LOT of car lots are always on there. Someone must be making a killing selling them low quality leads.

solumos · on Nov 12, 2020

Not only that - Facebook allows you to create "lookalike" audiences from e-mail lists. So as a marketer, if you can get your hands on a good e-mail list, you're golden.

tasubotadas · on Nov 12, 2020

Please add link

ir77 · on Nov 12, 2020

i'm very curious about something -- people here rave about the takeout -- does anyone actually use the thing?

i've tried the takeout option about a year ago when i wanted to switch from google photos to apple photos and it was an absolute mess, it exported random zipped folders, all broken up (i think there were 14 of thema ll together), multiple images were duplicates, somehow it managed to corrupt apple format of pictures and i would have one picture with the same name that was something like 24KB and another one with 1.2MB, etc.

if it wasn't for the option (which they took away) of syncing photos to your google drive and copying them out that way, i have no idea what i'd have done, i'd have lost about 120GB of photos and videos going back to 2008.

so seriously, do all of you just use the takeout and store it away without actually opening up and trying to use it?

karlicoss · on Nov 12, 2020

I think breaking up is a ZIP limitation (max archive size is 4GB)

I'm exporting it regularly (although I certainly don't have 120Gb of photos on it). You can choose an option for regular exports (every two months), and delivery method (e.g. Google Drive). Then I have a script that runs daily, mounts google drive, moves the takeout locally if it's present and removes from google drive (so it doesn't take space).

Then indeed, inside you have mess with some data in HTML, some in JSON, etc. But well, at least you can parse it... I have a library which I'm using as an API to various data exports, in particular, archived takeouts too (so I don't even have to unpack them to access)

- https://github.com/karlicoss/HPI/blob/master/my/google/takeo...

- https://github.com/karlicoss/HPI/blob/master/my/location/goo...

- https://github.com/karlicoss/HPI/blob/master/my/media/youtub...

Described this in more detail here: https://beepb00p.xyz/my-data.html#takeout

ir77 · on Nov 12, 2020

"Then indeed, inside you have mess with some data in HTML, some in JSON, etc. But well, at least you can parse it..."

how is some regular schmuck that wants to move his data out of google to another service supposed to determine what they actually have to parse? the user simply uploads pictures into the system but gets garbage out?

the scarry part here is that google makes it extermely easy to suck in the data but for an average user it's extremly difficult to get back out and takeout is absolutely not a good solution.

panopticon · on Nov 12, 2020

They seem to use standardized formats where possible: vcard for contacts, mbox for email, image files for photos, etc. I'll grant you that they do some not-nice things (like separating photo metadata into json files), but I'm curious what format for search or timeline activity would be useful for a "regular schmuck"?

If said person wants to view the data on their own time, HTML seems adequate. And JSON seems ideal if they plan on sending this data to a new service that ostensibly supports parsing Google's takeout.

ocdtrekkie · on Nov 12, 2020

I think a big part of the problem is even if Takeout is using standard formats, none of their competing services or software platforms are set up to ingest those formats.

Like mbox is fine for opening in a desktop client, but if you move from Gmail to Fastmail or Outlook or whatever, mbox might as well be a ClarisWorks spreadsheet file.

edoceo · on Nov 12, 2020

Hmm, I've seen ZIP files over 4GiB - not from takeout but from others

akx · on Nov 12, 2020

That'd be ZIP64 -- see https://en.wikipedia.org/wiki/ZIP_(file_format)#ZIP64

Closi · on Nov 12, 2020

The original specification was 4GB, although there is an extension to the specification that allows for over 4GB (64 bit is required).

rpdillon · on Nov 12, 2020

I just finished found through the takeout process for photos last night. 39 2GB archives. Request took a couple of days to process, then I got an email with a bunch of links that they said were good for 7 days. I planned to load them into my Synology download manager, but the takeout system seems to rely heavily on browser state, so Synology couldn't download them. Each took 5 minutes to download at home over WiFi, and the takeout interface demands authentication every 10 minutes. It also reloads the page and reset my position in the download list on every click.

I've uploaded and expanded about half the archives on my Synology, and it's currently indexing everything, so I can't comment on the photo issues you've mentioned quite yet.

Overall, I'm happy there exists a mechanism to get my photos, but the quality of the experience is truly awful.

BoppreH · on Nov 12, 2020

I exported my data this week, and was given the option of either many zip files (default) or a single 7z file.

I downloaded the 7z without any problems.

mtnGoat · on Nov 12, 2020

Almost like they made the ux really hard to use for a reason, eh?

All that engineering talent and that's the best solution? Lol.

101011 · on Nov 12, 2020

More like, they're just wanting to make sure they're legally covered. There is no added benefit for the company in making that experience better.

Source: working on a product engineering team for a larger company. Why would we upheave our roadmap for something that doesn't help provide explicit value to our product offerings? We've got a list a mile long of improvements and new capabilities to deliver on, and no engineer likes working through legal hoops anyways.

pdimitar · on Nov 12, 2020

I don't see how your statement contradicts your parent comment's. Looks like Google just did the bare minimum to cover themselves legally (as you said). But that shouldn't mean that they can't make the experience better.

As for no engineer liking working through legal hoops, my observations disagree. Many absolutely don't care. Give them a well-described Jira ticket and they'll happily chip away at it for a year if it's necessary.

OskarS · on Nov 12, 2020

My brother passed away very suddenly a couple of years ago. Google Takeout and similar services were a godsend for extracting and archiving everything from his accounts. As far as I know, everything exported fine, thiugh he didn’t use Google Photos.

benlivengood · on Nov 12, 2020

Gzipped tarballs are an option now with what looks like a 40GB max.

  $ ls -lh takeout*
  -rw-r--r--  1 ben  ben    39G Sep 26 18:29 takeout-20200925T172738Z-001.tgz
  -rw-r--r--  1 ben  ben    38G Sep 26 19:20 takeout-20200925T172738Z-002.tgz
  -rw-r--r--  1 ben  ben    35G Sep 26 20:06 takeout-20200925T172738Z-003.tgz

parliament32 · on Nov 12, 2020

The limit is 50gb on tgz takeouts.

jkingsman · on Nov 12, 2020

I use this one every other month -- download everything and throw it in cold storage. Mainly keeping it for the scary possibility of being kicked out of my account.

SirBikesALot · on Nov 13, 2020

What do you use for cold storage? I've had bad luck with external hard drives dying over time.

dawnerd · on Nov 12, 2020

I just used it to pull over 350gigs of photos. Not fun. They really make it hard to reimport photos elsewhere. I wish their app just had a download all.

parliament32 · on Nov 12, 2020

Use the tgz option for exporting, the zip format has all sorts of limitations and weirdness.

silisili · on Nov 12, 2020

As I wrote elsewhere, previously. I agree it's a mess, but not the end of the world. Extract the file -not- preserving directory structure(-j flag). Now you just have everything in a single folder. Then delete the metadata files by file extension. This took all of a few seconds to do for each zip.

slrey · on Nov 12, 2020

Shameless selfplug: As part of ongoing privacy research me and colleagues developed a website that parses personal data exports/takeouts from google, twitter, instagram and facebook and visualizes the data in a treemap and a timeline. We aim toincrease the awareness of personal data and the effects of online behaviour.

The data is not uploaded and instead parsed in the browser.

For the Google takeout make sure not to include data from Photos, Gmail, Youtube and Drive as they make the export too big. Also select "JSON" for "My Activity".

Feel free to try it out yourself:

http://transparency-vis.vx.igd.fraunhofer.de/visualization

tambeb · on Nov 12, 2020

I do a full takeout every month or so, both for myself and also my parents. It's largely so that we don't lose anything if Google does one of its awesome sudden account disablings.

tomjen3 · on Nov 12, 2020

Used it to export all my google plus social stuff (several gb over the years) before they shut it down. It kinda worked okay.

domano · on Nov 13, 2020

Exports as tar instead of zip.

Every photo has an additional json with some kind of metadata.

For me it was fine.

pydry · on Nov 12, 2020

I think people love the idea of takeout more than the thing itself.

duxup · on Nov 12, 2020

I have this run every few months and then download and backup the archive. I like to think it will be helpful in case of the dreaded 'locked out of google for no reason and no recourse' situation.

robotnikman · on Nov 12, 2020

Would be nice if there was a way to automate the process.

duxup · on Nov 12, 2020

Agreed.

It does allow you to send the takeouts regularly to google drive, but the amount of data I have is too much for that / fills my drive.

QueensGambit · on Nov 12, 2020

The easiest way to figure out the ethics of the company is to see if their export tool is easy to use. If it is easy, its safe to stay with them, because they are not doing it just for legal compliance. If not, run away from that company as fast as possible.

Rygian · on Nov 12, 2020

I would go further and say "see if their export tool is easy to use and actually works".

mark_l_watson · on Nov 12, 2020

I periodically use the takeout service and selectively copy to the 1TB of storage I have on OneDrive (which comes with an Office 365 subscription). Having my digital stuff backed up on two cloud providers is enough for my capacity for risk. I used to make a 3rd backup in Dropbox but don’t do that anymore.

Clampower · on Nov 12, 2020

I once tried to download my Google photos, and it became super messy. I got tons of different folders all with duplicates and other stuff and it was pretty unusable.

Has this improved?

dredmorbius · on Nov 12, 2020

Depending on when you did this, likely somewhat.

One of the positives of the Google+ shutdown was that Google Takeout saw a major overhaul around February of 2019. Third-party tools (Alois Bělaška's Friends+Me was invaluable https://blog.friendsplus.me/) still proved very useful, and gave capabilities missing from Google's offerings.

Dealing with data exports remains challenging.

tristanperry · on Nov 12, 2020

I did this 2 days ago, it's the exact same.

I used a simple one-liner find/mv command to move all JPGs into the same folder... which made it more decent.

tambeb · on Nov 13, 2020

Yes! I do the backup OneDrive backup too. (Not a typo, I meant to say backup twice.) And I actually do it for my live Google Drive as well.

I rsync my Google Drive folder to network storage, without deletion, and I have OneDrive syncing that. So I have a local and cloud backup of everything that's ever been in Google Drive.

texasbigdata · on Nov 12, 2020

Do you do this to time stamp changes using a sort of differential technique? Or just for non-reliance on Google.

wombat-man · on Nov 12, 2020

Huh, i was thinking I’d do something similar but upload the file to maybe s3 glacier storage in aws.

rathish_g · on Nov 12, 2020

It's really hard to download from takeout.google.com these days. I have 86 gb data on my google account and is trying really hard to export.

- the export size is around 176 gb. Mostly photos.

- it has the option to move to one drive or box. But 100 gb on Google will be 200 gb on one drive. Images are copied into multiple folders to recreate the albums. Note that google photos automatically create albums for family and trips.

- tried to use 2gb zip to split the files. We have to click and download 100+ pieces of zip files. Even if one file is corrupt we are done. All this shows in a modal window. We can't download more than 5-8 files at a time.

- split it into 5gb zip files. Now download numbers are manageable, but the network keeps dropping and we have to download again. We can only retry 3 times making the entire set useless.

- no options to separate videos and photos.

- we only have a week to takeout and test the whole thing.

TLDR; it's designed to make sure that we don't actually take out the files...

imgabe · on Nov 12, 2020

I'm not sure why you're getting downvoted. I went to download my photos today from Google Photos after reading about the end of the unlimited storage. The interface is incredibly annoying. It split them into 2GB zip files, which is fine, but then it takes multiple clicks to get each download to start. Oh, and then it logs you out every 5 minutes so you have to re-enter your password.

It's pretty clear they were not super concerned about making it a user-friendly process.

acdha · on Nov 12, 2020

If you say something negative about Google you’ll reliably get downvotes from Android fanboys even if it’s completely accurate. (Apple has similar fans, although they seem to be less defensive now that the company is doing so well)

It’s not worth worrying over except as a reminder that rating systems need to handle bad-faith voting.

lorenzhs · on Nov 12, 2020

This doesn't mirror my experience at all, and I think blaming it on fanboys isn't fair. I used Google Takeout recently, had it set to split into 50GB tar.gz chunks, and it worked perfectly fine. I wasn't logged out once, and downloaded the archive with around 90 MB/s (720 mbps). It was a very smooth data export experience. I'm very much not a Google/Android fanboy and avoid Google's services wherever possible. The OP is being downvoted because their experience isn't representative.

EDIT: It does seem that there are more people than I expected for whom the experience isn't as good as the one I had, so maybe Google should test this with (and make it work better on) internet connections that aren't as good as their office lines.

rpdillon · on Nov 12, 2020

I just finished the process last night for about 80GB of photos and GP's comment represents my experience very well. It took hours of tedious clicking and logging in dozens of times to get through it. Miserable.

acdha · on Nov 12, 2020

I’ve also used it and the original poster’s experience rang truer than yours. Most people do not have gigabit internet connections and manually downloading the default smaller chunks is annoying.

ConceptJunkie · on Nov 12, 2020

Yeah, it's been a real hit-or-miss process for me, too. The process is way too manual, and increasing the chunk size from 2GB to 10GB causes a lot more failures of the individual downloads for me.

And I don't particularly want to hear "get a better network connection". My connection works just fine for everything else.

That said, if you're able to download everything, it's reasonably well-organized, although as stated elsewhere, there's a lot of duplication of data.

lorenzhs · on Nov 12, 2020

But can't you just download larger chunks over whatever connection you have? Or is it common for internet connects in (I assume) the US to be so bad that you can't download a 20GB or 50GB file without errors?

acdha · on Nov 12, 2020

It’s really a question of how well it resumes: if you’re trickling in a 20GB file and get a couple lost packets at 10GB, was that earlier transfer wasted? Most people use wireless so it’s not hard to have a transient failure which will disrupt a TCP connection but won’t last that long in absolute terms.

ben509 · on Nov 12, 2020

> We can't download more than 5-8 files at a time.

I understand that big file transfers are hard, but that's the entire point of the service.

There are plenty of file transfer tools that can take a list of URLs, but this forces you to do everything in the browser.

londons_explore · on Nov 12, 2020

You can takeout each service seperately...

ir77 · on Nov 12, 2020

not sure why you're getting downvoted, i just wrote about the same issue i had about a year ago, it's an absolute mess for photos exporting, i don't think people actually use this service here, or they use it without actually looking what's in there.

rathish_g · on Nov 12, 2020

Before downvoting, just try to use the new interface and try for youself. I have used takeout about 2 years back and it was okay. Ofcourse I had just few gigs at that time.

I have tried to spin off an Amazon EC2 instance to download and copy to an S3 bucket. But it logs out every few minutes disrupting the downloads. It will not allow to down the same file multiple times. If one zip fails, the whole set is useless.

epanchin · on Nov 12, 2020

Get faster internet. Take your laptop to a google campus and download on their wifi. Set up a VPS and download and manipulate there.

Plenty of solutions.

acdha · on Nov 12, 2020

Those aren’t good solutions, they’re workarounds - and if you think about them even slightly you’ll realize they’re not very good:

1. The interface requiring multiple downloads prevents automation or simply waiting out a large transfer, and not having a robust automated retry mechanism ensures wasted time and increases the odds of data loss.

2. Few people have a high-speed free WiFi network nearby. You’re not getting better results at Starbucks or the local library, and Google’s campus networks require logins even if you are one of the few people who lives near one.

3. Setting up a VPS and running downloads from a web app requires money and skills most people don’t have, especially if you care about not accidentally leaking your personal data. If you have enough data to matter, you’ll also hit many providers quota limits or bandwidth charges. If you navigate all of those challenges, you still haven’t solved the problem of getting it home - at best you can now use rsync to remove the manual component of the second transfer.

ramses0 · on Nov 12, 2020

“Take your laptop to a google campus and download on their wifi.”

LoL

sahinabi · on Nov 12, 2020

i think he's just trolling - can't be that stupid.

benlivengood · on Nov 12, 2020

GoogleGuest is an unsecured SSID at every campus, if you have one nearby. It generally reaches the parking lots (which are conveniently empty right now)

ramses0 · on Nov 12, 2020

https://about.google/intl/en_us/locations/ ... not as isolated as I thought, waiting patiently for the geo-data people to work out an approximate census of people who live within a 50-mile radius of a google office. :-)

harshitaneja · on Nov 13, 2020

Even though I love using Google services, account being blocked and losing access to everything from domains to personal photos and all important docs in drive is a concern that is haunting me since seeing an uptick in these stories. Maybe it is that I am observing them more but in any case reducing Google dependence and having a strategy for account block scenario has become a need due to the large impact it would have on my life in the current scenario. I really wish google takeout had an API to request takeouts at weekly or whatever works, instead of the bimonthly (once in 2 months) option currently offered. Then one could keep data in Google services and not have serious concerns as at most the loss would be of a week's worth of data. Also on this note- does anyone have any recommendations for domain registrars? I cannot continue to have my domains on Google even though it has a great UX.

usr1106 · on Nov 12, 2020

I don't use Google a lot, but I have an actively used gmail account. I never give its address to anyone and all incoming mail is forwarded from other addresses. I don't send any mail from that account either.

I am somewhat worried that their great algorithms will one day decide it's there is some violation of their ToS and close the account. I had that issue even with a paid (well, voucher that came with a PC, but still) Microsoft Drive account, I used in an atypical way (no sharing, all contents encrypted).

In the past I used IMAP for backing up my messages, but over time my scripts to do so have fallen into disrepair... Whether takeout would be a way to do somewhat regular backups? Or might that trigger their algorithms, that you are not a good customer? Has anybody read the ToS whether anything is mentioned about takeout?

rasz · on Nov 12, 2020

I was never able to takeout my YT comments. There is no place to report bugs, complain, or ask for help actual human beings working for Google.

buildbot · on Nov 12, 2020

Has anyone had any luck using takeout for 10tb plus accounts?

rmkrmk · on Nov 12, 2020

I get export errors with about 250gb in Google

buildbot · on Nov 12, 2020

Matches with me experience, the only effective way I've ever managed to export my drive is via FUSE mount systems :/

itakealotfoto · on Nov 24, 2020

Hey, I have about 17tb of data. Is it even worth doing takeout? Are there any programmers out there who might wanna take on directly transferring this data to a Google one unlimited account? I have seen some third party companies. But I just don’t trust them. I need someone who will write a script to run a download and upload offsite where there is a fat pipeline and instant data transfer.

slhck · on Nov 12, 2020

Well, this works just great: https://i.imgur.com/PZEnuDM.jpg

There's a product called "null" without an icon.

upcoming-sesame · on Nov 14, 2020

"You have Advanced Protection switched on, which means that it could take days or even weeks for your files to be ready to download, but we'll email you when they're ready."

steviedotboston · on Nov 12, 2020

I used this recently to take my music out of Google Play Music and import into iTunes. The downloads were pretty slow, but it worked fine!

yurishimo · on Nov 13, 2020

Same. Do you know a good way to add meta data to songs? I’ve been carrying around a library since I was a teenager and I when I moved it from Google Play to iTunes, a bunch of the meta data was missing.

steviedotboston · on Nov 22, 2020

Hmm, I didn't notice any missing metadata.

kfrzcode · on Nov 12, 2020

I've used Takeout. Nice for them to provide a bundle. That said, I'd love the option to export my Messages content, and that's something I haven't seen anywhere. Has anyone successfully done this for Google Messages? I don't mean backing up to Google Drive, I mean a full-on export.

LoveMortuus · on Nov 12, 2020

This is actually kinda fun! I have ~17.8GB, so I guess I'm on the lower end, at least as far as what I've seen in these comments!

tomcam · on Nov 12, 2020

It does not seem to include Gmail. Am I correct, or did I miss some kind of option?

pxeboot · on Nov 12, 2020

I used Takeout to export from Gmail last month. It was listed alongside all the other services.

dusted · on Nov 12, 2020

Does this export the youtube video files as well?

bobbob1921 · on Nov 12, 2020

Yes it does (I’ve done this). However up until recently you could NOT export yt vids from any “sub” yt accounts. Ie I have one google account but 4x YouTube accounts under that google account. Google support confirmed that you could only export your YouTube uploads for the primary account. However as of about 6 months ago this was fixed and you can now takeout yt vids for all “child” accounts as well (and I have confirmed this). Altough it is a bit janky (as is all of takeout) as you need to use a separate yt/takeout interface to do so.

izacus · on Nov 12, 2020

I think you need to go directly to YouTube for that - tap your avatar on YouTube, choose "Your Data in YouTube" and there's the option.

flemhans · on Nov 12, 2020

Yes. The original files even (I got a WMV file out)

solinent · on Nov 12, 2020

I just deleted it all, it was easier, now I just keep it all on my machine. No more takeout--home cooked meals. I can see why it doesn't appeal to all though.

choward · on Nov 12, 2020

I hope you have an encrypted off-site backup somewhere.

solinent · on Nov 12, 2020

I wasn't using Google for anything significant so it was pretty easy to delete it. If I care about a picture I'll print it and keep it in a book, I would never trust a corp that's only been in the data business for 10-15 years, especially one known for stealing your information for advertiser's gain.

choward · on Nov 13, 2020

You don't have to "trust" them. In fact you shouldn't. That's why you encrypt locally first. And if they lose your data you still have it locally. It is good to also have two local copies.

sdwolfz · on Nov 12, 2020

Once the export is done and I see something I don't want to exist anymore, where do I request that data be permanently removed (GDPR style)? Asking as I honestly don't know.

yoaviram · on Nov 12, 2020

https://yourdigitalrights.org/d/google.com

This service will generate an email with a general deletion request. Modify it to ask for the specific information you want deleted.

Disclaimer: I'm the creator of this service.

thatsnotmepls · on Nov 12, 2020

> If you do not normally deal with data protection requests, please forward this email to your Data Protection Officer, or relevant member of staff. Please note that you have 30 days to comply with this request.

Google: "Listen here you little sh*t"

cromantic · on Nov 12, 2020

Options are GDPR which applies to European Union residents and CCPA which applies to Californians. Is there anything for people within the US who are residents of the other 49 states?

amelius · on Nov 12, 2020

I think Google Takeout is not meant to give an exhaustive overview of what Google knows about you.

It is meant as a service for backup, and migration to other services.

lopis · on Nov 12, 2020

Exactly. Unfortunately GDPR failed to realize that most data companies hold about us is inferred. Data takeouts usually provide you with all data you willingly provided (uploads, reviews, likes, playlists, etc). But most interesting information is missing. How often did I watch each video? When did I open each e-mail? There's so much data that companies collect about your behaviour that is never given back to us.

aboringusername · on Nov 12, 2020

It's entirely possible to collect information to identify a unique human without it being considered PII - combine it all together and maybe add a sprinkle here and there (perhaps public domain info, buying "anonymized" info) and boom, you know who it is.

Yet, if you're audited, it's just a series of IDs and numbers, nothing identifying there...Right?

If you ask Spotify for your data dump, you'll notice in a lot of the .JSON files the information is encrypted such that you can't understand it (it's just numbers). It's impossible to say whether it's actually stored like this or if they encrypt it before they provide the archive to you.

Meta data is almost impossible to legislate against, and as far as I can see, is entirely legal to collect and use as you see fit.

How many people in the world are on hackernews, named "lopis", use Firefox 65, have an IP address in $country, use this screen resolution etc etc

Easy enough to identify who you are.

hyperman1 · on Nov 12, 2020

AFAIK the GDPR already does this:

See first definition of https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CEL...

You are identifiable by combining multiple factors, even if every individual factor is not enough to be identifiable.

If I understand it correctly, the EU 'personal data' (PD) concept is much wider than the US 'Personally Identifiable Information' (PII) concept. You are touching one of the differences here.

cj · on Nov 12, 2020

This is correct.

For GDPR purposes, data is PII if it can be used in combination with any other data to identify an individual. Doesn’t matter if the individual data points are not themselves identifying.

One thing I’ve been curious about is whether AI and algorithms that can potentially take a huge amount of anonymous data and “identify” a user (but not explicitly), only identify in the sense that the output of the AI was only possible by correlating individuals granularity enough. I’m almost certain the answer is yes. I’m not clear on whether GDPR addresses that issue or not.

neallindsay · on Nov 12, 2020

> For GDPR purposes, data is PII if it can be used in combination with any other data to identify an individual.

By that definition, all data is PII. There is no information available on this planet that has not been influenced by people.

I'm not trying to be obtuse. I worry about this problem a lot. Obviously we need to keep companies from doing stupid stuff like storing the first digit of a Social Security number (can't identify someone by that!) and then the second digit (also not uniquely identifying!), etc.

On the other hand, what if I have web log files that only store URL, timestamp, and status code? Is that OK? If I get hits for two specific pages within a couple of minutes of each other, and there's only one person on the planet who would know about both those pages, I know they were visiting my site at that time.

People influence the world around them and it feels like privacy laws are trying to prevent companies from understanding that influence. At the same time every other incentive is pushing those companies to understand more.

remus · on Nov 12, 2020

> By that definition, all data is PII. There is no information available on this planet that has not been influenced by people.

I think that is a step too far. For example, it seems quite clear that a dataset of daily average temperatures from the top of Everest is not personally identifying information.

ohmaigad · on Nov 12, 2020

Black hair = PII, address = PII, drives black BMW = PII, any of this information together with other information could be used to identify an individual and that is exactly the issue. It is like saying that one brick is a house just because multiple bricks can make a house. If you gather enough data you can potentially point to specific individual. Just like unique PC fingerprinting - gather enough data points so that the fingerprint is unique.

hyperman1 · on Nov 12, 2020

AFAIK according to the GDPR, knowing each individual fact is fine. Only the combination is PD.

Hence, installing a camera that counts black-haired people, another that counts people entering some location, a third counting people having a BMW is perfectly fine. Merging the 3 recorded tapes to identify a person is not. Giving the 3 tapes to someone else is only OK if you guarantee somehow they wont do the merge.

tokamak-teapot · on Nov 12, 2020

Privacy laws are mainly aimed at allowing those whose data is being used to be aware of this, understand what is used for which purpose, and to elect to control this should they object.

None of this is the same as a blanket prevention.

kace91 · on Nov 12, 2020

My gdpr requests have usually included inferred data. I'm not sure if it was facebook or tinder's that showed a giant list of categories they thought I fitted in, which was btw hilariously wrong (I'm a 30 y/o single male and I was categorised as a single mom, for example).

jakubp · on Nov 12, 2020

GDPR does cover inferred data. Source doesn't matter. Only whether this is data about a specific identifiable person and whether it's covered by the list of protected types of data.

lopis · on Nov 12, 2020

Then the takeouts I've personally taken from Facebook, Spotify and Google were not compliant. They were missing anything interesting and non-obvious.

fsflover · on Nov 12, 2020

Yes, and not just in your case: https://ruben.verborgh.org/facebook/#history

fauigerzigerk · on Nov 12, 2020

You'd have to go to the specific Google service that stores the item you want to delete. You can delete emails in Gmail, photos in Google Photos, etc.

aboringusername · on Nov 12, 2020

This is the wrong answer.

Right answer: You can go to [1] and [2] and [3] and ask them to delete your information. It's important to retain a copy in writing that they have removed your information. If a copy is ever found online (in a data breach or otherwise) you would be able to enact legal rights as a result of their GDPR breach. I would encourage people who upload data to leave "fingerprints" in their accounts, such as certain photos, emails, and other data that you have ONLY created on this service (for example, email your own gmail account a unique email, if it's ever leaked, you know where it came from).

It's the same way Spotify's GDPR tool does NOT give you all the information they store, yet if you ask via their DPO (usually privacy@) you get a lot more data, rather sneaky way of hiding their true data collection.

ALWAYS use email or a physical letter, ALWAYS get a reply by the organization when enacting your GDPR rights, your lawyer/legal authority will be very thankful ;)

AND NEVER EVER USE AUTOMATED TOOLS! The chances are, there is data that isn't included within them. For example, go ahead right this second and submit a SAR for "technical log information" to Google, this data is NOT included in their official tools and you will be amazed how much they're storing!!

[1]: https://support.google.com/policies/answer/9581826?hl=en

[2]: https://support.google.com/policies/contact/sar

[3]: https://support.google.com/legal/troubleshooter/1114905?p=pr...

H8crilA · on Nov 12, 2020

RE [2]: You mean "About which Google product" -> "Other", "What personal data are you seeking?" -> "technical log information" ?

raldi · on Nov 12, 2020

Google Takeout is a great solution to the problem of, how do we get Google programmers, without protest, to write functionality into their services to allow Big Brother to acquire a neat and tidy copy of all a user’s data?

scrollaway · on Nov 12, 2020

... what.

You're complaining about the one way google users have to reclaim data that is rightfully theirs?

You know that if "big brother" wants data, big brother gets backdoors, not polite export requests?

raldi · on Nov 12, 2020

I’m not complaining. I’m marveling at the clever solution to the problem of, “How do we get hundreds of product teams to support our legal obligations without feeling morally conflicted?”

scrollaway · on Nov 12, 2020

What you're saying makes absolutely no sense and doesn't pass Occam's razor. Please back it with some semblance of reasoning at least.

raldi · on Nov 12, 2020

Could you elaborate on that? Do you disagree that Google Takeout is used for government user-data requests, or that absent such functionality, programmers tend to balk at implementing such requests?

scrollaway · on Nov 12, 2020

Google takeout is an inefficient tool for government requests, yes.

I'm sure it may get used but I highly doubt it's a primary tool in any way. PRISM revealed how much custom tooling is made specifically for governments and for giving back data to them, automatically addressing their requests etc. Takeout is slow, bulky, and its audience is the end user.

benlivengood · on Nov 12, 2020

Legal processes take priority over virtually anything else; be assured that every large company has gotten and responded to valid legal process before these download tools were available.

Just look at Google's [transparency report](https://transparencyreport.google.com/user-data/overview?hl=...) which shows requests back to mid/late 2000.

Additionally, the scope of what's provided in response to legal process depends heavily on who and what was requested and so the software is far more complex than "download all the user's data". Investigators often aren't aware of what data these companies store and if it's not specifically requested then it's not provided. Lawyers basically copy and paste their last successful warrant/wiretap/whatever and send it to the judge because that's how the legal system works.

raldi · on Nov 12, 2020

I don’t disagree with any of that. The Google Takeout interface makes it very easy for the user (or person responding to a search warrant) to pick and choose exactly what data to zip up.

rpdillon · on Nov 12, 2020

As a programmer, I would feel morally conflicted if my photo service didn't offer a way for users to get their photos in bulk.

raldi · on Nov 12, 2020

Exactly. That’s what makes it such a tidy solution.

anaganisk · on Nov 12, 2020

Hey I see you got your Tinfoil hat on, but seriously what? Do you think its really tough to find a 50 group of bean counting programmers in entire google, who cant build this tool for BigB?

raldi · on Nov 12, 2020

The functionality needs to be written into every product. Every PM would be aware of it, every software engineer would see the code. The story would leak and articles would get written about the “pervasive privacy back doors written into everything at Google”