Hacker News new | past | comments | ask | show | jobs | submit login

> I had GitHub claim my private git repos were not personal data. Because it was code, forgetting that my name and email are attached to every commit.

This brings to mind a few questions.

1. If a site stores multiple copies of a particular piece of personal data, let's say an email address, do they have to give you every instance when you ask for your data, or just tell you that they have your email address?

For example, if I use email address as an account identifier, so it is used as the primary key in the Users table in my database, and as a foreign key in my Purchased table, do I have to say send something that says your email address is in 1 row of one table and 13 rows of another table?

2. If I have to give back copies of your personal data that is in content you uploaded, what if that content contains personal information of other people, too?

If you had let others commit to your private GitHub repo, for example, their personal data would be in there. If GitHub has to give your commits to that repo in response to your GDPR request, do they have to filter them so that they only return the commits you committed?

What if I submitted an issue, you committed a fix, and in the commit message you thank me by email address for diagnosing the issue? Does GitHub have to remove my email address from the copy of the commit message when they respond to your GDPR request?

3. What about services that provide storage but don't process the content of that storage except to keep redundant copies or backups to protect you from hardware failure, such as Dropbox or Amazon S3? If I ask Amazon for personal information on me, do they have to figure out that you uploaded your contact list to S3 and my name, email, and phone number are there and tell me about it?




IANAL

1. How the data is stored is irrelevant. Again personal data refers to data connected to your account. So if they store a history when and where you logged in, they have to provide it. When you upload stuff, they have to provide it. When you star a repo, they have to provide it.

2. They have to filter data out that is not ought to be seen by you. A Repository is a special case since it is not simply personal data. Think about giving a contractor temporary access to your repo etc. GDPR tries to enforce reasonable data compatibility between platforms ("Right to data portability"). This is orthogonal to personal data collection.

3. No. It's the responsibility of the services that use S3 to manage this. The operators are the controllers in this case. They also have to ensure that Amazon does not process the data they store on AWS S3. Eventually they have to make this agreement even part of the contract with the persons who they provide the service for.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: