Hacker News new | past | comments | ask | show | jobs | submit login

I wonder how encrypted data can be de-duplicated. Do they use per-file encryption with no per-file salt?



Dunno about SpiderOak, but the way Tarsnap does it, is that as blocks are encrypted and uploaded, the client keeps metadata about them locally (presumably an hash, size, etc). Then that metadata is also encrypted and uploaded to the server. When it wants to upload more blocks, it just looks at that metadata and skips duplicated blocks, updating only the metadata to point to the existing block.


All of that is correct, but more to the point: Client data is deduplicated before it is encrypted.


You can share files with other people, similar to the way you can with Dropbox. That might indicate that encryption is done per-file (which is actually a little less secure, so who knows).

I haven't researched it, but it could work like this: scan the local machine, find duplicates, upload unique files, and then create links to any place a file is duplicated. It all happens locally, so only encrypted data is ever uploaded. Some tiny bit of info about the structure of the file system might be transferred and known by SpiderOak, but I can't conceive of a situation where that matters.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: