Hacker News new | past | comments | ask | show | jobs | submit login

I suspect this 5GB quote is a "fake", since there is no need to actually "copy" a song to the User's virtual drive, rather add a pointer to a copy already available on their platform, that is, if 1M users are buying a popular song, there is no need to clone this song million times, right?

Having said that, since there are smart people working on this platform on amazon, I am sure they don't make a physical clone, which raises the question already mentioned above, why limit to 5GB?




There is no cap on music purchased from Amazon MP3 (going forward). The cap is on uploaded files, since there is no way to link your uploaded copy of Bad Romance with my uploaded copy of Bad Romance and know that they are the same.


Surely there is in fact a way, at least much of the time. In general, I assume encoding a digitized song with a given codec and quality settings is deterministic, so the files have some chance of being identical. Next, if the track were identified using a common process (e.g. CDDB) it might easily be possible to identify them as being identical in origin and then perform some tests to see if they are in fact identical and convert one into a pointer to the other.


hash is the right (and only) way to do this.


My suggested approach assumes hash as a baseline approach (identify actually identical files -- which DropBox is apparently already doing) and goes beyond this (identify files that are identical in source and intent, but different owing to random trivial errors).


There is, if the MD5s match.


Of course; Dropbox uses the same mechanism.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: