Things like this are super annoying to me, especially when Apple is engaging in it as well. Just like deduplication on cloud services, this "feature" is literally only beneficial for the company so they can reduce your file storage use and make your data less secure and have an excuse to get to know more about your data they can use for other purposes.
> this "feature" is literally only beneficial for the company
Is this actually true? I don't know. I don't think consumer file hosting services would be viable at their present price points if they were storing every single file byte-for-byte.
Certainly some of the value capture belongs to the company, but if you're not particularly concerned about security it seems like a win-win.
If you are concerned about security, encrypt your files and everyone else with deduplicated files will subsidize the cost of your service.
It's not that they were simply smart and didn't de-duplicate identical songs; at some point they simply replaced instances of various recordings of songs with a single one. I could not be further from being an audiophile, yet I still understand the anger this could cause. Even at a basic level, there can be relatively major differences.
I don't think this has anything to do with de-duplication.
It's likely referring to iTunes Match which is a paid service that allowed you to upload your own MP3 files to Apple who would then serve it to you alongside any purchased/streamed music.
I believe that by default (but able to be disabled) your songs would be "upgraded" to the master-quality equivalent in Apple Music if they were identical. But then over time artists have been replacing those song with different versions/mixes.
> Just like deduplication on cloud services, this "feature" is literally only beneficial for the company so they can reduce your file storage use and make your data less secure
Every distributed object store I've ever seen takes your data, chunks it into a fixed size, hashes it and then distributes copies on multiple nodes.
So deduplication is intrinsic to the architecture and not some business decision layered on top.
I imagine they mostly all do it, with the exception of maybe Tresorit, possibly ProtonDrive but I am not super familiar with them. Its actually a "feature" of convergent encryption which Apple uses in lieu of true zero-knowledge e2ee like Tresorit.