Post author here! I wanted to make it clear that the original technique for breaking Spotify DRM is not mine - it was developed by Wang et al. in their excellent paper, Steal This Movie: Automatically Bypassing DRM Protection in Streaming Media Services. I just thought it would be a nice showcase of PANDA's capabilities. In particular, we can avoid the crazy optimizations in that paper because we can operate on a replayed execution rather than doing it live.
On the subject of cracking streaming music DRMs, a realization I have been sitting on for a while about what people can do with it... Considering the common wisdom that:
- storage will get exponentially cheaper
- data transfer speeds will get higher
It makes me think that eventually there will be illicit torrents of all the worlds music, plus the index, plus the metadata, and plus the interface/app for browsing it. In other words, people would not only pirate individual songs or movies, they would download their own complete copies of Spotify / Netflix. It isn't feasible now but it could be sometime in the next 5-15 years, depending bandwidth speeds.
I'm not sure how many people see this coming or take it seriously but I wonder what the effect could be and what the remedy attempts would be.
I very much agree. I hope that, capitalism, ironically enough for DRM, is the probably always the solution to overstepped DRM remedies. Regardless of the issue of piracy, which I am not advocating.
For instance there will always be a market for custom computing and "PCs", and so non-locked PCs will (hopefully) always exist in a capitalist environment. That market I think ultimately circumvents any attempt at ubiquitous control of hardware. The same thing is at play with software. And hypothetical new methods of connectivity may be able to circumvent many attempts at central control of the net.
For instance there will always be a market for custom computing and "PCs", and so non-locked PCs will (hopefully) always exist in a capitalist environment.
Unfortunately, that market is slowly becoming the minority, and because those "more free" devices may have limitations that make them incompatible with a lot of proprietary content (which is the majority) and circumventing those limitations could be illegal and difficult, there will be fewer users of them.
Given the move away from the carrier-subsidized model for mobile devices, I am optimistic about the technical future of general purpose mobile computing. What I am not optimistic about is the ability to seamlessly access content across devices from data that is privately stored in the cloud, mainly because not enough people would be willing to pay for/fund the development of such a service/piece of software to make it worth developing.
> It makes me think that eventually there will be illicit torrents of all the worlds music, plus the index, plus the metadata, and plus the interface/app for browsing it.
It seems to me that the a-grade private torrent trackers are exactly that. Larger catalogue, more comprehensive metadata, and better interface than any commercial service.
> In other words, people would not only pirate individual songs or movies, they would download their own complete copies of Spotify / Netflix. It isn't feasible now but it could be sometime in the next 5-15 years, depending bandwidth speeds.
If you have the infrastructure, and are really dedicated, you can definitely do it today.
I don't think you see what I'm saying. Imagine a single torrent called "music", 15 terabytes large. Opening the application inside brings up a local copy of Spotify.
>It seems to me that the a-grade private torrent trackers are exactly that. Larger catalogue, more comprehensive metadata, and better interface than any commercial service.
Regardless of metadata quality and collection size which I think is debatable although having not seen one... they are a la carte, so much different.
> If you have the infrastructure, and are really dedicated, you can definitely do it today.
No infrastructure is required, just multiple terabytes of storage (which most households do not have... yet), and bandwidth that makes downloading terabytes an attainable feat. Needless to say the later is not viable today either. But eventually both those things will become commonplace.
15 is a high number too. The set of music one would be possibly be interesting in hearing (of the set of all music which has already been produced) may be closer to 3-5 terabytes. Movies of course, are different larger set of numbers.
It's always fascinating to me how far the tools of the trade have come on since my heyday. They surpassed my own custom tools of yore a little while back, and things like this were things I could obviously never do at the time - nowhere near enough storage, for a start, but I also didn't know about the compressed/encrypted distinction (that would have saved me a lot of time!).
That you can do it in such an automated fashion now, where I just paged through the disassembly and hexdumps until I saw something that leaped out at me, is stunning. I'd coded a few helper routines, and my debugger was completely stealth and I could rewind a couple thousand instructions (the secret of my success! <g>), but never anything like that back then.
Not the OP here, but briefly, the technique uses a little statistical cleverness and the ability to examine a running program to help identify which part of the Spotify program produces the unencrypted bytes of the track it is playing. It then taps into that part of the program, and copies these bytes to a file. That file then contains playable audio. (This data must be unencrypted in order to meaningfully feed the digital-to-analog converter of the playback computer.)
- Records (with PANDA) every instruction run and every piece of data read or written by that virtual machine for half a minute while it's playing audio. (!)
- Analyses that recording and uses some very clever statistics to identify functions that read chunks of data that looks encrypted, and write chunks of data that looks compressed (yes, you can tell the difference, compression is imperfect).
- Out pops one likely candidate, which sure enough is the decrypter.
Small correction: the magic of record-replay is that you don't have to record every piece of data read/written, only the non-deterministic events (interrupts, accesses to devices, and a couple instructions like `rdtsc`). This lets the overhead during recording stay low, while still getting all the benefits of a full execution trace. This is why we're able to store the Spotify trace [1] in only 263.1 MB.
Much of the technology here was invented by Brendan and others at MIT Lincoln Laboratory, which is where I work. We have been very lucky to have Brendan join us for a few summers while he was completing his PhD at Georgia Tech and he gave a great showing at RECON. Brilliant guy. If you're interested in reverse engineering his most recent papers are essential reading: http://www.cc.gatech.edu/grads/b/brendan.
In addition to some of the automated RE work, we've also got multi-million dollar research efforts hacking the Linux kernel and reverse engineering/analyzing embedded systems. Lot's of fun stuff. You get to work on really exciting problems and you'll have the funding and the skilled coworkers you need to execute successfully.
If you find this type of stuff exciting, you should drop me a line at sally@ll.mit.edu. We're always hiring^. We've got great benefits too, like a pension, unlimited sick leave, 13 holidays, 20 vacation days, and free classes at MIT.
^One caveat is that because of how we are funded, we are only able to employ US citizens.
> One caveat is that because of how we are funded, we are only able to employ US citizens.
Just wanted to say thank you: just being aware of limitations like this and letting people know up front makes it a whole less annoying. Also the circumstances makes it understandable.
MIT Lincoln Labs is funded by the department of defense. They do a lot of classified work, and may have large areas that are off-limits to non-US citizens. I had a Summer internship at a similar institution, where my work wasn't classified, but was sensitive enough and close enough to classified work that only US citizens were allowed past the keypad lock to my desk area.
Since MIT Lincoln Laboratory's establishment, the scope of the problems has broadened from the initial emphasis on air defense to include programs in space surveillance, missile defense, surface surveillance and object identification, communications, homeland protection, high-performance computing, air traffic control, and intelligence, surveillance, and reconnaissance (ISR).
Lincoln Laboratory conducts research and development pertinent to national security on behalf of the military services, the Office of the Secretary of Defense, and other government agencies. Projects focus on the development and prototyping of new technologies and capabilities. Program activities extend from fundamental investigations, through simulation and analysis, to design and field testing of prototype systems. Emphasis is placed on transitioning technology to industry.
Essentially, the Lincoln Lab is one of several labs that are part of a strategy to make best use of creative people (who can think up things, but don't necessarily want to productize/weaponize them), industry (who can productize/weaponize things, but don't want to be part of a war effort) and military.
To see why they would not hire non-US citizens, consider the first author of the "Steal this Movie" paper, who now works at a Chinese university (presumably on similar problems to the ones that people in LL and at UCSB are tackling).
Interesting. The DoD funds research which enables them to hack into other systems, which means devices and our communications, and in return we get the tools to hack DRM and everything else.
I think maybe you misunderstood the statement. Not funded as in "my startup just got funded." More like, "we are funded by US government grants that preclude us from hiring aliens." Just my guess.
That's lovely - but that sounds kinda like what those who work with the NSA TAO say when they want to recruit independent contractors to find 0days, and we now know that many of those are not fully aware of what happened to their work.
Could you comment on whether your participants are in any way, directly or indirectly, working to find vulnerabilities for the NSA to exploit, to your knowledge?
Although if you wanted realtime audio decryption, I'd imagine just writing a sound driver or hooking the OS's sound functions would be far more direct.
That would miss the point. The approach in the article does not hook the audio driver, but rather the decryptor. This gets you the original compressed audio file, not a stream that has been uncompressed (and would have artifacts when re-compressed).
That's fair. Although a lot of BluRay rips are re-encoded and seem quite usable. Even the YiFY stuff at 1GB/hour or so.
Edit: Why can't a compressor perfectly re-compress the decompressed audio? It's obviously possible since the compressed data exists producing that specific decompressed data.
> Why can't a compressor perfectly re-compress the decompressed audio?
> It's obviously possible since the compressed data exists producing that specific decompressed data.
It's not granted that a compressor c1, that given A produces Az that decompress with d1 to A', can easily find any Ax that compress to Az, or equivalently can easily find Az given A'.
Formulated like that it doesn't seem quite so obvious: finding Az from A' amounts[1] to finding A from Az -- ie: lossless compression.
> Although a lot of BluRay rips are re-encoded and seem quite usable. Even the YiFY stuff at 1GB/hour or so.
Usable at any given viewing/listening set-up != actually remotely "good enough". I always say people shouldn't buy more expensive hi-fi gear than what they can actually tell apart -- the one problem with that (apart from people not being honest with themselves, optioning for the more expensive stuff anyway) is that when you're used to listening to crappy audio, you stop being able to tell the difference.
It's like listening to an FM radio that's slightly off station -- after a few hours, you probably don't notice anything wrong, until a new person walks into the room and adjusts it to be better.
Another point -- while BluRay certainly isn't lossless -- when you're talking the kind of compression/quality differences you mention (not sure what regular bluray films are, but if they max out at 48mbit/s for AV, that's by my calculations about 20GB/h) -- 1:20 -- I think you'd be hard pressed to notice any "additional" artefacts. It would be like comparing a raw/flac audio file compressed first to 320 kbps vbr mp3, and then compressed down to 16 kbs mp3, versus just doing the compression to 16 kpbs mp3 (well order of magnitude is correct, obviously this is going to be mostly cutting into the video data, but still). Just something to keep in mind.
With vanilla JPEG, you should be able to redo the DCT and find the quantized values exactly as they were, which means that you could losslessly reverse the JPEG compression not in the sense you get a compressed version that decompresses to the same lossy reconstruction.
With deblocking filters in MPEG2 and later, this is not necessarily the case, because you try to smooth things over in decompression and can't reconstruct the compressed version either.
Because the media uses lossy compression, the uncompressed version of the compressed file still contains artifacts of compression. If you the re-compress the file, so add even more artifacts.
A good example of this is the guy who re-uploaded the same video to YouTube many times.
The GP's question isn't if current re-encoders are capable of recovering the original compressed data, but rather if it's theoretically possible to write a decompressor that given the parameters to the psyco-acoustic model that's introducing most of the artifacts, is able to produce a compressed file that still has artifacts, but no new artifacts.
Yes, it sounds theoretically possible, but it may involve searching a huge search space and may be computationally infeasible.
If I'm being fair, audio encoders are more advanced than video encoders, and we can probably shave more off with psychoacoustics than we can with our current understanding of psychovisuals. On the other end of the scale, video encoders have to process orders of magnitude more data, in more dimensions!
For example, Opus's CELT encoder uses lapped transforms and keeps about the same level of constrained-energy. Combined with the (hybrid) voice codec in some very advanced ways, it makes for the most advanced audio codec around by quite some margin.
You look at video, and nothing's that good yet, not even HEVC. The only thing that leaps to mind is Xiphophorus's Daala project - https://www.xiph.org/daala/ - which is hoping to do for video what Opus did for audio and develop a royalty-free, awesome video codec (rather than being donated a royalty-free, okay two or three) - and that's I'd say one or two generations ahead of HEVC, but of course, still very very early work.
That's true, unless the original audio is 24-bit and your audio driver is sending 16-bit audio to your sound card, or similar sorts of loss are happening post-decompression.
Many devices have hardware acceleration for mp3 decoding, which also helps with battery life. I'm not sure how many devices have hardware accelerators that can be used/repurposed for Vorbis decoding in the Spotify case. I believe it's less common for hardware decoders to be useful for FLAC decoding, so there may be a battery life penalty for re-encoding mp3 or Vorbis audio as FLAC. (I also presume everyone realizes re-encoding lossily-compressed audio using a lossless codec likely pays a size penalty without any quality increase.)
I saw the headline and thought the same. I'm a little disappointed now :/ On the bright side, maybe someone will use PANDAS for defeating DRM, and blog about that... :)
The script [1] that computes the statistics on the in-memory data actually does use numpy, and perhaps could benefit from using pandas. Sadly I haven't had time to look into it though!
Uh what? Wow, I thought for a minute it said you took a 30-second recording and I was wondering what you were going to do to get the file format right. Use a raw encoding to get close to audio levels? Then I read on, noticed something and had to read back. Indeed, you didn't record the audio, you just recorded and replayed the operating system. Say what!
Yeah, it's next-level stuff, here. He said around 12 billion instructions which sounds like a lot, but with our current processors, not that much work for the CPUs.
More specifically, a billion instructions is one giga-instruction. A 2 GHz processor can execute roughly 2 billion instructions per second (this is a very rough estimate, thanks to superscalar, pipelining, uops, yada yada). So 12 billion (typical) instructions will take around 6 seconds to run.
This sort of thing is handy when profiling: see a function taking a billion instructions? Half a second of CPU time. And this ratio hasn't changed all that much in quite a while (what has changed, of course, is how many threads can execute simultaneously).
I don't know, perhaps I'm totally wrong, but this just depresses me. I can see the point from an engineering point-of-view, this is a riddle to be solved, but why would you want to help people circumvent one of the (to me) reasonable ways of enjoying and paying for music? I know the author stopped short of giving the full solution to getting it to work, but still.
Is this what we've come to? No one should get paid for anything if we can enjoy it for free regardless of the hoops we will have to jump through to not pay?
Sorry if I am too dramatic. I can often see the point of pirating things, but in this case I just don't get it.
Edit: I would appreciate an explanation of the downvotes.
There are far, far easier ways of getting pirated music if that's your goal.
I strongly doubt this will significantly affect the level of piracy in music; instead it's a really interesting application of a pair of interesting things, PANDA and the difference in 'randomness' between encrypted and unencrypted streams.
I wonder if we'll start to see more crypto methods that deliberately avoid looking like encrypted data to make this sort of attack harder. ISTR agl's pond did something like this, but I could be imaging it, and I've not seen it widely implemented.
> but why would you want to help people circumvent one of the (to me) reasonable ways of enjoying and paying for music?
Because DRM sucks. Why can I only listen to Spotify through their player? Why am I restricted to using it only on computers/devices that they've ported their player to?
It's not exactly the same, but back when iTunes sold DRM encrusted music files, I used to strip all the DRM from the files I legally purchased, so that I could play them on my Linux system, stream them to my Philips wifi speaker system, etc.
Being locked in to a single vendor's ecosystem sucks. This is part of the reason why I currently refuse to purchase video from iTunes or Amazon. When the industry wises up and drops DRM from video, I will happily start patronizing them.
Spotify is a bit different—it's more like Netflix than iTunes or Amazon. Still, if I'm paying for it, it's still frustrating to only being able to use their players [1].
[1] My TiVo has a built-in Netflix player, but it really sucks. Much of the newer content doesn't actually decode properly. Being locked into their system means I'm at their mercy. If I could get at the stream I could use a computer somewhere to transcode it into something the TiVo could reliably play... But the DRM prohibits me from doing that, and that's frustrating.
Frustrating to use their players? Not for 99% of their customers who don't even know what you're talking about. Don't like the restrictions? Don't buy it. Isn't America great?
I understand where you're coming from (also, piracy repulses me).
However: it's worth knowing whether content protection is implemented soundly. Contrary to overwhelming popular opinion, there are content protection schemes that work. Generalist engineers mistakenly believe that content protection schemes must be unbreakable to provide value. They don't: all they have to do is cost more to break than the value of the content they protect (across all the users who might subsequently get access to it).
For an example of a content protection scheme that worked extremely well, see modern satellite TV smart cards.
Given that there are ways to implement content protection soundly, there's validity to research that determines whether a given content protection scheme is sound.
I generally agree, but I think you're conflating two kinds of content protection.
One prevents the attacker from accessing something they aren't subscribed to, and relies on crypto and secure subscriber identity mechanisms. This is completely possible to implement soundly.
The other prevents the attacker from copying something they can see or listen to, and relies on bizarre mechanisms designed to prevent the user from learning the state of their hardware.
I find the latter awful, because it's an infinitely losing battle (you can always point a camera at your display in the end) which erodes consumer freedoms and encourages walled gardens.
Satellite TV is a funny example - since the communication is strictly one-way, the hardware state needs to be protected or it can be cloned, but I still think it's fundamentally a question of the transport protection variety rather than the copy protection one.
Indeed, the goal of DRM is to transfer the liability to someone else. If you make a product that plays back DRM'd content, you don't want to get sued when someone figures out how to use your system to pirate stuff. So you ask the copyright holder what to do, and they say "buy solution XXX that's already approved", and then you do. Then you're happy, because you don't get sued, the chip manufacturer is happy, because you have to buy their chips (and whatever other fun add-ons you may need for your product) from them, and the copyright holder is happy, because the chip company's lawyers convinced their lawyers that nobody can pirate stuff.
Meanwhile, it all works because nobody cares to break your DRM because audio is already widely distributed in lossless DRM-free formats (CDs), and Blu-Ray DRM is already broken. The pirates don't want Netflix's 5Mbps stream when they can just buy a Blu-Ray and get a 50Mbps copy instead. (Similarly, they don't want Spotify's Vorbis stream because they can just source material from the uncompressed CD.)
First, I'm a bit starstruck by getting a reply from tptacek, I feel validated that I wasn't totally off base. Silly, I know. Thank you for your reply and you are of course right, I appreciate you also putting this into context (tv smart cards). As I said, I was probably overly dramatic.
My point was that normally we explain piracy by "it is an easier way of enjoying ..." and I can agree with that, but few things are easier to use (and pay for) than Spotify. I can see your point about researching their content protection.
Edit: Wow, learned my lesson, don't express admiration. I guess I just don't get the downvoting without leaving a comment.
Having strong content protection on a system, and securing that system for its owner are not reconcilable goals. DRM and "trusted computing" is Big Brother Inside.
Are there protection schemes that "work" on properly user-controlled hardware? DirecTV and Xbox and the like can get away with this since they can ship a big block of hardware. It requires some level of work (not click and play) to crack such a device, even if you know how to do it.
Whereas any software-based mechanism, without buy-in from the OS/OEM, is essentially going to always be one click away from being cracked, as far as end-users view it.
> I can see the point from an engineering point-of-view, this is a riddle to be solved, but why would you want to help people circumvent one of the (to me) reasonable ways of enjoying and paying for music?
I have a PPC computer running Linux. It is impossible for me to use Spotify on this computer. I am happy to pay for the service. Is it irresponsible for me to reverse-engineer the protocol so that I may use a service I have paid for on a device they do not support?
Reverse-engineering the protocol doesn't necessarily mean I want to pirate the content.
From my point of view; if you are paying for Spotify, feel free to use any way you can to actually use the service. If you are not paying for it; don't use it. I did not understand the article as a way of enjoying the service on not supported OSes, but I can agree that is a good thing as long as people pay for the service.
I just checked the web player, and it requires Adobe Flash (gnash is not enough). The fun thing is, even with DRM coming to browsers natively, there probably won't be a module for Linux/PPC. DRM must be closed-source for obscurity, and the market for Linux/PPC is too small.
Well - I've got some stuff that I bought from Audible. But I just rent it. Worse, I cancelled my account because of this crappy DRM in the past (got a new one now) and that seems to mean that I will never again be able to listen to the books I purchased.
Right now I'm listening to an audiobook. I would LOVE to listen to that on my Linux machine. How..?
I would DEFINITELY prefer not to lose that access (again?), when I decide that the subscription for crappy DRM'd content isn't worth my money.
It's a long time since I cancelled my Audible account.
There was a lot of stuff that annoyed me about Audible, such as the fact you couldn't roll over credits to the next month and difficulty cancelling being two.
However in fairness to them I can still log in and download from my library there.
Also the one human I interacted with there was very helpful in getting me cancelled.
For what it's worth, they do allow credit roll-over now (up to some limit).
Still, I'm annoyed at the DRM, I can't listen to what I want where I want, and there are alternative DRM-free alternatives, so I'm also seriously considering cancelling my account. Glad to hear they still let you access your previous purchases (it was my main worry).
It can be seen as a form of protest against the mere idea of the practice of DRM. I.e. it doesn't mean authors advocate piracy, but they make a stand against unethical overreaching preemptive policing and its undemocratic off-shoots (DMCA-1201).
This is just some really neat reverse-engineering research using Spotify as an example. He doesn't want to harm Spotify.
Also, DRM is not necessarily intrinsically linked with payment. For example, people release DRM-free games on the Humble Bundles, and they make millions for charity. There are some business models that don't work well without it, yes (Spotify's particular model of streaming, for example), but plenty that don't need any DRM at all (digital radio stations, for example).
> There are some business models that don't work well without it, yes (Spotify's particular model of streaming, for example)
I would even argue that Spotify only needs to include DRM to pacify the labels it has deals with. Because you can get basically any specific mainstream song via Youtube and if you want to have huge collections of some music genre there are torrents for that, too. The main selling point of Spotify is its implicit promise of rewarding the authors and the comfort of music selection it offers.
Digital Restrictions Management turns a computer against it's users. It uses proprietary software to attempt to prevent users from sharing. It's unethical and needs to be circumvented wherever it is found.
There are semi-legitimate reasons to break Spotify's DRM. One artist I follow was about to release an album, but their record label cancelled it at the last second. Luckily, it still got uploaded to Spotify, so I was able to download a copy of it.
I'm curious, was any motivation given for the album's cancellation? It seems strange for a company to abandon a product that's basically done (though I've seen it happen for various reasons).
The band wanted to release both the album through the record label they were signed to and an acoustic version by the singer through their own record label. Somewhere along the line, the major label started getting really pissy at the band and both (A) wouldn't release the album at all unless the acoustic version was cancelled and (B) refused to send the band preorder copies.
Spotify is not the one reasonable way of paying for music.
Purchasing drm-free downloads, physical CDs, cassettes, and vinyl are the reasonable ways.
It's a damned convenient service though.
My iTunes library is huge. I still end up listening to the same albums with my (paid) Spotify subscription for the pure convenience alone.
Edit: I've just remembered the boxes upon boxes of CDs I have in storage in the uk. That's five years, doubt I've given them a second thought for at least the last four.
I recently loaded my CDs into the bin after ripping them all - though even the ripped cloud library is rarely touched. The removal of hundreds of CDs was fantastic. There were a couple of live performance CDs I have which were recorded at concerts I attended, I kept them. The mass of physical media is now gone.
If you want to pay for music / support artists you love, then seeing them perform (and paying for it) is the best way of getting cash into their pockets and not the record company's.
I know very little about encryption aside from general principles on what it is, why we have it, and what it's goals for existence are. That being said, piracy, love it or hate it, it's here to stay for the time being.
I see the "problem" as a simple one. If there is chance that a systems DRM can be broken by merely one person, all efforts by that provider, and now most other providers, at least those that share methodologies, is pointless.
If a song, book, or movie are locked down, only legitimate subscribers can use that media under the terms the owner or distributor of that media define. But if only one person of the potentially millions is able to break that encryption, one person has now made all the work and man hours of into said encryption completely worthless.
If it takes a few weeks to break encryption that took months of multiple man hours to create, was that time wisely spent? Once it's broken, the data is now open to everyone. That being the case, I feel were in a "why bother" scenario.
The main reason I see illegally pirated material not proliferating even more is the technical barrier to acquiring the files. And often times, applications and protocols like bit-torrent, tor, VPN's, SSL links, etc. have too high a technical barrier for the common user to start pirating their media.
Once that burden is removed, it's game over and everything will be free at a click of the mouse, until something new comes along or a better business model that works with the fast changing technology world we are in.
When you are waging a war of one against millions and the one can actually win without putting themselves in harms way, you have a war that many will think is a sure fire win for the millions. A million against one is a pretty good ratio. But in this case, the one has a very good chance of annihilating the millions.
Eventually we will learn there is no point. In the meantime, people do this type of reverse engineering for a number of reasons I imagine. Curiosity, the challenge, making a political statement, and most importantly, to learn.
From this, perhaps something new is learned. And from that a new library is made which then gets adapted to detect intrusions on our personal computers and embedded devices. Who knows what may come of this research.
In the end, my position on piracy doesn't matter. What I do know is it will happen, encryption will be reverse engineered or broken, and those who can't learn the technology to use the new research will continue to find the barrier to piracy too high. Others won't, and will get their media freely, sans any lockdown from what you want to do with what you purchased.
Cars can go well over the speed limit. Some to speeds that make no sense to even exist. But they are legal, and most people follow the speed limit laws. The cost and barriers of breaking those laws are too high so people follow the rules. With DRM, the cost is negligible, so the rules will be broken.
But mainly, I think the answer to your question is that curiosity drives a lot of this. Aside from curiosity it could be a lack of trust. Researchers want to know what is going on with their media and hardware.
How do you know an encrypted file is not up to nefarious deeds? The common user never will. But a researcher like this would discover that the file was up to more than just DRM, and that could potentially help those millions of people become more aware of security, and learn to be more skeptical of what they buy.
I am wondering if someone who understands statistics better than I do could explain conceptually how encrypted data is distinguishable from compressed data. I always assumed that Shannon's paper says that perfectly compressed data should be indistinguishable from random data (which is indistinguishable from encrypted data). Is mp3 compression not sufficiently perfect? Is my understanding wrong?
media compression usually isn't perfect. For starters, data is framed with synchronization bits to recover from bad transmissions and the like.
Also, audio codecs usually use prepared huffman codebooks for the final step.
For most codecs they're defined as part of the format, few (such as vorbis) prepend the codebooks to the data stream so compressors can use different codebooks if they choose to do so.
For vorbis there used to be an experimental tool that optimized huffman coding of an existing stream, giving another 4%-or-so compression with otherwise identical data (ie. no quality change).
These 4% plus framing are probably the entropy differential they're exploiting here.
Isn't music available through DRM-free sources most of the time anyway? Just don't use Spotify if you are against DRM. It will also be a vote with your wallet against it. By using it you implicitly support DRM proliferation.
My comment wasn't really to the authors of this tool. Their tool is a form of the public protest and is appropriate. It was more directed to those who actually use Spotify. For them, complaining while actually using these kind of services (and helping them to spread more DRM in the process) is strange.
Smokers complain, and now we have e-cigarettes which may be less harmful while still allowing the enjoyment of nicotine and the fun of smoking.
I'm not sure why it's insincere. I keep buying ThinkPads, but I hate the new designs with a passion. (And they hate me; they literally cause me RSI where the earlier ones didn't.) It doesn't make my arguments against the new ThinkPads any weaker.
> and now we have e-cigarettes which may be less harmful while still allowing the enjoyment of nicotine and the fun of smoking.
So, we can have digital goods sold without DRM, so we could enjoy them without police state methods attached.
> I'm not sure why it's insincere.
Because by buying from those who push DRM on them, users support and prolong the usage of the said DRM. Complaints won't persuade DRM Lysenkoists. Loss of profits can.
Except with Spotify you are not buying music. You are paying a subscription to their streaming services. DRM is the only way they can protect the artists. On top of that, if you are paying 10 dollars a month and ripping the DRM off their streams you are stealing, and completely missing the value proposition of streaming music.
confusing Spotify with amazon or apple DRM would only display lack of understanding on part of the consumer. The truth is for streaming services you are NOT buying music!!!
> Except with Spotify you are not buying music. You are paying a subscription to their streaming services. DRM is the only way they can protect the artists.
1. Renting does not make sense for digital goods (I already explained in the past why).
2. DRM doesn't protect anything. I.e. it can't enforce the renting paradigm. This very article demonstrates that DRM is broken and it can't stop piracy. Therefore there is no need to use it ever even if we assume that it's an ethical practice (it is not). All DRM does is punishing paying customers by crippling the usability of the product (limiting supported devices / platforms / players / formats and so on), while having zero effect on pirates who pirate the same stuff DRM free.
Did you even read the article? It is a lesson on how to use panda to examine the memory of running processes. The spotify / music DRM is just an example to demonstrate the power.
Pretty sure that is the only reason that this hasn't been fleshed out into a fuller anti drm system. It is easier to get it drm free from other sources and they get it sooner.
Would it have been possible to figure this out without the statistics by playing something very specific, like a 440Hz tone (yes, there's spotify 'music' that does just that)?
I'm pretty sure you used to be able to just copy .ogg files from Spotify's local cache at one point. Am I remembering right? It's a soup of 10-2000kB files now.
At one very early point in its history, yes, I understand that was accurate. Then they encrypted them, but the key was easy to find, and then they started doing more complex stuff.
It's of idle academic interest to me. Never used Spotify, but I don't wish them harm either.
As mentioned in another comment, if you do this you get the original compressed file. If you capture the audio and then want to recompress it, you'll introduce artifacts.
Although, it seems to me that an intelligent compressor could perfectly recompress the audio back to the original form.
This presumes that downsampling in either bit depth or sample rate isn't happening somewhere (such as your audio drivers) between decompressing the audio and where you're capturing the audio.
At the very least it makes a good demo for a technique that could be applied in other situations. Or even just the hack value (see the jargon file). Many things have value in some indirect way.
"Sec. 103(f) of the DMCA (17 U.S.C. § 1201 (f)) says that if you legally obtain a program that is protected, you are allowed to reverse-engineer and circumvent the protection to achieve the ability the interoperability of computer programs (i.e., the ability to exchange and make use of information)."
Though I wouldn't want to have to test that in court. IANAL.