From the 2011 kernel summit, "The attack turns out to have been part of a widespread credential-stealing network that has been operating for some years now; it is clear that the site had been owned by this network for some time before it was discovered. What also seems to be clear is that this was not a targeted attack; kernel.org was just another on a long list of broken machines."
So let's speculate about what the article almost-but-doesn't-quite propose:
The NSA, or related parties, was responsible for the breach. There was an investigation and postmortem, but because of an NSL or other gag-type order, they couldn't accurately publish what they discovered. So they figured that not releasing a report was better than releasing a report that either intentionally misled or pretended not to have figured out what happened.
I know, this is a pretty big leap. But regardless -- what does it mean? What are the ramifications if this is what happened?
What are the ramifications if this is what happened?
I strongly suspect they were able to get a copy of the kernel source code... They could be doing anything with it.. Porting it to a new platform.. Compiling it with unsafe GCC flags.. Or worse..
There was a discussion about this recently saying that it was highly unlikely. All the source was in Git and every git commit references the previous commit, making it highly challenging to modify an old commit without also modifying the commit id. More details: http://archive.is/Khq7R
Yes, it's unlikely they modified the source in git.. But it's possible they were able to download a copy and modify it locally... Possibly adding comments to document certain blocks of code.. Or adding unofficial patches for zfs support... Or worse..
1. Contribute a driver to the kernel. A network driver would be ideal, but any driver will do. Include a binary firmware blob, because including binary firmware blobs in drivers is how linux kernel devs roll (or how they used to roll). When creating the binary firmware blob, use a SHA1 preimage attack to actually create two versions with the same SHA. One is benign, and one runs your back door.
2. Root git server. Replace object containing firmware blob with the alternate version.
This attack should be detectable by comparing blobs from old git clones with blobs from new ones, using a hash that has better preimage resistance than does SHA1.
There are other ways to get your SHA1 colliding binary blob into a git commit in ways that are unlikely to be noticed. I demonstrated one (without actual SHA1 preimage attack) here: http://github.com/joeyh/supercollider
But a binary firmware blob is pretty much ideal.
His comment reads as "ok, do all these simple things then just put in your backdoor but make sure it has the same SHA1 hash as something benign, and that's it!"
This reminds me of a friend's response to the idea that the TouchID in the new iphone could be a way for the NSA to get your fingerprints: "Just imagine the shitstorm if they put a camera in there. Or a microphone."
There's no good way yet to uniquely identify a person based on a (dodgy) picture of them or sample of their voice. Whereas fingerprints are used for this daily* and a quickly searchable database of these (or just their "hashes") would be incredibly useful to _somebody_.
*I'm not raising the issue of whether they _should_ be or not here, just that they are.
There's no good way yet to uniquely identify a person based on a (dodgy) picture of them or sample of their voice
Surely this is sarcastic. We can easily identify with great certainty from a relevant set... as Facebook does, for instance. You have a relevant set if you are many governments (the local government, and in many cases Israel via AMDOCS and its intelligence allies - primarily the US, but in some cases possibly their intelligence allies) or a motivated attacker (eg. with an insider, or hiring an insider via a private investigation firm), because you have the device call/messaging/physical location records from which to cross-match. Even if it's a land-line. You also have easy access to additional voice data (voicemail recording, 'this call may be recorded' records at large companies such as banks or wings of government, etc.). Public data sets on non-secret government telephone interception frequency even in 'free-ish' countries like Australia suggest extremely broad cultures around acceptable collection. (For .au I read a raw statistics report I can't seem to relocate recently, but http://www.smh.com.au/technology/technology-news/be-careful-... is a good overview.)
I think the point was, why hasn't the report been released yet, 2 years later?
Could be they're lazy. Could be they're embarrassed. Could be they're legally prohibited from reporting it -- which in turn could be due to a NSL.
Lots of "could be". But not entirely crazy to list all the possibilities... while waiting for the report, which we can probably all agree ought to be released by now.
At this point, I think you basically HAVE to assume it was NSA involvement. Most of the people who were considered paranoid before seem to have been underestimating things based on what we now know.
Let's just agree that from now on the NSA is responsible for all future security and privacy problems. If proven otherwise, we will assume the NSA is willing to share the blame. Then we can move forward with the rest of the discussion.
The report hasn't been disclosed because they are busy with normal operations.
The report hasn't been disclosed because they were pulled away from it to do some more urgent business, and since then it has been forgotten and/or important evidence has been lost.
The report hasn't been released due to turnover in whatever group would be responsible for producing the report, which has caused the loss of important tribal knowledge regarding the event.
As an outside observer, it's hard to say which it would be. However, I do find it unlikely that there would be a gag order - there is little surveillance data to be directly obtained from kernel.org, and attempting to backdoor the kernel itself would have a high risk of exposure and an extremely high risk of blowback from other governments who use Linux. Even if the NSA or CIA or some other TLA were behind it, they would have taken steps to ensure their identity would not be exposed in any investigation in the aftermath of a successful backdooring; there's no sense then effectively telling people who they are with a gag order (and then risking that someone might choose to violate the gag order).
A feature of civilian security is that "It was restored from Git" is doesn't immediately spark a concern that Git could be compromised.
I'm not saying that it is, but compromising Git is certainly the sort of thing which would occur to a state sponsored espionage agency. And if one were seeking to compromise the Linux toolchain, it would certainly be a very attractive link. So attractive that not including it in a multi-vector attack might be considered grossly unprofessional.
Version control systems are a bad target. They are too simple, too deterministic, and too networked. You can steal their data, but if you insert something, you will get caught.
Yeah, there are exceptions, all of them proprietrary. There is no reason to trust GIT less just because some companies can make even version control hard.
Even assuming that Git is unassailable with a billion dollar budget:
How long has the Linux kernel been under development?
How long has it been version controlled using Git?
How long has it been a potential target of state sponsored espionage agencies?
The potential adversaries have been taking cryptography and security seriously since long before the Linux community. They have larger budgets and significant expertise backed by patriotism and economic rewards.
Compared to pulling a nuclear submarine wreck from the depths of the Pacific, Git might not appear so difficult.
It's a shame this article doesn't have anything new to add, although I'm glad someone is reminding the world we still don't know what happened. Before the Snowden revelations this summer I'd assume it was a simple drive-by, incompetence. Now it's hard to see this as anything other than a deliberate attack by an Advanced Persistent Threat.
OFF-TOPIC why is that when I hit back in Ars it creates about 10 pages in my history (didn't click anything in the page itself) this is an UX nightmare and the 3-4 articles I've looked in the past week made me cringe when trying to leave the page.
Weird parallel between the NSA revelations and the Global Warming movement, every odd weather event is attributed to global warming, every odd security event it attributed to the NSA.
That said, as I recall the "hack" was a lot less impressive than it seemed (some folks in Google's Linux team were administrators of kernel.org). I do wonder about the lack of a definitive online after action report though. Seems someone dropped the ball on that one.
That's not a weird parallel, it's confirmation bias. There are both AGW sceptics and AGW believers who draw broad conclusions from localised data when it suits their view. It's a very normal human foible.
I think that he (she?) was comparing some extra-alarmists' (Al Gore-type) tendencies to attribute every abnormal weather pattern to global warming. We should all agree that crying wolf weakens legitimate claims about climate change and ocean acidification.
That's the pre-911 world you're living in, no seriously, a derp is the best case, careful people plan for scenarios other than the best-case failure.
I have enough respect for the kernel maintainers to assume that they would be able to muster the courage to confess to derping up their file permissions in a time span less than two years.
- Jon Corbet reporting on a talk by H. Peter Anvin, https://lwn.net/Articles/464233/