This smells really bad to me, as if Intel pressured CERT into removing language that could have caused their market value to instantly vaporize as every consumer for the last 20 years joins a class action suit...
People were mocking them for it on Twitter, and one of the researchers (Anders Fog [0]) called it "funny". It could just be that people were misled into believing that such hardware currently exists, and that CERT decided there would be less confusion to stick to currently-possible mitigation techniques. :)
In any case, I doubt Intel would pressure anyone to remove the generic imperative "buy a new CPU".
>> Fully removing the vulnerability requires replacing vulnerable CPU hardware.
Imho intel would much rather them keep this language, which is why they removed it. There is no drop-in non-intel replacement for an intel CPU. Telling everyone that they need to replace CPUs is basically a mandate for them to by whatever replacement intel can cobble together. Having to replace all those chips would see intel's stock price skyrocket. The reality is that chips don't need to be replaced asap and customers have time to perhaps choose non-intel chips.
That doesn't make any sense. Do you think when Takata had to recall all those defective airbags that were killing people their stock price jumped because they were able to sell replacement airbags?
A product manufacturer with serious defects almost always ends up eating the costs of hardware repair or replacement.
That says nothing about the possibility to patch the missing page table isolation with a microcode update. They have enough spare registers for a seperate C3, they could workaround the TLB flush. Since the fdiv bug HW problems are not HW problems anymore.
I don't get these calls to a class action suit... If it was intentional, it could have a reason. But I just don't get this attitude when they are having a really bad day after someone discovered a new type of attacks on their chips.
Intel isn't having a bad day. The people who are stuck with chips that enable severe security exploits are having a bad day. Actually they are going to have a bad year till the design flaw is fixed in hardware. Or maybe even 5 years. Who knows.
Intel is 100% liable to face a lawsuit over this. Consider if a major car brake manufacturer discovered that there was a design flaw in the brakes that prevented them from functioning in certain situations. It'd be facing multiple lawsuits by now whereas Intel is going with "our chips are the most secure ever" line.
I think you are confusing security and safety. Security is about dealing with malicious attackers, while safety is about making sure random events and mistakes won't kill you.
Your example with the brakes is about safety, i.e. making sure the car won't kill you during normal operation. Normally, unless their CPUs start bursting into flames, this is not a problem for Intel.
The problem here is about security. A car analogy would be that to start your car, you need a code and that code can be found by measuring how long it takes to process the input, making life easier for thieves.
As for liability, I don't think you can be liable in court if you didn't plan for something that wasn't known at the time and isn't trivial.
The car analogy is very good one. Also an infamous Samsung Galaxy Note 7 comes to mind. I agree that such security hole such warrant for a replacement.
I don't believe the car analogy is good. And neither is the Note one.
In both of those cases the products can cause serious harm without any third party implication. The brakes would just go of or the battery would explode during normal function.
However, in the Intel case there has to be an attacker that actively exploits an issue in design.
To me this would be like making a class action suit against all lock vendors because they can be bypassed with the right set of tools. The fact that this affects everyone (Intel more than others) and that it took 10 years to find grants them some excuse. Also the architecture is not secret as far as I know so anybody could have audited this. They most probably did do so and found nothing until now.
Now, I do not like Intel communication around this and if it comes out that they knew this for years and decided to sit on it then it would be a different story.
Class action lawsuits are useful when there is negligence, or bad intent but in this case what could it possibly solve?
In both of those cases the products can cause serious harm without any third party implication.
Sorry, but in the 21st century world of the internet, cracking needs to be taken as a given. In many cases, "normal use" for a computing product means exposing it to use and therefore potential attack from anywhere on the internet. CPUs certainly fall into this category.
You don't have to inflict intentional injury to be liable for something like this. Intel's customers aren't getting what they paid for, so it seems pretty reasonable for the company to compensate them.
If you bought a car that was advertised as having 300 horsepower, and then the manufacturer realized it was unsafe unless they made a software change limiting the horsepower to 200, wouldn't you expect some compensation?
Time of purchase product to time of knowledge of flaw allows for time-frame to file lawsuit(s).
Example. Intel worked on rushed product to compete against AMD threadripper. That time was after the time of the known issue. Thus Intel was investing in continual bad practices and selling known bad products instead of investing in fixing the hardware flaw.
Major flaw in a car prevent selling a card until the flaw is fixed. Why shouldn't this also applying to computers that run the cars and other products?
It appears the retpoline fixes don't work in Skylake or later (it's smart enough to speculate out of it?) and will require new support for IBRS/IBPB in the microcode to mitigate.
I was wondering the same thing earlier. This doesn't feel like a disclosure that's had anywhere near ~6 months put into it.
Did the vendors ignore the disclosure initially and begin to change tactics later in the game? Based on how certain vendors have been characterizing this in their PR, I wouldn't be surprised if they didn't take the problem seriously originally.
The Ubuntu page that was on HN earlier [] claims that they were notified in early November. I have no idea if kernel people (as opposed to distro people) got notified earlier.
Especially microcode updates. Microcode is just a giant obscure binary for everyone outside of Intel. If there was a mitigation possible via a microcode update this could have been published months before disclosure without any meaningful risk.
IIRC Intel employs people to work on the linux kernel on behalf oh Intel. Either Intel fumbled or it isn't that easy to circumvent the problem plaging Intel's processors with a software hack.
That’s easy for you to say. You’re not the person having to admit to a billion dollar mistake.
Everybody stalls for time when the stakes are this high. How long can I reasonably spend tying to turn this into a small problem before I have to go public with it?
Saying it’s a bigger problem than it turns out to be is a PR nightmare of its own. If there was a cheap fix then you cried wolf and killed your reputation just as dead.
Exactly this. Apparently, the details of the attack have been published in official paper(s) before the security teams of major OSes could prepare and make publicly available mitigating patches for the users. There is no patch for Debian 8.0 (Jessie), or for Qubes OS, for example.
The chatter is all about how CPU manufacturers screwed up, but there is a much more alarming issue here, I think: the apparent irresponsibility of the people who published the flaws before the security teams and the users could mitigate them. Perhaps there was a reason for accelerated public disclosure, but so far this makes no sense to me.
> Note: IBRS is not required in order to isolate branch predictions for SMM or SGX enclaves
Perhaps this microcode update exposes a feature which was originally to protect these two modes? But that would mean that Intel did think about leaks through the branch predictor, only didn't make the logical leap that this could be an issue also for normal ring0/ring3...
Maybe, maybe not. I looked around a bit and found [1]"that the Intel SGX does not clear branch history when switching from enclave mode to non-enclave mode", which suggests either that the SGX designers were unaware of the dangers of not separating branch prediction between privilege levels, or that Intel intentionally weakened SGX so as to not reveal the similar flaw in their ring0/ring3 separation.
I'm disheartened by the number of comments here who are taking the stance that Intel has idiot designers or that management doesn't care about security. This attack is very clever and unexpected. Even though side-channel attacks have been talked about for awhile, even the guy who developed Meltdown was surprised that it worked. It just seemed like an "in theory" security hole, not an exploitable one.
AMD isn't vulnerable to Meltdown not because they foresaw this issue, but probably because they simply weren't as aggressive as Intel in allowing speculative execution. For years people have preferred Intel over AMD cpus due to their performance advantage, due in part to that higher sophistication of their pipeline.
Or to recast it, nobody is hating on AMD right now, but AMD CPUs do allow a user process to learn some things about the kernel via timing attacks. If next month a researcher develops Meltdown2 for AMD, are AMDs designers now suddenly idiots for missing an obvious security hole?
> AMD isn't vulnerable to Meltdown not because they foresaw this issue, but probably because they simply weren't as aggressive as Intel in allowing speculative execution.
You don't see why being "aggressive" with speculatively loading data over a _protection boundary_ could be considered irresponsible? I for one, think AMD has the right to gloat if they want. It's not just AMD, besides the latest version of ARM it seems all the other CPU vendors decided to not be "aggressive" with their users' protected data (sparc, mips, amd, power, s390x).
Does it mean all those vendors and architects had PoC for years for this and were sitting on it? No but they could have had a hunch not to go that route. Just like a sane developer might have a hunch over opening a wide API surface to a server that contains sensitive data. It doesn't mean they know there is security vulnerability in one of the API endpoints, it's just sane practice.
> If next month a researcher develops Meltdown2 for AMD, are AMDs designers now suddenly idiots for missing an obvious security hole?
But who called any developers idiots here? I think you were the only one.
Side channels are really hard to protect against. Caches, buffers (maybe you can check when they are full), branch prediction, sound, vibrations, timing, electricity consumption, em waves, temperature.
Things leak all over the place.
Such bullshit - there are loads of people, maybe not on SO, but on reddit etc that called them "idiot CPU developers", possibly misinterpreting what Linus said.
> I'm disheartened by the number of comments here who are taking the stance that Intel has idiot designers or that management doesn't care about security.
I think you are being unfair as the GP didn't call anybody an idiot for not caring about security.
It just calls them out for insisting that this is not a flaw or a bug.
I think it makes sense in a really pedantic way: "flaw" and "bug" have always been, in my observed experience of usage, terms used to refer to the consequences of oversights.
This wasn't an oversight; this was more like... whatever you call the fact that we're still, today, choosing to employ (and even design new!) hash functions that quantum computers could probably break easily. We're making an intentional design choice, based on the perceived difficulty and current infeasability of a particular known class of attack against that design. That current hashes are vulnerable if-and-when a quantum computer comes along to crack them isn't really a "bug" or a "flaw" in our hashing algorithms; it's a known property of our hashing algorithms.
Or, for another analogy: there was a point in history when the peak of warfare was ships shooting guided missiles at other ships, and the targeted ships shooting smaller "countermissiles" that attempted to get in the way of the incoming missiles before they could hit anything important. Every missile had a faint heat signature, making it visible to infrared optics—this was an unavoidable consequence of the fact that missiles need engines to make them move. But for a long time, the idea of a heat-seeking countermissile was just infeasible or un-economical to implement, so little work was done to hide the emissions signatures of missiles. The emissions signature certainly wasn't a "bug"—it wasn't the result of an oversight; and it's a bit strange to call it a "flaw", insofar as there was no such thing as a missile that didn't have said "flaw" while still being a missile. It was a known property of the missile technology of the time. Or, if you want to think of it on a higher level, "missiles" themselves—anything that you might call a missile—had a categorical flaw.
In the same way, anything we might call a modern-day CPU is now known to have the categorical flaw of leaking at least some amount of information through speculative execution. You can minimize it (like you can minimize a missile's heat signature), but you can't get rid of it without making something we wouldn't even call a CPU any more (most things without speculative execution are, these days, considered microcontrollers.)
In that sense, I can understand Intel's insistence that they didn't make a flawed product: they made a perfectly good instance of a "computer processor"—it's just that "computer processors", as a category of product, have a problem.
You wouldn't blame the missile manufacturer for making missiles with visible emissions signatures, before heat-seeking countermissiles were invented. They didn't introduce a flaw. They made their product to order, and the order—the requirements, the demands of the customer—themselves contained the flaw, contained the supposition that it was okay to make a particular trade-off because it wasn't currently exploitable.
In the missile manufacturer's case, it was the government that said "sure, heat doesn't matter, just make it go fast"; and when heat-seeking countermissiles were invented, it was the government whose (lack of) intelligence foresight was to blame for not changing their requirements to anticipate that exploit.
In Intel's case, some customer could have foreseen the exploit and shifted the market toward demanding non-speculative-execution CPUs. Intel was just making what the customers asked for, and right up until the end, they were asking for the categorically-flawed product.
Design choices don't need patches to fix them. Flaws and bugs do.
You seem to think that this issue is inherent to speculative execution - it is not. It is due to intel performing speculative execution in a flawed way. In particular, an incorrect branch prediction should have no detectable effect on the system, whereas here it does.
> an incorrect branch prediction should have no detectable effect on the system, whereas here it does
Branch prediction is not scoped for that. Branch prediction will always change microarchitecture state, which is always detectable at some level or another. The key takeaway for designers should be that even though microarchitecture state is not exposed in the datapath it is not secured from side channel exposure.
Your argument seems reasonable, until we bring to the table all the shit that Intel did to beat AMD out of the x86 market. These deeds were from their marketing department, so no relation with chip design per se, but as a company they were still actively trying and almost succeeded in killing any diversity in the x86 market.
It’s basically like saying “we are building stuff customers wanted, we just also beat to death any other potential alternative they could want as well”
You sound like a lawyer and I don't mean that as a compliment. But alright, if this isn't a flaw, then breaking it wasn't an achievement, certainly quantum computation wouldn't be an achievement - or rather foreseeing and avoiding the consequences isn't, which is easier because of different time scales, but still.
The problem is that there are a lot of us trying to get people to take these "completely unlikely" attack vectors seriously. It's like talking to a dog or a wall. Too many humans are hardwired to respond only when confronted with an actual situation. We get frustrated because our "in theory it works..." attacks work in reality eventually and then the rest of the world is all "oh who could have predicted that ____".
We did. We, the people you called "paranoid" while we quietly try to fix things. We're the ones trying to make sure that people don't die when cyber vulnerabilities are exploited by shitty actors.
I have a theory that this heavily relates to the feedback loops and signals in play. New features are positively observable and their impact is observable from release onwards.
When defending against unknown unknowns, security is unobservable. It's observable only in its absence. All that's left are heuristics and synthetic signals like pentesting.
I wrote a multi-thousand word essay on the topic, but for an internal audience. I don't know if I could properly share it.
AMD did things the right way competing for performance without sacrificing security. What's happening right now are the consequences of Intel's actions.
AMD did things the "right" way, not because they understood the security implication, but because it was very hard to achieve such high level of speculation, and I think the performance gains didn't justify the effort for them.
Have you got a reference for that statement? Because my interpretation of what has been said so far is that AMD deliberately chose not to allow speculative dependent loads that crossed privilege boundaries by enforcing permission checks on all reads whereas Intel chose to permit all loads & rely on the fixup at the retirement stage of the pipeline to enforce privilege boundaries.
I think it's worth noting that it's entirely possible, that if you are a CPU execution pipeline designer that you think about memory loads/stores, L1 cache, branch prediction and speculative execution, and it occurs to you that the cache gets polluted by spec exec, and the branches can be security checks.
But the solution is simple. If the branch is important, wait for it. (Load it into L1 cache before the branch - use a memory barrier.)
The fact that this is not in any ISA docs is a likely pointer toward the possibility that it in fact hadn't occurred to them.
These attacks are the same as any new invention. Easy to see once you've grasped the concept.
I think you are missing the point. What the commenter is saying is this... right now you are saying AMD did things the right way, but if a loophole is exposed next month, will you say the same thing?
I'm neither supporting or refuting the commenter. Just explaining to you the meaning of the comment.
There is a difference between following good security practices (gratuitous isolation, defensive design) and having exploits ready to show the world how valuable it is. Noone was claiming AMD knew of such exploits beforehand.
If I observe fact X which is "bleeding obvious", then it's not my responsibility to tell the world about X.
Of course this stuff isn't "bleeding obvious", but I'm going to assume that the AMD engineers thought that it was "obvious enough" to not explicitly to tell the world about it.
Besides... do you have any idea what kind of NDAs those engineers are going to have to sign?
Perhaps whoever thought about it assumed it was obvious enough that others would have realized it as well. It may simply have never occurred to them that it was worth sharing.
Intel grabs tens of billions revenue year after year, people of course expect the best of the best from them. Playing victim here doesn't work when u r an industry leader. U can not have the cake and eat it.
Latest AMD processors are on par with INTEL in terms of IPC performance, there is no evidence showing AMD's pipeline design is anything less smart than Intel's. AMD processors do not have the issue because they don't speculatively load data across the defined boundary. It is _NOT_ an "in theory" security hole, it is a security hole that managed to lower the computing performance of the entire world due to Intel's market share.
Intel designers should of course be blamed for such issues - full price paying customers are now suffering from performance penalises for up to 30% on some workload, when each new generation of Intel processors give you 5-10% performance boost in the last 5-10 years. Sure, the issue is a surprise for everyone, but if they are designing processors to power billions of devices, they are expected and required to be exceptional.
Let's make it perfect clear - Intel designers don't have to be smarter, they can give up their market share to AMD. It is a privilege to design chips for the entire world, it is not a right. When things go wrong, they need to admit the mistakes and fix their craps, sadly Intel is putting too much efforts into its PR rubbish ATM.
The poster you responded to never said that Intel engineers are idiots. Intel made a huge hole for themselves in the way they implemented ME.
The reason for the strong reaction over these flaws is not because of the severity of this issue, nor does anyone believe Intel engineers are idiots. They are getting blowback because they implemented a closed solution for a powerful feature - Management Extensions.
They lost a lot of trust, which makes it far harder to recover when new issues occur.
Did Intel promise in their docs that side channel attacks are not possible? If not, then OS writers are equally to blame for making incorrect assumptions. I don't think this is Intel's fault alone.
On a grander note, there are probably hundreds of even more esoteric side channel attacks all across the system since every process changes the system state. This is more like the beginning of a new style of attacks now that one is shown to be practical, rather than any particular entity's fault. Hardware designers will need to consider informational and physical isolation in a more rigorous way, and there may be theoretical limits that bound the performance-security tradeoff when you share resources.
They do not. And more problematically for the "It's all Intel's fault" story, in the SGX documentation they explicitly state that their chips do not protect against side channel attacks and that it's up to the software developer to handle it. As SGX is a part of the CPU effectively that means their official chip docs rule out side channel protection as a feature.
I don't know if playing the blame game here is going to be productive. All CPUs are vulnerable to side channel attacks of various kinds and focusing on Meltdown specifically seems like missing the forest for the trees - especially as ARM has the same issue in some of their designs.
meta comment - but when I first saw this comment it was in reply to carwyn's top level comment but now it is a top level comment unto itself? that makes a lot of the replies not make sense now - was this comment moved?
Well the first problem is obvious with hindsight. Consider all the state changes that occur when an instruction is speculatively executed. Are any of these rolled back?
In this case, the cache is not being rolled back. Neither (presumably) is the branch predictor. And then what of the performance counters? (do they count if an instruction is not retired?). I see many potential attack vectors opening up and it's much harder to prove that any of these state changes can't be exploited.
I disagree. The attacks are clever to focus on Speculative evaluation but are otherwise fairly simple (timing attacks are well known). I'm surprised such attacks have not been discovered earlier, as speculative evaluation seems quite broken. Security is likely not a top priority for Intel and probably not something their verification teams are targeting. There is perhaps a gap in the market for a vendor focusing on secure CPUs.
For an architecture that has been around for so long, if the attack was only discovered recently, then is it fair to call it 'simple'? In retrospect, anything can be obvious.
An obvious flaw doesn't become less obvious if it has been found. So it might be that some blackhat knew about it before, but there a lot of smart sufficiently-pale-shade-of-gray people out in the world for obvious problems to be found in less than decades. So I don't think it's an obvious problem.
It seems that at least some ARM might be affected by both Spectre and Meltdown.
So far, I have only seen negative meltdown tests for older AMD cores. Is there anything known for Ryzen except for the PR by AMD (and the kernel patch, which might be based on the google project zero information about older AMD cores)?
While meltdown is "easy" to fix by not reading memory if unprivileged to do so, spectre is a lot harder. Even if the caches are made safe, for example by having "speculative" cache lines which will be renamed into the "true" cache when the speculative thread is actually accepted and retired: It's not the only place where there is hidden state. For example, the branch prediction might be affected, and might give a timing signal.
We know (or can guess) approximately when the project zero team discovered the issue, but I think your parent comment meant that we don't know when _someone_ discovered it first. Maybe the project zero team were the very first to discover it, or maybe some state actor discovered it a decade ago and has been using it since then.
Right - but as always with a vulnerability, especially one that's borderline-undetectable through any kind of log analysis, the question becomes "were Google really the first to think of this?" and the tinfoil kingdom builds itself from there.
It's simple to describe and all the pieces are big red flags even on their own: Speculative evaluation has side-effects (e.g. the cache is not rolled back), speculative evaluation omits security checks, timing attacks can be used to determine the cache memory. It can even be exploited using JavaScript, no hand crafted CPU instructions required.
Perhaps it took so long to find because it's only relativity recently that companies have been paying people to break the hardware?
AMD isn't vulnerable to Meltdown not because they foresaw this issue, but probably because they simply weren't as aggressive as Intel in allowing speculative execution
What do you mean by 'aggressive' here? Cause frankly, I think aggression isn't a good trait to be exploiting in business - it's not the same as competitiveness.
Is it 'they would have done this if they could but they didn't try hard enough', or is it 'they could have done this but didn't have the nerve to take the risk'?
In the latter case, I'd argue that a lack of willingness to trade off security against performance is a Good Thing in line with engineering ethics. In the former case, it seems like you're assuming business comeptition is always construed as a zero-sum game - is this correct?
They weren't making a trade-off because it was not a known risk. No one had any reason to think that speculative jumps would lead to a large security hole.
My understanding is that it was a known risk, see the various citations already provided throughout this thread. Meltdown is a bugged design decision because it failed to consider the consequences of asynchronously verifying permission for speculative executions that reach into kernel space, which allows the Spectre attack to affect kernel memory too. That's the limitation of the reach of the "Intel bug" as far as I know, the rest is generally applicable.
The Spectre attack is a side effect of performing speculative execution without wiping caches, something that was, until yesterday, an intentional and clearly-chosen industry design direction, standard across almost every commercial CPU produced in the last 20 years, despite the known risk of "theoretical" timing attacks.
The only reason for Intel to make the decisions that led them to be vulnerable to Meltdown was to sacrifice correctness/safety for performance, and failing to consider the potential side effects of that sacrifice (cache heat). They obviously made a bad risk tradeoff there (though I don't necessarily fault them).
AMD could definitely make the argument that Meltdown was an irresponsible "benchmark cheat" from Intel.
EDIT: And let me further clarify, ARM was "cheating" and doing the permission check asynchronously on some models too (i.e., some of their chips are also vulnerable to Spectre in Kernel-space aka Meltdown). It's not solely an Intel issue.
Speculating memory reference reads across security boundaries is not a trade-off (except if you can prove that you will not further act on them in any data dependant way that is observable). That's mostly common sense.
Spectre does not need that -- but OTOH is more difficult to exploit; it is in a whole different category. There is a reason they have different names.
"This is not a bug or a flaw in Intel products. These new exploits leverage data about the proper operation of processing techniques common to modern computing platforms, potentially compromising security even though a system is operating exactly as it is designed to."
It's like we spent 20 years building houses out of wood, and then suddenly someone discovered fire. "This is not a flaw in Intel lumber supplies. This new 'fire' leverages chemical properties common to all plant products, potentially compromising structural integrity even though the lumber is operating exactly as it was designed to."
(Edit: It seems this analogy may be overly kind to Intel. Read all the replies to this comment for more information.)
This is not really like that. The dangers of speculative execution were known before, and, separately, the processor itself provided memory isolation capabilities. That the processor would ignore its own memory isolation principles when speculating about the next instruction to execute is a critical base of these vulnerabilities.
So maybe a better analogy would be that they built fire retardant layers into buildings, but neglected the fact that heat could still pass through the layers and start fires on the other side without burning the layer itself?
This one's good, because, like asbestos, there's not really a good alternative to this on current platforms other than giving up some of the performance value of branch prediction in exchange for long-term safety. As far as I know, asbestos remains an unparalleled material for fireproofing, and it is still used sometimes under controlled conditions, even though we've been forced to accept other materials for general use.
AFAIK, mineral wools and silica fabrics have replaced much of the high temperature insulation that asbestos used to do. I believe they do the job just as well, but they're more expensive because they're manufactured rather than mined.
A better analogy would be that they presented specs that showed that the building was supposed to be fire-proof but still insisted on dirverting from the specs and building best practices by cladding it with highly flammable material to improve its heating performance.
Unlike "neglecting the fact that heat could still pass through the layers", speculative execution doesn't merely allow something that was previously possible. It allows something that was previously impossible, and while doing so undermines another, more critical, feature of the processor.
As absurdum, so that they could advertise the home being more efficient, faster to heat and less wasteful of the lucky phenomenon peculiar to their furnace design for running beyond the specification so instead of overheating one room, or expensive distribution, the furnace gets you really quick hot water and the excess energy isn't wasted, so compared to fully internally insulated homes, the benefits just keep adding up, e.g. installing plenum cable would be inefficient because this system dumps the excess heat quickly through the house, the ability to convect nicely preventing past occurrences of furnace fires. To be the most efficient inhabitant (kernel process) Intel recommends you enter and exit the over and under heated rooms frequently to manage your body temperature. This design is excellent for your hyperactive residents, please refer to our release notes for the use of rooms either exclusively or during long operations such as Netflix consumption.
If this analogy was apt I think I would be on their side - but we're talking about "wood" they designed and the "fire" is not some new phenomenon but an actual thing theyve supposedly designed against (unauthorized access of memory). [from my reading as a lay-person, please let me know if this is incorrect]
The analogy is a bit tortured, but in this case both companies build houses out of wood because it has the be performance and both cover them with fire retardant materials. One company discovered that they could make better houses if the fire retardant material was held on with glue instead of screws, but failed to account for the fact that the fire might melt the glue.
It's like we spent 20 years building houses out of wood, and then suddenly someone discovered fire.
Goodness you just described my current work situation that's causing me to rip my hair out. This is a deeply human problem, and I've spent a year trying to mitigate it, but each time the discussion comes up I leave the conference room feeling like certain actors have a personal financial stake in wool fibers.
At work the issue-tracker has open bugs from 2011 detailing how to do SQL injection attacks using some common search interfaces, but noooooo, the sales team just has to update the visual style of the generated contract PDFs... <sigh>
Which is half true. All of them are vulnerable to Spectre. So far as I've heard only Intel (and maybe ARM) are vulnerable to Meltdown, because Meltdown exploits the lack of security checks on speculative execution that only Intel implemented. It's ambiguous if ARM is vulnerable, I've only heard people saying either "maybe" or that they haven't validated that.
Edit: Elsewhere I found a link to ARMs page where they breakdown exactly which of their processors are vulnerable to both Metldown and Spectre and it looks like quite a few of them are vulnerable to one or both of them.
I recognize the positive spin, the fact of hardware vulnerability is still little understood, and the trust placed in the cloud ecosystems is almost absolute. I'm aware that I would be planning to go on a Microsoft like security commitment and the sheer resources available to Intel if concentrated on the silicon features capable of e.g. detecting a miscreant hypervisor look like the next wave of products and marketing. To achieve this spin, however, Intel needs to day much more than they have. They can maintain their current line but score points with sceptics by demonstrating how some attacks work on Intel processors in the wild. Coordinated with Industry wide patches, making more point which is missed here how Microsoft and others patched branches in November, the wrap is to offer all customers running on vulnerable silicon migration and s performance bump. This is not a cheap process, but the political climate almost demands resolution at this level.
To me this feels like saying, "My program doesn't have a bug, it's executing the code that I wrote correctly." The actual bug is that the code isn't correct.
In Intel's case, even though the operation is correct, it's the actual design that's flawed.
I think what Intel is trying to say that this feature makes the chips insecure by design. It’s kind of like in Python where can get around the week protection for classes, methods, and variable to make them pseudo private by putting underscores in front of them.
Except that these CPUs were not designed to be insecure. CPU makers spent decades marketing their architectures as able to support the implementation of secure operating systems. All of a sudden it is clear that this is not possible without heavy dose of software stopgaps. It is a fundamental flaw on CPU architecture that was not disclosed by these companies.
Until a few days, researchers couldn't prove that this was possible because there was no example of such exploits. But there was the idea that this could be possible. Also this doesn't mean that people with other interests didn't have their own versions of this exploit and kept it a secret.
I'm scarred by the CISC - RISC wars, which were serious for my business at the time. So in a year I'm anticipating Open VMS I'm x64 , having a indelible memory of the memory space rings and system calls depending on explicit separation, I remember thinking to myself, well only four rings in Alpha are critical to VMS, and Intel Pentium has that many rings...so why don't they port straight to the main Intel platform? Is this leaking mode the reason?
If you fall 10 floors and die, it's not the architect's fault for not putting a handrail around the balcony. It's your fault for improper use of gravity.
The problem is they land with legs extended down from floors 1 to about 7 (bad: impact is transmitted up through shoulders and hips), while higher than about floor 7 they spread their legs out and attempt to parachute (better: impact is uniform over entire ventral surface, terminal velocity is lower, mortality rate drops).
Very cool thank you! Also prewitt and Travis above and below..
Mortality rates..
Got it ,:-$ so, marketing for 7th floor Homes with kitty life policy in the service charge bill technically not the Blatant fraud I assumed, and if the building manager had run into trouble the lives of many injured kitties luckily lookalike to the residents could be the hardest sentencing hearing for any animal lover judge. English cities badly need dog license reintroduction. Asked to help a former neighbor become tramp beg court to return his dog. Second I met the man, who certainly was denied due process and on paper atrociously mistreated, he introduced the dog, unknown potential dominant bulldog mongrel, getting it to l so and lock it's jaw on his agitated arm. I passed him again, alone in rain, dog coat sodden, he was high and drunk careless no train still disgorge the fool he requires. Dogs need sorting out in London before brexit chaos
I mean in cities I keep imagining much greater extent of elevation, moment we figure how. Above the smog line... I'm sure I was in 94 but 16 same place, in the middle, ugh.
Could it become unlawful to rent (Live in generally) homes without harmful particle filtration?
I'm thinking of renting out my filters just before tenant viewings, because this eliminates the obviously fresh air chill, takes out fat or any food Smell beyond the kitchen, let's a visitor smoke even..(friend closed rental dear by offering a cigarette excused by filters, it's your residence now) but from flu season to the decoration next door, the cost needs a artificial boost. Wish was a swapHN channel..
The Register does a pretty fine job of shredding Intel's doublespeak: "We translated Intel's crap attempt to spin its way out of CPU security bug PR nightmare"
Ironically, a chip company, whom deals in ones and zeros all day long, committed a very serious logical fallacy: appealing to the common practice.
One way to interpret the statement is that the chip design which they utilized is widely available and understood by everyone in the chip community to be good chip design, thus we aren't at fault, because everyone is doing it.
To reiterate those types of arguments are a very serious logical fallacy and unsound argument called appealing to common practice.
This is not by or for technologists, it is by PR folks for stock-market analysts.
Far from being at risk of being fired their whole PR department is probably swelled by ranks of excruciatingly expensive corporate emergency consultants and experts paid precisely to output this kind of menial drivel.
To reiterate those types of arguments are a very serious logical fallacy and unsound argument called appealing to common practice.
How were they supposed to avoid an exploit that no one would discover for years? And exactly how is "appeal to common practice" a fallacious or unsound argument?
Exactly how much performance -- meaning, how much market share -- is Intel supposed to sacrifice to avoid the possibility of introducing unforeseen bugs?
People did discover it early on as it's inherent in the design of the chips, it just seemed unlikely at the time. Now Intel has to deal with a huge potential liability if something bad happens because of it. It was just bad business.
Also, appealing to common practice doesn't make logical sense.
For competitive markets to remain competitive, actors must receive the right disincentives. Intel reaped decades of competitive advantage by forgoing security for performance. As you said, they decided that the cost of mitigating these known vulnerabilities wasn't worth the benefit. Now it has come to light, and without due punishment, that math will skew even further toward exploitation over safety in the future.
These technologies were created at a time when it seemed to be very difficult/near impossible to exploit them for the purposes were seeing today. Of course Intel should have changed course when processor speed and newer technologies were introduced. Rather, they decided not to worry about this.
A malicious process allocates a 256 member array. Then it creates a conditional where the speculatively executed part writes a byte at offset array + kernel_memory_value. The speculative branch is executed but then backed out, but the byte in the array was touched so it is in cache now. Then the malicious process reads all of the members of the array and looks for one that returns much faster than expected (is in cache) and they know the value of that byte. Rinse and repeat to read the rest of the kernel memory. It's not going to give you MB/s of throughput, but it's plenty fast to read some key material or process tables or anything like that.
It's a very impressive attack. My hat is off to whomever thought it up.
I didn't read it as whether whether the attack is impractical (because it is clearly quite practical) -- the parent was questioning whether it's practical (or not) that such an attack would be "planted" as a backdoor by an agency like the NSA. The attack comes off as quite impractical for something like a plant (sensitive to e.g. compiler output, requires locally executing code to snoop is already a red flag, and the 'bug' enabling this has really been considered a feature of every mainstream CPU for like 15 years, and not considered by many to be any kind of attack vector.)
Or maybe the idea is speculative execution itself was a dream of the NSA that was Inception-planted into the brains of CPU designers in the 90s; who knows what the theory-of-the-hour is regarding 3-letter-agencies and their capabilities.
Ultimately I think what we're really learning is that guarding against things like microarchitectural attacks on contemporary superscalar, OoO CPUs is going to be an uphill battle that we didn't ever think of due to incidental complexity (among other reasons), and will serve as a new class of attacks. Who knows how long this bug class will exist; we've killed some. What's also likely is that, like most security failures in the industry, this is a result of various things like basic lack of forethought/ill considered design, as opposed to plants (3 letter agencies aren't responsible for the vast majority of security failures you see, it's simple mistakes). But peddling conspiracy theories involving them gets you upvotes, so, you know...
The Meltdown paper cites 500 KB/s average throughput when transactional memory extensions are available on the Intel CPU. It's not MB/s, but it's still pretty fast.
So it speculatively indexes into the array by the value stored at that memory address and writes the byte. Then to figure out what the value is you just have to see which element in the array is cached.
To elaborate on this, the write to the array isn't what's being read here.
array[value_of_kernel_memory_byte] = 1;
This assignment gets rolled back like it's supposed to. It's when reading the array after the rollback that the exploit measures that a read to array[value_of_kernel_memory_byte] is faster than the rest because that index is already in the cache.
I replied to a comment saying "Or NSA had them do this intentionally"; my comment is not about whether this is a vulnerability, or whether exploitation is viable (it absolutely is), but rather about whether this is a viable backdoor.
If they designed it like this, and it works as they designed it, is their PR spin opening them up to legal challenge? Presumbly they've just been misrepresenting what their system was capable of if they claim this wasn't news to them.
Is it a valid legal defence to say, that was just marketing lies so don't take it seriously?
Well, that's true (at least for the spectre attack). It's not a bug in Intel products. It's a bug in any CPU that uses branch prediction (please correct me if I'm wrong, this is my current understanding)
The effect on Intel is far more severe than on other architectures. There were three vulnerabilities, two grouped together as "spectre" and one called "meltdown." Intel products are uniquely vulnerable to the meltdown attack, while many CPUs are vulnerable to spectre. The summary AFAIUI is that for most CPUs you can attack userspace processes with these techniques (think Javascript running in a browser), while for Intel CPUs you can also attack the kernel. Intel CPUs also seem to be somewhat easier to attack than others because of a higher-bandwidth side channel.
> Intel products are uniquely vulnerable to the meltdown attack
More precisely, there is only a PoC for Intel at this time. AMD processors are believed to not be vulnerable. Some ARM processors _are_ believed to be vulnerable.
> for most CPUs you can attack userspace processes with these techniques
I think that's almost but not quite exactly right.
Spectre variant 2 attacks vulnerable indirect jump code patterns that exist in the kernel (or some other process), but doesn't require running the attacker's code.
Spectre variant 1 allows you to infer the contents of memory in the same address space, so that's the one where you'd use eBPF to attack the kernel.
Meltdown (variant 3) if I understand correctly can infer memory contents of other address spaces without relying on any assumptions about the code running in the other address space.
I'm curious about AMD not being vulnerable, mostly because this page: https://www.kb.cert.org/vuls/id/584653 makes me think that AMD has admitted to being vulnerable.
That page talks about both Meltdown and Spectre, as far as I can tell. AMD is vulnerable to Spectre; everyone agrees on that. According to https://www.amd.com/en/corporate/speculative-execution AMD claims it's not vulnerable to Meltdown (aka "Variant Three").
A minor correction, ARM's upcoming A75 core looks like it will be vulnerable to Meltdown too. But since it isn't intended for use on server workloads the performance impact of the fix shouldn't be very significant.
That's like if Takata claimed it wasn't a production issue in Takata airbags, it was an issue that affected any airbag that was designed like Takata airbags.
It's a bunch of bullshit trying to dance around the fact that Intel shipped faulty products for years. The fact that other similar products may also be faulty isn't a valid excuse.
I'm not sure that calling it faulty is exactly fair. I feel like the attack is actually pretty brilliant, and has been non-obviously "vulnerable" for years.
The attack is pretty brilliant; but it's also not quite novel. Cache-timing attacks aren't new; certainly CPU suppliers should have known the general issues at hand for... at least a decade? See e.g. Colin Percival's somewhat related attack on hyperthreading way back in 2005: http://www.daemonology.net/papers/htt.pdf
Actually exploiting the information leakage isn't easy; and its compounded by the secrecy surrounding cpu internals. So I think they definitely deserve some blame here. Yes, the PoC is new. But the attack surface was widely known more than a decade ago, and they chose to punt the issue onto software; a solution that was unlikely to really hold water.
> Information leakage through covert channels and side channels is becoming a serious problem, especially when these are enhanced by modern processor architecture features. We show how processor architecture features such as simultaneous multithreading, control speculation and shared caches can inadvertently accelerate such covert channels or enable new covert channels and side channels.
The same can be said for many security vulnerabilities, but that doesn't make them any less faulty. Shellshock exploited a bug that had been in Bash for 25 years (!), but being non-obvious isn't a free pass.
I am not angry at Intel and in general think they do a good job, but trying to dodge blame here comes off sounding pathetic.
It would be like Takata claiming their airbags are not a problem because all airbags can kill infants. There is a problem that uniquely affects one manufacturer's product line, and another problem that affects many manufacturers. Intel has been shipping chips that are uniquely and more severely affected by these attacks for many years now. They were ignorant, sure, but they did have a uniquely vulnerable design.
That's like if Master Lock claimed it wasn't a production issue in Master locks, it was an issue that affected any lock that was designed like Master locks.
It's a bunch of bullshit trying to dance around the fact that Master Lock shipped faulty products for years. The fact that other similar products may also be faulty isn't a valid excuse.
IF the bug is present in all processors that use speculative execution AND Intel products use speculative execution THEN it is a bug in Intel products.
It would only be correct to say "This is not a bug or a flaw exclusive to Intel products"
My interpretation is that they not calling this a bug or a flaw, but a logical consequence of the design, and does not represent a flaw because the original specification didn't specify the implementation be resistant to this kind of attack.
I don't agree. If a spec leaves out details, the impelmentors are free to make reasonable decisions on how to implement it. It's more likely the spec didn't really have much to say about promises on the visible effects of speculative execution (the specs are often highly detailed about what side effects can be visible, this is a consequence of modern complex processors, which have very subtle designs).
A bug would occur in the case where the specification specified that there were no visible side effects from these mispredicted speculative executions, and the processor implementor failed to implement that part of the specification. This is a big deal because if it's a bug, Intel is liable.
It's likely all processors with these kinds of features, the specs will get updated to be more specific about these kinds of side effects.
I think the specification has a problem if a conformant implementation has a problem, particularly if it's a security one. There should be as little as possible left free to the implementors.
The KRACK attack from a couple months ago it's due to the fact the WPA2 specification was ambiguous about what values to accept. Most implementations allowed decrypting traffic and a few even hosts impersonating other hosts but they were perfectly conformant. I would say there is a flaw in the WPA2 specification.
There are always going to be unintended consequences but this one about effects of branch prediction seems, ironically, quite predictable.
From the perspective of Intel, as well as most of the microprocessor industry, yeah. Even "broken in reality" is arguable: while this is indeed an exploit, choosing performance over security (when the resources to implement a chip are finite) is a legitimate design tradeoff.
Right, so the mismatch here is whether or not the CPU is being pitched as wholly secure. If it is "mostly secure with insecure performance enhancements" or really "mostly secure with performance enhancements that have an unspecified level of security", then... It's "not a bug".
It's a huge lapse in customer trust, for sure. But if you're just going to play that semantics game...
I think the good point here is whether or not Intel engineers knew of patterns like this (they should have) and is it negligent or unethical to release products that have these vulnerabilities built in. Insecure (or "undefined") by intention or by coercion.
It feels like you should be able to make a version of branch prediction with speculative execution that behaves the same as the non-speculative-fo-real execution. As in accessing memory you can't actually access doesn't cause a load and does not speculatively execute further instructions with data you weren't supposed to be able to load in the first place.
This is after all what the real execution does; when you try to load from a memory address that you are not allowed to, it does not cause a memory load, it does not affect the cache and it does not continue executing instruction but instead generates a page fault. We skip the page fault thing in speculative execution, obviously, but we shouldn't continue normal control flow.
I'm sure that is a lot more difficult to implement in silicon and may end up negating the performance benefits of speculative execution, but right now it feels like not having the memory protection in place in speculative execution is a performance hack that exploded in all our faces.
In spectre, the victim process is coerced (via branch misprediction, in their first example) to speculatively access its own memory, resulting in side-effects that you can measure to determine the memory contents. No violation of memory protection needed.
How is that supposed to work? You would have to flush the entire cache on every branch, which would mostly defeat the point of the cache in the first place.
(The exploit involves reading an off-limits location in the speculative branch, then reading a legal location based on the off-limits value; so unless the entire cache was flushed the attack would still be possible.)
Could you somehow mark cache lines that are filled by speculatively executed instructions so they only become visible to normally executed instructions when the speculative instructions are committed?
I guess that would require a lot more logic and more cache tag bits.
The solution may lie in a new "non-speculative load“ instruction that delays execution until preceding branches are confirmed.
This would allow keeping the performance of speculative execution in the vast majority of cases, while also fixing the kind of potentially leaking double-indirections that Spectre variant 1 can exploit.
Spectre variant 2 needs to be fixed by isolation of the branch prediction structures, i.e. tagging the BTB and others with the PCID and privilege level.
> The solution may lie in a new "non-speculative load“ instruction that delays execution until preceding branches are confirmed.
That's sadly not enough.
There are also potential speculation-based side-channels that are unrelated to cache: timing the idiv operation whose latency "depends on number of significant bits in absolute value of dividend." Furthermore there are many ways to measure contention on the execution ports which can leak information about the speculative execution.
Well, there seems to be a lot of hysteria right now. You mention speculation-based side effects in general, but keep in mind that only data-dependent side effects matter. There aren't that many of those outside the memory system, if any.
You mention idiv, which is an interesting example. Is it actually observable? Without hyper-threading, almost certainly not, because the execution unit reservations should be dropped as soon as the speculated branch is resolved. With hyper-threading? Perhaps, but it shouldn't be too hard to fix if a non-speculated thread always takes precedence over a speculated one, which makes sense anyway.
In both cases there may be leaks in practice, but those should be fixable relatively easily in hardware, and we've already established that hardware fixes are required anyway.
Another case to think about are dependent conditional branches, since those could affect the instruction cache.
It does seem like a good idea to have a strict barrier instruction against speculative execution just to have a fallback mitigation.
They are saying this to protect future products that are too late to update. A flaw must be fixed. This thing will exist in near-future chips for years until they redesign. So lomg as they can call it a software issue they need not stop production.
So people can leverage the shitty design of your lock to steal goods? And it's okay because the lock is functioning as it should?
It shows how deeply embedded the idea of maintaining proper design and processes is over, going back to the drawing board and designing a secure lock.
It's an interesting debate. But it's clear in the statement they provided, that they are laser focused on design above all else.
Literally, prioritize proper design over security and maybe even performance in some cases. Which is interesting because you'd figure at some point, the customer would be considered in the design.
Like are Intel chips designed by robots or something?
Our service works just ask expected without any flaws. Our ushers check for proper tickets after the presentation has finished, that way we know who should and should not have access to the theatre.
I read this as saying "our lawyers do not want us to preemptively strengthen the case for legal liability for damages", and as being completely independent of plans for mitigation or future product changes.
And now people have found (made?) two workarounds for those hardware design specifications: Spectre and Meltdown.
So people made workarounds for those two workarounds: software patches.
Maybe even more other people will find (or make?) workarounds for those software patches?
Will we witness an semi-endless cycle of workarounds until the current design specifications are slowly becoming worthless?
Or will we suddenly witness new updated (patched?) design specifications (with some extra free features we never knew we wanted) and all buy new hardware?
This ridiculous double-speak is just their lawyers attempting to minimize exposure to claims for replacement processors due to not performing to specifications.
I find it very interesting that most of their official communication on this matter seems to be for their shareholders rather than the people who are directly impacted by using their products, as if they already completely gave up on the latter. Or maybe they think we wouldn't see through this bullshit PR speak, in which case I feel quite insulted.
The majority of the people putting the stock value down have no clue how bad this is: they are just making a bet (and many of them will make a lot of money even if this turns out to be an elaborate hoax). Sure a lot of people disagree, but if you read all the threads here you can quickly get a sense that among people who have a clue what this means on a technical level (those who real hacker news) don't agree how bad it is.
For those like me are running Linux and asking how to update the BIOS, lenovo provides a ISO file. Quoting from [1]:
"The BIOS Update CD can boot the computer disregarding the operating systems and update the UEFI BIOS (including system program and Embedded Controller program) stored in the ThinkPad computer to fix problems, add new functions, or expand functions as noted below."
Yeah, basically all BIOS have that. Some simply need a FAT usb key inserted with a specifically-named file that the BIOS looks for instead of a "true" bootable media.
And how do I get these firmware updates and microcode updates on Windows?
Checking AsRock, there aren't any Bios updates.
Also:
> Customers who only install the Windows January 2018 security updates will not receive the benefit of all known protections against the vulnerabilities. In addition to installing the January security updates, a processor microcode, or firmware, update is required. This should be available through your device manufacturer. Surface customers will receive a microcode update via Windows update.
>In addition to installing the January security updates, a processor microcode, or firmware, update is required. This should be available through your device manufacturer. Surface customers will receive a microcode update via Windows update.
This appears to be the case at least on my Windows 10 laptop.
I've installed the hotfix for Windows, but when I run the PowerShell script to determine whether mitigation is active, the script tells me that it's not active, due to lack of hardware support. The script then goes on to give the recommendation to "Install BIOS/firmware update provided by your device OEM that enables hardware support for the branch target injection mitigation."
It's a 1-year-old ASUS laptop and I would be surprised if they even give a sane response to my question to their technical support (I doubt they will even know what I'm talking about).
There has been recent news about critical security issues in Intel CPUs, requiring a firmware update for all laptops and motherboards with Intel chips.
The vulnerabilities include the potential for malicious websites to read sensitive system memory, including passwords and encryption keys.
I have model XXYY-ZZZZ, do you have any information on when an update will be available, and where I can access it?
If not, can you attempt to escalate this ticket? The security issues are starting making their rounds in the news, and more information can be found at https://meltdownattack.com
I saw the same thing on my Lenovo laptop: installed the Windows update, the PowerShell scripts said it's missing HW support. Installed a new BIOS update from Lenovo (released two weeks ago, btw), now the PS script says I have the needed HW support and is now protected. So on this laptop, the needed microcode appears to have come from a new BIOS and not via Windows.
I have another Gigabyte MB that I suspect is too old for BIOS updates anymore so I am really hoping that at some point these microcode updates due come through Windows and not just via BIOS updates.
Somehow I doubt Sony's going to be updating the BIOS for this VAIO laptop I'm typing this on, given that the last update was in 2012... and they don't even make computers anymore.
> Intel has developed and is rapidly issuing updates for all types of Intel-based computer systems — including personal computers and servers — that render those systems immune from both exploits (referred to as “Spectre” and “Meltdown”) reported by Google Project Zero.
Now I really wonder how they managed to patch both? Does someone know how you could patch something like that from Intel's side of things?
The first example usage given there is very illustrative:
As an example, modules such as branch predictors and speculative execution units can be turned off with a variant of the “chicken bits”, control bits common to many design developments to control the activation of specific features.
All they have to do is disable memory fetching during speculative execution so there's no side effects. That probably wouldn't affect performance much, and some unusual workloads might even get faster.
> that render those systems immune from both exploits (referred to as “Spectre” and “Meltdown”)
The language used seems to imply that they have patched all three identified vulnerabilities.
Is it possible that if this update is applied to a system running an unpatched Linux kernel on an Intel CPU that the system is no longer vulnerable to Meltdown?
Or does this microcode update complement work done by the kernel developers implementing KPTI as well as by the browser developers mitigating some portion of Spectre?
(My initial take is that the efforts by CPU manufacturers and OS/Software developers are orthogonal to provide better coverage to affected parties, such as those on CPUs where no update exists yet)
I'd also be interested in an answer here. Right now it seems like this might even just add support for OS vendors and software developers to mitigate this down the road by selectively disabling pipelining features in their code or at least that's what I'm reading seems to imply.
I'd be surprised if Meltdown was worth less than $2B. And I wouldn't be surprised if it was worth over $100B.
Of course that is raw value, however in practice it would be EXTREMELY hard to launder such a large amount of money into and out of an online transaction.
The question then becomes, what kind of person is the hacker? Are they in a position where they can wait and find a good deal or would they rather wash their hands of it over $100M or so?
For this kind of exploit, I don't think you'd have to worry about the transaction. The NSA/(insert spy agency here) would gladly send a truck full of cash to your house for it (assuming they didn't know about it already).
I like to think the NSA/TLA folks are sitting in an office somewhere incredulous that someone found this bug they'd never have even dreamed up, whilst a catalogue of simple buffer overflows and mandated backdoors that no one has yet found sits on the shelf ready for offensive use.
Side channels in general have been known for a long, long time in mil and intel communities, so it is likely that folks there long ago identified and understood that it's a bad idea to share a single x86_64 core between multiple tenants in a public cloud, for example, even if this specific attack vector was not explicitly tagged & bagged.
I was under the assumption that it's worse than just sharing cores or servers. Even machines that do not share cores but share a common hypervisor are cross-vulnerable.
From Wikipedia [0]:
Spectre has the potential of having a greater impact on cloud providers than Meltdown. Whereas Meltdown allows unauthorized applications to read from privileged memory to obtain sensitive data from processes running on the same cloud server, Spectre can allow malicious programs to induce a hypervisor to transmit the data to a guest system running on top of it.
However I have also read that Spectre is confined to userspace, so I'm not sure who is correct.
I was under the impression that the changes for Clang with retropipe was for specter. Mitigation of it then can be done by recompiling... everything.
> Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre..
Yes, but then you'd just end up with having a vulnerable application running inside of the VM. The mitigations in the compilers are for making programs compiled by those compilers immune(ish?) to Spectre, not to make it impossible for those programs to use Spectre to attack other processes.
You can, but spectre only affects your own process, and if you can compile/run your program, it's unlikely that exploiting anything to read the memory of your own process is any danger.
> Additionally, how does this interact with VMs? It makes the most sense to me that I'd need to microcode patch the hosts, but what about the guests?
microcode is for the physical CPU, it is ultimately stored in on die volatile memory in CPU. microcode can be delivered to the CPU on boot by the BIOS/UEFI (stored in it's firmware) and/or by the host OS on boot (doesn't even matter what the OS is, windows, linux etc), this is secure because microcode is cryptographically signed.
As for software, the OS kernel and hypervisor patches for VMs... From everything that has been said so far, it sounds like these bugs can't be solved entirely in microcode, so software patches are needed as well i believe.
For Linux this is typically handled as an extra 'initrd' that the bootloading process hands off to before giving control over to the next (and real) initrd...
Out of interest, do you know if the process of uploading the microcode via the OS independent of the BIOS? ... e.g on something with an unusual boot ROM like a mac or a Pi (pretend it had an x86) would the kernel not care? i.e it's just a CPU instruction?
I always assumed that the CPU itself would have an internal signature verification mechanism built in. But after searching it seems the OS or BIOS is responsible for quite a lot... I haven't found anything clearly stating if and where cryptographic signatures are verified. Once source at least suggested that it's only present on >2013 CPUs !
Ah, you are correct, I got confused watching some video of someone hacking away at the built in ROM on a de-caped chip... suffice to say, overwriting the permanent internal copy of the microcode does not normally happen. The advantage of updating the ROM this crazy way is purely to defeat the verification of the signature.
I find CPUs fascinating to be honest. While they are complex, they are not impenetrable. Playing with small assembly snippets and simulators helped me a lot to understand them.
Also, other architectures such as MIPS and M68K have simpler to understand ISAs which accelerate the process of understanding these small wonders.
These exploits are the result of the games we play to get high performance when memory is slow but processors are fast (speculative execution; caches).
It's time to figure out how to make RAM dense, cheap, AND fast, so we can finally eliminate the cache layer from the ridiculous hierarchy of storage technologies.
It will take a LONG time before we have tens of gigabytes of memory running with the same bandwidth and latency as L1 cache. According to a StackOverflow question [1], an L1 cache hit takes just over 2 nanoseconds, while RAM takes upwards of 100 nanoseconds.
I don't think that's realistic; registers and SRAM are a lot more expensive than DRAM and consume a lot more power. I think having some combination of fast, expensive RAM and slow, cheap RAM in any system larger than a tiny microcontroller is just a fact of life.
A different way to attack this problem is to make timing attacks harder by disallowing programs from reading the current time or inferring the passage of time by monitoring non-deterministic interactions between threads.
Pure Haskell code (i.e. code that runs outside the IO monad) is an example of an execution environment that conceals timing from the running code.
There are certain kinds of problems that aren't amenable to running in a completely side-effect and non-determinism free sandbox, but maybe for some applications it's a viable solution.
Flops are cheap. Bandwidth is expensive. Latency is physics.
We aren't at full lightspeed latencies ram technologies, but we're closer than you may realize, and it's unlikely we can do substantially better any time soon without revolutionary advances.
Curious. So you're suggesting that if we substituted SRAM for DRAM (assuming it was just as cheap and dense), that the absence of CAS latency, refresh, and capacitors wouldn't make much difference because offchip speed-of-light delays dominate?
[Edit] I realize it would always be slower than on-chip cache, but I thought it would still be at least an OOM faster then conventional DDRx DRAM.
More precisely the electromagnetic waves on the waveguides of the motherboard (bus) move around 2/3 as fast as electromagnetic waves in vacuum. Not as fundamentally separated as one would imagine.
They will update 90% of the chips 5 years old or newer, which I assume includes Ivy Bridge and later. But the bug affects older chips, too. Important to keep that in mind.
I don't know why you're being downvoted; it's crucial that Intel patches this deficiency for all CPU models which are currently in use, which is pretty much all of them, since some systems work nearly forever. We're talking about going back 15 years or so?
I mean this is like the Y2K bug, in that every system needs to have its CPU verified to ensure that it's covered by the patches, otherwise the system should be shut down and a new CPU provided. Since many of these CPUs have sockets for which CPUs are not produced anymore, whole new systems will have to be purchased.
Of course this might the result Intel wants to achieve.
I imagine most laptop and motherboard vendors have precisely zero interest in locating/dusting-off whatever toolchains are required to mint new BIOSes for long discontinued products.
> Intel has already issued updates for the majority of processor products introduced within the past five years. By the end of next week, Intel expects to have issued updates for more than 90 percent of processor products introduced within the past five years.
The post is silent on what will be done with older chips, it didn't say that they are not going to be updated. It makes sense that Intel prioritizes the newer chips.
Some careful wording there. I'm guessing that "introduced" means when they initially went on sale or something like that. I'd like to know what percentage of systems sold to consumers 2-3 years ago had processors that were "introduced" 5 or more years ago.
It's similar to the absurdly short supported lifetime of many phones, they support them for "2 years after release" but in many cases people are still buying them brand new 2 years after release.
It makes sense but 5-10 year old laptops aren't uncommon nowadays.
Had this occurred 20 years ago nobody would have cared because computers would get replaced super fast anyway. Nowadays, even a geeky household can have 5+ year old CPUs lying around and still used.
Anecdotally: I game, compile, transcode, and do any other number of CPU-intensive tasks on my machine. My household is definitely "geeky".
I still run an i7 2600k, which is a Sandy Bridge CPU introduced 7 years ago.
I don't (didn't? may not?) have any pressing reason to upgrade, although I was considering it. Funnily enough, even before now I was considering the AMD Zen architecture, and although I'm still undecided I'm being pushed further and further towards AMD through ethical considerations alone.
You won't need a microcode update. The microcode update is only needed for Broadwell and newer (for those, retpolines don't work). Also, hello, fellow 2600k owner :D
https://newsroom.intel.com/wp-content/uploads/sites/11/2018/... - processors older than 5 years don't need a microcode update, and probably won't even benefit. The new instructions for preventing variant 2 is inferior to retpolines, even according to Intel...
> Note that the insertion of LFENCE must be done judiciously; if it is used too liberally, performance may be significantly compromised.
Wow, as if we could press some magic button and it would pinpoint all the places where a branch condition involves user-controlled data. Of course Intel is downplaying this one to the max, the performance implications for a thorough fix are devastating.
> including personal computers and servers — that render those systems immune from both exploits (referred to as “Spectre” and “Meltdown”) reported by Google Project Zero.
You can't in conventional software, this is almost certainly a microcode update. It remaining to see at what cost. This change may well just totally disable features that normally are heavily used in pipelining.
Like Intel didn't spend tons of money over the past 20 years on TV commercials playing a unique set of musical tones to convince you to buy a computer with a flawed chip inside.
If it is (very likely, because only Intel can issue them thus far), they probably figured out how to turn off the branch predictor.
The performance drop will still be there, but better than the "recompile everything to not use indirect jump and call instructions"[1][2], at any rate.
Note: I was obviously being facetious, but It made me wonder how far back branch prediction goes not being a microarchitecture expert... Surprised to find even the old 68k utilises it! not that they necessarily suffer the same fate through the concept alone of course.
I wouldn't bet on that lasting, though. This CCC someone essentially reverse-engineered microcode on 5-10 year old amd processors. I wouldn't be surprised if they figured out how to do it on newer processors and intel processors.
Intel's microcode updates are signed by a 2048-bit RSA key and some unknown 160-bit hash function (might be SHA-1, but none one publicly knows) for older models, with newer ones using a 256-bit hash function. Unless there's a very clever exploit, I doubt third-party microcode updates will ever be possible:
Even this PoC was limited to 2000 bytes a second; even a little noise may well render the attack largely theoretical - or at least make the attack so slow as to make it feasible to use other (expensive) mitigations only very rarely.
But yeah, it's not a brilliant solution. But it would help.
Even one byte per minute would be enough to cause serious issues. Imagine if some add served up by a torrent site could read information from another tab rendered by the same process?
Not good, but still thousands of times better. And don't forget that you still need to guess where to read from, which better be a very very accurate guess at that rate. Making such an accurate guess is difficult when your runtime is as complex as a browsers, and so even without that extra timing noise no PoC exists (AFAIK?) attacking browsers.
But you know... you're not wrong ;-). I'm just not particularly worried about the likelihood of this attack hitting anything I care about anytime very soon.
Google does have a PoC for spectre that attacks Chrome via Javascript. That said, I am not super worried either. Its not like all my personal data isn't already out there for all to see (Thanks Equifax).
Side-channel timing attacks work on websites by executing MANY similar requests. In a similar way, in this case, while individual requests may not appear to adhere to a timing profile, over time they likely will still be discoverable.
c.f. The random number generator in "The Mythical Man Month" that only guaranteed any one number will be random, not a sequence of numbers.
"According to the latest update from Intel, a microcode is required to completely fix the bug. The microcode release date is, at this time, scheduled for an undisclosed confidential unacceptable late date"
The attacks require co-scheduling with the address space (thread/process) to be attacked. So, exclusive process affinity is a working mitigation against these attacks and other per-core micro-architectural attacks.
So if your Linux distribution provides a new release of linux-firmware package, you may get new versions of WiFi firmware packages as well with no actual changes to those. This seems to be the case with at least RHEL: https://access.redhat.com/errata/RHSA-2018:0015
The RHEL linux-firmware update contains an AMD microcode update that involves Spectre mitigations. AMD microcode is mentioned at https://access.redhat.com/articles/3311301 and the linux-firmware package changelog says:
* Wed Dec 27 2017 Rafael Aquini <aquini@redhat.com> - 20170606-57.gitc990aae
- Add amd-ucode for fam17h
(Intel CPU microcode is part of microcode_ctl package, updated separately)
Thank you very much for the response; I didn't realize RHEL would every version-bump a package without anything actually changing!
For the sake of my own sanity, i tested the iwlwifi-1000 firmware package iwl1000-firmware-39.31.5.1-57 and "-58" and can see no changes when hashing the files.
I wonder if the terminology of blocking specific exploits vs fixing vulnerabilities is intentional.
Even the "facts" page consistently uses "these exploits" and says stuff like: "These exploits, when used for malicious purposes, have the potential to improperly gather sensitive data. Intel believes these exploits do not have the potential to corrupt, modify or delete data."
Is this lawyering so they can say they never claimed to have addressed the vulnerabilities?
It's easy to fix Meltdown with a microcode update. Do we know if Intel (or AMD or ARM for that matter) are also attempting to fix Spectre? Spectre seems much more difficult.
I wonder what Linus thinks about all of this. Does he still believe this is no different than a random UI bug, as he used to say not too long ago about security bugs?
Can we avoid Windows 10 update? Frankly, I don't want to have my gaming & machine learning rigs slowed down (whatever you may say about negligible impact) and would like to opt out of this update on some of my systems. Is it possible? Or am I forced to just waste at worst 30% of my CPU cycles on mitigating this issue, even if my computer never gets into browser for more than a few seconds on very specific trusted domains?
These patches should only affect syscall entry/exit performance. So if you're not doing things that do enormous amounts of syscalls, you should be more at the 3% end of the scale. The 30% number comes from doing things like calculating the file sizes of a directory with a lot of very small files, where the performance overhead of a single syscall matters. In most performance sensitive applications like games, you probably won't see any syscalls in a hot loop since that'd be too slow anyway. Even if you're doing heavy IO, then that is most likely also done in batch and blocked on disk IO speed, not on syscall overhead.
There are reports on Reddit that NVidia's driver is affected due to heavier usage of patched functionality which would mess with both gaming and Deep Learning. In both, 5-10% drop is pretty unacceptable as that is often basically performance offset to lower-tier card (e.g. non-Ti vs Ti card).
That's just the PTI fix. On top of that runtimes recompiled with the retpoline. The question is whether the microcode update contributes an additional slowdown once used or makes the other mitigations cheaper.
I wonder if an optional microcode update disabling usermode access to cache flush and rdtsc instructions would make it impossible to do the low level timings required to make the measurements
(of course it would break other stuff, but maybe being able to enable these process by process might be appropriate)
thinking about it more a patch that masks out (or sets) the N LSBs of the value returned to user space accesses to rdtsc might do the trick, it wouldn't break anything, and time would still be monotonic
The recent CCC paper on how to write your own microcode (for AMD CPUs) means that AMD owners can actually experiment with a fix of this on their own CPUs ... now if only Intel would let us hack the microcode on our own CPUs too ....
Just masking the lower bits is not enough: you can easily reconstruct a high-precision timer from that by observing the clock edges (waiting until rtdsc increases): wait for clock edge, do your experiment, increase counter until next clode edge in busy loop. the counter now correlates strongly with the amount of time taken for the experiment.
Also, while cache flushes are the easiest way to trigger this, I am not convinced that there aren't other ways to do it. For example, you could attempt to clear the cache by spamming it with new data to cache, or you could try to use one of the other sidechannels (I've seen ALU ports being mentioned as one alternative: instead of accessing cache lines, you do some heavy math operations that will cause the ALU ports of the cpu to be used and then you observe how long ALU operations take in userspace)
In at least one of the variants, I believe that the branch predictor is primed in the attacker program with some value, say 0x00ABCDEF by jumping there consistently in certain circumstances.
Then the CPU switches to running another program that is allowed to access different memory. The CPU encounters a branch and predicts that it will land at 0x00ABCDEF and begins running attacker code briefly. Since it's working on the victim program, the code can access the victim program's data. After several cycles the CPU realizes it made the wrong prediction and NOPs out all of those instructions. However, the malicious code stores its results by shuffling things on the cache, which aren't rolled back because the CPU doesn't consider that to be nominal.
Therefore, the exploit can access memory from other programs. This is my limited understanding of Spectre. Reading the white paper, it must be much more complicated.
Apparently the CPU would normally stop accesses even speculatively when running from the malicious program (except for Meltdown, where it doesn't even stop that).
No, Spectre can only read its own memory, which makes it boring for programs, but interesting for stuff running interpreted code because it can suddenly peek outside, e.g. javascript.
Meltdown goes one step further in that it can read kernel memory (but not memory of other processes either, unless it was explicitly mapped in).
As i understand it, you can extract memory from another process by controlling input data to that process. First sending data to train an exploitable branch prediction point for success and then sending data which will use a misprediction to extract some data.
The paper sited has an example C program that works within a single process, but sets the stage to show how it could be achieved across process boundaries.
This branch prediction inside a "gadget" needs to be mapped into the address spaces of both attacker and victim (i.e. shared lib). But even if the attacker can get this code to run on both sides and examine the dependent timings to find out underlying values, the victim code can't be tricked to sequentially read-by-speculation all its internal data.
Does the instruction cache get poisoned as well on a mispredict and carried over?
Yes, they will. Microcode is hot-loaded. There is no way to permanently update a CPU's microcode; it is uploaded to the CPU on each OS boot. They come packaged in the appropriate Windows update or macOS update, or for example on Ubuntu you'll get them the next time you upgrade with apt. So to answer your question, the next time the user updates their operating system, provided the OS manufacturer has included the new firmware, the user will automatically get the new microcode which will get patched onto the CPU on each boot of that OS from then onwards.
Average user? Even in the average corporate I never see anyone do firmware updates unless it's to solve a specific issue.
It also tends to be bad for the above average users here, most firmware patches tend to rely on windows software to install which is sub optimal when you've installed linux on the machine.
I'm not sure I'm following you, but I'm saying they're releasing a fix, as if that really deals with the problem. How many people are actually going to install the fix? How many people out there even know about these issues?
Going forward there need to be easy ways to update all of this stuff. These flaws won't stop coming out. It would be great if they got installed with OS updates, but I can understand if there's a security problem there.
Maybe I'm using big words I don't understand. What is it that Lenovo and such are distributing as part of this? I know they're labeled as BIOS updates but I assumed it's more than that since the BIOS is a different component.
EDIT: So looking into it, it seems that CPU microcode is indeed the firmware, and I presume that the firmware is where a fix such as the one released here would be.
Somebody mentioned a Lenovo update elsewhere in comments. The recent ME updates required an update from Lenovo. I assumed the same would apply here but thankfully not. I wonder why the OS can't update ME firmware the same way.
Updated processor microcode has to be loaded to the processor on every boot, and this can be done either by the system BIOS/firmware or by the OS (or both).
Yeah, also it's not about new computers. It's a huge portion of all existing computers. Even my 10 years old Core 2 Duo laptop will get slower after I update it (I'm not too happy about it).
> Fully removing the vulnerability requires replacing vulnerable CPU hardware.
Proof: https://webcache.googleusercontent.com/search?q=cache:rzc6iQ...
This smells really bad to me, as if Intel pressured CERT into removing language that could have caused their market value to instantly vaporize as every consumer for the last 20 years joins a class action suit...