These results look pretty encouraging as to the general quality of the various browsers, until you read the fine print: "Only security bugs were counted in the results (doing anything else is tricky as some browser vendors fix non-security crashes while some don’t)." So, the numbers they show don't count random crashes, never mind incorrect behavior.
And who'd have thought that some vendors aren't even interested in fixing "non-security" crashes; I wonder which vendors those are? And I wonder how many "non-security crashes" their fuzzer found in each browser?
Even worse, they laud MS for the MemGC technology they use in their browsers, but MemGc is just a last-minute hack to try to keep use-after-free bugs from being security holes (by ignoring frees to anything that still has a pointer to it), without having to actually fix them. So, lots more bugs that really ought to crash instead go on to other undefined behavior, which hopefully doesn't amount to a security problem (never mind simply incorrect behavior).
So, it seems that their fuzzer is finding lots and lots more bugs, and they're just not telling us how many there actually were. Maybe the title should indicate better that they're limiting themselves to security vulnerabilities. I suppose there's a call for that sort of measurement, but it would be nice to also hear about the general quality of the software under test. The notion that they'd rate a browser that did exactly nothing as having a 100% perfect rating of zero bugs kind of rankles.
Project Zero is exclusively concerned with security: when they say "bug" they mean "security bug"; when they say "fuzz" they mean "fuzz with an intent to find exploitable security vulnerabilities", etc.
IMO this restricted sense of "bug" is clear from the second sentence of the article ("in the recent years the popularity of those kinds of bugs in targeted attacks has somewhat fallen in favor of Flash (which allows for cross-browser exploits) and JavaScript engine bugs (which often result in very powerful exploitation primitives)"). Even the first sentence becomes questionable if we interpret "bugs" more broadly (Historically, DOM engines have been one of the largest sources of web browser bugs); the example given is a CVE, etc.
So IMO I wouldn't call it the fine print. And in this context, the remarks about MemGC also make sense (Microsoft considers these bugs strongly mitigated by MemGC and I agree with that assessment […] useful mitigation that results in a clear positive real-world impact) — "mitigation" is the term for something makes it hard to exploit a [security] bug, not something that makes the bug go away.
A similar question about general bugs (as in "the last bug in TeX…") would be very interesting, but I don't know if we have a way of measuring it.
Your’re missing his point. The issue is not the purpose of the effort, that’s clear, everyone agrees. The question I infer is closer to be something like: If crashing bugs are inadvertently found in software, from whatever source, what would you expect the interest/motivation/urgency to be in fixing them?
It’s a good question but I’m not sure the answer is clear. On small projects I’d make the case that 100% of reproducible crashes should be fixed, always. However when the “project” approaches operating system scale and complexity, most pure principle policies like that cease to exist for practical reasons. Most decisions, not just bug fixing, are about tradeoffs/cost benefit considerations due to the simple realization you can’t boil the ocean.
Boil the ocean is an expression meaning do the impossible, so I don't think it matters much if we have a responsibility to do something we are incapable of. It does matter if we do the next best thing, and that's where the trade offs come in.
"And who'd have thought that some vendors aren't even interested in fixing "non-security" crashes"
That doesn't surprise me too much -- it comes down to prioritization, and often the crashes a fuzzer finds will be weird corner cases that don't happen in the 'real world' unless somebody is deliberately exploiting them. Since 'browser crashes safely' isn't a very interesting outcome for the deliberate attacker or a major issue for the user, it seems reasonable to me that a vendor might prefer to prioritize fixing other crasher bugs that telemetry says users are actually hitting, or fixing bad UI that annoys users, or any of the other thousands of issues in their issue tracker.
No software is bug free and no dev team has unlimited resource, so any decision to fix a bug is implicitly a decision not to fix some other bug. "We found this with a fuzzer" isn't an automatic ticket to the front of the queue.
> And who'd have thought that some vendors aren't even interested in fixing "non-security" crashes;
While I can't answer that I can tell you that I was very surprised Firefox showed little interest to fix this:
https://files.hboeck.de/crashff.html
(linux only, crash via notification api, bug open for a year)
I can reproduce it on Firefox 55 and 56 in Ubuntu Mate, it crashes after I allow the notification. It'd be useful to have a bug link so I can see what the status is. It could be an Ubuntu issue for all I know.
Usually, but not always, I hear this from game developers.
To be fair, of course, I'm not doubting that they're correct when the libraries are used in games. The problem is that increasingly graphics libraries get used in security-critical contexts like browsers.
I bet there are a few security vulns lurking in games that involve user-provided data (e.g. maybe you can customize your clan by uploading a logo). The boundary between trusted and untrusted in games with mods is especially porous ( e.g. https://security.gerhardt.link/RCE-in-Factorio/ ).
I was in a meeting today where somebody dismissed a video codec having vulnerabilities as being a problem as it isn't "on the internet like a web server".
I wish they included a bit more information on the bug's severity. I've been using Safari on macOS, and this might make me seriously consider dropping my usage entirely in favor of Firefox.
What we need is stronger website isolation and permissions. Depending on the site's functionality and how frequently I use it, I'll modify their permissions. Some browsers allow you to restrict a couple features, but I'd argue it's not granular enough.
Why should any random site be allowed to use WebSockets, WebRTC, WebGL, etc? I'd want to turn off all those features and more on sites that I don't think should have them.
I'm actually a bit annoyed with Safari 11, which removed the ability to conditionally disable WebGL. In my experience, most websites that were using WebGL had absolutely no good reason to do so.
I would bet they were all memory corruption related. They were using AFL to manage their test corpus. It is probably a good guess to say they were all memory corruption related. I am pretty sure the definition of bug is "unique crash". Crash means memory corruption. Probably, the only way they knew they had a bug in such a scenario is from an instrumented process crashing in the first place. I doubt they were checking for much else in terms of instrumented process behavior.
Reading the article, it looks like they did test on Mac hardware, just doing the initial distributed test on cheaper hardware first and then confirming each one individually on more expensive mac hardware. Or at least that's how I read OP.
It'd be way more expensive, they might as well just fuzz with WebKitGTK+ since it produces the results they desire; then validate the crashes on Safari.
And who'd have thought that some vendors aren't even interested in fixing "non-security" crashes; I wonder which vendors those are? And I wonder how many "non-security crashes" their fuzzer found in each browser?
Even worse, they laud MS for the MemGC technology they use in their browsers, but MemGc is just a last-minute hack to try to keep use-after-free bugs from being security holes (by ignoring frees to anything that still has a pointer to it), without having to actually fix them. So, lots more bugs that really ought to crash instead go on to other undefined behavior, which hopefully doesn't amount to a security problem (never mind simply incorrect behavior).
So, it seems that their fuzzer is finding lots and lots more bugs, and they're just not telling us how many there actually were. Maybe the title should indicate better that they're limiting themselves to security vulnerabilities. I suppose there's a call for that sort of measurement, but it would be nice to also hear about the general quality of the software under test. The notion that they'd rate a browser that did exactly nothing as having a 100% perfect rating of zero bugs kind of rankles.