This seems to be the main definition of deep vs shallow:
> It takes months of scrutiny by a dedicated few to develop confidence that you've winkled them all out.
By this definition, bugs like Heartbleed were indeed deep: they were not randomly found by a million eyeballs in a few days, they were were found by months of careful scrutiny by experts.
The fact that a bug can be fixed with a one line change doesn't mean that it's a shallow bug. A one-line bug can induce a rare, hard to observe behavior in very corner circumstances. Even if users are hitting it, the bug reports can be entirely useless at face value: they can look like a crash that someone saw once with no idea what was special that one time (I bet you would find OpenSSL bug reports like this that were symptoms of Heartbleed going back much longer).
This is why you need a complex and time-consuming QA process to even identify the long tail of deep bugs, which no amount of casual eyeballs will replace.
> By this definition, bugs like Heartbleed were indeed deep: they were not randomly found by a million eyeballs in a few days, they were were found by months of careful scrutiny by experts.
> The fact that a bug can be fixed with a one line change doesn't mean that it's a shallow bug. A one-line bug can induce a rare, hard to observe behavior in very corner circumstances. Even if users are hitting it, the bug reports can be entirely useless at face value: they can look like a crash that someone saw once with no idea what was special that one time (I bet you would find OpenSSL bug reports like this that were symptoms of Heartbleed going back much longer).
> This is why you need a complex and time-consuming QA process to even identify the long tail of deep bugs, which no amount of casual eyeballs will replace.
This is the point in dispute. As far as I can see Heartbleed did not require any special knowledge of the codebase; a drive-by reviewer taking a look at that single file had just as much chance of finding the bug as a dedicated maintainer familiar with the specific codebase. The fact that it was discovered independently by two different teams, at least one of which was doing a general security audit rather than specifically targetting OpenSSL, supports that.
The fact that it was only found 2 years after being introduced (by non-attackers at least), in one of the most used pieces of software in the world, suggests that it wasn't actually shallow by any definition.
I don't think it's relevant that it could have been found by anyone. We know empirically that it just wasn't. It was found by security auditors, which is just about as far as you can be from a random eyeball.
Edit: An even more egregious example is of course the ~20 year old Shellshock family of bugs in bash.
> The fact that it was only found 2 years after being introduced (by non-attackers at least), in one of the most used pieces of software in the world, suggests that it wasn't actually shallow by any definition.
Or that few people were looking.
> I don't think it's relevant that it could have been found by anyone. We know empirically that it just wasn't. It was found by security auditors, which is just about as far as you can be from a random eyeball.
It was found by security people with security skills. But those were not people closely associated with the OpenSSL project; in fact as far as I can see they weren't prior contributors or project members at all. That very much supports ESR's argument.
> It was found by security people with security skills. But those were not people closely associated with the OpenSSL project; in fact as far as I can see they weren't prior contributors or project members at all. That very much supports ESR's argument.
It doesn't. ESR's argument suggests OpenSSL should not hire security researchers to look for bugs, since all bugs are shallow and people will just quickly find them - the Bazaar approach.
What Heartbleed has shown is that the OpenSSL project would be much higher quality if it took a more Cathedral-like approach and actively look for security researchers to work on it, and make them a part of their release process. Because they didn't, they released with a critical security vulnerability for more than a year (and there are very possibly many others).
Especially in security, it's clear that some bugs are deep. Any project that cares about security actually has to follow a cathedral-like approach to looking for them. Releasing security critical code early is only making the problem worse, not better.
This is a "Seinfeld is unfunny" situation. People don't remember what development practices were like prior to this essay.
> ESR's argument suggests OpenSSL should not hire security researchers to look for bugs, since all bugs are shallow and people will just quickly find them - the Bazaar approach.
It suggests they should release early and often to allow outside researchers a chance to find bugs, rather than relying on internal contributors to find them. Which seems to have worked in this case.
> Especially in security, it's clear that some bugs are deep. Any project that cares about security actually has to follow a cathedral-like approach to looking for them.
No it isn't. I still haven't seen examples of bugs like that.
> Releasing security critical code early is only making the problem worse, not better.
How is missing a critical security issue that endangers all encryption offered by OpenSSL for more than a year (giving attackers access to your private keys via a basic network request) "working in this case"?
> No it isn't. I still haven't seen examples of bugs like that.
If you don't think finding Heartbleed after a year in OpenSSL was the process working, how about finding Shellshock was hiding in Bash for more than 20 years? Was that still a shallow bug, or is bash a project that just doesn't get enough eyes on it?
> How?
By letting people expose their data for years with a false sense of security. By encouraging projects to think security is someone else's problem, and they don't need to really worry about it.
Rather than releasing support for TLS heartbeats that steal your private keys for a whole year, it would have obviously and indisputably been better if the feature had been delayed until a proper security audit had been performed.
That the bug was eventually found is in no way a merit of the model. People find security bugs in closed source software all the time. The NSA and the Chinese and who knows who else usually find them even earlier, and profit handsomely from it.
> How is missing a critical security issue that endangers all encryption offered by OpenSSL for more than a year (giving attackers access to your private keys via a basic network request) "working in this case"?
The fact that it was found by people outside the project is the system working.
> If you don't think finding Heartbleed after a year in OpenSSL was the process working, how about finding Shellshock was hiding in Bash for more than 20 years? Was that still a shallow bug, or is bash a project that just doesn't get enough eyes on it?
Yes it's a shallow bug. I mean look at it. And look at who found it.
> Rather than releasing support for TLS heartbeats that steal your private keys for a whole year, it would have obviously and indisputably been better if the feature had been delayed until a proper security audit had been performed.
How much auditing do you realistically think a project with a grand total of one (1) full-time contributor would've managed?
If the code hadn't been publicly released we'd still be waiting for the bug to be found today.
> The fact that it was found by people outside the project is the system working.
This happens all the time to Windows, to Intel's hardware architecture, even to remote services that people don't even have the binaries for. There is nothing special about people outside the team finding security bugs in your code. After all, that's also what attackers are.
> Yes it's a shallow bug. I mean look at it. And look at who found it.
If a bug that hid from almost every developer on the planet for 20 years (that's how popular bash is) is still shallow, then I have no idea how you define a non-shallow bug.
> How much auditing do you realistically think a project with a grand total of one (1) full-time contributor would've managed?
That's irrelevant to this discussion. Per the essay, even a company as large as Microsoft would be better off releasing anything they do immediately, instead of "wasting time" on in-house security audits.
> If the code hadn't been publicly released we'd still be waiting for the bug to be found today.
I'm not saying they shouldn't have released the code along with the binary, I'm saying they shouldn't have released anything. It would have been better for everyone if OpenSSL did not support heartbeats at all, for a few more years, rather than it supporting heartbeats that leak everyone's private keys if you just ask them nicely.
This is the point of the Cathedral model: you don't release software at all until you're reasonably sure it's secure. The Bazaar model is that you release sofwatre as soon as it even seems to work sometimes, and pass on the responsibility for finding that it doesn't work to "the community". And the essay has the audacity to claim that the second model would actually produce better quality.
> There is nothing special about people outside the team finding security bugs in your code.
That supports the point.
> If a bug that hid from almost every developer on the planet for 20 years (that's how popular bash is) is still shallow, then I have no idea how you define a non-shallow bug.
A bug where you think "yeah, no-one except the x core team could ever have found this". A bug where you can't even understand that it's a bug without being steeped in the project it's from.
> That's irrelevant to this discussion.
Disagree; that the Bazaar can attract more contributions is a big part of the point.
> This is the point of the Cathedral model: you don't release software at all until you're reasonably sure it's secure. The Bazaar model is that you release sofwatre as soon as it even seems to work sometimes, and pass on the responsibility for finding that it doesn't work to "the community".
Few people were thinking about security at all in those days, at least the way we think about it now; the essay isn't about security bugs, it's about bugs generally. The claim is that doing development in private and holding off releasing doesn't work, because the core team isn't much better at finding bugs than outsiders are. The extent to which a given project prioritises security versus features is an orthogonal question; there are plenty of Cathedral-style projects that release buggy code, and plenty of Bazaar-style projects that release low-bug code.
It did the literal opposite: the TLS Heartbeat Extension was itself a bazaar (and bizarre) random contribution to the protocol. The bazaar-i-ness of OpenSSL --- which has since become way more cathedralized --- was what led to Heartbleed, both in admitting the broken code and then in not detecting that code regardless of the fact that it's one of the most widely used open source projects on the Internet. It comprehensively rebuts Raymond's argument.
I remember what development practices were like prior to this essay, which is part of why I feel so strongly that it's overrated. Several other people on this thread were working in the mid-late '90s too.
If "few people were looking" at OpenSSL, one of the most widely-used pieces of open source software in the entire industry, Eric Raymond's point is refuted.
The whole thesis is that the open source userbase forms the army of eyeballs that will surface all bugs --- they're part of the development process. So no, this dodge doesn't work either; it doesn't cohere with what Raymond said.
> It takes months of scrutiny by a dedicated few to develop confidence that you've winkled them all out.
By this definition, bugs like Heartbleed were indeed deep: they were not randomly found by a million eyeballs in a few days, they were were found by months of careful scrutiny by experts.
The fact that a bug can be fixed with a one line change doesn't mean that it's a shallow bug. A one-line bug can induce a rare, hard to observe behavior in very corner circumstances. Even if users are hitting it, the bug reports can be entirely useless at face value: they can look like a crash that someone saw once with no idea what was special that one time (I bet you would find OpenSSL bug reports like this that were symptoms of Heartbleed going back much longer).
This is why you need a complex and time-consuming QA process to even identify the long tail of deep bugs, which no amount of casual eyeballs will replace.