Why was Tanenbaum wrong in the Tanenbaum-Torvalds debates?

redthrowaway · on March 23, 2012

In short, he wasn't wrong. His arguments were correct, but a combination of happenstance, intransigence, and a vast infusion of research dollars kept Wintel dominant on the desktop, while the world moved away from it.

The micro- v. macro-kernel debate is moot; all of the interesting development is happening elsewhere in kernel-land. On that, he was wrong. But RISC will dominate CISC unless Intel manages to pull a miracle out of their ass, <s>and GNU will dominate, at least in terms of #installs, through Android.</s> nope, I stand corrected.

So Linus was right about what mattered in the 90's, and Tanenbaum had his finger on the pulse of history. Both knew what they were talking about, and both were right in their own way.

Although reading the Usenet postings, Linus does come across as more of an arrogant upstart, and less of a dickish master than usual. That alone is worth the read.

InclinedPlane · on March 23, 2012

He was "right" in that he made a prediction that did not come anywhere near true?

Sorry, he was wrong.

As to the 2nd point about x86 vs "RISC" processors, it turns out he was so massively wrong the very basis of his understanding was incorrect. CISC processors are dead today, they stopped being a substantial part of the market in the late 90s. I'm sure they still exist somewhere in new devices (RAD hardened Pentiums, perhaps) but for the most part the CISC vs RISC battle is over, and RISC won overwhelmingly. But Tannenbaum thought that a necessary consequence of that would be that the x86 architecture would die due to the weaknesses of CISC processor designs. But what actually happened is that Intel (and later AMD and others of course) started making processors with a RISC core that support the x86 (IA32) instruction set through transparent op-code translation. Every Intel cpu since the Pentium-Pro has worked that way (and every AMD CPU since the Athlon).

You can't just wave your hands and say "yeah, but he was fundamentally right in some ways though there were some things he couldn't have foreseen". That's part of the deal, there's always something that you can't foreseen. Imagining that CISC's weaknesses are identical to the weaknesses of the x86 architecture are just the sort of naivety and shallow reasoning that can lead you to make woefully wrong predictions.

Osiris · on March 23, 2012

Recently on a flight I sat next to a Ph.D. that ran a chip design consulting firm. In this fascinating discussion, he talked about the internal design of AMD and Intel processors and said exactly this. AMD and Intel haven't had hardware based x86 computation for a very long time.

Instead, they have a RISC style pipeline with an op-code translation layer that decodes instructions into smaller RISC-like instructions that get run through the pipeline.

So really, modern x86 processors are more like x86-compatible processors.

zurn · on March 23, 2012

"CISC processors are dead today [...] Intel (and later AMD and others of course) started making processors with a RISC core that support the x86"

That's a pretty far fetched argument. RISC/CISC is about instruction set, and the x86 instruction set is CISC.

Of course since the big RISC/CISC battle the implementations have converged a lot on the microarchitecture level, primarily because the transistor budget sweet spot targeted by RISC melted away and the amount of chip area saved in instruction decode and ISA simplicity was later dwarfed by out-of-order machinery, caches etc.

So an equally valid argument (as "CISC is dead") is "RISC is dead" since RISC chips today have brainiac instructions and pipelines like divide/multiply, unaligned access, variable length instructions (Thumb on ARM), out-of-order execution etc.

Arelius · on March 23, 2012

That's sort of like saying that any RISC machine becomes a CISC machine at soon as you install the JVM on it (With the caveats that I don't know if the JVM has a CISC instruction set)

burgerbrain · on March 23, 2012

If all the end users use the CISC layer and no person uses the RISC layer, then I would feel comfortable calling it a CISC machine.

dhimes · on March 23, 2012

For those of you, like me, who are curious about the difference between RISC and CISC, here's a gentle introduction:

http://www-cs-faculty.stanford.edu/~eroberts/courses/soco/pr...

ScottBurson · on March 23, 2012

Right. I think what Tannenbaum overlooked was the massive transistor budgets that were on their way thanks to Moore's law. Where early CISC chips had to spend a substantial fraction of their transistors on instruction decoding, the absolute number of transistors required for that didn't increase nearly as fast as the total on the chip. So now we have quite complex instruction decoders that translate CISC to RISC, and they're still just a small fraction of the chip.

muyuu · on March 23, 2012

He was squarely wrong in that he made concrete predictions and these fell off dramatically.

He said in the early 90's that x86 would be dead in 5 years. If that's not a wrong prediction I don't know what is.

As for microkernels being superior to macrokernels, the trend has been to evolve into hybrid kernels (one of which Linux is now in practice). CISC vs RISC: same outcome. Hybrid approaches have come on top. Modern x86 processors are RISC inside, CISC outside and this produces concrete advantages by needing less memory bandwidth. Even if the instruction set was nominally RISC, we'd do more instruction pipelining producing a similar result. It's got to the point where the difference is nominal. We still call them x86 but they are fundamentally different processors. We still call them RISC and they do instruction compositing now.

He was WRONG with capitals, as usually "religious" and opinionated people are. There's rarely black and white in the real world. It's shades of grey.

Also, being off by a 400% in time scale, even if the outcome is similar as predicted, is being wrong no ifs or buts. Predictions like this mean total practical failure in any decision making.

tspiteri · on March 23, 2012

GNU will dominate, at least in terms of #installs, through Android.

Android is not GNU, although its kernel is Linux. So much so that it is an example cited by GNU's GNU/Linux FAQ in order to make the case that Linux and GNU/Linux are different.

http://www.gnu.org/gnu/gnu-linux-faq.html#linuxsyswithoutgnu

redthrowaway · on March 23, 2012

Thanks, fixed.

kalleboo · on March 23, 2012

I believe in this debate "GNU" refers to the Hurd kernel, not the userland. Hurd did turn out out to be a spectacular failure.

InclinedPlane · on March 23, 2012

Yes, this is exactly right on both counts. He's not making an argument about a generic open source POSIX operating system being dominant, he's making a specific argument about Hurd, the GNU kernel. Which, incidentally, still hasn't been released.

beza1e1 · on March 23, 2012

Hard to call it a failure, because it is not over yet. They are happily failing for 20 years now. :)

http://www.gnu.org/software/hurd/

jacquesm · on March 23, 2012

The fact that they are failing for 20 years does not mean they're not a failure.

bootload · on March 23, 2012

"... Linus does come across as more of an arrogant upstart, and less of a dickish master than usual. ..."

There was a follow-up to this email joust detailed in "Just for Fun" [0] with this remark, "Maybe a year later, when Linus was in the Netherlands for his first public speech, he made his way to the university where Tanenbaum taught, hoping to get him to autograph Linus's copy of Operating Systems: Design & Implementation, the book that changed his life. He waited outside his door but Tanenbaum never emerged. The professor was out of town at the time, so they never met."

The book also detailed that the main reason for the spat was that Tanenbaum was publicly commenting. Hence the response.

[0] https://en.wikipedia.org/wiki/Just_for_Fun

Xuzz · on March 23, 2012

Maybe Linux will dominate, but Android is definitely not GNU. They don't use the GNU userland, libc, or many other GNU libraries (if any).

jacquesm · on March 23, 2012

Tanenbaum wasn't wrong and neither was Torvalds. The fact is that these are complicated matters that you can't make black-and-white.

    Microkernels are the future

And they are, the future just hasn't arrived yet. But microkernels see more and more adoption every day. They offer a degree of reliability that is unprecedented. But they also come with a performance penalty that is for a lot of people enough of a drawback that they would rather have 'good enough' than 'perfect'.

For software that needs to be 'perfect' microkernels are the way to go and in fact in the embedded world there are more microkernel varieties that you can choose from now than ever before. Once performance penalties are no longer important and people will start to demand software that does not crash with every change of the weather I believe microkernels will see another wave of increased adoption. As far as I'm concerned this can't come soon enough. Userland drivers are so much better than a monolithic kernel.

    x86 will die out and RISC architectures will dominate the market

And in fact, in the mobile arena this has already come true. And the way Apple is moving I would not be surprised to see an Arm powering an Apple laptop one day.

    (5 years from then) everyone will be running a free GNU OS

I think both parties underestimated the strength of the windows lock-in here. And many people still underestimate the strength of this lock-in, even here on HN the demise of Microsoft is announced with some regularity.

cperciva · on March 23, 2012

the future just hasn't arrived yet

As far as microkernels go, I'd say that the future has arrived. We don't call them microkernels, of course -- we call them hypervisors. But they're fundamentally the same thing.

jacquesm · on March 23, 2012

True, but those things run inside the hypervisors are usually still the same old monolithic operating systems. I think that once those are also micro kernel based that there will be a real shift in perception.

ajross · on March 23, 2012

You say potato, I say potato. What's the meaningful difference? Inside even the purest microkernel you're still running "processes" with unified address space subject to awful crashes and memory corruption. The process is itself an abstracted machine, after all. How far down the abstraction hole do we have to go before we reach purity?

jacquesm · on March 23, 2012

The differences between the linux kernel and a microkernel (such as for instance QnX but there are plenty of others) is that everything is a process, and everything but that tiny kernel runs in userland.

It's the difference between 'potatoes' and 'mashed potatoes' ;)

ajross · on March 23, 2012

No, you miss the point. I understand very well what a microkernel is. I'm asking you what the conceptual difference is between running a bunch of "macrokernel" systems inside a hypervisor and running a single microkernel with a bunch of processes. There is none: they are the same technology. The difference is in the label you stick on it. Which is a very poor thing to start an argument about.

(edit: I should clarify "same technology" to mean "same use of address space separation". Microkernels don't need to use virtualization technology like VT-d instructions because their separated modules don't need to think they're running on unadulterated hardware.)

286c8cb04bda · on March 23, 2012

> I'm asking you what the conceptual difference is between running a bunch of "macrokernel" systems inside a hypervisor and running a single microkernel with a bunch of processes. There is none: they are the same technology.

Complexity is the difference.

Hypervisors "won" because it was easier to implement; They only had to add another layer to the stack, rather than fundamentally change the structure of the OS.

The outcome is a more baroque collection of code, though. Worse truly is better.

Arelius · on March 23, 2012

The difference is the purpose of the system. I the prior, the purpose is to simply multiplex the hardware into multiple logical systems performing different tasks. And the later the purpose is to build a single unified system. It has more co, Mmunication between the systems, and duplication of work is minimized. Only one process has any FS drivers in it. Another only worries about display. And more importantaly, it's more fault tolerant, if the display process goes down, all the other processes are generally built to wait on it coming back up. Whereas you cannot have a microkernel go down and not take an application process, file system process, and a network process with it..

ajross · on March 23, 2012

I don't buy that at all, it's just semantics. Why can't multiple OS images be a "unified system"? That's what a web app is, after all.

And the fault tolerance argument applies both ways. That's generally the reason behind VM sharing too. One simply separates processes along lines visible to the application (i.e. memcached vs. nginx) or to the hardware (FS process vs. display process).

Potato, potato. This simply isn't something worth arguing over. And it's silly anyway, because there are no microkernels in common use that meet that kind of definition anyway. Find my a consumer device anywhere with a separate "display server", or one in which the filesystem is separated from the block device drivers. They don't exist.

(edit rather than continue the thread: X stopped being a userspace display server when DRM got merged years ago. The kernel is intimately involved in video hardware management on modern systems. I can't speak to RIM products though.)

jacquesm · on March 23, 2012

> Find my a consumer device anywhere with a separate "display server", or one in which the filesystem is separated from the block device drivers.

Blackberry, every computer running 'X'.

flomo · on March 23, 2012

> x86 will die out and RISC architectures will dominate the market

It wasn't just Tanenbaum who was wrong about that. Billions of dollars were dumped into RISC architectures on the assumption that x86 wouldn't scale. Microsoft committed to an expensive rewrite of Windows (or OS2) to make it portable. Apple considered x86 and decided to bet on RISC instead.

So this wasn't just some wacky college professor opinion, the industry thought RISC was a sure thing. (Linus of course didn't really care, he just wanted something to run on his 386 clone.)

edit: it bothers me that this debate is always presented without context. Torvalds is a PhD student busy reinventing a 20 year old unix kernel design, and Prof. Tanenbaum is pointing out he isn't advancing the state-of-the-art, which is totally correct. The fact that Linux turned out to be really useful and popular is mostly aside the point - the advancement was Torvald's open source management, not the kernel design.

rogerbinns · on March 24, 2012

> ... that x86 wouldn't scale

The ACE Consortium formed in the early 90s picking MIPS as the chosen processor and included Microsoft and SCO.

http://en.wikipedia.org/wiki/Advanced_Computing_Environment

flomo · on March 25, 2012

Every RISC vendor had their own little PC 'consortium' (except perhaps Sun SPARC). They never sold that well, and when te Intel Pentium Pro came out, it beat them on most specs, so the whole idea of a RISC PC died around 1996 (outside of Apple).

sophacles · on March 23, 2012

Tangential to the main topic:

But microkernels see more and more adoption every day. They offer a degree of reliability that is unprecedented. But they also come with a performance penalty that is for a lot of people enough of a drawback that they would rather have 'good enough' than 'perfect'.

For software that needs to be 'perfect' microkernels are the way to go and in fact in the embedded world there are more microkernel varieties that you can choose from now than ever before.

I'm looking into this space a bit for some personal projects. Would you be able to point me to some examples/good resources on this?

jacquesm · on March 23, 2012

QnX (now owned by RIM iirc) is one of the most mature micro kernel based os's out there.

sophacles · on March 23, 2012

Thanks. I have heard of QNX but didnt realize it is a microkernel.

jacquesm · on March 23, 2012

I've used it for years on a very large message switch and in my experience it was rock solid, very easy to develop on and extremely responsive. For hard real time stuff from userland you'd still have to tweak things a bit but even that is possible.

tkahn6 · on March 23, 2012

   x86 will die out and RISC architectures will dominate the market

IIRC Intel's processors now translate x86 instructions into a series of RISC micro-instructions on the fly.

beatle · on March 23, 2012

>And they are, the future just hasn't arrived yet. But microkernels see more and more adoption every day. They offer a degree of reliability that is unprecedented. But they also come with a performance penalty that is for a lot of people enough of a drawback that they would rather have 'good enough' than 'perfect'.

Correct. The future of computing is mobile and the weakness of the Linux kernel's monolithic architecture is highlighted by Android's numerous design and implementation issues as well as Android's numerous maintainability, upgrade, reliability and performance problems.

Tanenbaum was actually right.

ajross · on March 23, 2012

Sounds like an interesting thesis. You have a link to one of these design or implementation "issues" and how it's a reflection of the lack of address space separation and/or IPC design of the linux kernel?

No? Yeah; sounded like a content-free platform flame to me too.

Actually: I'd be curious to hear some more knowledgable folks on this. My understanding of the iOS kernel is that it's a microkernel only via historical label: the PVR driver stack, network devices and filesystems live in the same address space and communicate with userspace via single context switched syscalls. Is that wrong?

beatle · on March 23, 2012

Here's one for you genius. This would not be such a hard problem to solve on a Hybrid/microkernel OS. And you wonder why some Android devices don't get updates?

The Android kernel code is more than just the few weird drivers that were in the drivers/staging/android subdirectory in the kernel. In order to get a working Android system, you need the new lock type they have created, as well as hooks in the core system for their security model.

In order to write a driver for hardware to work on Android, you need to properly integrate into this new lock, as well as sometimes the bizarre security model. Oh, and then there's the totally-different framebuffer driver infrastructure as well.

This means that any drivers written for Android hardware platforms, can not get merged into the main kernel tree because they have dependencies on code that only lives in Google's kernel tree, causing it to fail to build in the kernel.org tree.

Because of this, Google has now prevented a large chunk of hardware drivers and platform code from ever getting merged into the main kernel tree. Effectively creating a kernel branch that a number of different vendors are now relying on.

Now branches in the Linux kernel source tree are fine and they happen with every distro release. But this is much worse. Because Google doesn't have their code merged into the mainline, these companies creating drivers and platform code are locked out from ever contributing it back to the kernel community. The kernel community has for years been telling these companies to get their code merged, so that they can take advantage of the security fixes, and handle the rapid API churn automatically. And these companies have listened, as is shown by the larger number of companies contributing to the kernel every release.

But now they are stuck. Companies with Android-specific platform and drivers can not contribute upstream, which causes these companies a much larger maintenance and development cycle.

http://www.kroah.com/log/linux/android-kernel-problems.html

For your 2nd question

In Mac OS X, Mach is linked with other kernel components into a single kernel address space. This is primarily for performance; it is much faster to make a direct call between linked components than it is to send messages or do remote procedure calls (RPC) between separate tasks. This modular structure results in a more robust and extensible system than a monolithic kernel would allow, without the performance penalty of a pure microkernel.

http://developer.apple.com/library/mac/#documentation/Darwin...

ajross · on March 23, 2012

The Greg KH link is very stale. All that stuff got merged. And you're interpreting it wrong anyway. Android introduced some new driver APIs, they didn't completely change the kernel. Check the .config file on an actual device and count the number of drivers that are absolutely identical to desktop linux.

And how exactly does having a microkernel fix the problem of having a stable driver API? Drivers must be written to some framework. Windows NT derivatives are microkernels too, and they're on, I believe, their third incompatible driver architecture.

And did you actually read that second link? It's drawing a single "kernel environment" with all the standard kernel junk in it. That is not a microkernel.

Sigh. I probably shouldn't have gotten involved.

beatle · on March 23, 2012

>And how exactly does having a microkernel fix the problem of having a stable driver API? Drivers must be written to some framework.

at this stage i will refer you to, ironically, Andy's book:

http://www.amazon.com/Modern-Operating-Systems-3rd-Edition/d...

spiralpolitik · on March 23, 2012

Tanenbaum wasn't that far off.

1. In terms of pure Micro-Kernels he was off. It was tried, the benefits didn't outweigh the drawbacks so people moved to Hybrid Micro-Kernels (Windows NT, OS X, iOS et al) so from that perspective he was about 50% right.

2. Given the way ARM is trouncing everybody in the mobile space, unless Intel manages the biggest comeback since Lazarus the future is almost certainly RISC. Whether this feeds back to the desktop space remains to be seen.

3. Unlikely to happen, although the future is most likely Open Source in some form or other. GPL v3 has largely ruled out GNU dominating as vendors that previously shipped GNU components replace the GPL components with other Open Source Licenses because they find the new terms a bit much.

huggyface · on March 23, 2012

the future is almost certainly RISC

What does RISC even mean anymore? Seriously, I remember the debates during the early 90s (back when MIPS and friends were going to destroy Intel) and the RISC of then is very different from the RISC of today. Then the merit of RISC was that you literally reduced the instruction set to the minimum possible, putting the demand on the compiler to gang them to do even rudimentary work. The idea was that the simpler silicon would be easier to scale up (frequency scaling was a major problem), and the compiler would have more insight into the operations of a product giving such a product a performance advantage.

The MIPS of the 90s had about 45 instructions, total, and a corresponding simplicity of implementation. The 8086 had 114, providing higher level, much more complex silicon, and has grown since then.

How many instructions does ARMv7a provide (this is actually a hard question to answer)? It has floating point operations, SIMD / NEON, virtualization support, and on and on and on. I do know that while it once feature just 25,000 transistors (ARM2), a modern ARM9 design like the Tegra2 hosts 26 million transistors for just the cores (not the GPU).

I realize that I'm stepping into a linguistic landmine, and various contrived "this is the differentiator" definitions will appear, but the original intent of RISC versus CISC was exactly what I described above. Today the meanings are absolutely nothing like that.

spiralpolitik · on March 23, 2012

While one of the characteristic of a RISC processor was a simpler instruction set its not the only one. There is also uniform instruction length to make instruction decoding logic simpler and quicker and that a single instruction doesn't takes longer than a single clock cycle.

But you are correct that the water today is somewhat muddy especially as CISC processors borrowed stuff from the RISC processors and vice versa. I think someone in the late 90s coined the term CRISP (Complex Reduced Instruction Set Processor) to describe these beasts, although I haven't seen the term mentioned in recent years.

Symmetry · on March 23, 2012

Just a few quick things to say on RISC vs. CISC.

Back when this debate was happening CPU design teams were a lot smaller, meaning that any given feature hadn't had enough effort put into it to get as far into the realm of diminishing returns, so there was a much bigger payoff to be had in reducing the number of features you implemented.

You also weren't devoting most of your die to huge arrays of cache, so adding - say - more addressing modes would tend to mean you couldn't have as many pipeline stages. Any given feature will still make the overall design more complicated and so will make it more difficult to add any other feature you want, but the issue isn't as pressing as it used to be.

One area where RISC does still has a big advantage is instruction decode. When you run into an x86 instruction you have to read a lot of bits to figure out how long it is, and its not self synchronizing so you could read an instruction stream one way if you start at byte FOO, but if you start at byte FOO+1 you can find an entirely different but equally valid sequence of instructions.[1] So decoding N bytes of x86 instructions grows in complexity faster than linearly. In fact, I suspect that modern processors have to use some sort of "Guess the three most likely solutions throw out the results if we're wrong" solution for current processors to get the performance needed.

If I were to design an ISA I'd probably want some sort of UTF-8 style variable length scheme, where you can always tell where an instruction boundary is without reading from the beginning but with the space savings from having the most common instructions be shorter than the least common ones.

[1] This apparently also annoys my security researcher friend.

EDIT: Found the link to that really good explanation Mashey had on RISC vs. CISC: http://userpages.umbc.edu/~vijay/mashey.on.risc.html

anamax · on March 23, 2012

> In fact, I suspect that modern processors have to use some sort of "Guess the three most likely solutions throw out the results if we're wrong" solution for current processors to get the performance needed.

IIRC, it's way more sophisticated than that.

As I understand Intel's trace caches, their guess is basically the result of decoding the next N instructions, accounting for branch prediction.

And yes, it includes detection/recovery for writes into the instruction memory that would invalidate that guess.

huherto · on March 23, 2012

I thought this comment was cute. This person is in a conversation with two of the biggest authorities in O.S design and he asks for a reference.

  *Can you recommend any (unbiased) literature that points out the strengths and weaknesses of the two approaches?*

zurn · on March 23, 2012

Both RISC vs x86 and monolithic vs microkernel were just dichotomies of their times, after that the disadvantages and advantages melted away through changing constraints and crosspollination. Other market factors since dwarfed these technical arguments.

oneweekwonder · on March 23, 2012

I find it very strange to see no reference to Minix3 http://www.minix3.org/

Tanenbaum is putting his money where is mouth is with the project(Or at least other peoples money he acquired) and yes the traction is slow and it it might fail but I would really enjoy the day running a operating system based on a micro-kernel, that does everything Tanenbaum promises with Minix3.

pjmlp · on March 23, 2012

Well currently there are quite a few successful mikrokernel OSs, like QNX and VxWorks.

And MacOS X and Windows are actually hybrid-kernels. http://en.wikipedia.org/wiki/Hybrid_kernel

caycep · on March 23, 2012

aren't micro or nano kernels living on as hypervisors? If so, they are the backbone of all this "cloud" hullabaloo

tbsdy · on March 23, 2012

Interesting to see Torvalds' attitude when he wasn't a rock star developer...

Linus "my first, and hopefully last flamefest" Torvalds

:-)

zalew · on March 23, 2012

"PS. I apologise for sometimes sounding too harsh"

:D

benihana · on March 23, 2012

Maybe he's always been just an arrogant grad student:

>Re 2: your job is being a professor and researcher: That's one hell of a good excuse for some of the brain-damages of minix. I can only hope (and assume) that Amoeba doesn't suck like minix does.

exim · on March 23, 2012

Who said Tanenbaum was wrong?

The fact that he doesn't swear, doesn't make his arguments weak.