For people who have a hard time seeing the image, here's the text of the comment:
Rob Pike - It's a different kind of mess, and for different reasons, but the Unix/POSIX/Linux systems of today are messier, clumsier, and more complex than the systems the original Unix was designed to replace.
It started to go wrong when the BSD signal stuff went in (I complained at the time), then symlinks, sockets, X11 windowing, and so on, none of which were added with proper appreciation of the Unix model and its simplifications.
So let the whiners whine: you're right, and they don't know what they're missing. Unfortunately, I do, and I miss it terribly.
Of all the things he mentions for sure signals are the worst thing. To do such a simple thing of delivering a tiny bit of information to a process the implementation and semantics of signals is so bad that you have to use atomic variables, deal with interrupted system calls, write reentrant functions, and so forth.
He wasn't complaining about signals. Signals were in the original Unix. He was complaining about "the BSD signal stuff", which changed the semantics of signals incompatibly, in order that a program could possibly handle externally-sent signals without race conditions, and also added a bunch of signals for goofy stuff like SIGWINCH.
Signals are just a software interrupt mechanism, doubling as a way to kill runaway processes. All the things you are complaining about are problems with interrupts in general. Yet all computers since the 1960s still have interrupts, because they allow you to do things you can't do without them. And that's still true when you're talking about signals. Signals allow you, for example, to implement preemptive multithreading (with SIGALARM) or transparent transactional persistence (with SIGSEGV) or buffer overrun checking (ElectricFence, with SIGSEGV) in a userspace library.
However, they've clearly been expanded far beyond what they are necessary or even good for, and lamentably are still not usable for the primary use of hardware interrupts: I/O.
My major concern is that there is a huge difference between the actual usefulness of unix signals, that is mostly limited to killing processes or signaling tiny bits of information (like when you send a signal to reload a configuration), and their semantical complexity.
I must admit it is not trivial to replace this system with another equivalent without changing a lot of how Unix works. For instance signals are used in order to interrupt a program: if you don't make this trappable (like kill -9) then programs can't recover or prepare in any way before quitting. An alternative can be to just allow a signal handler to fire that will never return.
From this point of view you could even just have two signals, with the only goal of stopping processes, one in an hard way (SIGKILL) and one in a soft way (SIGINT), firing a signal handler (but the process will terminate when the handler returns). This way you don't need handling the interruption of syscalls nor writing "safe" signal handlers. As you said signals are interrupts. If the interrupt handler can never return a lot of problems disappeared.
All the other tasks currently performed with signals should use a more general and saner message bus, with select(2)able file-alike interface.
Plan 9 uses a system called notes in place of Unix signals. It's pretty much the same idea, but instead of sending an integer you can send arbitrary strings by writing to /proc/n/note. Non-trappable messages are sent to /proc/n/ctl.
This is a more flexible system than Unix integer signals, but in practice the extra flexibility isn't often used, as a process can simply mount a ctl file into the namespace or post a pipe in /srv.
I always thought it'd be cool if all programs implemented external apis this way. Kind of like how applescript works, but, for a mail program, for instance, write "Hacker News" to /proc/n/query and it'd return the data. Or write 'message 1' to /proc/n/open and it'd open or return it.
You wouldn't put those sorts of things in /proc, but that's basically how things work. And since it's a filesystem, you can export your API effortlessly over the network.
It's funny that you should bring that up... During this whole back-and-forth I can't help but think that Apple's libdispatch solves most of the issues brought up by both sides:
* Use a dispatch queue and dispatch_async a block to handle computation. Since the queues work from a shared thread pool and are managed at the kernel level, a long-running Fib calculation won't hold up other blocks from processing on any concurrent queue (it would still hold up a sequential queue...obviously).
* Use dispatch_sources to do your IO. Now you don't have to worry about clumsy callbacks, and your IO operations are perfectly happy co-existing with your long-running computations.
* Use dispatch_read/dispatch_write to persist to disk and you don't have to worry about different latency and throughput characteristics of files vs sockets
...and to the point about signals:
* Use a signal dispatch_source to do things in response to signals that you would never have thought possible. This is possible because libdispatch sets up a trampoline to catch the actual signal and then turn around and queue your dispatch_source block. It's not a complete panacea, but it's close...
Internally, libdispatch is built on a system call named "kevent" (part of FreeBSD, first introduced in 4.1, meaning it was introduced in 2000). This system call will block while waiting for an event.
When a system call blocks, it is subject to a signal interrupting it, in which case you need to check for the EINTR error condition to restart the system call manually, unless you have SA_RESTART set on your signal handler /and/ your system call is one of a small set of "standard I/O" calls (note: kevent is not on this list[1]).
Unfortunately, looking at libdispatch's source code, one will note that none of the usages of kevent() are correctly handling EINTR. I noticed this, for the record, while debugging the "dispatch_assume_zero(k_err)" that I had determined was being hit while running some (buggy) code of mine inside of libdispatch (it checks for EBADF, and that's it).
tl;dr signals are insidious and libdispatch didn't care ;P
Epoll is limited to file descriptors. Kqueue also supports waiting on process events (process exit, process fork, process exec, etc), on filesystem events, and a bunch of other things. Epoll also does not support batch updates and requires multiple system calls to apply updates for multiple file descriptors, while kqueue supports batch updating and polling in a single system call. All in all kqueue is just superior to epoll. See also http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod for various ways in which epoll is broken (search for "The epoll mechanism deserves honorable mention as the most misdesigned of the more advanced event mechanisms")
Not to mention that kevent contain the number of bytes to be read, or the number of connections to accept, which can avoid you a system call that returns EAGAIN in a non-blocking socket.
Plan9 was a project that intended to be more Unix than Unix. For example, when Unix says "Everything is a file... except sockets, windows, etc." Plan9 says "no really, everything is a file." Here is a list of things that are not in Plan9 http://c2.com/cgi/wiki?WhatIsNotInPlanNine Oh, and did I mention Pike was one of the original leaders of the Plan9 project?
Ultimately, today's UNIX is not that far from this ideal. In Linux, you can make pretty much everything be an fd: signals can be delivered over a descriptor (signalfd(2)), filesystem change notifications can be delivered over a descriptor (inotify(7)), and so on. Everything ends up being something you can pass to select (or a variant), and so you don't really need to handle the different concepts in different ways: a modern Linux application reads from a bunch of fds and takes action based on what it receives.
Work is being done in this direction, with mount and UID namespaces. The main concern is that allowing users to make these kinds of bind mounts may confuse legacy suid apps.
Which really is the problem with modern Unixes -- legacy features that expect things done in a broken way, leading to complexity and ugliness where none should exist. The new features and simplifications continue to exist in a complex world, have arcane restrictions and interactions with the old system, and generally end up increasing the complexity, instead of making things simpler.
No, that's the problem with software designed to work in the real world. Software has to work in a legacy environment to be useful. This is why Unix gets used and Plan 9 doesn't. It's also why TCP and the Web, just to name two examples, survive.
I'll quote jwz: "Convenient though it would be if it were true, Mozilla is not big because it's full of useless crap. Mozilla is big because your needs are big. Your needs are big because the Internet is big. There are lots of small, lean web browsers out there that, incidentally, do almost nothing useful. If that's what you need, you've got options... "
Even in 6th Edition Unix, devices were files in the filesystem as well. And in 6th Edition Unix, you could open and read a directory as a file; that was how you found out what its children were. There was no EISDIR. I believe this behavior persisted until at least SunOS 4.
Adding sockets and windows outside the filesystem is one of the things Rob is specifically complaining about here, and in fact complained about at the time as well.
The philosophy is that everything "looks like" a file, and this achieved through the standard file API (open/read/write/close). AFAIK sockets can be used with this API, as well as directories under Linux. The abstraction leaks quite a little bit sometimes, though.
It's DRY writ large. With that one command you get free text history for every program you use. You don't have to include history handling in each program individually.
Doug McIlroy (inventor of pipes, original author of diff) was on my undergrad thesis committee. I think he would agree with Rob here.
While the design of these additional features violates the "UNIX way," it doesn't violate pragmatism. Too often in our field, perfect is the enemy of good enough. Is BSD's model and implementation of sockets perfect? Surely not. Is it good enough? From the purists perspective, maybe not. From the pragmatists perspective, absolutely. I probably wouldn't be typing this today (on my macbook pro) without the implementation of BSD sockets.
I recall this missive from Linus Torvalds on the design of Linux:
"If you want to see a system that was more thoroughly _designed_, you should probably point not to Dennis and Ken, but to systems like L4 and Plan-9, and people like Jochen Liedtk and Rob Pike. And notice how they aren't all that popular or well known? "Design" is like a religion - too much of it makes you inflexibly and unpopular."
That is a somewhat silly comment from Linus given that ken and rob designed and built Plan 9 together. (For example, ken designed UTF-8 while rob helped write the code: http://doc.cat-v.org/bell_labs/utf-8_history )
OS X design is still a horrible mess, a monolithic BSD kernel bolted on top of a monstruous Mach 'micro'-kernel. Not to mention things like the new XML-based init system, property list files, hacks around 'extended attributes' (or whatever they call them) and many other aberrations.
Property list files: a big f*n win compared to the ad hoc mess in a Linux/BSD /etc directory.
Hacks around extended attributes: any reason not to like those? Or just because in 1977 a file was just a file, and that's the way it should be, god damn it?
XML-based init system: a sane init system. And XML added in for standardization.
OS X is a mess in several ways, but those are not it. And the "monolithic BSD kernel bolted on top of a Mach 'micro'-kernel" sounds like a win-win situation. Monstrous why? Because it doesn't fit some idealistic model?
Last year I even discovered a kernel bug in their file descriptor passing implementation that, AFAIK, still isn't fixed. Various signal handling properties are more buggy than on Linux.
TCP_NOPUSH is broken, too - if you enable it and then later disable it with setsockopt, it doesn't immediately send any pending data (as it does on FreeBSD, and as TCP_CORK does on Linux).
IIRC, a lot of their code came from FreeBSD back in the day. I wonder why they didn't keep it in sync?
It's a mess because there are several ways to do the same things, and no two things are done in the same way. It feels like any time I want to do something new in OS X I need to learn a new system of some kind. That is bad design. It should just be obvious.
If "obviousness" is the criteria for a well-designed OS, I'm not sure we've seen a (successful) well-designed OS in the history of computing. OS X certainly doesn't deserve to be called out. :)
Actually, I think it does. OS X is a horrendous frankenstein of several different systems with different design philosophies. It is really, truly awful. (Although it works - I'm using an OS X machine to type this now.)
One of the great things about Plan 9 is that everything implements the same interface. If you can interact with one thing, you can interact with everything. Using new parts of the system becomes obvious because you already know the interface.
A concrete example of why this works: In Plan 9, process information is available via (guess what) the file system, so the Plan 9 debugger just reads the state of running processes from those files. Because file systems are automatically exported over the network (also part of the file system), you can debug a running process on another machine without the debugger knowing anything about the network. Nobody had to implement network debugging - it just worked right away because of good design.
You know you have a good design when these kinds of complex behaviors just "fall out" without any additional work.
On the other hand, signals were just a poor design. They are much nicer now on Linux with signalfd(2), which gives you a file descriptor to read them on, a much nicer interface than all the old syscalls for signals.
Well, I'm not sure signalfd(2) was feasible back when signals were added. Was it?
Today it's usual to structure your main() as some kind of select()-like loop; even if you don't need one for your main workflow, at the worst you can spawn an I/O thread and put it there. But back then, you didn't have threads. Lots of programs didn't read your input but still wanted to catch signals, e.g. to exit cleanly on kill -15. Many others read input, but without an event loop - they would just try to readchar() periodically when they had nothing else to do.
An occasional readchar could be replaced with an occasional read to a signalfd. The problem is blocking calls, you need to not block for too long in case a signal arrives, which is more messy it is true.
Blocking calls are not the problem - you simply block on select({stdin, signalfd}) instead of blocking on read(stdin). The problem rather is slow IO calls that nonetheless never block - the canonical example being a read() from a disk file. In this case (eg. /bin/tar) you would have to poll your signalfd after every read() - but this breaks the file abstraction for programs like /bin/cat that don't care whether the file descriptor they're reading from is the blockable type or not, so they'd have to do both.
This is now starting to look considerably inelegant, and we haven't even talked about implementing the equivalent of synchronous signals provoked by a program's own action (SIGILL, SIGFPE, SIGSEGV, SIGBUS...).
I can't help but juxtapose this current dialog (which now includes one of the Unix forefathers) with the idea of "Worse is Better" (http://www.jwz.org/doc/worse-is-better.html). Maybe it's the Jersey in me but at the end of the day working, shipped software is all that concerns me.
The interesting bit here(I don't wanna say "problem"), is that we're actually a few steps further down this staircase of "let's ship". Unix wasn't exactly the purist ivory tower to begin with, then it merged with the partially compatible BSD, then we got some not-quite-Smalltalk GUIs on top of that, and now we're building elaborate web stacks on top (and/or instead) of that. And I don't even want to talk about mobile apps that in an almost unholy ceremony combine that again…
It's a Babylonian tower made of mud (some would say camel poo). But it's brightly painted and has a good view, and all your friends live in it.
Personally, I don't wanna move out either, but I still have to turn my head every time you see the paint come off somewhere…
I'm not technical, but even I can appreciate some of the very many layers beneath my web-browsing.
But, taking me typing into this text-box on HN as an example: What would the Unix way be? (If every tool does one thing and does it well, with text as input and output, and pipes to join it all together.)
Would I really have an unholy long command-line of a bunch of tools piped together (but accessed by clicking an icon)?
I'm really not the expert to ask here, my personal style is a bit too tainted for the hardcore V7/P9 fans. But it's interesting to think about, so I'll give it a try.
Let's cut to the chase: Why would you have a textbox and a specific site for what amounts to a simple discussion? There's really no big conceptual reason why we couldn't do this via email, nntp or some similar protocol.
Leaving that aside, the more difficult question is how you'd get the textbox on your screen, i.e. what's the "True Unix" way of "web browsing"? There aren't that many examples of a rather graphical, highly interactive programs that believers would classify as really Unix-like. Maybe something roff-like, where you have a pretty universal display language and different (server-side?) tools are used to create something that would be too hard to express in the language as is (cf. tbl, pic), but then how would you make that interactive without doing the same stuff as HTML/CSS/JS? I was quite fond of the concept of NeWs, where you'd distribute your application over the net as PostScript, and I think Pike's Newsqueak went in a similar direction.
And how would you handle things on the server side? I don't think it would be that much different from what we're doing now. HTTP is quite resource-oriented and thus maps closely to a file system (the path-like nature of URLs is no accident). A CGI/PHP model for simple "files" would suffice, and you could have an almost arbitrarily complex application that appears as a file system, just like you can have that now appearing as a bunch of HTTP resources. People wrote "big" C applications, even though they could've theoretically done it all with a bunch of shell scripts, awk and ed. The "one thing and one thing only" mantra never was that religiously adhered to.
So, in conclusion, I don't think we're that far off right now, especially on the server side. If you look at what Pike's complaining about, it's mostly how you do it. Quite often the wheel is needlessly reinvented or has too many spokes, it's not that driving somewhere is wrong.
I'd argue that our current system is closer to "Unix" than it would be too Lisp Machines, Smalltalk and other more homogenous systems.
No, the web server (one tool) would instantiate another tool (e.g. python/ruby/arc parser) which renders the HTML page pipes it back to the server which sends it to you. You fill out the form, press submit, the web server passes pipes those parameters back into the python/ruby/arc parser which ... etc.
BSD sockets has somehow made it's way through the entire computing world and we're never going to get rid of it. Even the tiniest of embedded systems have decides that sockets are the one-true-way of network programming.
This means that for example, Plan 9 applications that do networking are not tied to a specific network stack, you can mount multiple network stacks concurrently, you can have 'virtual' network stacks (for example running in user space and proxying to a remote host, or doing other neat tricks), and the apps don't need to care.
Also when Ipv6 was added to the Plan 9 stack, no application code had to be modified, because the API nicely abstracts network addresses.
What's the alternative? OS/2 had nice features; it got crushed. BeOS had some nice features; it got crushed. (Yes, I'm aware that they're probably still around in some OSS version.) Plan9 keeps being mentioned, but do regular users understand it? (Is there a "this is why we do things, and this is why that's good" written for people who haven't written their own compiler?)
Are people working on experimental new OSs to "fix problems" in existing systems, or have we gone way past the point of no return? Is there any work on new microprocessor architecture? How much stuff in my modern OS is there because of legacy 8086 stuff?
I think, but I'm not sure, that the length of time it's taken people to get (for one example) IPv6 rolled out shows that change is not likely.
> Plan9 keeps being mentioned, but do regular users understand it? (Is there a "this is why we do things, and this is why that's good" written for people who haven't written their own compiler?)
This will give you a good overview of the basic design decisions in the system and their rationale, for further details on how Plan 9 deal with issues from toolchain design to authentication and security see the rest of the papers: http://doc.cat-v.org/plan_9/4th_edition/papers/
They are a wonderful read even if you never touch Plan 9, they are full of insights, ideas and criticisms of existing approaches, and many even include discussion on how to apply them to existing nix systems (sadly most of this has gone almost completely ignored by the nix community).
>> Plan9 keeps being mentioned, but do regular users
understand it? (Is there a "this is why we do things,
and this is why that's good" written for people who
haven't written their own compiler?)
> Yes, you can start with the main Plan 9 paper:
http://doc.cat-v.org/plan_9/4th_edition/papers/9
So, he asks about regular, ipad-toting, angry-birds-playing, how-do-I-turn-on-my-printer users and you link to a technical paper containing buzzwords such as "compilers", "internet gateways", "distributed systems", "POSIX", and "remote procedure calls". Do you see the problem with this?
The problem is that he thought the "regular users" referred to people who use the operating system itself rather than your iPad users.
Do your regular iPad users understand Unix, iOS, BeOS, or OS/2? No, they don't. They understand point and click interfaces on top of them. The operating system is irrelevant to people who want to Play Angry birds. Users who can't turn on their printer are irrelevant to a discussion on the merits of one operating system over another. The user interface on top the system and its programs is a separate topic.
As a programmer, I can make my programs do whatever I please. If I want to autosave all of my documents, I can write an Emacs script to save after every hundred keypresses. If I want greater security, I can write a tool to encrypt my files and network traffic, and a firewall to prevent unwanted connections. If I want to sync my files over the net, I can write a dropbox-clone. If I want to get better battery life, I can set my disk to spin down, reduce my screen brightness, or disable wireless. I am also more likely to know about existing versions of these specialized tools, because it's me and my friends who build them.
But as an "iPad user", I don't know any of that. If the feature isn't included in the OS, "turtles all the way down" and enabled by default, then I will never even know that such a thing is possible (and even if I did, wouldn't know how to get it).
So it's really exactly for the non-expert users that OS development is so important. Everyone else (and by that I mean the minority), is informed enough to figure these things out no matter what OS you give them.
Actually, he asked about people who haven't written their own compiler. I believe there's a large middle ground between people who've written their own compiler, and the users you're describing.
(a) He is qualified to speak about these issues in so far as he is the designer of node.js and is a programmer.
(b) Rob Pike's comments are a "big deal" because Rob Pike is one of the "original neckbeards" that worked on Unix, Plan 9, and has recently been developing the Go language at Google. i.e. He helped developed the OS that Ryan is complaining about.
(c) "right or wrong" depends heavily on what you are trying to accomplish. Much like anything in the real world, there is no right or wrong answer.
'Rob Pike is one of the "original neckbeards" that worked on Unix'
Do you know that 'neckbeard' is a slur? It refers to someone who tries to grow a beard but can only grow hair on below their jawline, which is typical of adolescents.
FWIW, Rob has never had a beard. Ken, dmr, and bwk all have/had full beards, not neckbeards.
I am kind of surprised to find myself writing this post, but here I am.
I think "neckbeard" may have gotten confused with "greybeard," which is used to refer to the elders of ages since past, whom have bestowed upon us mortals the eternally fecund gifts of unix, posix, C, etc.
You forgot BeOS, which lives on in Haiku. One of my favorite "alternative" operating systems of the time. I still wish I had more hours in the day so I could hack on Haiku once in a while.
This is not sarcasm by the way. I like being nostalgic and apparently I like system related stuff as opposed to web programming. But keep doing what y'all did.
Symlinks are bad. They break the natural semantics of a hierarchical tree-like filesystem, and turn it into a messy half-assed graph.
What happens if you move a symlink? Will it still point to the same object? That depends on whether it's relative or absolute. What if you move the directory containing it? Do you have to recursively check for the correctness of links every time you operate on a directory? Sounds unreasonable to me... What if it links to a location which isn't mounted? Whats the definition of '..'? If you cd into a linked directory and do an 'ls ..', should that list the parent of the target or the link? How would you implement that?
The point is, symlinks are a messy kluge which probably wasn't thought out very well. You can look at what Rob Pike has to say about it himself on http://cm.bell-labs.com/sys/doc/lexnames.html
You have file at /some/path, then you put newer version of that file to /some/path, replacing the old file, and the symlink still points at the same address, and it doesn't matter what actually happened with the file itself.
It's useful enough to warrant its own abstraction.
I can understand why he wouldn't like symlinks. They weren't introduced until 4.2BSD and they seem like a bit of a hack. Compare them to hard links. If you move the destination file the hard link remains, but the symlink is broken and there's no simple way to figure out where the destination file was mved to. Also hard links include reference counting so you know that someone is pointing to your file. Not so with symlinks. Also, hard links mirror the permissions of symlinks.
Note that the history shows up in the ln command which originally only created hard links, and then got the -s parameter when symlinks were introduced.
Could be worse. Could be Windows desktop shortcuts. Fuzzy logic of target resolution? Now that's messy.
Symlinks may be ugly and/or dangerous relative to the original file system concept, but they solve problems. They are not "all good", but life would be worse without them. E.g. - application by application implementation of location aliases a la IIS virual directories (or whatever they called it) for web applications.
I tried using hard links back in the late nineties when I was new to linux. Instead of creating a second link to the file it copied the file to the new directory. I discovered the problem when I modified one copy and found the second copy, which was supposed to be a hard link, wasn't changed. I don't know what went wrong, but with symlinks available, I just never bothered trying hard links again.
ADDED: Note that I avoided hard links after that, because I was worried about what would happen if I modified one file and deleted it, thinking it was a link to another, when it was actually a copy. A person could lose a lot of work if that happened. At least with a symlink, I know that something is or is not a link.
That's rather the point. When presenting a view of the file system under some particular directory, it's nice to not have to hard-link every file into place, and instead include wholesale directory trees. That's my most common use case for symlinks.
It's mitigated in that you can't hard link a directory (except in recent MacOS). So I suppose you would always know the real cwd of your shell, for instance.
Every command that operates on files needs an extra flag to tell it whatever it should follow symlinks, or operate on the target, or on the symlink itself, etc.
Any program that manipulates or traverses file trees is affected, and a few more, in an OS that is built around the concept of hierarchical file systems, this is quite a few.
>Symbolic links make the Unix file system non-hierarchical, resulting in multiple valid path names for a given file. This ambiguity is a source of confusion, especially since some shells work overtime to present a consistent view from programs such as pwd, while other programs and the kernel itself do nothing about the problem.
etc etc.
filesystems are a mess, and it'd be great if something could fix them.
He said they weren't added with proper appreciation of the Unix model and its simplifications. So he could mean that symlinks can be implemented in a more Unixy way?
Utilizing single purpose programs with a highly unintegrated scripting language breaks the whole idea of do one thing and do it well as there are now n things that do the same thing but with different trade offs. We have n flags, massive unmaintainable messes of bash, posix sh and other variants. Sun's shot with a 'managed' os with javaos was more than ahead of its time.
I would guess the problem with sockets is in the whole creation process. Instead of just calling open() on a magic pathname (like how you deal with devices in Unix), you call socket(), then bind()/listen()/accept() on the server side, or connect() on the client side. Luckily they didn't totally screw it up, and it's just a regular file descriptor after that point... except on Windows, where they did screw it up.
Rob Pike - It's a different kind of mess, and for different reasons, but the Unix/POSIX/Linux systems of today are messier, clumsier, and more complex than the systems the original Unix was designed to replace.
It started to go wrong when the BSD signal stuff went in (I complained at the time), then symlinks, sockets, X11 windowing, and so on, none of which were added with proper appreciation of the Unix model and its simplifications.
So let the whiners whine: you're right, and they don't know what they're missing. Unfortunately, I do, and I miss it terribly.