In QNX, everything is a message, including files. The basic primitive is MsgSend, sent to another process which has a MsgRecv outstanding. The other process sends a reply back with a MsgReply, which unblocks the MsgSend and returns data. Any amount of data can be sent. POSIX file primitives, open/close/read/write, are small functions that make MsgSend calls.
MsgSend is a more useful primitive than a file. It's a subroutine call across processes, a better fit to anything that isn't stream I/O. In most OSs, you want a subroutine call, but the OS gives you an I/O operation. Linux has to stand on its head for operations such as getting all the properties of a pluggable device, because returning a variable length array of fixed-format records as an atomic operation isn't a Linux primitive.
Microkernels have a bad reputation because interprocess communication in Mach was botched. Mach was built on BSD. To do this right, you have to design CPU scheduling and message passing together, and they need to be tightly coupled. The key to interprocess communication is arranging the normal path for a message pass to throw control from one process to another without a trip through the scheduler. Get that wrong and your microkernel will be sluggish.
sendmsg isn't a synchronous operation, and there is no lock-step operation of sender and receiver supplying and retrieving a result. Sure you can make something QNX-like, but parent's point was much more general than that: most OS APIs and primitives are not easily expressed in terms of bytestreams
Yes, and when you express them as bytestreams, you end up with framing and marshalling and serialization. People start sending JSON. Then they want to put everything in one process because the performance is so low.
Are they symmetrical philosophies? As a layman, this appears to mirror the financial debate between everything being stocks (“the quantum of an economy is a dollar of assets!”) and everything being flows (“the quantum is a transaction!”), or physics’ particle-versus-wave formulations.
You've just pushed me into 20 Minutes of silent pondering and arguing with myself. My girlfriend asked me if something bad happened because I was just staring wordlessly into the air.
Of course "vi" has all its bases covered, by alternating between the cursor being between two characters in "insert" mode, and over one character in "normal" mode:
>In insert mode, the cursor is between characters, or before the first or after the last character. In normal mode, the cursor is over a character (newlines are not characters for this purpose). This is somewhat unusual: most editors always put the cursor between characters, and have most commands act on the character after (not, strictly speaking, under) the cursor. This is perhaps partly due to the fact that before GUIs, text terminals always showed the cursor on a character (underline or block, perhaps blinking). This abstraction fails in insert mode because that requires one more position (posts vs fences).
>Switching between modes has to move the cursor by a half-character, so to speak. The i command moves left, to put the cursor before the character it was over. The a command moves right. Going out of insert mode (by pressing Esc) moves the cursor left if possible (if it's at the beginning of the line, it's moved right instead).
>I suppose the Esc behavior sort of makes sense. Often, you're typing at the end of the line, and there Esc can only go left. So the general behavior is the most common behavior.
>Think of the character under the cursor as the last interesting character, and of the insert command as a. You can repeat a Esc without moving the cursor, except that you'll be bumped one position right if you start at the beginning of a non-empty line.
>Make VIM normal-mode cursor sit between characters instead of on them
>I would really like it if the VIM cursor in normal mode could act like it does in insert mode: a line between two characters. So for example:
>- Typing vd would have no effect because nothing was selected
>- p and P would be the same
>- i and a would be the same
>Has anything like this been done? I haven't been able to find it.
>Answers:
>The idea that the cursor is always on a line and on a character position or column is inherent in Vim's design. If you were to try to change that, many of Vim's operations would behave differently or would not work at all. It's not a good idea. My advice would be that you learn and become accustomed to Vim's basic behavior and not try to make it behave like some other editor. – garyjohn Feb 5 '11 at 23:55
>What you want is not Vim, I'm afraid. – romainl Feb 6 '11 at 7:15
Not sure what API approach they use, but in the area of microkernels:
- GenodeOS is reportedly used in production by some commercial clients, though mostly undisclosed IIRC; it is a higher-level layer (~OS) compatible with numerous microkernels
- Fuchsia OS - in dev; it's a recent development by Google; as much as Google is officially silent about it (I understand they're not sure how the experiment will work out for them), observers assume it's most probably hoped to be used as a successor to Android
- Redox OS - in dev; no concrete news of mainstream usage plans I know of, but has some mindshare among developers
- Minix 3 - used in production; infamously by Intel in their IME
The Nintendo Switch's Kernel, Horizon/NX , is an example of a microkernel, tailored for their specific use-case, and is in wide use[0]. It is, sadly, closed source, however it has been reverse engineered. Their IPC API is, I believe, pretty smart and elegant:
- There is a per-thread IPC zone of 0x100 bytes. When doing IPC, the request is serialized and put into this. If bigger data than 0x100 bytes is necessary, pointers are passed around, and the Kernel maps it into the process servicing the call.
- svcSendSyncRequest is used to call an IPC. It is a synchronous API that will block until the process servicing the call replies to it.
- svcReplyAndReceive is used to receive an IPC request and reply to a request, and then wait until a new one is received. The syscalls are "merged" into a single one to avoid the syscall overhead: Almost all svcReply will be followed by an svcReceive, so merging them into a single call makes a lot of sense.
You can find more information about the SVCs at [1] and the IPC layout at [2].
A little note about this: Horizon/NX actually traces back to the Horizon OS on the Nintendo 3DS. The IPC marshalling was significantly more simple back then[1]. In all honesty, I'm not sure what made Nintendo thing the IPC marshalling on the NX was a good idea; to me it just looks like a hastily-designed mess (cf. "This one is packed even worse than A, they inserted the bit38-36 of the address on top of the counter field." on the [2] page linked by the parent comment). However, the NX incarnation was designed with "naturally" wrapping C++ methods in mind and it does a fairly decent job at that.
It's a low level RPC mechanism, it does reschedule to the new process immediately without a trip through the scheduler when it's used, it has similar blocking semantics, etc.
It's also worth following the "bus1" work for linux which may end up being quite similar as well.
QNX is mainstream as hell, just not anywhere you'd expect there to be an OS. It's an RTOS and gets used in automotive, medical, and network technologies where software latency needs to be tightly controlled. You wouldn't use it in the same place you'd use Linux anyway.
This is true. I suppose I'm just curious about more general (server/desktop/mobile) use cases, because the IPC and scheduling issues the parent mentions must be extremely difficult to predict with a wide array of workloads.
It's really not designed for those use cases though. We were using it for a GUI and or finding was that without constraining the resource usage of the GUI using Adaptive Partitioning user-driven workloads could stomp all over the lower-priority processes. This effectively meant that a poorly-configured system would eventually trip the watchdog due to user input.
QNX does include features to control this but you have to know enough about it to realize you want them. And POSIX priorities are not enough to prevent this from happening because they are solving a different problem.
You might find it useful in servers where controllable latency is more important than responsiveness, but I wouldn't use it in a user-driven workload like a desktop OS without very carefully configuring it. Out-of-the box it can be very unstable.
I remember using the single disk QNX demo back in the day and being highly impressed with the functionality the crammed into that disk. Sad that it never went anywhere as a desktop/server OS.
I ran QNX as a desktop OS for three years when I was working on a DARPA Grand Challenge vehicle. The vehicle ran QNX, so the development systems did, too. It had an early version of Firefox, the Eclipse IDE, Thunderbird mail, and all the Gnu command line tools. Sadly, QNX under Blackberry no longer offers the desktop environment. And QNX's open source/closed source/open source/closed source transitions angered the development community so much that people stopped offering QNX versions of UNIX/Linux software.
The most striking thing about running QNX on the desktop was the absolutely consistent response. It doesn't page, so everything is in memory. Going back to Linux felt so laggy.
You pay about a 10%-20% overhead cost for all that interprocess communication. It's not a big deal on modern processors, because when you send a message, the receiving process is running on the same CPU and the data is usually in L1 cache since the sender just put it there. So copying is cheap.
Incidentally, all the Boston Dynamics robots run QNX. They're coordinating all those limbs and valves from one CPU. It's not distributed; that would be too uncoordinated. Valve update is at 1KHz; balance update is at 100Hz.
> And QNX's open source/closed source/open source/closed source transitions angered the development community so much that people stopped offering QNX versions of UNIX/Linux software.
Was it ever really open? I can't find copies of any version (regardless of how old)
Well the importance of the kernel on application performance is often over-estimated, especially for non-server software.
xnu is a hybrid of Mach and FreeBSD. To be super-concrete, it has both sets of syscalls, with Mach syscalls as negative integers, and BSD syscalls as positive integers. A given running executable has a dual existence as both a BSD pid_t and a Mach task_t. At one point it was even possible to have a Mach task without a pid.
Most executed syscalls are BSD: read, write, select, getpid, etc. However Mach still plays a role in the VM system (`vm_copy`, etc.) and crucially via the IPC workhorse `mach_msg`.
It's hard to overstate the importance of mach_msg. It underlies the great macOS IPC primitives: xpc, notify(3), etc. Linux looks anemic by comparison with its limited pipes, creaky SysV sendmsg(), and DBus (hope you never have to use it). But it also appears that Apple is preparing to drop Mach: mach_msg / MIG is buried, xpc is conspicuously designed to be separable from Mach, etc.
They pulled some of the low-level I/O drivers into the kernel because the IPC overhead was too great. They call it a "hybrid" kernel rather than a pure microkernel.
To my understanding (please correct if wrong!) it's not only drivers, it's most of a BSD-style kernel that XNU runs in the same memory space as Mach. I'm not sure what difference that makes in practice versus the more conventional BSD kernels — I think a lot of IPC happens through Mach ports? … and I think Mach shows in the realms of signal delivery, what resources are waited on with `select`, etc.
>"I thought of objects being like biological cells and/or individual computers on a network, only able to communicate with messages (so messaging came at the very beginning -- it took a while to see how to do messaging in a programming language efficiently enough to be useful)." -Alan Kay
NeFS -- aka NFS 3.0 -- used a PostScript interpreter as the file system API.
Same, I think the file 'abstraction' is only valid for data flow. But whenever you have system components represented as files, you end up with a bunch of side API bolted (kinda like ioctl in linux).
Plan 9 had a different view IIRC, something like everything has a protocol.
Plan 9's rio window system expected most graphical clients to use the draw functions, which operate by writing to /dev/draw. /dev/draw wasn't even a framebuffer—it was a control file that you wrote messages into. It clearly wouldn't have met Carmack's requirements though as the thread you link to points out, the Plan 9 team would have been happy for him to map the framebuffer into the address space of his process.
If we invoke "OOP" here, why not invoke classes? Or rather "interfaces" in modern OOP-speak, or "traits" in other parts of the landscape.
There's a common trait among many objects: you can read a sequence of bytes from it. Disk files, sockets, various input devices, random number generators, etc. There's another common trait, writing a stream of bytes. Another, even more common pair of traits, is "opening" and then "closing" something.
All of them can be described using interfaces / traits to clearly communicate which sets of operations are applicable to which objects.
This e.g. can be neatly used to describe the "many things are a file" idea:
File passwords_file = Filesystem.make("/etc/passwd");
Readable password_readable = passwords_file.openRead(); // Could be openWrite().
Process zcat_process = Processes.make("zcat /tmp/something.gz");
Readable zcat_stdout = zcat_process.stdout.openRead(); // Can only be openRead().
ClientSocket sock = Network.makeClientSocket(host, port, options); // Unlike a file.
Readable sock_readable = sock.openRead();
for (Readable r in [password_readable, zcat_stdout, sock_redable]) {
byte b = r.read(); // Read one byte from each, no matter what it is.
}
But this would require a very different language, way more powerful than C (but also not C++ please). Unlike in 1969, now we have languages like that, e.g. Rust and ATS-lang. Likely we need another 10-15 years for a viable, somehow widely used OS kernel written in such a language to emerge.
The problem with classes is that an object's available methods can change over time. E.g. a file object should not have a write method after having been closed. Ongoing research is trying to fix this problem (see Scribble, mailbox types for unordered interaction), but until then I'd say it's probably best to stick with dynamic messaging.
Another approach is that object's available methods cannot change over time, but this requires move semantics: `close(file)` consumes `file`, so you can't invoke anything else on it.
Rust has this built-in (lifetimes and move / borrow semantics); I suppose C++ can model this, too.
Yes!
Changing available methods/consuming data terms are virtually the same - the former can be modelled as having methods consume their object and returning a copy of it that has another type.
My background is in distributed systems and there linearity has a big cost, which is why the research I mentioned needs to be done.
You don’t need the OS to be written in that language. It’s enough to wrap the APIs into these interfaces. CPUs are much faster these days, for many tasks modern PCs are often IO bound, count of kernel calls rarely becomes the bottleneck.
Windows does just that with their PowerShell. Here’s an example that copies a file remotely on the server:
As long as this gets type-checked at compile time, the types info can be erased during compilation and be absent at runtime.
Since traits / interfaces can't have their methods overridden, you don't need the dynamic dispatch, per-instance VMT, and such; you can have fixed offsets in a method table per class, both in userland and in kernel. (Some of the indirect calls can probably be made direct if the class is exactly known at compile time.)
Windows does it nicely: everything is a securable (with ACL) object. Process, thread, file, socket, console, service... is an object, in a namespace. The only exception I've encountered so far is network management stuff, like routing tables. WinObj is a nice tool to inspect how Windows handles this (https://docs.microsoft.com/en-us/sysinternals/downloads/wino...)
> In Windows, you have 15 different versions of "read()" with sockets and
files and pipes all having strange special cases and special system calls.
That's not correct. ReadFile can be used for all kinds of objects, including sockets: https://docs.microsoft.com/en-us/windows/desktop/api/fileapi... Yes, it behaves differently depending on the type of the object, but so does read(2) (E.g., you'll never get SIGCHLD when reading from a file, for example.)
The concepts of objects and ACLs are in the NT kernel itself. So things have worked like this on all windows versions built on the NT kernel: from WinNT 3.1, to Win2k, ... to this day. Windows TCPIP stack has improved since then, but I'd be _very_ surprised if ReadFile didn't work on sockets on XP (or anything NT-based, actually).
In any case, he's very rash to call out Windows when Linux has its share of special-purpose syscalls that do the "same" thing (e.g.: send, sendv, sendmsg, sendfile, sendmmsg, …) with myriads of options to alter their behavior.
> The whole point with "everything is a file" is not that you have some
random filename (indeed, sockets and pipes show that "file" and "filename"
have nothing to do with each other), but the fact that you can use common
tools to operate on different things.
I hear there is quite some complexity in sqlite to work around quirks of the file system API, in order to avoid data loss, yet there are still ways this can happen:
How many of those characteristics identified on that page are really shortcomings rather than, as the parent comment says, freedom-enhancing (if not performance-enhancing) features?
The workarounds seem to have more to do with SQLite's use as an embedded library than with filesystem behavior.
All of section 1 is about rogue overwriting. Absent a particularly fine-grained filesystem permission model, or enforced locking. The former might only have "freedom" implications without any performance penalty, but perhaps not the latter.
Section 2 is about locking, so see above. Even if it's advisory, that just moves any slowdown from the kernel to the application. Of course, broken locking is clearly a shortcoming, and this is one of the quirkiest (maybe newest?) aspects of the filesystem. If anything qualifies under the "bad sign" characterization, this does. However, it's also not exactly fundamental to the filesystem, as even section 2.3 describes what SQLite will do if POSIX adivsory locking is not supported over NFS (i.e. it is possible for locking to be unavailable).
The remaining sections all appear to be about bugs, either hardware, thread library, elsewhere in the kernel, SQLite itself, or in configuration/usage (disabling corruption protections).
POSIX does constrict you. That is, it doesn't let you get the full performance you could get by talking directly to the hardware. It doesn't offer abstract features that would be useful to people making rocksdb, sqlite, etc. In fact, it does a rather brutal job of constricting you.
Of course, disk drive protocols also constrict you, relative to having direct control of firmware and visibility over raw flash devices, drive head position, etc. For example, it would be neat if you could send block A and block B to the disk at the same time, only to have the disk compute the checksum of block A and put it at offset k of block B before writing block B.
The difficulty in getting things correct is a constriction for users of the API. And the overly strong API guarantees is a constriction on implementer.
POSIX I/O is like C. It's super powerful. It's everywhere. It's not going away. But it has sharp edges. And we can sacrifice a little efficiency on the small scale to grab huge gains in other areas like productivity and reliability.
I agree. In fact, I find myself looking at Classic Mac OS's resource fork and feeling profoundly sad that it feel by the wayside as *nix took over the world.
The resource fork was a beautiful idea; structured data built into the filesystem itself, and I wonder what the computing world would have looked like if it was used by the dominant operating system of the Internet instead of a sideshow.
Being able to operate on everything with the same tools is not a goal that everyone desires all the time. For instance, look at all the things with torx screws instead of phillips heads. Or, type systems in computer languages. Or, support for non-variadic functions. Or, file systems with forks.
This might just be because I'm european, but phillips head screws really are the worst. They are so easy to strip because it is so easy to use an over- or undersized screwdriver. Torx or Hex screws are good because you can't really use the wrong sized screwdriver and the screws doesn't get stripped as easily.
My read on it is that it was to avoid anything taking damage, especially the screw (and screw head, other than, perhaps, the slots themselves, insofar as they could be used for future driving purposes).
Why not? Treat the command FIFO as a write-only pipe. Query responses can come through a corresponding read-only pipe. Add an additional file to give you read/write access to GPU memory somehow.
What if you want to modify the color on vertex 473? What position in the file do you seek to? A dumb file interface would force the client to simulate the internal state of the GPU engine so it could reverse engineer the necessary commands to modify the state.
GP suggested a command queue. So the answer is that then you send another message to modify the vertex or however else you would prefer. I would imagine that VBOs could be represented as additional files and then they could be mmapped.
A file is a byte stream with some metadata - i.e. permissions. With such an API you can really do anything.
Yes, I suppose maybe you'd probably have a file per VBO or whatever. Create a dynamic one: it's like a pipe, write- and append-only. Create a static one: it's like a file, and you can mmap it (like glMapBufferData), or fseek+fwrite to it (like glBufferData), and maybe fseek+fread from it too (because why not).
The command queue would be a write- and append-only pipe, containing of state setup commands and rendering, in some format that could be efficiently transformed by the driver into the actual commands that would get sent to the GPU, since this hypothetical queue would probably be GPU-agnostic. (I think this was how D3D9 worked internally, and that may still be true for more modern versions...)
(You wouldn't have commands for modifying buffers. That would be just something you do yourself by accessing the buffers. As the creator of the buffers, you know where the data is inside them, so you'd just replace the data in question with new stuff.)
Anyway, even if in practice this wouldn't be optimal, it could probably be made to work well enough. A file-based approach is at least not ridiculous to imagine.
Writing to a (non memory-mapped) file takes a call to write(), whereas setting pixels in a framebuffer might just be a memcpy().
I think, given the post we are commenting on, it is OK to distinguish these things. Historically, framebuffers are packed arrays of pixel values, and files aren't necessarily even seekable. These are well established norms.
You could argue that the real interface for writing to an image/screen should involve two-dimensional locations, instead of a linear offset. This is what Plan 9's draw facility proposed, along with alpha compositing operators. That is certainly also a question of interface, but one at a higher level.
So I don't agree that "write X at offset Y" is the ultimate, universal highest-level abstract standard for drawing on a raw raster image. It's more like an implementation detail from my point of view, actually :D
See my other comment about Plan 9. You can certainly organize to be able to mmap a file that represents a framebuffer, but that has nothing to do with the comment I was replying to. And it is exactly the kind of thing Torvalds is objecting to, quite reasonably: if the framebuffer device file exists only to be mmapped, there's no reason for it to exist as a file.
Plan 9 used the file metaphor to achieve things like network transparency, which is a situation that precludes the use of mmap. (Though you could organize an OS to make it work that way.) If you just want to get access to a framebuffer as a memory-mapped resource, old style, there's no good reason for files to be involved.
If we posit that memory mapping is a thing that one can generally do to files, then a ton of other abstractions break: how do you memory map a socket? Not a fixed piece of a socket's buffer, but communication on the socket itself?
"But what would you _do_ with them? What would be the advantage as compared to the current situation?"
Here is one answer (2008):
"This paper presents PipesFS, an I/O architecture for Linux 2.6 that increases I/O throughput and adds support for heterogeneous parallel processors by (1) collapsing many I/O interfaces onto one: the Unix pipeline, (2) increasing pipe efficiency and (3) exploiting pipeline modularity to spread computation across all available processors. PipesFS extends the pipeline model to kernel I/O and communicates with applications through a Linux virtual filesystem (VFS), where directory nodes represent operations and pipe nodes export live kernel data. Users can thus interact with kernel I/O through existing calls like mkdir, tools like grep, most languages and even shell scripts. To support performance critical tasks, PipesFS improves pipe throughput through copy, context switch and cache miss avoidance. To integrate heterogeneous processors (e.g., the Cell) it transparently moves operations to the most efficient type of core"
I first learned of Rob Pike when starting to dabble with golang. Now I notice he's been involved in more or less everything I already use or stumble over. Busy guy!
Having everything as X isn't as flexible and performant as specialized types."When all you have is a hammer".
This results in systems being built ontop of narrow file-as-everything service, providing a wider interface to real data(events,sockets,pipes,messages,async IO).
Of course, people don't like it and build libraries to bypass it(and even the kernel layer itself https://blog.cloudflare.com/kernel-bypass/ ) just because the interface is inherently limited and inflexible.
I always wanted /dev/zero, which is used to mmap zeros into memory, to be more general and use the device minor number to define which byte gets mapped, so you could mknod /dev/seven with a minor number of 7, to provide an infinite source of beeps!
Don't they have a special purpose USB keyboard with one big easy to press squishy sculpted and iconically colored button that generates just that character?
If you really want this and do not have to be bothered about portability, just write a kernel module and load it.
It will be about 40-60 lines of code.
If I had my way you'd be able to define new kernel modules by downloading a few lines of PostScript. ;) Ok, JavaScript these days. But you know what I mean.
It's essentially RPN Lisp (with dynamic binding, via the dictionary stack). NeWS had an OOP package that used the dict stack to implement a very Smalltalk-like object system with multiple inheritance, which could dynamically define methods and properties in objects and classes, etc.
May I ask if there are experimental systems today that take inspiration from NeWS? You've blogged about OpenLaszlo, but this too is now quite old (and dead).
Every line of JavaScript and JSON that I write takes inspiration from NeWS! But that's just me.
NeWS differs from the current technology stack in that it was all coherently designed at once by James Gosling and David Rosenthal, by taking several steps back and thinking deeply about all the different problems it was trying to solve together. So it's focused and expressed in one single language, instead of the incoherent fragmented Tower of Babel of many other ad-hoc languages that we're stuck with today.
I summarized the relationship of NeWS with modern technology in the wikipedia article:
>SimCity, Cellular Automata, and Happy Tool for HyperLook (nee HyperNeWS (nee GoodNeWS))
>HyperLook was like HyperCard for NeWS, with PostScript graphics and scripting plus networking. Here are three unique and wacky examples that plug together to show what HyperNeWS was all about, and where we could go in the future!
Another thing that REALLY inspires me, which goes a hell of a lot further than NeWS ever did, and is one of the best uses of JavaScript I've ever seen, is the Snap! visual programming language!
It's the culmination of years of work by Brian Harvey and Jens Mönig and other Smalltalk and education experts. It benefits from their experience and expert understanding about constructionist education, Smalltalk, Scratch, E-Toys, Lisp, Logo, Star Logo, and many other excellent systems.
Snap! takes the best ideas, then freshly and coherently synthesizes them into a visual programming language that kids can use, but is also satisfying to professional programmers, with all the power of Scheme (lexical closures, special forms, macros, continuations, user defined functions and control structures), but deeply integrating and leveraging the web browser and the internet (JavaScript primitives, everything is a first class object, dynamically loaded extensions, etc).
Here's an excellent mind-blowing example by Ken Kahn of what's possible: teaching kids AI programming by integrating Snap! with existing JavaScript libraries and cloud services like AI, machine learning, speech synthesis and recognition, Arduino programming, etc:
AI extensions of Snap! for the eCraft2Learn project
>The eCraft2Learn project is developing a set of extensions to the Snap! programming language to enable children (and non-expert programmers) to build AI programs. You can use all the AI blocks after importing this file into Snap! or Snap4Arduino. Or you can see examples of using these blocks inside this Snap! project.
That's a reasonable point at an abstract level. But in reality, while you use `open`, `read`, `write` and `close` for files, the equivalent for directories is `opendir`, `readdir`, `closedir`.
I don't see how providing specialized open/read/close calls for directories breaks the file abstraction.
Imagine byte-level access to a directory. You'd have to have some library in userspace that would correctly be able to manipulate that directory's metadata. Now imagine doing that for various filesystems.
Plan 9's namespaces and file-servers were pretty awesome and flexible. It really opened my eyes to the generality of the file abstraction.
I do see how having a specialized set of calls for directories breaks the abstraction that directories are files. Isn't that the very definition of breaking the abstraction?
I mean, as far as I see it, the directory is still a file, of a specific format.
How is that any different than an image file, say, of a particular format? You still need specialized programs in order to manipulate them in a meaningful way. But they're just bytes. Just like the directory is just bytes.
The difference, of course, is that allowing the user to arbitrarily manipulate a directory entry at the byte-level could lead to filesystem corruption.
I'm aware I may be missing something really obvious here. Heck, even contradicting myself :)
Well, you could unify the interface to files and directories by removing all the system calls to deal with opening and closing and reading and writing files, and then removing all the system calls to deal with opening and closing and reading and writing directories, and then simply using ioctl() for everything!
The same is true with signals, timers, filesystem notifications, etc.
There are a few (redundant/duplicated with minor feature differences in triplicate, 'cause it's Linux) file-ish abstractions over parts of those concepts, but those abstractions are neither the only, recommended, or most widely used means of accessing those functionalities.
Abstraction presumes some common operations, otherwise it's not abstraction, it's just some opaque bytes. You can open/read/write/close files. May be seek. What common operations can you do with some random windows HANDLEs? AFAIK you can't even close that handle, you need to call specific function.
Windows handles can be handles to a number of different types of objects. Generic operations which can be performed for any type of handle:
* Closing the handle: Release any kernel resources associated with the the handle
* Comparing to other handles
* Duplicating, e.g. duplicating a handle to be used by another process. This allow a process to create a less privileged process and only pass handles with certain privileges to the process.
* Get/set handle flags, e-g- flag which indicates whether the handle will be inherited by child processes.
For each object type a number of object type specific operations can be used in addition to the above, e.g. file read/write for files or send/receive for sockets.
Regardless of whether a handle is for a file, a socket, window, mutex or something else, the generic functions such as close will always work.
Typically, I've found, the common operation is not so much the file/HANDLE itself, as that the file/HANDLE can generate operations that you need to wait on. But select() set the stage by waiting on the file descriptor, and poll/epoll mostly follow suit. You register interest in events on an FD. (Instead of, say, starting an operation and saying "tell me when this operation finishes"; e.g., "futures" as exposed by a number of languages.
Aside from waiting, I'd say passing/referring to the objects being operated on is also a commonality. Passing a FD to a child process, for example, or giving it (the object, be that a file/pipe/socket/timer/process/etc.) a name in the FS s.t. other things can refer to them by name. (I personally wish though that fork()/exec() had forced you to specify exactly the set of objects (files/etc.) for the child to avoid the entire multithreading/close-on-exec/atomic flag setting hell that exists today.)
I don't know that read/write are actually abstract at the level of a "file descriptor" though. Eventfds and timerfds both support them, but it feels forced. Pipes are one-way, so one of read/write don't make sense, and seek on pipe doesn't either. I think some file descriptors (child classes of a generic "object"/FD/HANDLE, such as files) support read/write, but not necessarily all FDs support read/write/seek.
My point is that HANDLE is too general abstraction. Can you wait on HANDLE that's result from CreateHeap call? Can you wait on HANDLE that's result from CreateWindow call (HWND is defined as HANDLE)? What's common between file, window and heap object?
Closing/destroying and granting (to another process) are generally applicable to about any resource held by a process. UNIXes, by contrast, expose multiple "close" system calls for non-file things, and have a largely incomplete mishmash of ways to share non-file things.
The point isn't that everything is a file; it's that you can treat everything like a file.
Of course the abstraction maps varying degrees of well, but it also simplifies the hell out of the lion's share of the ways we interact with computers, and serves as a foundation for the rest.
Than what, an entry in some table in the kernel that you can use to read bytes from the thing it represents and/or write bytes to it? That's what a "file" is in *nix.
The on-disk notion of a file sits atop layers of that abstraction. It doesn't, for example, go away when you `rm` it. An entry in a table does, but if some other process had that (disk) file open at the time, its data is still there, readable — and recoverable — through the former file's entry under that pid's /proc/$pid/fd/.
I liked the idea of filerefs for everything precisely because I hate the sockets() semantics. But, most of this is about the awful pain of the ioctl() stuff you either have to call as setup magic on the FD, or pass as setsockopts() because you cant coerce enough into the limited modality of file-like moments opening the (pseudo) file.
Really it feels more like 'what is the hierarchical structure of my namespace' because once you nail that down, the file semantics become clearer. if its async io under the class of io its open("/io/async/my-thing", ...)
So its one of those yes.. but its so hard to nail it down moments.
Sure, but that doesn't mean he doesn't occasionally make a mistake.
> In that message, he's talking about...
Yes, I know. But the larger context is Linus debunking (correctly) the unix philosophy that "everything is a file". Linus is correct: it is not the case, never has been the case, nor should it be the case, that everything is a file. But then he undermines his own argument with the (mistaken) claim that "if you can read it, it's a file." No. There's a salient difference between files one the one hand, and pipes and sockets on the other, and it has nothing to do with whether or not they have names. A named pipe is still a pipe, not a file. "Files" in /dev are not files, neither are "files" in /proc, despite the fact that they have names.
It's a detail, but an important one in the larger context IMHO.
This quibbling semantics. That's like saying a ball isn't really a ball because it's not curved at a molecular level. It's true and it'd interesting in it's own sense, but not at all relevant to the real world discussion at hand, and not a refutation of anything in context ("had me until the end...").
Yes, the wording around "file" is a bit ambiguous, because Unix has a major design philosophy that said you can treat non files as files as get a lot of mileage, but didn't call those things "fauxles" or something instead.
> Yes, the wording around "file" is a bit ambiguous
Exactly. And this can be a real problem when people are first learning about unix.
In fact, it can be a real problem even after that. Just the other day (true story) I was trying to debug some server latency issues and I asked one of our sysadmins for help. He suggested that I run lsof to help debug the problem. I told him I was pretty sure that wouldn't help because all the evidence indicated that the problem was with a rogue process, not a file, and he said, "This is unix. Everything is a file."
Everything is a file... except processes (you can't just delete a process), network interfaces (try setting an IP address with a write() or ioctl()), and plenty of other things. Unix never had "everything is a file" as its mantra, and Torvalds never introduced it.
In Unix, you can't just delete a file either! You can remove a reference to an inode from a directory, but even if no other directories have references to that inode, a process might be holding it opened, and it won't get deleted until that process dies.
My understanding of the mini-slogan "everything is a file" is that it was never meant to refer to the entire API of the object, but just reading/writing data.
And then you hit ioctls, which are necessary even by the very file-like way of
working with terminals. The slogan in the unix world never matched anything
well, or at least never since '90s.
> (try setting an IP address with a write() or ioctl())
ioctl(sock_fd, SIOCSIFADDR, &ifreq_pointer);
is a thing, although it targets a socket on the interface, not the interface itself.
Generally I think there's some cases where they try to push too much through limited interfaces, e.g. ptrace feels like "ok, we have an API already, lets push this unrelated functionality through it too so we don't have to design and add a new one"
> [SIOCSIFADDR] is a thing, although it targets a socket on the interface, not the interface itself.
If we're cheating by using file descriptors instead of files, then you
could go all the way and used AF_NETLINK sockets, they allow you even to set
routing or firewall. That's kind of not the point.
And then we go back to processes, System V-like IPC (shmget()/semget()/msgget()), and network configuration subsystem (routing, firewall, bringing interfaces up or down, and the like; netlink is only a recent development, after 2000).
Well, the mantra is of course an aspiration. In practice things are not always done that way, especially in the past. SysV ipc is hopefully a thing of the past though.
The power management system should be integrated with the file system so you can turn the power off by going "rm /dev/power". And also with the desktop so you can drag the power into the garbage can.
ps: jokes aside, I do believe that if the filesystem as machine representation was enforced and design with good sense and average joe in mind, you'd probably not even need man (pardon the hyperbole)
No, you can't say the same thing. You open it from a shell script, you literally do get something from it.
In the email you quote, Linus gives two reasons why futex and sockets should not be files:
> there's absolutely no point in opening /dev/futex from a shell script
or similar, because you don't get anything from it.
> Perhaps because you cannot enumerate sockets and pipes?
It stands to reason that the converse of those _are_ reasons to make things a file:
I can open /dev/input/$X and get something from it. The normal thing to do is open a file under /dev/input/ and read data from it. No, we don't usually do that directly, because libinput does it for us; but that's what it's doing. As an exception, if code deals with joysticks, it seems to me that it's common to open /dev/input/js$X yourself.
I can list the entries in /dev/input/ and enumerate the input devices attached to the system.
I would be terrified to ever try and contribute anything to Linux, lest Linus yell at me and call me stupid. Seems like he does that in every excerpt that gets posted here.
> Seems like he does that in every excerpt that gets posted here.
That's because only messages in which he is yelling at someone get posted here. AFAIK, most messages he posts are neutral, but these are boring, so they don't get posted here. I just scrolled through LKML and found a random example of his normal tone (https://www.spinics.net/lists/kernel/msg2852881.html):
"So none of the patches looked scary to me, but then, neither did earlier versions.
It's the testing that worries me most. Pretty much no developers run 32-bit any more, and I'd be most worried about the odd interactions that might be hw-specific. Some crazy EFI mapping setup or the similar odd case that simply requires a particular configuration or setup.
But I guess those issues will never be found until we just spring this all on the unsuspecting public.
I don't know about you, but the organizations/people that I work with would not tolerate repeated yelling/anger at people trying to do their jobs, or in this case, contribute to a free community project, let alone to the point of becoming a stain on the entire project and a frequent topic of discussion around the world.
You're quiet right that Mr. Torvalds is not in some permanent state of rage and does not respond inappropriately to all, maybe even most, messages on the lkml. But the fact remains that he and thus his project have a serious problem with hostility. I could understand flipping out on someone who was paid to do a job and didn't even try, someone who was deliberately introducing bugs, etc. But a simple disagreement between highly intelligent and dedicated FOSS developers never calls for this sophomoric behavior.
The few cases I've seen posted seem to be exactly about someone deliberately inserting bad code and MOST work is done by people who ARE in fact being paid for this mediocre work AND know better.
there is nothing fundamentally wrong with violent disagreement about technical issues, as someone who has been on the pointy end of the stick, you cant take it personally, its debate with people who have very strong feelings, about highly technical issues.
There's a form of survivorship bias at play there..."Linus responds sensibly" doesn't normally get posted to HN or anywhere beyond the actual forum where he responded, for that matter.
Note that he only yells at people who have a long track of submitting patches,
so he already has an established relationship with them. From what I hear
repeatedly, he's polite and helpful towards the newcomers.
Linus is running one of the largest open source projects, he has probably communicated millions of times, everything is out in the open, but its only the handful of times he loses his cool that it gets posted here without context, without all his previous attempts to fix the issue, with people being difficult or obtuse, without context.
This kind of out of context discussion that gives rise to comments like yours seems far worse than anything he has said.
How many CEOs have managed companies out in the open for 15 years plus without ever losing their cool? There is no way to know as there is no transparency.
This is the open source equivalent of "We need a distributed system with sharded data replication to three regions in case we hit a billion users." If I ever get yelled by Linus on anything, I'll consider it great success.
Having been subscribed to lkml for years and generally followed the discussions at least at a casual level, this really isn't an accurate characterization of how Linus treats people in general.
However, your attitude is kind of what Linus intends to instill in those easily deterred by such behavior. He's not interested in your contributions if you're going to be a thin-skinned child about the development process and ongoing maintenance. Your contributions won't be perfect, and he's not interested in having to waste time beating around the bush in telling you so.
Yeah, and when Linus gets put to pasture and betrization [] becomes mandatory, we can all stop laughing. Probably laughing won't be allowed either, anyway, too aggressive.
I appreciate the work Linus has done a great deal. Clearly he's a gifted individual, but he does come across a bit aggressive in his communication at times! It can be funny, but also somewhat offensive.
As a BDFL, it's the only way to survive, unless you're on sedatives your whole life. You very frequently come across ideas that are so unexplainably stupid that the only way to fix them is to stop them in their tracks. Opening a friendly debate about the advantages and disadvantages is just a huge waste of multiple hours, because you'll never end up convincing people that their pet ideas/projects should go back to square one (or square zero in programming). His aggression has turned contributers away from the project and made negative relations, but the end goal is to maintain the Linux operating system, and his aggression, strictness, and blows to self esteem have had a proven effectiveness 10x more than soft discussions.
Think about it this way: A dictator has to be extremely assertive because if he said "No, I don't want that in Linux" with no basis, he'd eventually lose his dictatorship power as people would form their own "unions" to try to change things, making several directionless forks of Linux. If he instead said, "This is a completely idiotic approach and anyone that thinks it's okay is braindead," nobody would question it and most people would just take his word for it. Most heated debates have an equal number of supporters for each side, and it takes someone with a strong voice to remind them that one decision must be chosen regardless.
You skipped the middle ground between "no basis" and "ad hominem is the only basis".
Lying about things and people is a bad way to force people to accept decisions. Calling someone or something stupid isn't a technical argument, it's a poorly worded conclusion.
Rusty Russell and Linus had a long working relationship at that point and Rusty was quite amused by that line. He quoted it in the talk he gave on futexes.
Hypothesis: the reason Linus Torvalds is still BDFL for Linux, but Guido van Rossum recently announced he is stepping down as BDFL of Python, is that GvR is just a lot nicer as a human being. Being a BDFL either requires getting a lot of abuse and not responding in kind, or being such an irascible boor that it doesn't bother you.
Just a hypothesis; I've never met either one and never been a BDFL either.
I've always wondered what sad life Linus leads to make him communicate like every developer I've worked with that I don't want to work with. Clearly he has deeper issues that lead to his blooming sarcasm and condescension, that our community already struggles with. This behavior is not normal, nor sustainable.
Yikes, I'm disturbed that occasional outbursts of righteous anger would make someone start assuming serious psychological problems. That's concerning to me - and I do not think I would be comfortable working with people who are so deeply offended by an angry email.
God help you if you'd ever grown up working on farm equipment.
We were all 26 once. Professionalism builds successful teams long term. Calling out toxic behavior that worsens things, like others imposter syndrome, is part of how we help things get better.
Some of the other comments here have mentioned, but it bears repeating: this is not the "normal" Linus. He doesn't rant 100% of the time, but when he does, you had better pay attention because it must be something very important.
I don't understand this justification. If someone I worked with had a habit of "occasionally" insulting people, they should be fired. Just because we can't fire Linus doesn't mean what he does is okay.
Some people are in the 99.99th percentile of being irritating and proposing bad ideas, such that virtually nobody on earth has enough diplomacy to stifle their temper when having to deal with them for extended periods.
Most people are never forced to deal with enough people, to regularly be forced into extended contact with this particular kind of person. The kind of person who is forced to deal with enough people that they are likely to inevitably, eventually have to work with such people, is what we usually refer to as "a politician." But that's not quite accurate.
See, elected politicians learn to hide their distaste for such people, and that's how we get our sense of what it means to "act like a politician." But founders, kings, and other such benevolent-dictator-for-life are still politicians—they have to hold court taking the full spectrum of opinions from the brilliant to the idiotic—but they don't usually bother to hide this distaste.
We have no equivalent word for someone forced to do what a politician does, but without the same incentives to put on a face and pretend they aren't hating some parts of their job.
No, just the opposite: we fire (i.e. not re-elect) people with power for being assholes whenever we can, even if they’re perefect at their jobs otherwise and their attitude isn’t actually hurting anybody (including the people they are asshholes toward—usually those people got to where they were in life, proposing horrible ideas like they do over and over and over again, by being completely immune and oblivious to criticism.)
Hypothesis: governments and other such bodies could be a lot better-run if we just had to choose the least charismatic/diplomatic qualified person possible, because then that lack of charisma would prevent them from swaying us with anything besides facts; and then, later, nothing they could do could make anyone like them any less than they already do, so they wouldn’t have to worry about maintaining their image getting in the way of doing their job.
His issues are pretty clear. A constant stream of (and let's be frank here) bullshit he has clearly outlawed in the project.
At the end of the day, most of the angry rants are about high level or influential people breaking kernel mantras. Don't break them, especially when you should know better. I'm impressed by his level of self control honestly, when I imagine the constant stream of horse shit he probably has to deal with on the daily.
One thing though is that the “nice” maintainers of popular open source projects burn out (see Guido’s recent retirement as BDFL). Linus seems to be going strong. I would rather have a mean, effective Linus than a nice, burned out Linus.
Yes, this is absolutely the reason. Open-source software maintainance burns out people daily, and most can only tolerate <4 years. 20+ years is godlike, and can only be achieved if you have the world's longest patience, or correctly direct your madness on exactly the people who caused it. It's the only way. All maintainers do it to some extent, depending on how far separated they are with their contributers. Linus works very closely with them, which is why he experiences so much BS firsthand.
MsgSend is a more useful primitive than a file. It's a subroutine call across processes, a better fit to anything that isn't stream I/O. In most OSs, you want a subroutine call, but the OS gives you an I/O operation. Linux has to stand on its head for operations such as getting all the properties of a pluggable device, because returning a variable length array of fixed-format records as an atomic operation isn't a Linux primitive.
Microkernels have a bad reputation because interprocess communication in Mach was botched. Mach was built on BSD. To do this right, you have to design CPU scheduling and message passing together, and they need to be tightly coupled. The key to interprocess communication is arranging the normal path for a message pass to throw control from one process to another without a trip through the scheduler. Get that wrong and your microkernel will be sluggish.