Writing an OS in Rust: Async/Await

jerf · on March 30, 2020

Is async/await a good idea for an OS kernel, even a toy one? Cooperative multitasking tends to break down at scale, because the probability that all the "threads" you're cooperating with are playing nice goes to zero as the number of threads increases. An OS will tend to have a concentrated number of the pathological cases in it as it deals with hardware and all the other hardest timing & concurrency problems.

It's a viable option for user-space programs because you can far more tightly characterize them and program them from top to bottom to work with that paradigm nice. Embedding islands of cooperative multitasking in a sea of pre-emptive multitasking seems to make a lot more sense than the other way around.

However, this post is a question, not a statement. If, for example, a Linux kernel developer posts "nah, it's no biggie, the Linux kernel is effectively structured the same", for instance, I would not quibble.

jcranmer · on March 30, 2020

There's a few different pieces to your question.

Async/await is a programming paradigm that moves the state management of an asynchronous program into the compiler rather than being explicitly managed by the user. The paradigm is almost universally a win: if you don't want to write asynchronous code, just don't use it; it doesn't impact your code at all.

A second question is how the language actually drives the execution of asynchronous tasks. Most of the languages that have added async/await have incorporated it into their execution model, so you are forced to use the particular semantics of the language. There are subtle, but very important, differences here: for example, when you actually provide the answer that resolves the await, do you execute the waiting code immediately, or schedule it for a future iteration? This is what can cause the most problems with async/await, but Rust does not define any executors [1], so it's not an issue here.

The final question is if asynchronous code at all makes sense for an OS kernel. I'd argue that the answer is very much yes: communication with devices are almost invariably asynchronous (you set some memory lines to tell the device to do something, and get an interrupt telling you it's arrived). Particularly for the filesystem, where there are both synchronous and asynchronous system calls for the same code, a good executor for asynchronous code could well be beneficial for simplifying code paths. (Note that filesystems also generally have some degree of task dependency requirements--you need to ensure that the various requests happen in a particular partial order).

Now do note that this requires a far more complex executor model than most async/await toolkits (including this tutorial) provide.

[1] Logically, it would make sense to provide a utility function that synchronously runs an async function, but this isn't implemented yet.

jerf · on March 30, 2020

"Now do note that this requires a far more complex executor model than most async/await toolkits (including this tutorial) provide."

I'm assuming the async/await language support in Rust is intrinsically tied to a cooperative approach, which is the part I'm questioning. Obviously a kernel needs to be able to generically work in an asynchronous fashion... the question is, is this asynchronous fashion appropriate?

If the Rust async/await support is indeed so flexible that with some work you could tie it together with a pre-emptive runtime of your own custom creation (as would make sense for an OS kernel), or some intermediate support that may not be "pre-emptive" but would be fully immune by-construction to the sort of issues I'm worried about, then, yes, by all means use it.

But a fundamentally cooperative kernel would really worry me. You'd be looking at freezing the whole kernel, and likely the whole machine, if anything, anywhere, goes wrong with the cooperation. It won't take a lot of scale before that's a problem.

And that's the worst case. Kernels strike me as very likely to also have to deal with the starvation situations that cooperative multitasking can have, where one "task" never gives up control. Even if this doesn't freeze the machine, this can cause significant, human-visible latency issues to emerge.

jcranmer · on March 30, 2020

The Rust async/await amounts to the following:

* Magic keywords (i.e., async and await) to manage the state machine aspect of the function for you.

* A standard interface for describing an asynchronous task (core::future::Future). The magic keywords create an implementation of this interface.

* An interface for saying that task progress can be made (core::task::Waker).

That's it. There is no runtime provided, not even optionally, in the standard library (let alone core). A fair amount of this post is giving instructions for how to build the runtime to execute asynchronous tasks.

One important thing to note is that Rust uses a polling-based approach to implement tasks. That is, the core interface to a task is to run it and have it return the result or indicate that is waiting on something else. In the case of the latter, there is a callback mechanism (the Waker object) that the task is responsible for calling when it has more information to make progress--and the caller is responsible for calling the task again when that happens. Many (most?) other implementations instead use a resolution model, where the caller provides a function that is called with the result, and when the function gets called (particularly in the case of tasks that can execute quickly) differs from implementation to implementation. Rust's model definitely maps much more easily to a kernel, since the dominant pattern will be to drain a buffer, poke a device, and then wait for an interrupt to happen (the interrupt handler calling the wake function).

Matthias247 · on March 31, 2020

> Rust's model definitely maps much more easily to a kernel, since the dominant pattern will be to drain a buffer, poke a device, and then wait for an interrupt to happen (the interrupt handler calling the wake function).

What is nice about Rusts model is that it prevents the continuation from running inline the event handler which signaled the completion. That avoids a ton of reentrancy issues and questions around "where does my code actually run"? That indeed makes it also interesting to use for interrupt handlers, since it is guaranteed that the code which waits for the interrupt to happen runs purely in the executors thread and will not accidentally take over the interrupt handler.

yazaddaruvala · on March 30, 2020

At a high level, your concern is that OSes should not use cooperative multitasking for OS Processes. i.e. using Rust's async-await as the only technology within the OS Scheduler would be a mistake.

The article isn't discussing schedulers, but async-await within the OS codebase in general. Forgetting the scheduler, for example, Interrupt handling and driver code is generally async in nature, and at a base level needs to be cooperative.

    while let Some(byte) = keyboard.interrupt().await() {
        ...
    }

In general, certain drivers also need pre-emption, however, given all of the code is written and curated by the kernel devs, they can ensure the code cooperates well.

fluffything · on March 31, 2020

> I'm assuming the async/await language support in Rust is intrinsically tied to a cooperative approach, which is the part I'm questioning.

That assumption is incorrect. The blog post explains this in detail better than anyone could here - and if that does not suffice, the Rust book and the async book, as well as the Fuchsia kernel documentation, also all go into this.

> If the Rust async/await support is indeed so flexible

It is indeed even more flexible than that. You don't even need a run-time to drive progress. You can manually drive progress of async tasks by polling them throughout your code at specific points if you wanted - essentially interleaving the async processing of tasks with your main thread's task (and well, you can build any sort of executor you want, with different priority levels or whatever you feel like doing). I have an HPC app that does this to drive MPI non-blocking communication.

The blog post explains all of this. It's pointless for you to invest a lot of your and others time into speculations built on top of incorrect assumptions, when the first 2 minutes of actually reading the blog post clarify this for you.

spease · on March 31, 2020

> But a fundamentally cooperative kernel would really worry me. You'd be looking at freezing the whole kernel, and likely the whole machine, if anything, anywhere, goes wrong with the cooperation. It won't take a lot of scale before that's a problem.

This is already the case, isn’t it? If you do a ‘while(!operation_complete) {}’ anywhere in the kernel and it never happens then I’d expect it’s pretty much done.

Whereas with async/await it’d be more likely that the coroutine would simply never resume. It might wind up being a memory leak, yeah, but not freeze your system.

afiori · on March 31, 2020

The parallel case with async functions should be a long running computation that does not contain any yield point, so that it cannot be suspended easily. I remember that it is possible to construct (contrived) examples of this also in Erlang.

kbenson · on March 30, 2020

> If the Rust async/await support is indeed so flexible that with some work you could tie it together with a pre-emptive runtime of your own custom creation

I think that's exactly what it is. AIUI it's basically an API for async/await (and maybe a default runtime?) that different back-ends can be plugged into. At least that's what I've come away with over the months of reading random bits about it here.

Gladdyu · on March 30, 2020

As described in the blog post, the rust compiler generates a state machine for every async function and generates the appropriate poll methods for the state transition. This is fundamentally at odds with the preemption, which would then have to indroduce new intermediate states into the diagram which it won't be able to do at runtime.

kbenson · on March 30, 2020

But if Rust is the operating system/kernel, whenever it decides to to schedule something is preemption for anything downstream, right?

I mean, you don't actually use preemption in the kernel right? Don't you have to handle all that yourself, since there's nothing higher level than you to handle it for you? In that respect, doesn't plugging in a Futures runtime that looks for device flags and stuff as appropriate and flags tasks to wake up/finish accomplish the same thing? (those are actual questions, not leading statements)

Gladdyu · on March 30, 2020

If you would write a basic scheduler, at some point you'd have to await the userspace code but you wouldn't have any way to force it to stop running. If the userspace code would enter an infinite loop it would hold the kernel thread forever. Within a constrained environment, eg. the kernel itself (and even that's sufficiently complex with loadable drivers that you might end up with bad interactions) I could see some use for async await, but you'd still need to be able to preempt untrusted code.

afiori · on March 31, 2020

As far as I understand the scheduler and userspace processes are going to be completely orthogonal features to this. In general would it be complex[1] in this case to integrate both preemptive and cooperative multitasking?

[1] as in: http://journal.stuffwithstuff.com/2015/02/01/what-color-is-y...

toast0 · on March 30, 2020

> Is async/await a good idea for an OS kernel, even a toy one? Cooperative multitasking tends to break down at scale,

> Embedding islands of cooperative multitasking in a sea of pre-emptive multitasking seems to make a lot more sense than the other way around.

In my reading of the post, the proposal is to do cooperative multitasking within the kernel, and preemptive multitasking between the kernel and the userspace. I think this is tenable, within the kernel, you pretty much need to play nice with the rest of the kernel anyway. Most kernels don't have effective protection for one part of the kernel causing problems with the rest; although, some do have limited protection against drivers that crash or loop.

geogriffin · on March 30, 2020

well.. there is CONFIG_PREEMPT and the even broader CONFIG_PREEMPT_RT in linux [1].

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

RossBencina · on March 31, 2020

> Is async/await a good idea for an OS kernel, even a toy one?

You might want to look at Joe Duffy's posts about Microsoft's Midori OS. In particular the post "Asynchronous Everything":

http://joeduffyblog.com/2015/11/03/blogging-about-midori/

Their findings sound pretty compelling to me. Personally I'm convinced that eventually we'll see much more system-level use of async/await style mechanisms. Perhaps with Rust, perhaps with C++ once coroutines land in C++20.

pjmlp · on March 31, 2020

You can already use C++ coroutines with C++/Winrt, if feeling like having a go at them.

ww520 · on March 30, 2020

An OS has many aspects to it. Scheduling tasks is only one of the aspects. Async/await is just the interface/mechanism in dealing with the tasks, in this case cooperative tasks.

Interacting with hardware, dealing with interrupts, memory mapping/managing, task isolation, etc are all other aspects of an OS that are apart from task scheduling but still needed.

Cooperative multitasking works fine as long as the users/developers understand the limitation.

jerf · on March 30, 2020

"Cooperative multitasking works fine as long as the users/developers understand the limitation."

The problem is, we basically already know that is not the case. We've got years of experience with what "cooperatively multitasked" OS looks like, and it was not a "works fine" situation. You can't understand the limitations at scale as they interact with each other far, far beyond the human mind's capacity to understand.

bitwize · on March 30, 2020

All I/O is asynchronous. Or rather, synchronous I/O is a subset of asynchronous. Often, in kernel I/O systems, you are communicating with, say, a disk controller that will take some time to process your request. You will want to do other stuff while the device finishes and process the results only when the device signals success or failure.

jfkebwjsbx · on March 30, 2020

Async and sync I/O are not subsets of each other, although you can use one to implement the other.

The issue OP is discussing is cooperative vs. preemptive, not async vs. sync.

Matthias247 · on March 30, 2020

> Embedding islands of cooperative multitasking in a sea of pre-emptive multitasking seems to make a lot more sense than the other way around.

Pretty much my opinion. I think a preemptive threaded environment where some threads might run multiple async tasks (in order to boost efficiency) can make a lot of sense.

Pure cooperative environments seem too error-prone for bigger systems, because they lack the necessary isolation between tasks. Any bad task can degenerate the performance of all others tasks by blocking the execution thread.

briangold · on March 30, 2020

True for a general-purpose OS kernel. But in an embedded system or unikernel it could make a lot of sense.

afiori · on March 31, 2020

Also in a microkernel architecture possibly; if the scheduling were cooperative within each process and preemptive between processes that would look like it could make everyone here happy.

monocasa · on March 30, 2020

Coopertively scheduling across your other isolation boundaries is a pain point, but it's a great architecture for within a single context (ie. within the kernel, or within a single process).

Linux has a lot of cruft and isn't as async as some would like, but NT is pretty async underneath it all (although they still allow you to do syncronous work as well).

Rusky · on March 30, 2020

I'm not a Linux kernel developer, but it does have a few similar mechanisms AFAIU. If a driver needs to perform more work in response to an interrupt than would fit into the actual handler, it can use cooperatively scheduled "tasklets" or "softirqs."

CuriousSkeptic · on March 30, 2020

Probably not at all related to the OP. But I found the thought entertaining, so offering it as such.

One way to address you concern would be to guarantee that task always terminate in a bounded amount of instructions.

One way to solve that would be offer a limited buffer in which the program instructions for the task might be expressed, and a language to express them in that is strongly normalizing (think something like Morte)

elihu · on March 31, 2020

The Linux kernel makes a somewhat similar distinction between code that's executing in an atomic context and code that isn't. Atomic code can get stuck in an infinite loop or whatever, but what it isn't allowed to do is to go to sleep (i.e. put itself into a sleeping state and invoke the scheduler).

Atomic tasks are generally expected to do a finite amount of work that can be finished in a short time without waiting for anything else (except maybe a spinlock, which should be held by some other similarly-atomic task that will finish in a short time).

yencabulator · on April 2, 2020

The seL4 microkernel, written in C, has formal proofs of the maximum clock cycles for all of its syscalls. That's one way they manage to not need in-kernel preemption.

pjmlp · on March 30, 2020

Yes, Active Oberon already had it in the mid 90's.

Symbian also had it, both called them active objects.

imtringued · on March 31, 2020

>Cooperative multitasking tends to break down at scale, because the probability that all the "threads" you're cooperating with are playing nice goes to zero as the number of threads increases.

That's completely wrong. Either cooperative multitasking works or it doesn't. There is no probability involved. What often happens is that foreign code that is not under your control is not yielding. It only takes a single non cooperating task to block an entire core. If you have properly written code the probability of a malfunction is 0% and this probability doesn't grow with the number of threads.

skyde · on March 31, 2020

you can have a kernel thread pool dedicated to one kernel feature. This way only task created by this kernel feature will run on this pool and compete together .

steveklabnik · on March 30, 2020

This is also just a fantastic introduction to async/await in Rust, regardless of the OS bits. Another amazing article by Phil.

> The only requirement is that we use at least nightly 2020-03-25 of Rust because async/await was not no_std compatible before.

There's some fun stuff here that's omitted (which makes sense, of course). It was always a design constraint of async/await in Rust that you could use it without the standard library. However, the initial implementation required thread local storage. The future trait's poll method took a &mut Context, and generators didn't support passing context in to them when resuming. This meant that the context would be placed in TLS, and would pull it back out when needed. Generators are unstable, partially for this reason. However, this was fixed, and that means TLS is no longer a requirement here! See https://github.com/rust-lang/rust/pull/68524 for more details.

m0th87 · on March 30, 2020

Is it possible to context switch between userspace processes directly, without going through the kernel, i.e. a kind of fast, inter-process cooperative multitasking? I know earlier operating systems used inter-process cooperative multitasking, but I'm guessing they still went through the scheduler?

I was trying to figure out if QNX does this with `MsgSend`, as QNX is renowned for being a fast microkernel, but it wasn't clear to me. According to Animats, "There's no scheduling delay; the control transfer happens immediately, almost like a coroutine. There's no CPU switch, so the data that's being sent is in the cache the service process will need." [1] According to wikipedia, "If the receiving process is waiting for the message, control of the CPU is transferred at the same time, without a pass through the CPU scheduler." [2]

It seems like `MsgSend` circumvents the scheduler, but does it circumvent context switching to the kernel entirely too?

1: https://news.ycombinator.com/item?id=9872640

2: https://en.wikipedia.org/wiki/QNX#Technology

retrac · on March 30, 2020

It's not possible without specific hardware support.

Context-switching on most current platforms with virtual memory requires fiddling with things like page mappings, and that simply must be done from supervisor mode.

The ability to switch tasks in hardware has existed on some historical machines. For example the GEC 4000 series basically implemented a microkernel, including the scheduler, in hardware. Switching to another process was done with a single hardware instruction.

I don't think anything current has such features besides the long-obsolete and unused taskswitching feature on x86 processors.

cwzwarich · on March 30, 2020

No, it doesn't circumvent context switching to the kernel. With memory protection and separate address spaces, you generally need privileged execution to change page tables. However, the raw CPU cost of a syscall exception to the kernel followed by an exception return to userspace isn't actually all that high compared to all of the work that an OS usually does on top of it.

bregma · on March 30, 2020

MsgSend() in QNX Neutrino does not circumvent the kernel, just the scheduler. If you're clever and lucky you can get an entire round trip to and from the service in a single timeslice. On the other hand, MsgSend() always blocks, so the call could be preempted depending on service priority or if there is an I/O wait or whatever, and that would mean the scheduler comes to play.

yencabulator · on April 2, 2020

In the vaporware Mill CPU design, yes. On x86, afaik hard no.

zackmorris · on March 30, 2020

I've shied away from async/await because I haven't seen a good writeup on how to make it deterministic. Come to think of it, all of the times I've encountered it, there was no way to analyze the codepaths and prove that every exceptional situation and failure mode was handled. Maybe I missed something along the way?

So my feeling about it is that it may turn out to be an evolutionary dead end, or anti-pattern at the least. My gut feeling is that async/await is functionally equivalent to the the synchronous/blocking/coroutine/channel system of other languages like Go. Could we write a thin wrapper that converts async/await to that or vice versa?

This is the primary reason why I've stuck to synchronous/blocking PHP with all of its flaws instead of Node.js. I think this is a fundamental thing that shouldn't be glossed over and accepted so readily into other languages.

steveklabnik · on March 30, 2020

What do you mean by "deterministic" here?

> Come to think of it, all of the times I've encountered it, there was no way to analyze the codepaths and prove that every exceptional situation and failure mode was handled.

There is nothing special about async/await with regards to this in Rust, at least if I'm understanding you correctly. Async functions can return Results like any other function for recoverable errors, and can panic like any other function for un-recoverable errors.

> My gut feeling is that async/await is functionally equivalent to the the synchronous/blocking/coroutine/channel system of other languages like Go.

It depends on what level of abstraction you're talking about. For Rust, which cares a lot about details, they're not. I gave a talk comparing and contrasting all this stuff here: https://www.infoq.com/presentations/rust-2019/

zackmorris · on March 31, 2020

Well, I'm coming from the Javascript side, where people use promises a lot for async, but it's almost impossible to trace execution in the debugger. It's usually immediately obvious to me that many exceptions and failure modes have been missed, but I find it difficult to reason about large promise chains and I usually have no idea where I would add more error handling. It tends towards a big ball of spaghetti.

Then if something does fail (wifi blips out without a retry, who knows) the page just hangs with no indication of what went wrong.

Contrast this with Unix-style synchronous blocking code piping data between threads with no shared state. Since every step of execution happens in a single thread, blocking until a pipe returns data, it's trivial to step through the code.

Async/await really becomes a problem in Javascript though because they made it a language keyword in Node.js. So you never know if you are dealing with a sync function or async function. Eventually the entire program has async in front of everything and ends up being written just exactly like sync blocking (but with superfluous syntactic sugar everywhere). So I question what was really gained there. IMHO it just doubles the workload in the mind since now we have to juggle both types of functions and explore those permutations. That's the last thing we need when we're trying to brainstorm a solution from a large possible search space.

I'm also looking at this from a one-shot functional programming perspective. I realize that sync blocking code blocks the main thread, which is usually the GUI rendering loop. There are ways around that though. The best solution I've seen so far is Apple's Grand Central Dispatch:

https://en.wikipedia.org/wiki/Grand_Central_Dispatch

https://www.raywenderlich.com/5370-grand-central-dispatch-tu...

https://www.swiftbysundell.com/articles/a-deep-dive-into-gra...

Basically it works by being sync blocking and providing simple ways to run closures on other threads. I find it much easier to debug than async/await though. Rust probably already has all of that functionality, so I don't understand the advantage of copying Javascript model with all of its caveats. I think what's happening is that people just want to run with the context they're already familiar with.

steveklabnik · on March 31, 2020

Ah, so, async/await, while conceptually similar, isn't the same across languages. Rust's semantics here are very different than JS's. I think Rust's implementation would address all of your issues, at least conceptually.

> I don't understand the advantage of copying Javascript model with all of its caveats. I think what's happening is that people just want to run with the context they're already familiar with.

With respect, I think it's because you don't understand it in Rust yet. It was not directly copied from JS, and was certainly not due to familiarity. Read the post! It's very good :)

jcranmer · on March 30, 2020

There are two important things that are often conflated with async/await: the programming model, and the executor model.

The programming model of async/await is essentially syntactic sugar for a state machine, where each of the states are determined implicitly by the site of the call to await, and the ancillary storage of the state machine is managed implicitly by the compiler. It's basically no different from a generator pattern (and is usually implemented on top of the same infrastructure) [1].

Since the result of calling an async function is a task object that doesn't do anything until you call methods on it (including implicitly via await), most languages integrate these task objects with their event model. This model is completely orthogonal to the feature, and often exists (at least in the library, if not the language) before async/await is added to the language. JS has a built-in event loop that predates async/await by several years, whether programming in client-side browsers or using something like Node.js. Rust is unusual in that it prescribes absolutely nothing for the executor model; all it defines in its standard library is the interface to drive the tasks manually.

I doubt async/await is an evolutionary dead end. It's the third kind of coroutine to hit it big in mainstream languages, the first two being the zero-cost exception handling mechanism (i.e., try/catch) and the generator (i.e. yield). The biggest criticism of zero-cost exceptions essentially amounts to their types not being made explicit, but the design of async/await makes their effects on the type of coroutine quite apparent, so it's not going to run into the same problems.

[1] There is a difference in the pattern, which is that you're probably going to get back different kinds of objects implementing different interfaces, which is a bigger deal if you manually drive the interfaces. Structurally--that is, in terms of the actual machine code running--there is likely to be no difference.

zackmorris · on March 31, 2020

Thank you, that was very insightful. I've felt that way about generators, but it never occurred to me that try/catch could be viewed similarly to a coroutine (because it's basically using goto to jump to alternate states too).

Edit: I forgot to mention that I came to your same conclusion a few years ago about async/await being equivalent to a state machine. It happened a different way for me, when I realized that a coroutine performs the same computation as a state machine, but is much easier to trace. I was working on a game, and made a rather large state machine with perhaps 20-50 states and some side effects where those states interacted with game state. But it simplified down to just a few lines of ordinary flow control code when I tried it as coroutines with message passing. So to me, async/await carries the burden of state machine complexity but loses the simplicity of coroutine (and sync/blocking obviously) traceability. So I just don't see a compelling reason to use it.

lmm · on March 31, 2020

async/await-based code is deterministic until you deliberately introduce the ability for tasks to race. Look at Noether for a language design that explicitly makes those steps. Whereas if you have channels (and don't have some kind of single ownership) then you have nondeterministic races right from the start. So I rather doubt that any such thin wrapper could be formed.

Certainly as a human reader, Future-based code (which async/await is sugar for) is the easiest form of concurrency/parallelism to reason about: yes you have small tasks that might execute in parallel, but all your functions still return values, function composition still works the way you'd expect, there's no way for parallel tasks to silently interfere with each other in some hidden way. The control flow is still what it looks like (unlike with general coroutines). Of course if you can afford to avoid parallelism entirely you probably should, but async/await is the least intrusive/most understandable way to manage it if you do need it.

jfkebwjsbx · on March 31, 2020

A state machine's control flow is even easier to follow, although more verbose.

lmm · on March 31, 2020

I disagree, I find state machines almost impossible to reason about. Has it already been through state X? Who knows. Will it come back to this state in the future? Maybe. Is there a route from state Y to state Z? Shrug. State machines are effectively goto writ large, and there's a reason we switched to structured programming.

xedrac · on March 31, 2020

Well-designed deterministic state machines are incredibly easy to reason about. You simply render them as a graph and at a glance you can understand every possible transition and state.

> Will it come back to this state in the future?

If there's a path from the current state back to itself and it gets the correct inputs, yes. Otherwise no.

> Is there a route from state Y to state Z?

A simple glance at the graph will suffice to answer this.

Of course poorly written code, regardless of the constructs, will be hard to reason about.

lmm · on March 31, 2020

> You simply render them as a graph and at a glance you can understand every possible transition and state.

But programming by editing graphs never caught on. So the graph is always going to be a secondary representation of your program; it won't be the natural view in your debugger or profiler (if it's even visible at all).

And more importantly, no-one's found a really compositional approach to programming with graphs yet. In conventional programming you can define an interface that exposes one function, and maybe you have an incredibly complex implementation behind that interface, but you can reason about the rest of your program without having to look inside that black box, even as the implementation changes drastically. I've never seen the same approach applied to state machines: maybe you know that this cluster of states is independent now, but anyone can add a new state transition that changes the global control flow willy-nilly.

zackmorris · on March 31, 2020

I wish I could upvote you 100. State machines are fine up to some number of states, on the order of 10. Beyond that, they become incredibly difficult to reason about in real programs because each state may have side effects with program state.

What ends up happening is that exceptional situations happen and the state machine becomes muddled with special cases. So for example, say you're writing a networked game of Tetris or Pacman, something with a relatively small state machine. You get all done but realize that you forgot to handle people joining or leaving the game, or the game closing or jumping ahead when a remote player beats the level and the game needs to go back to the title screen. The state machine starts getting all of these checks interspersed with the game logic.

Typically this doesn't happen with sync/blocking code though. You write an ordinary main loop, adding checks for these exceptional situations just like any other event you watch for. In my experience, the sync/blocking code is roughly an order of magnitude smaller and easier to reason about than a state machine. But more importantly, it can be scaled infinitely, because you can always add tracing or step through it in a debugger as you normally would. But a large state machine is just exactly what you described: a large switch command of gotos. You can't just look at it and know where the code came from or what conditions got it there. You have to derive that history by hand in your head.

And since async/await is more conceptually equivalent to a state machine than a coroutine or sync/blocking code, that severely limits how well we can reason about large promise chains. See my comment to jcranmer on this thread for a bit more insight about the internals of this:

https://news.ycombinator.com/item?id=22731043

lmm · on April 1, 2020

> And since async/await is more conceptually equivalent to a state machine than a coroutine or sync/blocking code, that severely limits how well we can reason about large promise chains.

I have to disagree with that. You can implement async/await using state machines, just as you can implement a loop using goto, but it's a more constrained model that's much easier to reason about than a fully general state machine or a fully general coroutine. The control flow still proceeds the way you expect - you still execute code from top to bottom, you still return once from every function call - you just yield at some points in it. You're not interleaving your control flow with some other function that you communicate with (like a generator), and you're certainly not treating suspended control flow as something to be copied or passed around (like a coroutine). It's a much simpler concept.

ahasani · on March 31, 2020

What do you think about hierarchical state machine [1] ?

>Has it already been through state X?

Like everything else, logging

>Will it come back to this state in the future?

Mentioned in the comment before, there is a transition graph that you can do static analysis on, note that you might run into a halting problem

>Is there a route from state Y to state Z?

There is a transition graph

>State machines are effectively goto writ large...

quoting Prof Carl Hewitt of Actor Model, "goto is harmless"[2], function/procedure call is a goto, much like sending an named event while in particular state with/out parameters

[1] https://en.wikipedia.org/wiki/UML_state_machine

[2] https://www.youtube.com/watch?v=7erJ1DV_Tlo&t=34m54s

note: edited for formatting

lmm · on March 31, 2020

> What do you think about hierarchical state machine [1] ?

Would need to see what the implementation actually looked like and how it was worked on. The graphs may look nice, but I've never seen people program effectively by editing graphs.

> Mentioned in the comment before, there is a transition graph that you can do static analysis on, note that you might run into a halting problem

Sure, but the graph seems to be very much secondary. You don't step through the graph in a debugger, for example.

> quoting Prof Carl Hewitt of Actor Model, "goto is harmless"[2]

Not at all convinced, and he fails to make any real case for it. I've worked on actor systems, and they're bad in the same way that unstructured code is bad.

> function/procedure call is a goto,

It's not, because you still have the call stack. You know where you came from and why (and can see it in the debugger), and you know you're going back there. You can reason compositionally, because in g(f(x)) f is a black box and always will be even if it gets refactored, whereas there's no equivalent "locality" in a state machine.

> much like sending an named event while in particular state with/out parameters

That also sounds like a bad way of doing things.

ahasani · on March 31, 2020

Interesting points, thanks

> Would need to see what the implementation actually looked like and how it was worked on

Examples are but not limited to Simulink, Rhapsody, Unreal Blueprint, QT SCXML

>The graphs may look nice, but I've never seen people program effectively by editing graphs

graph mentioned is a state transition graph, not necessarily a visual graph, although its a nice eventuality

> You don't step through the graph in a debugger, for example.

Yes you do, if using the mentioned example

Combining a state machine with actor model has been a work wonder for me. I have to agree with you on locality in pure state machine, but if a state machine is defined as an Actor you get locality by definition. Its more of a deterministic abstract modelling, testing and execution

lmm · on April 3, 2020

> I have to agree with you on locality in pure state machine, but if a state machine is defined as an Actor you get locality by definition.

Not really. You have a kind of binary version of locality - whether something is in the same actor or not - but you can't reason about any region that's smaller or larger than that. Incoming messages might arrive from anywhere, and emitted messages might cause any kind of effect anywhere (including other messages back to the sending actor). So you can't really understand any region larger than an actor without having to understand the whole system - and that's before we even get to passing actor addresses around.

eximius · on March 30, 2020

What do you mean by functionally equivalent?

Channels are just queues and passing methods. Rust style futures are closer to continuations. Both are powerful and can accomplish neat taska, but not ABI compatible or even used the same way.

michael_j_ward · on March 30, 2020

> Channels are just queues and passing methods.

My understanding is channels in `go` are 0 sized by default- and thus turn into "rendezvous channels" that can be used to synchronize progress between threads.

This article compares Go and Rust threads and includes an example of how go uses channel's to synchronize. [0]

[0] - https://medium.com/@deckarep/paradigms-of-rust-for-the-go-de...

ww520 · on March 30, 2020

Async or parallel systems are deterministic when the multiple paths don't cross, and when crossed the crossing points have well defined deterministic steps and interaction.

ccktlmazeltov · on March 30, 2020

This is a huge issue when trying to fuzz async code also.

staticassertion · on March 30, 2020

Dropbox does, essentially, fuzzing of Rust async code and it is determistic.

I couldn't find a post about it, sadly, but I know the information is out there somewhere.

ccktlmazeltov · on March 31, 2020

source? Can't find anything on that either

staticassertion · on March 31, 2020

https://dropbox.tech/infrastructure/rewriting-the-heart-of-o...

> The Control thread is designed to be entirely deterministic when its inputs and scheduling decisions are fixed. We use this property to fuzz it with pseudorandom simulation testing. With a seed for our random number generator, we can generate random initial filesystem state, schedules, and system perturbations and let the engine run to completion. Then, if we fail any of our sync correctness checks, we can always reproduce the bug from the original seed. We run millions of scenarios every day in our testing infrastructure.

animalnewbie · on March 30, 2020

I am no expert and I wish there were a start from 0 guide with examples not involving external crates but my sense is that you can write your own executor and do whatever you want. (Maybe I'm wrong?)

steveklabnik · on March 30, 2020

You're not wrong: that is basically what this article is, and it includes writing an executor!

nindalf · on March 30, 2020

The linked article finally helped me understand how Rust implements async. Give it a shot.

fbhdev · on March 31, 2020

Aside from the cooperative multitasking discussion, I thoroughly enjoyed how you discussed the rust async/await concepts in detail and learned a lot from it. I find your article providing a lot more value than the rust documentation, which is unfortunate since that makes the learning curve somewhat difficult.

phil-opp · on April 1, 2020

Great to hear that! I also found the official documentation a bit short on background information, so I decided to write my own explanation rather than link to something existing. Given that the async/await implementation is still quite young, I'm sure that the official docs will improve over time.

amelius · on March 30, 2020

An OS with async/await sounds awfully similar to Windows 3.1 with its cooperative multitasking model.

What's old is new again ...

steveklabnik · on March 30, 2020

Well, there's a big difference:

> Using async/wait, we now have basic support for cooperative multitasking in our kernel. While cooperative multitasking is very efficient, it leads to latency problems when individual tasks keep running for too long and thus prevent other tasks to run. For this reason, it makes sense to also add support for preemptive multitasking to our kernel.

> In the next post, we will introduce threads as the most common form of preemptive multitasking. In addition to resolving the problem of long running tasks, threads will also prepare us for utilizing multiple CPU cores and running untrusted user programs in the future.

It seems the plan here is to use this for internal work, and still give a preemptive multitasking API to userspace.

jklehm · on March 30, 2020

Isn't that driving against the lesson learned from the past though? 3rd party drivers can not be trusted to behave and hang the kernel when doing so with cooperative multitasking.

steveklabnik · on March 30, 2020

I am not sure what Phil's overall kernel design is; you're assuming a monolithic kernel, but not every kernel has drivers in kernel space.

jklehm · on March 30, 2020

Fair point, looking forward to following along :)

mtnygard · on March 30, 2020

Phil's series is meant for learning rather than production use. So sometimes he takes a "shortcut" in one post then removes the restrictions or workarounds later.

eximius · on March 30, 2020

Toy OS, so he has less to worry about. But also maybe just be more careful around kernel modules, if that eventually exists?

Koshkin · on March 30, 2020

GP is not wrong, then: with this implementation, async/await still amounts to the good old cooperative multitasking and does not (yet?) take advantage of threads. (In C#, for example, async/await is indeed different from this, being based on TAP - Task-based Asynchronous Pattern, where tasks are executed "asynchronously on a thread pool thread rather than synchronously on the main application thread.")

steveklabnik · on March 30, 2020

Async/await lets you build tasks, but is completely orthogonal to threads. You could set up tasks with a 1:1 or N:M or even single threaded model.

And yes, the GP is not wrong that if this were the only mechanism provided, it would be similar. It does not seem like the plan is to expose this externally whatsoever, though.

hope-striker · on March 30, 2020

Didn't most old operating systems use cooperative multitasking? I remember, at least, that classic Mac OS (i.e. pre-OSX) didn't use preemptive multitasking, either.

Anyway, this SO answer[0] explains why early Linux, much like the hobby kernel in this article, used cooperative scheduling inside the kernel, and only preempted user-space stuff.

[0]: https://stackoverflow.com/a/16766562

bestouff · on March 30, 2020

MacOS may have been a little late. Windows NT, OS/2, Linux did preemptive multitasking since beginning of 90s, even AmigaOS had it in the 80s.

saagarjha · on March 30, 2020

Mac OS badly needed preemptive multitasking (and memory protection!), and got it when the NeXT bits were transformed into Mac OS X.

renox · on March 31, 2020

Thanks for the link!

I don't understand why this would be different for a Rust based OS..

dfox · on March 30, 2020

There is one significant difference: Windows did context-switches in the blocking calls and did not rely on the program code having "the right structure" needed for straight co-routines to work.

The difference between preemptive and cooperative multitasking is not whether you do full context switches, but whether there is a way to do context switch at a point where the process does not expect it (ie. by handling some kind of timer interrupt as scheduling opportunity).

amelius · on March 30, 2020

Does Rust allow a computational for-loop to be interrupted somehow?

Computation can also be viewed as a blocking operation.

steveklabnik · on March 30, 2020

The only yield points are .await points. If there's an .await in a loop, then sure, but otherwise no.

amelius · on March 31, 2020

What is the cost of an .await point?

For this to work, perhaps Rust should cooperate too, inserting .await points at strategic places in the code, to keep cost low but still guaranteeing a certain responsiveness of the overall system.

jfkebwjsbx · on March 31, 2020

That would explode the state machine size and make its performance characterization very hard.

And you would need to avoid having any kind of call to external code (or yield before and after every such call, and pray for the best).

mwcampbell · on March 31, 2020

If I understand the futures::pending!() macro correctly, one can use that to add a forced yield point in a loop, right?

steveklabnik · on March 31, 2020

Yes, it creates a future, and then immediately .awaits it.

dfox · on March 31, 2020

That is interesting point that I originally wanted to mention. There are systems that take the middle ground approach of using something that the program has to eventually execute if it is running indefinitely as scheduling opportunity. Typically an backward jump or something that allows you to do backward jump (ie. function call), this is the case for erlang, SOAR (which was recently mentioned on HN) and interestingly enough Ericsson AXE, which is to large extent erlang-like VM implemented in hardware (which predates erlang).

AnIdiotOnTheNet · on March 30, 2020

Well, I'm not well versed in the details, but I do wonder if cooperative multitasking isn't a better idea nowadays in the age of ubiquitous multicore.

bluejekyll · on March 30, 2020

As Steve’s sibling comment points out, this is only for the kernel. The article mentions preemptive threading for the user space as still being desirable.

AnIdiotOnTheNet · on March 30, 2020

I'm not convinced that's actually true though, is what I mean. The big drawback of cooperative is just that a single process can monopolize resources, which in the singlecore days mean the system became unresponsive and you had no way to recover from it. That isn't really true now with multicore. As long as the OS has a core to work on it can terminate or otherwise deal with misbehaving processes.

Again, this is based on my admittedly limited understanding, but I haven't yet seen good reasoning why pre-emptive is still preferred.

Bigpet · on March 30, 2020

> Error: you cannot open another Chrome tab because all your cores are already used up by Slack and VSCode

You would probably still want to preempt, because you're not going to rewrite all the widely used software to actually yield.

Because that's the thing about cooperative multithreading, the participating parties need to cooperate.

And if you look at the amount of threads that some software open it's just crazy. Looking on my machine the top5 are:

1. 260 threads System (ok, that's basically the OS that could be changed).

2. 145 threads Dropbox.exe (that's enough to lock all cores on any single consumer CPU).

3. 87 threads SearchIndexer.exe (also OS, could be rectified).

4. 74 threads EpicGamesLauncher (looks like an i9 is no longer enough)

5. 70 threads MemoryCompression (also OS)

I know not all of those threads are active at all time an I guess all the blocking threads could be said to be cooperating but even Dropbox alone regularly has 3-5 threads running.

There's still so many 2 and 4 core consumer machines out there that cooperative multi-tasking with the current software ecosystem would be a disaster.

Not only would applications keep each other from making regular steady progress but even single applications would regularly hang themselves from all the threads they themselves created if there was no pre-emptive yielding beyond using any OS-call as a yield-point

jcranmer · on March 31, 2020

And what are those threads doing all the time?

On N-core, 2-way SMT hardware (which describes pretty much all consumer hardware), the maximum amount of threads that can be usefully doing work in parallel is between N and 2 * N. Any more threads than that, and it must be the case that you expect those threads to spend a lot of time doing nothing to justify the cost of their creation.

There's two main categories that can justify that. The first is threads that are spending a lot of time stuck in blocking I/O--i.e., they're calling things that exist as yield-points in practical kernels anyways. The second thing is some sort of event processing loop, which looks like kind of like this:

  while (!shutdown) {
    event_t *e = get_next_event(queue);
    if (!e) { sleep_until_work(queue); }
    else { execute(e); }
  }

The call in sleep_until_work already today uses a syscall to indicate that the thread shouldn't be scheduled. But even get_next_event can likely be trivially modified to use a syscall to add an event loop. Since multiple threads are able to enqueue events (else who would fill in work while you're sleeping?), you need some sort of threading library support to implement get_next_event correctly. Change the equivalent of pthread_condvar_wait for your OS to introduce a yield point, and you'll have introduced yield points to the vast majority of applications.

Bigpet · on March 31, 2020

most programs hopefully do the sensible thing. But just having the possibility of one browser tab to spawn 8 web-workers that don't do any IO but just busy-wait makes it clear that current implementations of common software don't play well with that model.

It might not lock-up your OS but it would lock up all of the userland and the kernel couldn't do anything about it, because it can't pre-empt. All it could do under that model is let the user kill the browser or just kill the browser itself automatically.

I think most people would prefer being able to use their PC for other things while transcoding a video or compiling a program at near the PCs full capability. Instead of sacrificing a whole core to the OS and then also having to wait longer for those tasks to finish, plus not being able to do anything else.

A model with less pre-emption is certainly possible, but current end-user software makes an approach without any pre-emption very in-advisable.

jfkebwjsbx · on March 31, 2020

That does not follow. Long-running CPU-only tasks are a thing and they may not be aware of each other.

With cooperative multitasking, launching something like a parallel compilation would make your machine unusable.

jcranmer · on March 31, 2020

It is true that it's possible to screw your machine your over in that manner. I was pointing out that the large number of threads wasn't evidence in support of that fact, but rather the contrary; the threads have to already be using existing facilities that provide cooperative sleeping points.

imtringued · on March 31, 2020

The intention of the author's OS is to use cooperation in kernel space and preemption in user space so your entire comment makes absolutely no sense.

afiori · on March 31, 2020

But it was a reply to someone suggesting to use cooperative multitasking for scheduling userspace processes. It does not apply to the article but it does apply to the comment it is replying to.

steveklabnik · on March 30, 2020

My understanding is that you've now moved the monopolized case from 1 to N, where N is the number of cores. N is usually pretty small.

The laptop I'm writing this on has four cores, and the Chrome instance has 17 threads running...

imtringued · on March 31, 2020

Just because a kernel is using something internally doesn't mean it is exposing it to user space applications.

fortran77 · on March 30, 2020

Or MacOS up to and including OS9

throwlaplace · on March 30, 2020

great project for this lull.

i've been going through http://craftinginterpreters.com/ in rust (in order to learn rust) and it's a fantastic learning experience. the language is straightforward after you understand the basics (smart pointers and generics) and if you have a good ide with completion (CLion). lifetimes/borrow checker aren't as painful as people would have you believe if you use heap allocation (i.e. Box, Rc<RefCell>). now obviously heap allocation isn't optimal but for getting started it enables a smooth progression.