Maybe Linux will get scheduler activations in the near future, another OS feature from the 90s that ended up in Solaris and more or less nowhere else. "Let my user space thread scheduler do its work!"
We talked about adding that to Linux in the 90s too. A simple, small scheduler-hook system call that would allow userspace to cover different asynchronous I/O scheduling cases efficiently.
The sort of thing Go and Rust runtimes try to approximate in a hackish way nowadays. They would both by improved by an appropriate scheduler-activation hook.
Back then the idea didn't gain support. It needed a champion, and nobody cared enough. It seemed unnecessary, complicated. What was done instead seemed to be driven by interests that focused on one kind of task or another, e.g. networking or databases.
It doesn't help that the understandings many people have of performance around asynchronous I/O, stackless and stackful coroutines, userspace-kernel interactions, CPU-hardware interactions and so on are not particularly deep. For example I've met a few people who argued that "async-await" is the modern and faster alternative to threads in every scenario, except for needing N threads to use N CPU cores. But that is far from correct. Stackful coroutines doing blocking I/O with complex logic (such as filesystems) are lighter than async-await coroutines doing the same thing, and "heavy" fair scheduling can improve throughput and latency statistics over naive queueing.
It's exciting to see efficient userspace-kernel I/O scheduling getting attention, and getting better over the years. Kudos to the implementors.
But it's also kind of depressing that things that were on the table 20-25 years ago take this long to be evaluated. It's almost as if economics and personal situations governs progress much more than knowledge and ideas...
Actually, I think the biggest obstacle is that as cool as scheduler activations are, it turns out that not many applications are really in a position to benefit from them. The ones that can found other ways ("workarounds") to address the fact that the kernel scheduler can't know which user space thread to run. They did so because it was important to them.
There's already plans for a new futex-based swap_to primitive, for improving userland thread scheduling capabilities. There was some work done on it last year, but it was rejected on LKML. At this rate, it looks like it will not move forward until the new futex2 syscall is in place, since the original API is showing its age.
So, it will probably happen Soon™, but you're probably still ~2 years out before you can reliably depend on it, I'd say.
The kernel wakes up the user space scheduler when it decides to put the process onto a cpu. The user space scheduler decides which user space thread executes in the kernel thread context that it runs in, and does a user space thread switch (not a full context switch) to it. It's a combination of kernel threads and user space (aka "green") threads.
It might! I'm not sure if it's an exact fit or not but the User Managed Concurrency Groups work[1] Google is trying to upstream with their Fibers userland-scheduling library sounds like it could be a match, and perhaps it could get the upstreaming itcs seeking.
I think some of the *BSDs have (or had) it. Linux almost got it at the turn of the millennium, with the Next Generation Posix Threading project, but then the much simpler and faster NPTL won.