I use it as the core cross-platform event loop layer for my terminal (https://mitchellh.com/ghostty). I still consider libxev an early, unstable project but the terminal has been in use by hundreds to now over a thousand beta testers daily for over a year now so at least for that use case its very stable. :) I know of others using it in production shipped software, but use it at your own risk.
As background, my terminal previously used libuv (the Node.js core event loop library), and I think libuv is a great project! I still have those Zig bindings available (archived) if anyone is interested: https://github.com/mitchellh/zig-libuv
The main issue I had personally with libuv was that I was noticing performance jitter due to heap allocations. libxev's main design goal was to be allocation-free, and it is. The caller is responsible for allocating all the memory libxev needs (however it decides to do that!) and passing it to libxev. There were some additional things I wanted: more direct access to mach ports on macOS, io_uring on Linux (although I think libuv can use io_uring now), etc. But more carefully controlling memory allocation was the big one.
And it worked! Under heavy IO load in my terminal project, p90 performance roughly matched libuv but my p99 performance was much, much better. Like, 10x or more better. I don't have those numbers in front of me anymore to back that up and my terminal project hasn't built with libuv in a very long time. But I consider the project a success for my use case.
You're probably better off using libuv (i.e. the Node loop, not my project) for your own project. But, the main takeaway I'd give people is: don't be afraid to reimplement this kind of stuff for you. A purpose-built event loop isn't that complicated, and if your software isn't even cross-platform, it's really not complicated.
To be clear, I am discussing the text under "About" in the top right, labeled as "Description" when edited, which currently states:
> libxev is a cross-platform, high-performance event loop that provides abstractions for non-blocking IO, timers, events, and more and works on Linux (io_uring or epoll), macOS (kqueue), and Wasm + WASI. Available as both a Zig and C API.
... with no mention of zero-allocation though yes it is mentioned later as a feature in the README.
Very nice! TBH, libuv sometimes felt like it is popular because it's popular rather than sheer technical prowess. I was never comfortable with how much allocation is done by it, and I don't always find how it deals with platform primitives as useful as I'd like.
> don't be afraid to reimplement this kind of stuff for you. A purpose-built event loop isn't that complicated,
Amen. There's no need to view the event loop as mysterious. It's just a while loop that is constantly coordinating IO.
What do you think are the next steps for a next generation event loop?
I've been experimenting with barriers/phasers, LMAX Disruptors and my own lock free algorithms.
I think some form of multithreaded structured concurrency with coroutines and io_uring.
I've been experimenting with decoupling the making sending and recv independently parallel with multiple io_urings "split parallel io" - so you can process incoming traffic separately from the stream that generates data to send. Generating sends is unblocked by receive parsing and vice versa.
On 5.1.5 Summary of Benchmarking Results (page 44)
> Of the three different applications and frameworks, DPDK performs best in all aspects con-
cerning throughput, packet loss, packet rate, and latency. The fastest throughput of DPDK
was measured at about 25 Gbit/s and the highest packet rate was measured at about 9 mil-
lion. The packet loss for DPDK stays under 10% most of the time, but for packet sizes 64
bytes and 128 bytes, and for transmission rates of 32% and over, the packet loss reaches a
maximum of 60%. Latency stays at around 12 μs for all sizes and transmission rates under
32% and reaches a maximum latency of 1 ms for packets of size 1518 bytes with transmission
rates of 64% and above.
> Based on these results, it was determined that DPDK can optimally handle transmission
rates up to around 64 bytes, above rate 64% performance increases are non-existent while
packet loss and latency increase.
> io_uring had a maximum throughput of 5.0 Gbit/s and was achieved at a transmission
rate of 16% or higher when the packet size was 1518 bytes. The packet loss was significant,
especially for transmission rates over 16%, and when packet size was below 1280 bytes. Gen-
erally, the packet loss decreased when packet sizes increased for all different transmission
rates. The packet rate reached a maximum of approximately 460,000 packets per second. For
higher transmission rates and for larger packet sizes, the packet rate decreased. This reached
a minimum of around 40,000 packets per second for a transmission rate of 1%. The latency
of io_uring is highest at size 1518 and transmission rate 100% with a latency of around 1.3
ms. For lower transmission rates under 64%, the latency decreases when packet size increase,
reaching a minimum of around 20 to 30 μs.
> The results of running io_uring at different transmission rates show that io_uring reaches
its best performance on our system at around transmission rate 16%. Above rate 16% there
are no improvements in performance and latency and packet loss increase.
Ok 25Gbps vs 5Gbps seems like a huge difference, specially since io_uring was having higher packet loss as well
Three, I wanted an event loop library that could build to WebAssembly (both WASI and freestanding) and that didn't really fit well into the goals of API style of existing libraries without bringing in something super heavy like Emscripten.
This is a cool motivation!
Could you drop this into Node to make Nodeex ? A kind of experimental allocation-free Node that somehow carves out the allocations into another layer (admittedly still within the node c code)?
I saw ghostty and thought, “isn’t that the terminal written by the guy who cofounded hashicorp?”. I really enjoy your ghostty blog posts and will be checking out libxev!
We copied libxev's code for the timer heap implementation in Bun for setTimeout & setInterval, and it was a ~6x throughput improvement[0] on Linux compared to our previous implementation.
The timer heap currently lives on a different thread instead of the main thread, which means timers have to be allocated and scheduled separately for each one. Scheduling things to other threads is expensive. The reason it works this way isn't good and we will fix it but haven't prioritized it yet
As someone maintaining a project with its own event loop: don't do it in larger projects.
The problem is that you'll start having dependencies on external libraries. And when those then need event loop integration, things get messy. We've introduced bugs before, caused by subtle differences in semantics. (Like: does write imply read? Are events disarmed while running? What about errors?)
If the lib and event loop are reasonably popular, someone else probably has integrated them before. Or the lib supports the event loop natively (or uses libverto.) Either saves you some trouble.
The interface looks like verto is linux first design, like git. But what's the point? Just implement epoll like Illumos did. Also allocation heavy and apparently can't use deno-style loop.
io_uring support is obviously great & excellent, fulfills the "high performance" part well. brought an immediately smile to my face.
i was not expecting "Wasm + WASI" support at all. that's very cool. implementation is wasi_poll.zig (https://github.com/mitchellh/libxev/blob/main/src/backend/wa...). not to be unkind, but this makes me wonder very much if WASI is already missing the mark, if polling is the solution offered.
gotta say, this is some very understandable clean code. further enhancing my sense that i really ought be playing with zig.
I was going to say, "I wonder if Bun.js would/could use this" but it looks like Jarred Sumner has been cherry-picking bits of libxev for at least six months.
Many signal/slot implementations are done synchronously without any event loop involved, the two are somewhat orthogonal. Even Qt will call the signals synchronously most of the time without the event loop involved, it's just an additional feature of it to queue the event in the event loop.
libdispatch/GCD is a task scheduler built on top of kqueue. It's meant for moving things away from the main UI thread without thinking how often you do that.
The README only states that I use kqueue on macOS, but I don't claim it is specific or originated from macOS. I've read the README over a few times and can't find where you'd get the feeling that it's a macOS-only thing. If I can edit it in any way to make that clearer let me know.
libxev is not compatible with BSD currently because macOS's kqueue API is very slightly different from BSDs to make it incompatible (i.e. I use mach ports a lot on macOS, but other parts of the syscall interface also vary slightly).
If you depend heavily on Mach ports I don't think "kqueue (macOS)" is an accurate description. That makes it sound like it has more of a chance to work on BSD than it does.
It is an accurate description. The mach ports are waited on through kqueue, and I use kqueue for all other waiters with "standard" fds (i.e. files). But my usage of mach ports (even for a partial use case) make it incompatible with BSD, and even if I didn't use mach ports the kqueue structures used by macOS are slightly different and incompatible anyways, and I don't claim BSD support anywhere.
It's splitting hairs and being a bit pedantic, but you also reordered my descriptions: in the README I always say "macOS (kqueue)" and not the reverse which you incorrectly quoted. I think that makes a small but tangible difference.
I did misread and misquote that. But when a remark is parenthetical I guess I consider them equivalent. macOS and kqueue is not equivalent. Maybe macOS (using kqueue and Mach ports) would make it clearer?
I use it as the core cross-platform event loop layer for my terminal (https://mitchellh.com/ghostty). I still consider libxev an early, unstable project but the terminal has been in use by hundreds to now over a thousand beta testers daily for over a year now so at least for that use case its very stable. :) I know of others using it in production shipped software, but use it at your own risk.
As background, my terminal previously used libuv (the Node.js core event loop library), and I think libuv is a great project! I still have those Zig bindings available (archived) if anyone is interested: https://github.com/mitchellh/zig-libuv
The main issue I had personally with libuv was that I was noticing performance jitter due to heap allocations. libxev's main design goal was to be allocation-free, and it is. The caller is responsible for allocating all the memory libxev needs (however it decides to do that!) and passing it to libxev. There were some additional things I wanted: more direct access to mach ports on macOS, io_uring on Linux (although I think libuv can use io_uring now), etc. But more carefully controlling memory allocation was the big one.
And it worked! Under heavy IO load in my terminal project, p90 performance roughly matched libuv but my p99 performance was much, much better. Like, 10x or more better. I don't have those numbers in front of me anymore to back that up and my terminal project hasn't built with libuv in a very long time. But I consider the project a success for my use case.
You're probably better off using libuv (i.e. the Node loop, not my project) for your own project. But, the main takeaway I'd give people is: don't be afraid to reimplement this kind of stuff for you. A purpose-built event loop isn't that complicated, and if your software isn't even cross-platform, it's really not complicated.