Articles SortedSet[0] is basically a pseudo-BTree with fixed depth = 2 and fixed max_size of leafs. This gives O(log n) search and O(sqrt n) inserts [1].
It would mean that insertion at beginning is worst case scenario (split + need to move all buckets), but timing of insert is actually dominated by adding size of the buckets to calculate final index.
You can also write Elixir/BEAM NIF extensions in Nim and Zig as well using Nimler or Zigler, resp [1,2]. Also I’ve found writing native extensions for the BEAM is simpler than many other dynamic languages since it relies on immutable data. It’s roughly the same as just a function to deserialize/serialize data with some extra references to GC’ed data. Especially with dirty “NIF”s (they allow the functions to run for as long as they want without messing up the BEAM scheduler).
In the case of BEAM NIFs, I find Rust's compiler guarantees especially compelling.
That's because a crashing NIF can take down the whole Erlang VM. If fault-tolerance is what brought you to Erlang/Elixir and OTP to begin with, as it is for many, having the server crash could be a major issue.
True, Rust has the strongest guarantees. Nim offers similar guarantees overall (checked arrays etc) and Nimler has exception checking enabled so the compilers checks if you have any possible checkable exceptions. Personally, I use Nim since it’s easier to use on embedded Linux devices by precompiling to C. Rust has some odd dependencies on the target’s C linker but doesn’t check the standard $CC or $LD variables. Zig is interesting but it offers the least memory protection.
The have some really disciplined engineers (some of them I used to work with). I have learned a lot from these people. Erlang is everybody's secret weapon for high concurrency (like Discord) and Rust has predictable performance that is easy to tune. GC is not an option for many use cases they have. One interesting problem they run into was the GC is Go and the lack of configuration options for it. I think they are the prime example of using the right tool for the job, whatever it might be, instead of trying to sell you a single programming language that solves everything.
I've been using Discord on a near-daily basis for 9 years, in servers from 10 to 25,000 people, and I think I can count on two hands the number of actual breakages I have had with the product. It's honestly pretty nuts.
I agree, I always found it strange that an app aimed for gaming really has better performance than slack/teams and their massive budgets behind them. But I never saw a company that uses discord for communication.
> I always found it strange that an app aimed for gaming really has better performance than slack/teams
Makes sense to me, its users would be running games alongside the software, and would be quite cross with the software shitting itself in the middle of a co-op game. Somewhat good performance and behaviour is (or at least was originally) absolutely necessary to getting traction, a bunch of checkboxes would not suffice.
> But I never saw a company that uses discord for communication.
Currently working at a place where at least one, if not two, teams use Discord for daily chatting. Official meetings are mostly handled over Zoom but I believe their intra-team stuff tends to happen over Discord.
My view is that they solved voice communication first - that was their original use case. Reliable multi-party voice is a harder problem than text chat and means that you can't/don't just rely on Electron for everything. Their product is a way to 'sell' some good non-trivial (but not groundbreaking) engineering.
Whereas Slack is successful because it understood it's usecase well (including how to sell to enterprises).
The closest I've found is the Spotify open-source project "Backstage", which uses Discord for communication (https://github.com/backstage/backstage) - I have no idea about the rest of Spotify, mind.
Their skiplist description is a little funny — typically a skiplist has one element per leaf node, and you should never need to move items between buckets. Also skiplists have logarithmic layers in the number of elements, so inserting and searching are always bounded by O(log n).
I've been using a bucketized version of skiplist for years now, I thought I invented it! Using most recently in https://github.com/infradig/trealla (a Prolog interpreter).
For binary trees, indexing can be done by saving the subtree size of each node and doing a sort of binary search. Not sure if this is fast for B-trees that have more than 2 children nodes, though.
> There’s a fantastic Elixir project called Rustler. It provides nice support on the Elixir and Rust side for making a safe NIF that is well behaved and using the guarantees of Rust is guaranteed not to crash the VM or leak memory
Hum, can someone talk more about these NIF and what special thing Rustler brings?
What's different from just any kind of native interop, like say Java JNI ?
So there are a few ways to do interop with non-BEAM languages.
Ports are the typical one for simpler things. You start an external command and talk to it via stdin/stdout.
NIFs are potentially much faster and are traditionally in C. But they carry the risk of crashing the VM. Rustler+Rust removes most, likely not all, but most of the risks for screwing up and killing the VM.
So NIFs are usually not recommended simply because the resilience and reliability of the BEAM VM relies on a bunch of cool strategies inside the VM and NIFs side-step all of that. The Rust ones should be far less dangerous.
NIFs run without most (all?) of the safety features the erlang VM provides, such as memory safety and scheduling. If the NIF crashes, the entire app (aka, VM) crashes.
This is anathema to some...i dunno..philosophies? expectations? of Erlang/Elixir.
NIFs written in Rust (via Rustler) should never be able to crash the VM. Unlike NIFs written in C.
Rustler brings other quality of life improvements, like easier interop (decoding/encoding values to and from) and managing the cleanup of objects that live in Rust but are referenced by Elixir.
> Note that this function may not catch all panics in Rust. A panic in Rust is not always implemented via unwinding, but can be implemented by aborting the process as well. This function only catches unwinding panics, not those that abort the process.
From the docs filmor linked to, if a library sets panic = unwind and a user sets that to abort, “If any of your users choose to abort, they'll get a compile-time failure.”
We use rustler only as a thin wrapper to existing crates (like, for hdrhistograms). I'm far from a Rust expert, but I believe it uses std::panic::catch_unwind
> Note that this function may not catch all panics in Rust. A panic in Rust is not always implemented via unwinding, but can be implemented by aborting the process as well. This function only catches unwinding panics, not those that abort the process.
I can’t speak to the Rustler, but the JNI bindings for Rust are quite good. There are a lot of areas where you can still shoot yourself in the foot, lots of unsafe. But there are patterns they encourage that are very safe, like associating lifetimes to Java objects that are instantiated in the Rust code appropriately. It also has good interfaces for working with arrays and avoid some of the foot guns in traditional C JNI libraries.
It’s not quite possible to have no unsafe code in JNI with Rust (in my experience) but it is very easy to isolate to very small areas of your code.
I'm not sure if the other answers were good enough.
NIF is native interop. There's a bunch of provided functions for interfacing with Erlang terms and other things like that. But, it's native interop, so you can also do whatever crazy stuff you want, even if it's a bad idea.
The BEAM VM cannot and does not protect you from bad NIFs; but if you follow the guidelines and are careful not to crash, you can have a good time. Often, the scope of the native work ends up so small, it's easyish to keep it safe. For larger scope, it's more common to use a Port or a C-node; a Port is just what Erlang calls running your other code in separate OS process communicating with stdin/stdout, and a C-node is your other code running separately, possibly on another machine, connected via network sockets, acting like a dist node. For this application though, it makes more sense to use a NIF to avoid copying the data to and from a separate OS process.
I haven't used Rustler, but my understanding is it's mostly just a framework of sorts for writing NIFs in Rust. Maybe making it easier to use Erlang terms, and setting up the compile settings etc.
My brief understanding of NIFs is they're similar to unsafe in Rust where a bunch of VM guarantees are only held so long as the NIF is well-behaved. I believe there was also some limits on how long a NIF could execute before starving the Erlang scheduler[1].
The other answers have covered the technical details well, but not so much the philosophical ones.
You may have heard about how Erlang with the BEAM is a fantastic implementation of the Actor programming model. This involves lots of processes (green threads) holding a little bit of state and communicating with messages. It's both a great concurrency model and a great programming model. The standard library (OTP) then also provides excellent tools for managing these processes in supervision trees.
Ports act like another actor in a supervision tree. This actor happens to be a C program and the messages are sent and received by stdout/stdin, but otherwise it behaves like a process. You can manage its runtime state with supervision trees and all the other excellent tools that BEAM/OTP give you.
NIFs on the other hand are best used to replace something that could be a pure function in Erlang, but is a lot faster in something compiled (or because you want to leverage some existing codebase that does that).
The little that I know about NIFs is that if they crash, they bring down the entire BEAM, so it's pretty risky if not done properly. But I've seen that Rust complements Elixir (via NIFs) very well.
damn, tried to pull this in as a dependency, doesnt look like it compiles on the most recent erlang versions. good reason to dig into the source this weekend.
You might be thinking complement. And yes those do complement each other perfectly. The other scaling article how and why they switched from Go to Rust is pretty mind blowing too.