Hacker News new | past | comments | ask | show | jobs | submit login
Write Elixir NIFs in Rust (docs.rs)
95 points by 1letterunixname on Aug 6, 2023 | hide | past | favorite | 40 comments



There's also Zigler, that makes writing NIFs in Zig a breeze: https://github.com/E-xyza/zigler


This is way less safe than Rust.

NIFs have a major downside in that it can take down the entire BEAM if they misbehave.

And a huge reason why people use Erlang, is its resiliency from outages.

With Rust memory protection, it gives people more confidence in using NIFs for exactly that reason.


I am confused by the examples, do you actually need to write your code as a text string?


Well it seems that those are two different approaches.

Rustler being a Rust library to write Rust NIFs for Elixir. So I guess you have to compile the NIF first and then use it.

On the other hand Zigler seems to be an Elixir library that you give a string of zig code and it probably offloads compilation to the zig compiler. Since Elixir is compiled to bytecode, if you don't modify it won't get unecessary recompiled I guess so the approach has merit I think.

Btw it's a sigil'ed string, that is the result of a function that takes a string and returns one rather than a string literal


Also, this[1] is a game changer if you're planning on distributing your extensions but are worried that compiling Rust (especially cross compiling it) could hinder the appeal of your library.

Used it for a few projects and had one of those “How could I have lived before this?” moments.

[1] https://dashbit.co/blog/rustler-precompiled

p.s. Explorer[2], the NIF version of Polars[3] is distributed using Rustler pre-compiled, making Python less and less necessary for the Elixir developer or enthusiast

[2] https://news.livebook.dev/data-wrangling-in-elixir-with-expl...

[3] https://www.pola.rs/


Has anyone experimented with any Rust framework that implements Erlang's actor model in pure Rust? How well do they scale in distributed system? I know I've seen a few libraries such as:

Lunatic https://github.com/lunatic-solutions/lunatic

Bastion https://github.com/bastion-rs/bastion


I’ve liked using ractor.

https://github.com/slawlor/ractor


This project also appears interesting, but it seems that its clustering features have yet to be tested in large scale distributed systems.

https://github.com/slawlor/ractor/discussions/131


I wrote a blog post about using Rustler a while back: https://medium.com/multiverse-tech/html-validation-in-phoeni...

The particular use case is obsolete now that Heex templates have arrived in Phoenix, but the associated example code might be useful.


I remember playing around with this. Went and looked at my private repo and it was ~6 years ago.

Interesting idea, since I was enjoying Phoenix and Elixir. However, I jumped too quick I think into making a Rust NIF and it made the whole thing quite complicated. I was finding chasing down errors in multiple domains (Rust->Elixir->JS and back) to be challenging. I learned a ton on that project, however.

I wonder if it would go smoother now due to advancements with Phoenix and less javascript involved. I think some of the Rust libraries I was using have now advanced more as well.


What I was never able to wrap my head around with NIFs was lifecycles and ownership. The entire "resource" thing always confused me a little.

Say I wanted to implement my own ETS, maybe to add an atomic rename (which ETS really should have). Like ETS, I'd like to have a :named_table, so I don't need to pass a `ref` around everywhere. How/where do I create this global table lookup and how/who manages it?


For lifecycles, BEAM calls your NIF's load/upgrade/unload hooks as appropriate.

For global NIF module data, you're steered towards setting it up in the priv_data pointer passed to those hooks, and retreivable with enif_priv_data, but you can also use static variables or whatever else (it is native code, BEAM can't force you to do anything).

Lifecycle of Erlang Terms is bound to environments, often NIFs get Erlang Terms from the process that called them, and return Erlang Terms to that process... You can't keep references to those in native variables beyond the scope of the function call though; if you need to keep things around, you've got to either copy the values directly into native values, or use erl_make_copy to copy to an environment that you manage. (The erl_nif manpage is probably more clear than me, look for ErlNifEnv)

All that said, I think you're looking for a multi-table atomic rename like in MySQL[1]; so that you can replace a table in a single transaction? If that's all you want to change, you probably should just modify ets, rather than reimplement all the things; I'd be somewhat concerned about adding additional locking, but I don't know what the existing locking looks like.

[1] https://dev.mysql.com/doc/refman/8.0/en/rename-table.html


You can put your nif ref into a persistent term (if you do not plan to update it). Or into ETS table (which kinda defeats the purpose of making your own ETS table).

But probably you would wanna not use resources and pass name of a table as an atom into your C level.

Lifecycle is simple, the same lifecycle is as for long binaries. Each process that saw the reference would increase internal ref count. Once a process has been GC-d and does not have references to the resource in the process heap - the ref count decreases. When nobody references the resource - it is removed.


not the biggest expert, but from the top of my head I would say "static reference in your library"

you might be interested in looking at this

Neural: an ets-like interface to shared terms

https://github.com/soup-in-boots/neural/


I'm seeing no changes since 2013 and no issues either open or closed. Do you know what sort of state the project is in right now?


> Do you know what sort of state the project is in right now?

Sorry, but I don't.

I linked it as an example of handling your own tables by name vs by ref in a Nif.


I both love & hate the idea of rustler.

I love there’s a way to make apps on BEAM faster, I hate that you have to do that with non-native Erlang/Elixir language.

If only the BeamASM could be 5x faster than it is, rustler wouldn’t be needed (and come with all the downsides of NIFs).


Ehhh, a lot of high level languages support low-level interop. Python, Ruby, etc. It's not something you usually reach for as a user, but rather as a library author.

Of course everyone wants their language to be faster, but the interop story for Elixir/Erlang is very good, all things considered.


Can anyone explain why this is cool?


Rust is way faster than Elixir for CPU bound code, so it's nice to have the ability to sprinkle some Rust over your Elixir.

* What are NIFs: https://www.erlang.org/doc/tutorial/nif.html

* Using Rustler in Production: https://discord.com/blog/using-rust-to-scale-elixir-for-11-m...


Another reason this is cool along with the other comment. Is that rustler will stop errors from crashing the BEAM.

If you implement a NIF in C and you panic then the erlang VM will crash bringing down everything.

If you do it in rustler then that panic is caught and is raised as an exception on the elixir side, which can be handled (or not) like any other exception.

This allows you safe bindings to existing rust libraries (such as polars in the case of explorer)


While Rust is definitely safer option, if code fails inside unsafe code blocks, or panics, I expect a similar result like C.


I presume from the text above that there's an explicit panic handler, so, no, a panic won't cause "a similar result like C", it'll unwind, potentially causing local issues (e.g. it may leak resources which were in the process of being properly dropped when we panicked) and then the Elixir gets control.

If the code does something unsound in an unsafe block then yes, you get to keep both halves like in C. For example if you've decided to unsafely index arbitrarily far into a small array, this will blow up just the same as x[n] in C would. But, why would you do that?


Usually most "oops, I crashed the server", or "created a CVE", aren't written on purpose.

The part of having a SecDevOps as part of the many roles one has to perform, is worrying about this boring stuff.


The whole point of explicit unsafe rust is making this type of thing have to be written on purpose. You can't accidentally write unsafe code. And in case you accidentally write unsound code in your explicit unsafe code, you know exactly where to look.


While I agree in principle, the fallacy from security point of view, is assuming it was the same person, or that the error happened in unsafe code block and not elsewhere, caused by wrong invariants.

Hence coding in a safer language by itself isn't a guarantee, care must still be taken.

Still better than C language family anyway.


> that the error happened in unsafe code block and not elsewhere

Remember Rust's Safety Culture, the big C word I mention over and over here and elsewhere that we run into each other? Although the Rust compiler doesn't get to have an opinion about this, Rust's culture does, and Rust says no, the error is in your unsafe block.

Suppose I write a function which takes six 8-bit unsigned integers A, B, C, D, E and F, adds them together and then indexes into an array I own which is 512 entries long. Clearly if you give me most possible values of A through F this is a bounds miss.

If I write this function in safe Rust, it panics when you do that. The bug is in my code, that's where the panic happens.

If I write it in C, it blows up when you do that, but, and here's the crucial part, maybe I say "Idiot, the documentation clearly says in paragraph six see value normality subsection B4, and in B4 I wrote that none of A through F should have values such that when added they sum to 512 or more, thus it's your fault.

If I write it in unsafe Rust then culturally I have two clear choices. One of them is like the C. I write my very extensive documentation and I mark my function unsafe which indicates that callers need to read and understand the documentation to ensure they obey all pre-conditions. Their code will also be unsafe because they can't call my unsafe function without that, reminding them of their responsibility.

The other choice, which is more usual, is to safely encapsulate the feature. I can optionally write extensive documentation, but regardless I must ensure the function cannot cause unsafety even if used by a malicious idiot.

Because of these two options, the problem is always the unsafe code. Maybe it's my unsafe code (I did a bad job either obeying or specifying the preconditions) or maybe it's your unsafe code (you didn't obey my preconditions) but either way it's never the safe code.

Rust's compiler can't promise that but Rust's culture can.


The culture part I agree on, certainly.

Something shared on Java and .NET land, when discussing JNI, P/Invoke, C++/CLI.


You can pull from the Rust ecosystem when you need something that Elixir might not have or might not be the best at.

For example, I use it to parse XML files while this can be done in Elixir and there are libraries for it. Being able to use the Rust library serde and quick_xml makes life so much easier.

Another example would be the Elixir library Explorer which is built using the Polars library in Rust for fast dataframes.

Rustler is also pretty easy to use. I never coded Rust before and I had working code very quickly and build is easy as running your normal mix commands.

You can also use the rustler-precompiled library so users of your library dont have to worry about building the library.

It's essential the PyO3 of the Elixir ecosystem, excluding the Python to Rust part .


Pet peeve of mine: whenever that’s possible, avoid writing libraries in one language that references something that has to be compiled in a different language. If you do, do you end up with Perl/CPAN - installing anything becomes a pain. It’s nice to have a quicker XML parser, but it would be real nice if it was a plug-in and there was a slower, but easy install version that required no external things to be compiled.


It's not a pain because of

https://github.com/philss/rustler_precompiled

The users of your library doesn't have to install anything. The library will seem like any Elixir library when they use it.

It safe because when you upload your project to Hex.pm you include a file with all the checksum of the latest version of your library and when you install the library it will use those to download the library from GitHub.


not anymore, Elixir community as usual delivers some great tool for their beloved developers and users

https://dashbit.co/blog/rustler-precompiled

(it's similar to pre-compiled Python wheels, but it's better developers' side in my opinion)


Unless one can do like in Java,.NET and nodejs, package the libraries alongside the bindings, for all major target platforms.


You can with Rustler using rustler precompiled. You write a workflow in GitHub that builds for all the supported targets and the precompiled library will download the correct one when a user downloads your library.


Aside from the actually good use cases other comments have mentioned, one idea I've played around with is using it for augmenting an Elixir app with code that can be run in both the browser and the server - use a NIF to run the Rust code server-side, compile it to WASM to run in the browser. The idea that inspired me was writing input validation code, running it both in the browser for greater responsiveness and on the server for making sure inputs get validated, using the same source code so those don't get out of sync. It's probably not worth the added complexity, but it's kind of neat.


That is really neat!

I suppose the only downside of using the same code is it has the same bugs (e.g. I forgot to strip whitespace before doing a length check). So you don't get a belt and braces of having two lots of independently implemented validation

But then the upside is that it has the same bugs, so you only have to fix it once


Even if it's buggy, I figure it'd at least be consistently buggy. Part of what spurred me to make this was an issue I had with account setup for a local utility's website; their client-side validity check for creating a new password was apparently using different logic than the backend code. I kept trying to create a password, the web page said it was valid, then I'd hit submit and get rejected for having an invalid password.


The title should include the library name.


Is it possible to do something like NIFs in Java?


If you're asking if it's possible to use C (or any language that can export functions conforming to the C ABI), then yes: https://en.wikipedia.org/wiki/Java_Native_Interface




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: