That said, this is the general operating strategy in the Rust ecosystem: got a cool feature you think would be useful? Try implementing it via a crate/macro hackery to see if would indeed be useful.
It's one of the reason I'm a big fan of Rust - between the macros and the typesytem, you can generally prototype language level changes. No, the prototype won't be as nice to use as the real thing, but I think that's for the best.
As an example, you can write a crate to introduce named argument support for calling and defining functions. Lots of fun.
An aside: the author is one of the macro experts in the Rust ecosystem.
This reminds me of genrules in Bazel. Bazel is a sandboxed build system where code-generating tools need to declare all inputs and outputs. Before running a code generating tool, Bazel needs to compile it for the platform that's running Bazel. The code-generation binary can be cached, though.
Procmacros in Rust also imply a build system that runs arbitrary code, so it's interesting to see them exploring a similar direction.
So the initial implementation doesn't allow any inputs/outputs besides the tokenstreams, but some proc macros do access the filesystem. There's definitely more exploration to be done both with proc macros, and the general build system (which allows execution of arbitrary rust code as well, but which tends to touch the file system a lot more).
This definitely seems like a good starting point for adding security to the build system.
A "procedrual macro" in Rust takes in some Rust code, does some sort of transformation on it, and then produces some Rust code. (Technically its a stream of tokens, but same thing.)
These are written in Rust, and so can do arbitrary things. That's what makes them so powerful. But it also can be dangerous. Additionally, they're written in Rust, so they need to be compiled, and they need to be compiled before other code that uses them can be compiled; otherwise, you don't know how to perform the transformation. They're kinda almost like plugins to the compiler.
So, how should these plugins execute? We could just compile them and run them, and that's indeed what Rust does today. However, since they're used as part of the compilation process, you want them to be fast. But when you're doing a debug build, they'll also be built in debug, which is slow. We could solve this particular issue by compiling dependencies in release mode, and your code in debug mode, but that has other problems and doesn't exist yet. Another concern is that they're arbitrary Rust programs: what happens if your plugin doesn't just implement some trait for your type, but also emails /etc/password to some bad guy? (you get what I mean, this is just an example)
So, ideally you want a sandbox to run your code in. This would let you limit the capabilities of plugins. You want it to be fast. You want it to be lightweight. You want it to be cross-platform.
All of these problems are solved by WebAssembly. You can compile the Rust code to wasm one time, for every platform, and ship it as a precompiled binary. Wasm will sandbox everything, so you can prevent a number of possible security issues. Additionally, for Rust, Rust already supports being compiled to wasm, and has a number of wasm runtimes to choose from.
I think a good way to think about this problem is to substitute "wasm" with "lua". When is lua a good idea? Wasm is also probably a good idea for those things. Only probably, of course...
Good response. This usage in Rust procedural macros boils down to providing flexibility while improving speed and security.
One thing I'd add is that WASM has a few unique advantages compared to other options, which in this case would be shipping a Rust interpreter or using another VM like JVM. Building a Rust interpreter would be a lot of effort to build and maintain, while WASM already has tool-chain support due to it's adoption in browsers. Something like JVM bytecode works differently enough that it's not a standard "ISA" style target in LLVM or GCC tool-chains. WASM is also significantly lighter than many other language specific VM's, provides a decent default sandbox, and is performant enough.
Another potential usage which would be interesting is using WASM in interpreted languages to provide more performant modules/plugins while providing better security/ease of use. Elixir, Ruby, and Python all use native (C) modules for performance, but often are difficult to compile. Being able to precompile modules to WASM would provide better memory safety and be easier to use.
Yes, that’s already possible today, they can do anything.
We haven’t decided for sure if we’re going to do this upstream yet, so those kinds of policy questions have yet to be answered. I’d imagine so, but there’s a lot of details there!
I would say precompiled proc macros via wasm narrows that hole - since it can't access any outside source, it becomes deterministic and greatly simplifies testing it for malicous output.
Yes, this particular implementation does. That doesn’t mean the final one will follow suit. This could be considered breaking backwards compatibility, for example.
True. I'd hope future implementations are very explicit about what operations are allowable (for example, globs for files in allowed inputs/outputs), such that you still have an easier time testing.
For companies that might be worried about the third-party code ecosystem (did you know that npm packages can effectively run code as you on your system?), this drastically increases the confidence in the security of random Rust cargo crates.
Really unexpected to use a bytecode execution engine for accelerating compile times, only if one doesn't spend time regularly reading compiler related papers, sigh.
That said, this is the general operating strategy in the Rust ecosystem: got a cool feature you think would be useful? Try implementing it via a crate/macro hackery to see if would indeed be useful.
It's one of the reason I'm a big fan of Rust - between the macros and the typesytem, you can generally prototype language level changes. No, the prototype won't be as nice to use as the real thing, but I think that's for the best.
As an example, you can write a crate to introduce named argument support for calling and defining functions. Lots of fun.
An aside: the author is one of the macro experts in the Rust ecosystem.