This feels like a really roundabout way to implement something that the compiler should be responsible for. All the pain and effort of C -> WASM -> C could be avoided if GCC or clang had some option to add bounds checking instrumentation for each memory access in the compiled output.
Maybe, but don't underestimate network effects. What's important about wasm is its universality - both where it can run and what can target it - which is already making for a powerful ecosystem of tools and compatibility
GCC and clang could implement their own bounds checking rules, but C -> WASM -> C is actually <C | anything> -> WASM -> <C | anything>
The "universality" exists only for now because wasm is at the toy-language level. The more it will evolve towards being helpful for production, the more opinionated and complex it'll become which will reduce the number of languages supported and platforms it runs one.
Source: every language vm under the sun (CLR, JVM, Neko, etc.)
Totally aree. This is such a an obvious thing to do, I'm amazed that I had never considered it a possibility until now. I guess I thought of sandboxing as something only VM-based programming languages were capable of. For decades we've been dealing with buffer overflow exploits in C apps - you really have to wonder why this hasn't already been an option in GCC and other compilers, or simply another pass in OS make compilation steps.
I'm sure it's not a cure all, adds overhead and not applicable in all cases, but every small addition to security wouldn't be a bad thing. I don't see any reason why every command line utility in Unix based OSes, for example, couldn't be sandboxed. Think like wget or curl for example.
Sure and containers give you syscall restrictions and OS protections, but you don’t see people embedding containers inside of other applications. People generally sont like sidecars, so embedding wasm makes a lot of sense to have in process
On Linux seccomp is what provides syscall restrictions, and seccomp was originally added to Linux to support untrusted app sandboxing--CPUShare and later Chrome Native Client (NaCl). See https://lwn.net/Articles/332974/. This is why classic seccomp, as opposed to BPF seccomp, only permitted a small, fixed set of syscalls--read, write, and _exit.
A seccomp classic sandboxed process will be at least as secure as any WASM runtime, no matter the engine. Even though the former is running untrusted native code, the attack surface is much more narrow and transparent.
> Sure and containers give you syscall restrictions and OS protections, but you don’t see people embedding containers inside of other applications.
It sounds like you're implying that C coders will opt out of the sandboxing provided by the OS, but that's not possible without coding kernel level code. For userland processes, the sandboxing isn't optional, and your process will be sent a SEGV signal if it tries to access memory it's not allowed to access.
If those utilities cared about that sort of thing they'd already have been rewritten in OCaml 20 years ago. The reasons unix utilities are written in C and C compilers don't do bounds checking are political, not technical.
NaCl was CPU specific, and the way they solved this problem with PNaCl (by using a subset of the LLVM IR) was more or less a hack which most likely involved at least as much machinery in the browser as a WASM runtime (and it compiled slower and also performance wasn't any better than even the first WASM runtimes, the only thing that PNaCl had going for it was straightforward multithreading support, which took far too long to materialize in WASM - on the other hand, Spectre/Meltdown would also have affected PNaCl hard).
Having worked with asm.js, NaCl, PNaCl (plus a couple other long forgotten competitors like Adobe's Alchemy/Flascc) and finally WASM: WASM is the natural result of an evolutionary process, and considering what all could have gone wrong if business decisions would have overruled technology decisions, what we got out of the whole endeavour is really pretty damn good. It's really not surprising that Google abandondend PNaCl and went full in on WASM.
Yeah, WASM was the result of taking a look at NaCl, realizing it could never be specified and independently implemented, and writing a spec for something better that could.
WASM is superior in pretty much all ways to NaCl other than being a bit more time to go through the whole specification and implementation process. NaCl was a great prototype, and WASM is a much more polished result because of it.
(P)NaCl was just too close to the past, roo many memories of activeX, flash and other crap still loomed heavy in people's hearts and they wanted a clean cut with the past, an open web, free from the dictat of corporations etc.
I do understand why people wanted something different and rejected (p)NaCl. But the reason is not technical, it's political (in the broad sense). Any technical issues could have been solved by fixing and extending (p)NaCl, but everybody involved understood it was not going to work
I used to work in the Native Client team at Google, both on regular NaCl and pNaCl. It failed for technical reasons too. And maybe, mostly for technical reasons.
pNaCl's biggest mistake was using LLVM bitcode as a wire format (see, for example, [1]). Another problem was trying to use glibc, instead of newlib. That resulted in a lot of low-level incompatibilities, as glibc itself is too target-specific. And implementing shared objects between portable and native world was really messy.
asm.js appeared as a simpler alternative to NaCl, and then it was quickly replaced by wasm, developed mutually by Mozilla, Google and others.