> Landlock Make can build code five times faster than Bazel, while offering the same advantages in terms of safety. In other words, you get all the benefits of a big corporation build system, in a tiny lightweight binary that any indie developer can love.
In terms of safety, maybe almost (bazel can check if the source files changed
during the build, this (afaict) can not). But bazel also provides a lot more
(caching, remote builds, ...). So, while cool, read more on it and evaluate it
in depth before deciding to replace bazel with this.
One of the nice things about Bazel that the article didn't get a chance to go into is it uses SHA hashes of files, rather than file timestamps, to determine when an artifact has changed and therefore needs to be updated. It's slightly more costly to compute hashes, but it's necessary if you want to have something like a global cache of build artifacts, since synchronizing time across machines is hard.
What I'd recommend for anyone really, is to just do what Google did. For the first six years of Google's lifecycle, they got along just fine with GNU Make. Then they switched to the huge scalable thing once they actually reached that inflection point. I'm obviously not there since I'm just a scrappy open source coder. So for me I'm quite happy to be working with GNU Make and I can foresee myself getting many additional years of use out of it.
For the first decade and a half of Google's company lifecycle, they got along just fine with GNU Make
??? Google was started in 1998, and Bazel was created ~2006 as a replacement for Python + GNU Make ("gconfig"). I was on that team, though I only worked on Blaze a tiny bit. The "google3" build migration was sometime around 2003 or 2004.
So at most there were 6 years of using Make only, i.e. "google2".
Importantly, pre-Blaze google3 wasn't just GNU make -- Python was a huge part of it, which is why the Bazel build language Starlark looks like Python. It used to literally be Python, and now it's a Python-like language with parallel evaluation.
---
If you want to do what "scrappy Google" did these days, then you should use Python + Ninja. Ninja is meant to be generated, just like GNU Make was generated by Python. (A big difference is that GNU make has a big database of built-in rules that basically do nothing but slow down incremental rebuilds.)
This work with Landlock looks very cool, and it would make a lot of sense for Ninja to have optional support for it. Some of the caveats are a bit scary but hopefully that can be worked out over time.
The way I was thinking of doing it was just to have a ./NINJA_config.py --slow-sandbox mode. So you can use any sandbox to warn you about missing dependencies, including something container-based like bubblewrap, symlink farms, or Landlock. I think that would work, though I haven't tried it. The shared library issue is tricky, etc.
It's very useful to have the build config / generator split for this reason, and many others (e.g. build variants go only in the first stage, not the second).
I wrote 3 substantial GNU makefiles from scratch and regretted it largely because it lacks this split -- it has a very tortured way of doing build "metaprogramming". IIRC one dimension of variants was OK, but 2 got you into the "write a Lisp in Make" territory. Might as well use Python (or Lua, etc.)
I wouldn't say it's scary. There's always been full transparency with the caveats and they're being lifted incrementally. https://twitter.com/l0kod/status/1556378406983458818 The workarounds are perfectly reasonable. Also take into consideration that Landlock is so simple as a security tool, that it really opens itself up to people like us being able to focus in on the opportunities for improvement. A lot of security stuff, like containers, is so byzantine that no one would enjoy having open discussions about their design. Landlock has felt different to me, simply because we are having these conversations, and it's been so engaging that it really makes me want to believe in Torvald's whole mythology about the many eyeballs.
Email me if there's anything I can do to help Ninja adopt Landlock. The Cosmopolitan Libc pledge() and unveil() implementations are written to be relatively easy to transplant into other codebases. I'd love to see a broader audience benefiting from our work.
Yeah maybe I shouldn't say "scary", but I think the shared libraries and hard-coded paths would have to be worked out / generalized for something like this to land in upstream build tools.
And obviously having a working demo like this is huge progress in that direction, so I think it's great work.
But I'm also wondering if there's a way to do it without OS-specific support patched in? That is, with a wrapper analogous to bubblewrap, sandbox-exec, etc. I know pledge() should make use of app-specific knowledge, so it can't be a CLI wrapper, but I think that isn't true for the build tool case?
It would be more portable, and reduce an O(M x N) code explosion, if there was a standard access-dropping CLI interface that every OS could implement, and that every build tool could use without custom OS-specific code. I've been meaning to experiment with that for a long time, since I agree this problem is worth solving! (for correctness, caching, distribution, etc.) And I think Ninja is a good start and you can do some creative things with it, like
I also think having the generator split is nice because you could do ./NINJA_config.py --with-sandbox on Linux, and get all the dependencies correct. And then on OS X if there is no good sandboxing tool, you can just do it without the sandbox, assuming that the deps were tested on Linux. (And I agree the most practical answer on Windows is to "use WSL" :) )
There's an element here you may have overlooked that's important to understand. The purpose of a sandbox is to control things. The purpose of dynamic shared objects is to delegate control to other developers so you can leverage their labor at minimal cost. That means giving up control. These two concepts can't be reconciled. You can't properly sandbox dynamic shared objects existing outside your dominion because they're not yours to control.
I solve this problem for myself by using static binaries. I don't need dynamic shared objects. I ship support for it in my project releases, since I know other people do. That weakens the safety my tools can offer dso users, but it doesn't concern me, since I'm offering them incremental value; a weakened sandbox is better than no sandbox at all.
Cosmopolitan Libc is what I have for anyone wanting the stronger model. You can't use it to leverage an ecosystem of third party packages. What it can offer you is a statically linked hermetic environment with less complexity that's more conducive to control, provided you're ok with some assembly being required. Worth considering for your next greenfield project.
I understand that and don't really disagree with any of it, but what I'd say I'd say is that there's a third possibility of doing what containers already do -- use dynamic linking against fixed versions you control.
That is, the way I think of it is more along the lines of whether an executable is a "value" (in the Hickey sense), not whether it's statically linked. So the whole container can be a value, but it's not statically ed linked.
That has bearing on both incremental builds and distribution. This relates to my comments on the article about size optimization:
i.e. I think it makes sense for executables to retain structure for differential compression (they can be a tree, not a flat file.)
---
Also I'm wondering if the sandboxing can be done in a child process, without the cooperation of the build tool. I don't see why not, but I'll have to try it (and especially with the container model, which changes things a bit). That could actually make the dynamic library issue easier to deploy because you've punted some policy (hard-coded paths) into a separate tool, rather than having it in the build tool.
I wrote 3 substantial GNU makefiles
from scratch and regretted it largely
because it lacks this split -- it has
a very tortured way of doing build "metaprogramming".
GNU make definitely still has a lot of rough edges, but parts of your comment seems as though you're making claims about it that haven't been true in a decade or more.
There's quite a few small things about modern GNU make that make it easier to use for meta-programming than it was even just a decade ago. It still has a lot of rough edges that I wish it didn't have though.
Can you point me to a widely used Makefile that uses Guile in GNU make? AFAIK it's not available on most distros, so people don't use it. It's an optional extension
I don't disagree with anything you said, but I think it's important to point out that your experience is unique in that you're talking about build infrastructure for a massive codebase: most of Google's internal stuff, right?
In my experience plain GNU Make works great until you start working with massive projects. Similarly, the ninja speed improvement: you just don't see it unless you have a truly massive project, or you're using low-quality Makefiles, like what CMake produces. I've measured; I can't see it. I have many medium-sized projects, all with simple Makefiles, and it works really really well. I'm pretty convinced at this point that this should be the way to go for most projects out there.
The 3 makefiles I wrote were actually for a medium-size open source project https://www.oilshell.org/ (medium measured by lines of code)
I could probably write a blog post about this, but for two of the Makefiles I needed to enumerate the rules dynamically (inputs weren't fixed, they were "globbed"). And my memory is that this interacted very poorly with other GNU make features build variants
Whereas that pattern is extremely simple with Python/Ninja. Just write a loop or a nested loop and generate a build rule on each iteration. It's done in 5 minutes, whereas GNU make was a constant struggle.
I will chime in in agreement with you, since the replies here are mostly to the contrary. Writing Makefiles by hand sucks eggs.
It's really worse than this because this probably assumed a specific Linux setup at least, but everything goes out the window once Windows is in the mix. If having to deal with potentially multiple shells was a problem with Make, having to deal with multiple shells on Windows (where it could be CMD, PowerShell, or a UNIX like shell under MSys or Cygwin...) is untenable.
Today the only obviously valid reason to use Make is because you have to or already are. Most Linux distros ship Ninja. Hell, MSVC ships Ninja.
There are many examples of Makefiles for open source projects that validate how ugly things can get without needing to be Google scale. One of the most elegant Makefile setups I can think of is Near's setup for bsnes/higan,and honest to goodness, it's still a pretty horrific mess.
I don't want to be hyperbolic, but also it's irresponsible for people to suggest you should just use Make. You shouldn't just use Make. Not even as a beginner. If you wanted a nice turnkey solution for a beginner, I'd suggest CMake or Meson, both of which happily generate Ninja files and also have the bonus of supporting Windows well. CMake has broad ecosystem support, despite it's many shortcomings. It's far from elegant, but definitely practical.
That's Windows' problem and it's not even a real problem anymore because make runs fine in WSL. Microsoft has pretty much gotten their act together in the last four years in supporting open developer tools. They've got bash and ANSI support. It's great. Give them credit where credit is due. It's time to say goodbye to shoehorning unix devops into win32. Doing that gets sillier each year. Especially since, as Microsoft has been generous in supporting tools, they've certainly been boiling the frog with their win32 environment. The last time I extracted a zip file containing dev work on Windows, it extracted at 64 kilobytes per second to the local hard drive, because their virus technology was scanning and uploading everything I was doing. How can you build software in that kind of environment? And why is it that it's always the voices coming from our own community who insist that we should. People shouldn't fall that far in love with an operating system because even with perfect support, builds are still going to go slow on Windows. Use WSL.
This only makes the problem much worse. (Note that I literally reference MSys and Cygwin in my original reply. However, these options are neither ideal for making good Windows binaries nor are they great facsimiles for UNIX. Almost any Makefile needs to be carefully constructed so that MinGW works, especially if you want to support more than just very specific MinGW setups. And there's other downsides I didn't get into, such as licensing problems with winpthreads, and silent dubious behavior like pseudo relocs. I don't recommend using MinGW.)
I like the speed of Ninja, and the flexibility of having a stage that generates the ninja (or make) files. But it's a bit unclear to me what the best practice is for when to generate the ninja files. It feels like the programmer is expected to do this stage manually and there's no great automatic way to generate the ninja files when they need to be generated. How do good build systems solve this?
Yeah unfortunately there probably isn't a "best practice". I think there are just many types of projects and their needs vary. I think a big issue is what your dependencies are.
This person mentioned Ruby + Ninja, and I similarly used Python + Ninja (borrowing the ninja_syntax.py from the Ninja repo itself).
I'd say if you can get away with it, using a real language like that is straightforward and good. If you have a low level project that doesn't depend on anything (like a shell or VM :) ), then it will probably work.
I admitted I probably had to rewrite a small portion of the "120,000 lines of CMake" that ships with CMake, in Python. And that took some time, but it ended up working well for my needs.
If you have a big project with lots of dependencies (especially optional / detected ones), I think that is probably where CMake or Meson is better. They both use Ninja, but I haven't used them myself.
---
But I would also say that there aren't many best practices around GNU make either. The usage spans a very wide range... And lots of people generate it too, like Ninja -- kconfig for the Linux kernel generates GNU make, etc.
autotools and CMake both generate GNU make too. It's a big mess, but I think Ninja is a good foundation for at least not "starting with a mess" :)
> What I'd recommend for anyone really, is to just do what Google did. For the first decade and a half of Google's company lifecycle, they got along just fine with GNU Make. Then they switched to the huge scalable thing once they actually reached that inflection point.
Hopefully as a community we can build things that are scalable and as-simple-as Make. I think please.build is a step in the right direction but still too complicated.
GNU make is great for many things, used correctly. The problem is that POSIX make is extremely impoverished, so sticking to just the POSIX subset is often a bad idea. In many cases using "make" should really mean using "GNU make" so you can use conditionals, automated dependendency generation (via reloads of dependency info), etc.
However I also cringe when people say to write GNU makefiles from scratch -- because those ALSO don't work for everybody, and are generated more often than not (e.g. by autotools, CMake, kconfig, etc.)
So I'd say that GNU make gives you the illusion that you can just use that one tool, but you often end up needing to generate it anyway. And then you should have used Ninja with whatever generator you ended up with :)
1. Getting started is difficult: There is a lot of impedance mismatch across various language toolchains. This can be fixed over time by improving the tooling/libraries/docs. If JetBrains had a "New Bazel Project" and had a way to 1-click-add Java/Python/Ruby/golang/Rust/etc source into //third_party with BUILD files and a version of build_cleaner Bazel would win. Just making it easy to vendor libraries and build software from the IDE is I think all it would take to get popular.
2. The JVM: I am a huge fan of Java (I've build multiple company's backends in it) but it is not a great choice for CLI tooling (even with a daemon). A no-dependency, small, build system executable would go a long way to making it easy to get people started.
3. The cruft: A lot of things in Bazel are the way they are because someone had to get a feature out ASAP to support $BIG_USER to do $BIG_THING and it would be too difficult to migrate away/to something at Google. If we drop the cruft and redesign things from the ground up we can get a nice abstraction for the world. For example, please.build's proto_library is VERY easy to use (way easier than even the Google internal versions imo).
4. The tooling: You can get massive CI improvements, free mutation testing, frameworks for build (integration tests, e2e tests, etc), and much more by building things against the Bazel API. Unfortunately not much outside of Google supports this API. Example of things you can do with this API: https://kythe.io/examples/#extracting-compilations-using-baz...
We could live in a world where people can build langauge-agnostic tooling that automatically works so long as you pass it a `*_binary` target that transparently builds for all major platforms (zig's or APE's as crosstool) which would allow platform vendors to define macros to completely automate deployment and development for their systems. For example we could have:
```
from aws import lambda
lambda.package(
name = "my_api",
deps = [":api_handler"],
)
```
And just by running `build :my_api` you could have a zip file which packages your code (from any language) into a lambda.
> I spend a lot of time advocating for Bazel for this reason.
Bazel works pretty well for large scale projects. There are definitely things I don't like about, but I would agree that for large projects it is better than make.
But for small to medium size projects, bazel adds a lot of complexity and has the problems you mentioned, for not that much benefit. Especially if you are using a language that doesn't have built in support in bazel,or do something that doesn't match up with bazel's way of doing things.
SCons [0] uses something similar (MD5 instead of SHA) since the end of the 90es, so at least that aspect is not a Google invention. The cache there is local though. We never had flaky builds with it and could extend it very nicely, unfortunately it was not very fast.
(GNU) Make by default uses the file change timestamp to trigger actions. But this is definitely not the only way, and you can code your Makefile so that rebuilds happen when a file's checksum changes.
IIRC, the GNU Make Book has the code ready for you to study...
Or, you might get more clever and say "when only a comment is changed, I don't want to rebuild"; file checksums are not the correct solution for this, so you can code another trigger.
Justine is one of my fav contemporary hackers. Whenever I see her stuff posted on here, not only am I amazed that anyone did what was done, but even more amazed that one person did it all on their own.
Justine, if you're reading this, keep up the amazing work!
Agreed. The word "impressed" doesn't accurately capture the way I feel about her work. Her content is exactly the kind of stuff that got me fascinated by CS at a young age, and it's nice feeling that again as an adult, as I almost invariably do whenever I come across something she's written or made.
Others have said similar things, so my point is mostly one of emphasis. Bazel goes deep on finding edges in the build graph.
Example: C/C++ has this preprocessor step where ‘#include’ and stuff happen. It’s very cheap compared to what comes next.
If you change white space or comments or whatever, the preprocessor runs, hashes the same, now the .o is good, now the link is good.
OP’s stuff is cool, but just wanted to make sure everyone is clear that industrial strength build systems like Bazel take on that constant factor intentionally. They are designed for building big, complex, polyglot code bases.
The only thing I’m really missing for my usecase with make is the ability to specify goals/targets other than files. E.g., I want a docker image to be present in the daemon. Basically, calling out to a function. (Probably somebody is going to correct me and say there is something there in the form of conditionals or something..)
Docker images can be stored as a file. And maybe are indeed stored on the file system as individual files as well. But building against a specific cr implementation for this seems too brittle.
That's easy to do, but if you want to make it compatible with incremental builds the best way is to create a "I did it already" empty file as a dependency marker. E.g. "build" a ". build/called/apt-install-deps" or whatever.
This is exactly how I do it now but it is brittle. If either my packages are changed, or the definition of what needs to be installed/pulled changes I need either remove the file or something.
I could maybe put all these definitions, packages etc in separate files and then have make watch the time stamps. That saves the issue of changing definitions but doesn’t help when things get deleted from under your feet.
I want to like this (and I do). It just feels like yet another perfectly wonderful tool that only 1000 other people in the world (besides me) will use.
A while back I started an experiment/prototype called "make-audit"; this is a (draft) tool to report when an execution of GNU make reads or changes files in ways that are inconsistent with its Makefile: https://github.com/david-a-wheeler/make-audit It's nowhere ready for serious use, but it can detect the following:
* Error: Target TARGET : unreported prerequisites: SET : The make recipe for creating TARGET is reading from the prerequisites in SET, but the makefile fails to report them as dependencies. You may want to add SET to the prerequisites of TARGET.
* Error: Target TARGET : claimed but unused prerequisites: SET : The make recipe for creating TARGET claims that it depends on SET, but the items in SET were never read. You may want to remove SET from the prerequisites of TARGET.
* Error: Target TARGET : unreported target: SET The make recipe for updating TARGET also modifies the files in SET but this is not reported.
* Error: Target TARGET : unmodified reported target: SET
Very nice, this is almost exactly the kind of thing I wanted to have to create a higher level build tool (make files are IMO pretty terrible as a declarative language for building, but perhaps a good low level primitive to use to build systems like Bazel on top of). The only problem is the dependency on a fresh Linux feature (ported from OpenBSD if I understand correctly)... my ultimate goal was to have a Landlock thing that works on all OSs, but that may be really hard as not many languages abstract the file system away so that this can be implemented in the application-level (Dart has support for abstracting away the file system, and I was trying to use that, but it doesn't seem to support that kind of thing when running processes).
I suspect this should also make things like Nix and Guix easier... or maybe lighter-version of them easier as you don't need to implement the build sandox anymore when building software if you use Landlock Make.
One of the downsides of Blaze/Bazel is that the symlink tree adds indirection that makes debugging broken builds that much harder. This approach, while not very portable, seems like a win from an ease of use standpoint.
(Also, at least internally, Blaze pretty much required Linux for the longest time anyway.)
This is pretty cool! I have a habit of using Makefiles to automate a lot of things that aren't really build systems (it just maps nicely to how i think of a lot of problems - the whole target + dependency graph concept is powerful). While I'm positive that this work is great for build system stuff too, this will certainly also help solve some problems I've created for myself in the past - namely accidentally acting on files I didn't intent to.
That 200 lines doesn't include our pledge() implementation, which Make assumes is provided by the C library. Right now only Cosmopolitan Libc and OpenBSD have an unveil() implementation. It would take some thought to decide what the best approach would be for incorporating something like that into GNU Make. So I'm waiting on more feedback from the community and the GNU developers. Because upstreaming is totally something I'm open to considering.
Also please consider that GNU Make supports so many ancient platforms, like DOS, QDOS, Amiga, Windows 3.1. When I forked GNU Make, the first thing I did was delete all that support for ancient defunct platforms to give me enough room to think about how to approach implementing this feature. I don't think I could have done this if I had to wade through all that code from the start. I'm doing this work just for fun and to help out, and coding is only fun when the code is clean.
I'd like to see this upstream. It's okay if the functionality only works on some platforms... it would force fixing of makefiles, and all platforms would benefit from that result.
I've tried many alternative to make over the years, and eventually, doit (pydoit.org/) was the one I sticked with.
It's declarative yet dynamic, it's simple yet powerful, you get to use shell command or a custom function, and it deals with all the things you need like deps and targets, creating a dag, caching, etc.
Good work, but I'm afraid that this will not really catch on because it's not that portable. I also think that Bazel has the same problem; most of the sandboxing works only on Linux.
Until something like this works on Windows, it won't get wide adoption. And it is possible to do something like this on Windows.
And personally, I would want to avoid the use of GNU make.
Nevertheless, this is a step forward for those who have to use GNU make.
Bazel uses "sandbox-exec" on macOS, and has a more generic POSIX compatible sandbox if you can't use the Linux sandboxing tool too.
> And it _is_ possible to do something like this on Windows.
Are you able to point to any open source examples of this? Bazel supports Windows, and I'm sure they'd love to be able to support sandboxing on Windows as well.
Chromium's sandboxing code, design documents, etc. are a good read for one of the most widely deployed and battle tested windows sandboxes, and is presumably BSD 3-Clause "New"/"Revised" Licensed like the rest of Chromium:
1. Have an "unsandboxed" broker/parent process that implements unveil-like logic for whitelisting files.
2. Have the sandboxed child process run under a heavily restricted access token that blocks "all" file I/O (except, null security FAT32 mounts are sadly still accessible).
3. Intercept/patch Win32 API calls to request whitelisted things via IPC with the broker process that would otherwise be blocked by the restricted access token.
Bazel/Make are in slightly trickier situations, in that they run third party binaries - which might require shenannigans involving injecting DLLs, or creating patched EXEs, to do the intercepting/patching of `CreateFile` etc.
What interpreter? Why do you assert that it's "not how Bazel would want to do it"?
No, it's not chroot -- it uses a tree of symlinks, created specifically for each process being run, that assumes the process isn't trying to escape the 'sandbox' too much.
At the end of the day, a _strong_ sandbox is going to require some OS-specific support, and this could be seen as a kind of graceful degradation when there's not a better OS-provided sandboxing tool.
(As an aside, your comments seem rather combative and I don't really understand why; we can discuss the merits of this tool without declaring it'll never catch on and won't _do enough_ for your standards.)
Why do my comments seem combative? All I said was basically good job, but it will unfortunately not be used much. Why is that combative?
With regards to chroot, I stand corrected. I knew it was a tree of symlinks, but I thought it was also more than that because symlinks alone don't seem like a sandbox. Honestly, Cosmopolitan's system appears to be more of a sandbox than that.
I agree that a strong sandbox is going to require OS-specific support as of right now. I do have ideas for implementing sandboxes without it, but it requires putting the sandbox into the interpreter. And I could be completely wrong that the interpreter would do a good enough job.
And this is what I mean by interpreter: Bazel has a language. I think it's called Starlark. To make that language useful, it needs some interpreter. That intepreter might just be reading the language and building a dependency tree, but my point is that it could do more, including checks.
Perhaps my assertion that Bazel would not want to do it that way is not fair, but I said that because Bazel's method of sandboxing is different, and I suspect that they would not want to refactor their Starlark interpreter. That's all. They certainly could, and I would encourage them to. So I could be wrong, and I would eat my words in that case.
> but my point is that it could do more, including checks.
Could you explain more how you see this working?
For example: the build system is running a build step. It has determined the inputs and the outputs for that build step. It is going to execute a subprocess for that build step (say, GCC). It wants to ensure that GCC doesn't accidentally depend on files other than thaie that the build system knows about. How can that functionality be implemented with checks in the build system interpreter?
I suppose it could run the process with something strace-like and monitor which files it accesses but isn't that just a way of implementing a sandbox? I'm not sure what you mean exactly.
The best way to do this is best described in the thesis that Eelco Dolstra wrote describing Nix. I suggest you read that.
tl;dr: Clear the environment, know where all of the system headers are, control the build environment of the dependencies. Basically, knowing dependencies means controlling them.
But to expand on that, an interpreter could do some basic checking like:
* Does the command reference a path that the build system doesn't know about?
* Does the build system know where the executable is for the command, and is it well-known?
Things like that.
It won't be perfect, but it would be better. And it can get better with time.
> With regards to chroot, I stand corrected. I knew it was a tree of symlinks, but I thought it was also more than that because symlinks alone don't seem like a sandbox. Honestly, Cosmopolitan's system appears to be more of a sandbox than that.
To be totally clear: the tree of symlinks thing is a fallback, used only when lacking platform support or when sandboxing is explicitly turned off [0]. On Linux, the normal sandboxing strategy is to use namespaces, like most container runtimes. On Mac it apparently uses sandbox-exec (some opaque Apple tool), as was mentioned above. Chroot, being both non-POSIX, requiring root access on many systems, and not providing the necessary facilities is not really a great fit -- which I assume is why it's not used.
There was experimental Windows sandbox support at one point [1] based on how MS does it for BuildXL (their own build tool for giant monorepos) [2]. Unfortunately it doesn't seem to be maintained, and under the hood it's kinda ugly -- it actively rewrites code in-memory to intercept calls to the Win32 APIs [3], which was apparently the cleanest/best way MS could come up with. However, from Bazel's POV it works in a roughly similar way -- you spawn subprocesses under a supervisor, which is in charge of spinning up whatever the target process is with restrictions on time/memory usage/file access.
On the "sandbox in the interpreter" thing: what kind of checks are you envisioning? It seems like putting checks at that level would end up leaving a lot out -- the goal of any build system is to eventually spawn an arbitrary process (Python, gcc, javac, some shell script, etc.) and so even with extensive checks in starlark you'd end up with accidental sandbox breaks all over the place. For pure starlark rules you could e.g. check that there are no inputs from /usr, but even then if gcc does it implicitly, you're SOL. Or am I thinking of the wrong kind of checks?
EDIT: somehow missed your sibling comment. Nix is definitely cool, and is pretty similar to how Bazel does things with regards to explicit build graphs. The check for "well-known commands" would also be cool IMO. That said, Nix also has a chroot-y sandbox-y thing it uses to spawn processes -- so they're not all that different [4].
Yes, I agree that Bazel and Nix are not much different. Nix seems to be even more sandbox-like than Bazel, and that's good in my opinion.
Beyond what they do, I'd like checks that are even more invasive, more cautious about letting the build script do anything.
For example, if you're on Linux, a bad actor build script could technically mount the root directory `/` underneath the sandbox area in /<sandbox>/rootdir/` using Linux's bind mounts feature and then `rm -rf /<sandbox>/rootdir/`. Whatever it has permission to delete will be deleted (unless I'm unaware of some safety feature in bind mounts that prevents this besides needing root).
I would like checks that restrict a build to just performing those actions necessary to the build. You could, for example, have a permission policy, say for a particular package that you don't trust, that only allows that package to spawn GCC and the linker. If that package goes rogue in its build script, it would be stopped dead the first time it tried to either use `rm` or use a bind mount.
That's the sort of checks I'm referring to: checks for fine-grained permissions on what a build can do.
My idea is to take that even further and make it possible to have those checks in software that you compile from source so that you can stop the software from going rogue too. How I am going to do that, I'll leave unsaid for now, but I'm working on it.
> a bad actor build script could technically mount the root directory `/` underneath the sandbox area in /<sandbox>/rootdir/` using Linux's bind mounts feature
How could you do this without already being outside of the sandbox?
> Also please consider that GNU Make supports so many ancient platforms, like DOS, QDOS, Amiga, Windows 3.1. When I forked GNU Make, the first thing I did was delete all that support for ancient defunct platforms to give me enough room to think about how to approach implementing this feature.
From https://github.com/jart/cosmopolitan#support-vector , supported platforms are Windows, Linux, Mac OS X, FreeBSD, OpenBSD, and NetBSD, on AMD and Intel x86 processors. That's a large list, maybe large enough to become widely accepted, but GNU Make supports ... pretty much everything, ever. Darwin on ARM (probably the most important), NetBSD on MIPS, Solaris on SPARC, FreeBSD on POWER, Haiku, FreeDOS, HURD.
Cosmopolitan is probably portable enough and Landlock is probably portable enough to catch on, but compared to make(1), it's not that portable.
The Landlock Make binary runs on OpenBSD too, where it also supports sandboxing. But those are the only two platforms right now where it's fully-featured. As I mentioned in the article, the next logical step is to add FreeBSD support using jails. The other platforms, I'm not so sure about. Might have to do something like run the binary in a blinkenlights emulator to intercept the system calls of programs we didn't build ourselves. But it's kind of a moot point, since every platform supports spinning up a Linux thing that lets you build code. On Windows you have WSL. On Mac you have Docker. So I don't know why people are egging on with the portability angle. I mean, just having unveil() on Linux in addition to OpenBSD is HUGE! We've been waiting decades for this, and now it's finally here, thanks to Landlock!
After having used unveil() in OpenBSD, I don't really agree that it's huge. Not until it's everywhere. In fact, that's the reason I'm egging on portability: until it's everywhere, it doesn't really matter, and all the world is not a Linux or an OpenBSD.
And while WSL and Docker might exist on those platforms, they are still hobbled. They're both VM's, not native.
Oh, and the other reason that unveil() is not huge is because it's voluntary for programs that use them. That's why using pledge() and unveil() for regular programs is not huge and never was.
Sure, it can help you identify missing dependencies. That's great, sure. Your work here is good because any GNU Makefile can be sandboxed. That's good. That is actually not voluntary for build scripts because they won't usually unveil() things. That is a step forward, which is why I said that in my first comment.
But the real problem in build systems is making sure that build scripts and the software they build cannot do anything bad to machines. Look at all the protest malware with the Russia situation, as well as other instances of supply chain attacks. Using pledge() and unveil() can help with the build scripts problem, but not the actual software. Unless you patch yourself, which you can do, but that's manual, and people aren't going to bother.
And then there are the limits: on OpenBSD, they tell you not to unveil() just the paths passed on the command-line, at least last I checked. That's kind of dumb. I hope Cosmopolitan on Linux does not have that issue because that's a big limitation. It means, for example, that my `bc` cannot use unveil() until after it processes the files given to it by the user, i.e., only when it starts reading from stdin. If that were not the case, I could have called unveil() as soon as all command-line arguments were processed.
Anyway, the point is that yes, this is a step forward, for Linux, but we need an automated way of checking for supply chain attacks in software, not just builds.
I said devops not distribution. APE is build-once run-anywhere, i.e. you build once on Linux and run your binary on seven platforms. APE will stand to gain if a stronger consensus emerges about only doing coding on Linux/BSD. Then we can let Windows and Mac be more like modern teletypes that are great for ssh, browsing, and consumption. It wouldn't have been possible for me to achieve what I achieved had I focused on the traditional build-everywhere run-everywhere model. It feels liberating using one and only one toolchain, on one platform, using zero configure scripts. No operating system is beyond the reach of x86_64-pc-linux-gnu. This is the one true way for compiling code.
> Landlock Make can build code five times faster than Bazel, while offering the same advantages in terms of safety. In other words, you get all the benefits of a big corporation build system, in a tiny lightweight binary that any indie developer can love.
In terms of safety, maybe almost (bazel can check if the source files changed during the build, this (afaict) can not). But bazel also provides a lot more (caching, remote builds, ...). So, while cool, read more on it and evaluate it in depth before deciding to replace bazel with this.