> But then I tried to run it on a build server I was using (Netlify), and I got this extremely strange error message: “File not found”. I straced it, and sure enough execve was returning the error code ENOENT, which means “File not found”. This was kind of maddening because the file was DEFINITELY there and it had the correct permissions and everything.
This is an infuriating property of the runtime linkers on both Linux and Windows: if you're trying to load file A, and dependency B of file A does not exist, you just get "file not found" with no indication of which file it was, and it's extremely hard to debug. At least on linux the "show dependencies" tool is built in.
I ran into this last week with golang. I’m used to building statically-linked executables, and forgot to set `CGO_ENABLED=0` to force this when compiling. I was doing a multi-stage docker build from `golang:1.13` and copying the final executable to `alpine:3.10`, only to see `shell: main: file not found`. `ls` and just about everything else showed it there. It wasn’t until I rabbit holed into dynamically-linked go executables that it started to make any sense at all. A simple message of `main: linked '/lib64/...' not found` would have saved me hours.
> All the code is in one file (sqlite.c), and there are no weird dependencies! It’s amazing.
This is because the author of SQLite publishes it this way as a convenience to integrators; the actual day-to-day coding is not done in a single C file.
In fact, the README[1] calls out twelve "key files", and explicitly warns that SQLite "will not be the easiest library in the world to hack."
There's quite a bit of build management going on with sqlite. Apart from that script, Dr. Hipp wrote his own source code management system[1] to fit his need and if I'm not mistaken they don't write headers by hand[2].
You just don't depend on weird libraries that mess up your preprocessor state. Then you make sure that even your static (file local) variables have unique names. And then you concatenate all the header and source files.
It shouldn't be very hard, but most projects fail already at the dependency stage.
And also as a gift to optimizing compilers, from discussions I’ve had w him. The idea is that a single compilation unit (the “amalgamation”) is easy for the compiler to reason about.
For one, I think this would be cumbersome from a debugging perspective. All of the address-to-source-mappings in your DWARF section would point to things like "lib.c:237652" instead of "lib-module.c:347".
It also means that you're effectively linking all you dependencies statically. In some cases this can be a plus, in some cases a minus. I think having the choice is important though.
I imagine there are lots of other parts of the toolchain that would have to be updated to support this workflow. Seems like a lot of hassle for something that, though at times complicated, mostly works.
For larger projects there is a danger of blowing out the memory on older machines when you combine everything into one monstrous C/CPP file.
It's already impossible to compile Firefox on a machine with 1GB of RAM because some of the files are just too big. Even with swapping enabled the build process crapped out for me last time I tried it. While 1GB machines are pretty rare these days outside of SBCs, you could easily blow out a 4GB machine if you tried to compile the whole damn thing at once, and 4GB machines are still commonly found.
Is that much of a problem though? For SBCs like the Raspberry Pi you can do builds on the SBC itself. While this is neat, it is far from optimal because builds are very slow. It's much more typical for embedded systems to do you build on a host PC with a cross-compiler.
Sometimes getting that cross compiler environment set up properly, especially with all of the dependencies and build tools, is more hassle than just letting the build run overnight on the SBC itself. Especially when it has a configure script that wants to check on various properties of the target architecture as part of the build.
In a vacuum, reading that statement, that was the assumption I made. Certainly strikes me as a bit odd, but then, SQLite is a bit of an odd project. And sometimes odd projects do things differently in a way that works out well because they have the discipline to pull it off.
But it does make much more sense that it is worked on in pieces and then put back together.
Combining all the code for SQLite into one big file makes SQLite easier to deploy — there is just one file to keep track of. And because all code is in a single translation unit, compilers can do better inter-procedure optimization resulting in machine code that is between 5% and 10% faster.
One of the benefits of this is Alon Zakai's wonderful compilation of the SQLite C code with Emscripten to generate sql.js, which I have found very useful for teaching SQL.
Yeah, so, this just blew my mind. I compiled it with an older version of gcc that I had on my Windows machine and it worked just as easily (without threading, or -ldl)
But there it is, a fully featured SQL engine in an EXE file that I can use with any application I want.
In a world that requires a million SDK's, DLL's, dependencies, etc, this is the most refreshing thing in the world.
Being easy to compile is a very nice thing to have about any software. Redis for example, is also too easy to compile, and quickly too.
I think a lot of software using autotools is easy to compile (atleast on the POSIX compliant systems).
Even postgresql is easy to compile, although not very quickly.
Sure, but like... `package.json` as a source of truth for dependencies (package names + semver criteria to match) followed by abstracting fetch + build away through one command `yarn install`
Those two concepts will always remain the same despite the underlying workings, no?
In some ways the entire concept of "Linux distribution" developed to address this. Certainly if you stay within a distro ecosystem it will have tools to do this for you in both Debian and Redhat.
But doesn't this make compiling program on systems without these tools hard? For example, it is not easy to compile those programs on Windows without WSL or Cygwin.
Well, it doesn't make it easier. I suppose you could try building a set of RPM packages for the Windows platform, which is basically what Cygwin is.
If you're saying "wouldn't it be great if C the language came with an integrated package manager that worked the same on all platforms and was available everywhere", then yes, that would be great, and I too would like a pony. I just think the ecosystem is way too fragmented to ever achieve that. Doubly so if you have to start thinking about cross-compilers and the long tail of little embedded targets.
It's a whole lot more complex to run a build script like this than a simple call to a C compiler, and has many more dependencies: you even need bash installed, and it looks like the script would fail if the kernel isn't compiled with selinux support... The script itself isn't that compilcated, but complicated enough that I didn't want to spend too much time reading it.
Moreover, you cannot compare cloning a git repository with downloading a single c source file...
This seems equivalent in complexity to the good old "git clone; ./configure; make"
The amalgamation file is a really interesting idea -- is this common in the world of C applications? The documentation is quite clear, however, that the amalgamation file and the source files are not the same thing. The source code (1,848 files in 40 folders) can be pulled down here -- https://www.sqlite.org/cgi/src/doc/trunk/README.md -- but more assembly will be required if you're planning to build the project.
UPDATE: maybe not so much assembly is required... just running "make" built the project without any drama (I'm on macOS with XCode and tooling for Xamarin already installed - YMMV in terms whether you might need to install something to compile from source).
It's not common, but there are other projects like this for easy integration, especially on embedded systems. You see some C++ libraries that are a single header file include. And here is a collection of single file headers for graphics tasks, like loading images, resizing images, rendering font glyphs, etc:
I haven't checked yet but was curious if it post processes the file into a single source file or that's how it's developed.
The former sounds useful for build (as her blog suggests) whereas the latter sounds frightening to audit.
(Unless it's written literate?)
It's far more easier. I had some sqlite2 database on an old system but nothing to convert it or even dump it. Had to download sqlite2 source and delete all tcl.c files. Then ran
This is an infuriating property of the runtime linkers on both Linux and Windows: if you're trying to load file A, and dependency B of file A does not exist, you just get "file not found" with no indication of which file it was, and it's extremely hard to debug. At least on linux the "show dependencies" tool is built in.