SQLite is easy to compile

pjc50 · on Oct 28, 2019

> But then I tried to run it on a build server I was using (Netlify), and I got this extremely strange error message: “File not found”. I straced it, and sure enough execve was returning the error code ENOENT, which means “File not found”. This was kind of maddening because the file was DEFINITELY there and it had the correct permissions and everything.

This is an infuriating property of the runtime linkers on both Linux and Windows: if you're trying to load file A, and dependency B of file A does not exist, you just get "file not found" with no indication of which file it was, and it's extremely hard to debug. At least on linux the "show dependencies" tool is built in.

lallysingh · on Oct 28, 2019

There is also LD_DEBUG, which is quite helpful.

zshift · on Oct 28, 2019

I ran into this last week with golang. I’m used to building statically-linked executables, and forgot to set `CGO_ENABLED=0` to force this when compiling. I was doing a multi-stage docker build from `golang:1.13` and copying the final executable to `alpine:3.10`, only to see `shell: main: file not found`. `ls` and just about everything else showed it there. It wasn’t until I rabbit holed into dynamically-linked go executables that it started to make any sense at all. A simple message of `main: linked '/lib64/...' not found` would have saved me hours.

jwilk · on Oct 28, 2019

ld.so on Linux does show you which library is missing.

In this case, it was ld.so itself that was missing, so it didn't have a chance to tell you what's wrong.

my123 · on Oct 29, 2019

On Windows, you get a graphical dialog in those cases, but no CLI feedback...

pjmlp · on Oct 28, 2019

vcdepends used to be part of the SDK as well.

thomascgalvin · on Oct 28, 2019

> All the code is in one file (sqlite.c), and there are no weird dependencies! It’s amazing.

This is because the author of SQLite publishes it this way as a convenience to integrators; the actual day-to-day coding is not done in a single C file.

In fact, the README[1] calls out twelve "key files", and explicitly warns that SQLite "will not be the easiest library in the world to hack."

https://sqlite.org/src/doc/trunk/README.md

mhd · on Oct 28, 2019

There's quite a bit of build management going on with sqlite. Apart from that script, Dr. Hipp wrote his own source code management system[1] to fit his need and if I'm not mistaken they don't write headers by hand[2].

[1]: https://www.fossil-scm.org/

[2]: https://www.hwaci.com/sw/mkhdr/

thomascgalvin · on Oct 28, 2019

Fossil intrigues me every time I see it mentioned. I've never found a sue for it, but it's so neat that I want to find a use for it.

j88439h84 · on Oct 28, 2019

The main features could be emulated with a small wrapper around git, no?

phamilton · on Oct 28, 2019

A full web ui with issue tracker is a bit more than a small wrapper.

zabzonk · on Oct 28, 2019

> the actual day-to-day coding is not done in a single C file.

That's true - but in order for it be able to be reduced to a single C file, a lot of thought had to go into the process.

jstimpfle · on Oct 28, 2019

You just don't depend on weird libraries that mess up your preprocessor state. Then you make sure that even your static (file local) variables have unique names. And then you concatenate all the header and source files.

It shouldn't be very hard, but most projects fail already at the dependency stage.

bch · on Oct 28, 2019

And also as a gift to optimizing compilers, from discussions I’ve had w him. The idea is that a single compilation unit (the “amalgamation”) is easy for the compiler to reason about.

cozzyd · on Oct 28, 2019

Sure, but now we have -flto, somewhat negating that advantage?

bch · on Oct 28, 2019

Perhaps. Assumes all compilers support it, but otherwise, a testable case.

Glacia · on Oct 29, 2019

Except LTO is slow to compile and SCU is fast.

akhilcacharya · on Oct 28, 2019

I'm not a C/C++ developer, but why can't more C/C++ projects be distributed as source amalgamations? What prevents this from happening?

As in, source to source amalgamation rather than a more complicated build system to generate a binary.

paulddraper · on Oct 28, 2019

> why can't more C/C++ projects be distributed as source amalgamations

* Build time and memory increases.

* Debugger locations are less clear. MSVC debugging doesn't even work past 64KB lines.

* You still need to link it anyway when using as a library.

* Linker optimization has improved (especially clang).

* It's more difficult to modify.

SQLite has some decent reasons for it especially historically; I'm not sure those reasons are as strong for other projects.

For applications, distributing binaries is even more convenient than source code (assuming they exist for your platform).

And for libraries, "header-only" can be more of a convenience than "single-file".

akhilcacharya · on Oct 28, 2019

Interesting, thank you!

Are there any ways of doing that sort of distribution header-only then, rather than single file?

wkz · on Oct 28, 2019

For one, I think this would be cumbersome from a debugging perspective. All of the address-to-source-mappings in your DWARF section would point to things like "lib.c:237652" instead of "lib-module.c:347".

It also means that you're effectively linking all you dependencies statically. In some cases this can be a plus, in some cases a minus. I think having the choice is important though.

I imagine there are lots of other parts of the toolchain that would have to be updated to support this workflow. Seems like a lot of hassle for something that, though at times complicated, mostly works.

Mikhail_Edoshin · on Oct 29, 2019

There's the #line directive that can keep the original file names and line numbers.

jandrese · on Oct 28, 2019

For larger projects there is a danger of blowing out the memory on older machines when you combine everything into one monstrous C/CPP file.

It's already impossible to compile Firefox on a machine with 1GB of RAM because some of the files are just too big. Even with swapping enabled the build process crapped out for me last time I tried it. While 1GB machines are pretty rare these days outside of SBCs, you could easily blow out a 4GB machine if you tried to compile the whole damn thing at once, and 4GB machines are still commonly found.

speleo_engr · on Oct 28, 2019

Is that much of a problem though? For SBCs like the Raspberry Pi you can do builds on the SBC itself. While this is neat, it is far from optimal because builds are very slow. It's much more typical for embedded systems to do you build on a host PC with a cross-compiler.

jandrese · on Oct 28, 2019

Sometimes getting that cross compiler environment set up properly, especially with all of the dependencies and build tools, is more hassle than just letting the build run overnight on the SBC itself. Especially when it has a configure script that wants to check on various properties of the target architecture as part of the build.

BubRoss · on Oct 28, 2019

I'm not sure anyone thought that it was made that way, this is very explicitly for distribution and it works very well.

eximius · on Oct 28, 2019

In a vacuum, reading that statement, that was the assumption I made. Certainly strikes me as a bit odd, but then, SQLite is a bit of an odd project. And sometimes odd projects do things differently in a way that works out well because they have the discipline to pull it off.

But it does make much more sense that it is worked on in pieces and then put back together.

chubot · on Oct 28, 2019

Short Tcl script that combines everything into one file:

https://www.sqlite.org/src/artifact/5fed3d75069d8f66

Some other details / rationale:

https://www.sqlite.org/amalgamation.html

Combining all the code for SQLite into one big file makes SQLite easier to deploy — there is just one file to keep track of. And because all code is in a single translation unit, compilers can do better inter-procedure optimization resulting in machine code that is between 5% and 10% faster.

bobochan · on Oct 28, 2019

One of the benefits of this is Alon Zakai's wonderful compilation of the SQLite C code with Emscripten to generate sql.js, which I have found very useful for teaching SQL.

https://github.com/kripken/sql.js

carapace · on Oct 28, 2019

Awesome! "Web SQL Database" rides again!

https://en.wikipedia.org/wiki/Web_SQL_Database

dpcan · on Oct 28, 2019

Yeah, so, this just blew my mind. I compiled it with an older version of gcc that I had on my Windows machine and it worked just as easily (without threading, or -ldl)

But there it is, a fully featured SQL engine in an EXE file that I can use with any application I want.

In a world that requires a million SDK's, DLL's, dependencies, etc, this is the most refreshing thing in the world.

enitihas · on Oct 28, 2019

Being easy to compile is a very nice thing to have about any software. Redis for example, is also too easy to compile, and quickly too. I think a lot of software using autotools is easy to compile (atleast on the POSIX compliant systems). Even postgresql is easy to compile, although not very quickly.

MuffinFlavored · on Oct 28, 2019

> run ./configure > realize i’m missing a dependency

If only there was like, a tool, to help you manage compile-time dependencies.

I wish there was a `yarn` equivalent for C projects.

jkbbwr · on Oct 28, 2019

This problem is 1000% harder than you think it is, and everyone has a different idea on how to solve it.

MuffinFlavored · on Oct 28, 2019

Sure, but like... `package.json` as a source of truth for dependencies (package names + semver criteria to match) followed by abstracting fetch + build away through one command `yarn install`

Those two concepts will always remain the same despite the underlying workings, no?

progval · on Oct 28, 2019

That's what distributions do. For example in Debian, the lists of build and run-time dependencies are in ./debian/control

pjc50 · on Oct 28, 2019

In some ways the entire concept of "Linux distribution" developed to address this. Certainly if you stay within a distro ecosystem it will have tools to do this for you in both Debian and Redhat.

maple3142 · on Oct 29, 2019

But doesn't this make compiling program on systems without these tools hard? For example, it is not easy to compile those programs on Windows without WSL or Cygwin.

pjc50 · on Oct 29, 2019

Well, it doesn't make it easier. I suppose you could try building a set of RPM packages for the Windows platform, which is basically what Cygwin is.

If you're saying "wouldn't it be great if C the language came with an integrated package manager that worked the same on all platforms and was available everywhere", then yes, that would be great, and I too would like a pony. I just think the ecosystem is way too fragmented to ever achieve that. Doubly so if you have to start thinking about cross-compilers and the long tail of little embedded targets.

eitland · on Oct 28, 2019

Working with other languages was how I realized how brilliant the combo of Maven and Java was ;-)

(PS: today Yarn and Nuget does some of this for the Node/JS/etc and .Net ecosystem but back when Maven arrived those weren't even planned.)

gravypod · on Oct 28, 2019

I wish we just collectively stuck to a single language agnostic build systems like buck, pants, or bazel and distributed everything with BUILD files.

rhn_mk1 · on Oct 28, 2019

I find pkgconfig + dnf install pkgconfig(foo) useful

camgunz · on Oct 28, 2019

Haha, I actually feel exactly the opposite! I wish JS, Ruby, Python, etc. deferred to OS-specific package management. Mandatory xkcd reference, etc.

giovannibajo1 · on Oct 28, 2019

Another good example is the Go compiler. Assuming you have any version of Go in your PATH, it's:

    git clone https://github.com/golang/go
    cd go/src
    ./make.bash     # or make.bat for Windows

And that's it. You can then use "bin/go" to compile your projects.

MayeulC · on Oct 28, 2019

It's a whole lot more complex to run a build script like this than a simple call to a C compiler, and has many more dependencies: you even need bash installed, and it looks like the script would fail if the kernel isn't compiled with selinux support... The script itself isn't that compilcated, but complicated enough that I didn't want to spend too much time reading it.

Moreover, you cannot compare cloning a git repository with downloading a single c source file... This seems equivalent in complexity to the good old "git clone; ./configure; make"

aliceryhl · on Oct 28, 2019

You just reminded me of GOPATH :(

mikece · on Oct 28, 2019

The amalgamation file is a really interesting idea -- is this common in the world of C applications? The documentation is quite clear, however, that the amalgamation file and the source files are not the same thing. The source code (1,848 files in 40 folders) can be pulled down here -- https://www.sqlite.org/cgi/src/doc/trunk/README.md -- but more assembly will be required if you're planning to build the project.

UPDATE: maybe not so much assembly is required... just running "make" built the project without any drama (I'm on macOS with XCode and tooling for Xamarin already installed - YMMV in terms whether you might need to install something to compile from source).

speleo_engr · on Oct 28, 2019

It's not common, but there are other projects like this for easy integration, especially on embedded systems. You see some C++ libraries that are a single header file include. And here is a collection of single file headers for graphics tasks, like loading images, resizing images, rendering font glyphs, etc:

https://github.com/nothings/stb

amyjess · on Oct 28, 2019

Every time an article about SQLite hits the front page here, I find myself stunned at how well-designed it is.

setheron · on Oct 28, 2019

I haven't checked yet but was curious if it post processes the file into a single source file or that's how it's developed. The former sounds useful for build (as her blog suggests) whereas the latter sounds frightening to audit. (Unless it's written literate?)

thomasgt · on Oct 28, 2019

I believe they use TCL scripts to generate the amalgamated .c file. Oddly enough there is a post about TCL on the front page as I type this.

BubRoss · on Oct 28, 2019

If there is a way to run the preprocessor without any recursion it seems like that would be an easy way to do it without an extra tool.

z92 · on Oct 28, 2019

It's far more easier. I had some sqlite2 database on an old system but nothing to convert it or even dump it. Had to download sqlite2 source and delete all tcl.c files. Then ran

    gcc *.c -o sqlite2

and it was done. That much simple.

MrZongle2 · on Oct 28, 2019

I love a happy ending!

(Especially when many of my compiling-from-source experiences resemble what the author was anticipating)