I find I need to quote myself from 2013, as I can't really add anything further to it:
> Wow. I honestly wasn't expecting to ever see this on the front page of HN again, given the current ubiquity of 64-bit Linux. (And yes, before anyone asks, I've played around with minimizing 64-bit executables. Unfortunately they are both larger and less forgiving of tomfoolery. The smallest 64-bit ELF I've created is 84 bytes.)
> Since it is here, though, I want to take the opportunity to say thanks to everyone who's expressed their appreciation of my essay. And I should note here that writing that essay, so many years ago now, is one of the better thing I've done for my career. Share what you have to learn the hard way; the effort won't be wasted.
Oh cool. I stumbled across this a while back and got inspired wrote some ELFs in gas (GNU assembler). It's surprisingly straightforward and there's even a manpage elf(5), which is pretty much exactly the 32- and 64-bit specifications rolled into one.
If this kind of thing sounds fun to you then you'll probably want a System V ABI reference [0] to look up all the different relocations etc. Also, /etc/include/elf.h can be helpful.
The part I started having trouble with was getting dynamically linked binares working. At the time I didn't quite understand what the PLE amd GOT were, so set things down. Maybe it's time to pick this back up.
As always, the OS Dev wiki [1] is also a really good source for these kinds of low level implementation details.
I've recommended this to many people as a practical way to learn more about ELF, system calls, assembly, and various other things, quite aside from size optimization.
>"The linker is still adding an interface to the OS for us, and it is that interface that actually calls main(). So how do we get around that if we don't need it?"
Is the author stating that _start is an interface to the OS then? It's interesting I never thought of this as being an interface. Is this an interface because _start actually sets up the stack used to pass argc, argv and the environment variables? I believe _start is found in the C runtime crt.o correct?
The author then states:
>"The actual entry point that the linker uses by default is the symbol with the name _start. When we link with gcc, it automatically includes a _start routine, one that sets up argc and argv, among other things, and then calls main()."
Is this just a long-winded way of saying _start sets up the stack for the new process? WIthout a stack you can't "set up" argc, argv etc. I would also be curious what "among other things" might include.
> Is the author stating that _start is an interface to the OS then? It's interesting I never thought of this as being an interface.
Perhaps it's not the best term, but it's not really incorrect. I probably wouldn't phrase it that way today (20 years later).
> Is this just a long-winded way of saying _start sets up the stack for the new process? WIthout a stack you can't "set up" argc, argv etc.
Well -- to be precise, the stack is created by the kernel during the exec system call. The kernel also copies the argument and environment strings into the process's memory. However, the kernel does not create the argv array of pointers, or the envp array. Building those arrays is the job of _start(), and that is what I meant by "setting up argc and argv".
Please note: That was the case back in 1999. Things are different now -- I think it changed with the 64-bit kernel? In any case, nowadays the ELF loader in the kernel also builds the arrays.
> I would also be curious what "among other things" might include.
As before, it depends on the C runtime. The _start() function might ensure proper initialization of libc, for example. (Again, with current binaries that's usually done via an `.init` section, but in 1999 that was not the norm.)
https://github.com/pts/pts-xtiny for creating similarly small i386 ELF executables from C source, without the usual libc overhead. hello_world.c compiles and links to ~200 bytes.
> Wow. I honestly wasn't expecting to ever see this on the front page of HN again, given the current ubiquity of 64-bit Linux. (And yes, before anyone asks, I've played around with minimizing 64-bit executables. Unfortunately they are both larger and less forgiving of tomfoolery. The smallest 64-bit ELF I've created is 84 bytes.)
> Since it is here, though, I want to take the opportunity to say thanks to everyone who's expressed their appreciation of my essay. And I should note here that writing that essay, so many years ago now, is one of the better thing I've done for my career. Share what you have to learn the hard way; the effort won't be wasted.