Technically the executable only runs on 6 operating systems as the 7th option basically turns the application into an operating system running on bare metal.
Justine's impressive work on this subject makes me wonder if it's possible to build truly native applications using more complete toolsets and more modern languages. Multiplatform toy programs have existed for ages, but this SDK of sorts makes it possible to write executables that don't require manually tricking the compiler or adjusting your code for portability with inline assembly. As long as the program is statically compiled and contains the necessary drivers for things like syscalls, this should be perfectly possible, though the current implementation isn't exactly user friendly, relying on giant headers like https://justine.lol/cosmopolitan/cosmopolitan.h, statically linked components like https://justine.lol/cosmopolitan/crt.o and https://justine.lol/cosmopolitan/ape.o, and other such tweaks.
I'd love a Go/Rust compile target that just works regardless of operating system. As an added bonus, the bare metal nature would be excellent for platforms like Firecracker to run services without even needing to boot a kernel!
It would be nice if you didn't need to start out with tons of ccflags and ldflags, but I'll point out you don't actually need to use the amalgamation header, alternatively using -isystem you can pass the libc/isystem directory to add the libc headers to the include path.
Especially for porting existing projects, I recommend against using the amalgamation as sometimes including everything all the time causes problems.
Can you build a binary on NetBSD with gcc or whatever, linked to the system libc (or whatever usually happens by default), such that you can copy it over to a FreeBSD or OpenBSD system and it'll just run, without having to install any kind of special compat or emulation layer?
If so, ok, sure, then you have a point. Otherwise, no, those are three distinct OSes, and it's entirely correct to list them as such.
I know you can't mix macOS into that, so the "or 4 if I'm generous" is pretty unfair.
> The "OS" in BIOS doesn't stand for Operating System! It's Basic Input/Output System.
I don't think the author is claiming it means "Operating System", just that "BIOS" is a simple shorthand for "bare metal". Which no, isn't an "operating system" in the classical sense, but it is a standardized set of very low-level system services that sorta vaguely kinda acts like one. And if we consider UEFI rather than old-school PC-BIOS, it's even more like an OS.
Your comment just feels like pedantry for pedantry's sake (and not particularly correct pedantry, at that). Not sure what you're trying to accomplish here by minimizing the achievement.
I'm going to strongly disagree; BSD hasn't been a single operating system since 1995 (4.4BSD-Lite Release 2, the last BSD version from Berkeley). The extant BSDs are closely related, derive from a common original source, and semi-frequently share code back and forth, but they're separate systems with different code and different ABIs.
Hmm, linux also use ELF, as does the BSDs - but an normal executable is still not compatible across any of the 4 OSs. There isn't any good reason to lump them together.
macOS (it’s not called OS X any more) is listed as XNU.
And of course it would list every BSD instance. Supporting them isn’t automatic when you’re taking about pre-compiles binaries because they can have subtly different ABIs. POSIX helps here but mostly at the source level, an ELF compiled for FreeBSD wouldn’t automatically run on OpenBSD nor NetBSD. The fact this does is unusual and deserves calling out.
Keep in mind that, although this is a cool concept; in practice everything needs an installer. (IE, a .msi file or a Setup.exe for Windows, a .dmg or a .pkg for Mac, and whatever Linux requires.)
And, if you're shipping into an app store, there's really no point in using something like this.
Replacing a running executable isn't possible under Windows (and probably some other systems); you'd need to spawn a second process that waits for your process to die, replace the executable, and then relaunch the application.
Possible, but a lot harder than doing it the Linux way.
As a practical matter, many applications span multiple directories, so there is not just one file to delete. If the app, is simple enough, though, sure -- just replacing the executable can work.
Some examples of apps spanning many files and directories are configuration files, cached data, and plugins. For uninstallation, much of that stuff should definitely be deleted; for upgrades, there are occasionally conflicts which installer systems can help to manage (though, often enough, they can only do part of what needs to be done).
You can, but personally I think having to seek out some website when there's an update gets old pretty fast. And updates are important for security reasons.
The comparison is fatuous. If your mates had names and there was a button on your phone, one for each acquaintance by name, you wouldn't need a way of looking them up.
BTW I also worked on a small in-house system that was just a run-it-from-the-desktop install (aka xcopy install) and it worked perfectly. Boss insisted we needed a "proper" installer so I tried the installer-maker provided by MS. Jesus what a fucked-up mess that made of the whole process, a day of trying to get that working (while it sprayed guids and subdirectories everywhere and could't even uninstall what it installed) and I gave up.
I also have a fun story of an MSSQL uninstall that went wrong on a live server.
executive summary is that xcopy installs work very well IME.
> If your mates had names and there was a button on your phone, one for each acquaintance by name, you wouldn't need a way of looking them up.
Which you realistically don't, hence the point of the comparison.
> executive summary is that xcopy installs work very well IME.
The point isn't the initial install, it's the subsequent lookups (including update, uninstallation, etc.) when you have a bunch of programs to deal with. When you're dealing with 1 program, you can treat it like your pet. But when you're dealing with 50, you can't do that anymore. You need to realize you have cattle and adjust accordingly.
Phone numbers are nothing like programs. The comparison is false.
> The point isn't the initial install, it's the subsequent lookups (including update, uninstallation, etc.).
Well perhaps, but if you can delete a program by literally deleting the file, and update it literally by replacing it then you are onto a good thing - unix-like simplicity. Dependencies may be a problem, or not.
> Phone numbers are nothing like programs. The comparison is false.
That wasn't the comparison in the first place.
Finding the location of the program and its uninstaller is very much like an address or phone number lookup. Updating it is like updating an address book. And you need a database to maintain it all and find what you need, and something has to update that database. That's one major thing an installer does.
Er. Windows has Portable Apps (especially of https://portableapps.com/ fame, but also in general), Mac happily lets you run most things directly from the DMG without "installing", GNU/Linux tends to favor going through the package manager but lots of people just build and run from source and there's absolutely nothing preventing you running a static binary in place. Everything most certainly does not need an installer, though I grant there are sometimes benefits to having one.
It would be an interesting exercise to use executable compression on the fat binary to make it less fat that also transparently runs on the same 7 operating systems.
This is really cool! I have a question for people who know this stuff: what's the "f.▼ä" pattern that appears at the end of these binaries? It looks intriguing...
Why is so much padding needed? Once I remove support for Windows and Bare Metal it seems that it might be possible to halve the binary size if you can remove the legal section (or at least change it for a link + checksum?) and the padding.
BTW, once I enable windows I think there's padding at the end of the binary section that's hard to notice, it might be good to replace it with something visible. I think it's coming from `ape_idata_iatend`.
Justine's work on all this has been hugely inspiring and a big breath of fresh air in an area that always kicks the can of reducing final binary sizes down the road.
That's one of my very few gripes with Rust: compared to what Zig / C / C++ can produce, Rust's release binaries are quite huge. There's likely a lot of possible work to be done in the area of tree shaking and generally only leaving in what's actually used in the final binary. I hope this becomes a priority soon-ish.
Why wouldn't it help? Surely this (and others) method do static analysis on what's actually used in the final program, no?
Example: you use a crate with 13 functions. You only use 1 but it depends on 2 others in the same crate that don't depend on anything else. Why shouldn't the end result only compile 3 of the 13 functions?
Or maybe I am misunderstanding or I am badly informed (not a compiler / linker engineer so I'm likely with naive assumptions) -- apologies if so.
I understand it's not easy, I am just a bit surprised that Rust's team is not chasing this a bit more aggressively, let's say. It puts a slight stain on an otherwise excellent project.
Well it helps, but it'd be like spooning water out of a bucket when a faucet is pouring more into it. The NPM dependency model solves the diamond dependency problem by schlepping in multiple versions of the same package. C / C++ / Java (and I assume Zig too) require a single version of any given package. The way Rust and NPM do things adds a whole dimension of complexity to what needs to be shaken. It's sort of like a devil's bargain because it helps their ecosystems grow really fast, but it creates a lot of bloat over the long run. Rust is still relatively new so binary sizes are still pretty good compared to what we might witness in the future. So enjoy the golden age of Rust while it's golden!
No I don't think so, but I'm not certain. I think with Go the problem always had more to do with the way the language is designed, making it difficult for the linker to prune dependencies.
Wow, really? I've always thought that Cargo by default forces a package into one particular version. And that a particular package in multiple versions is something people opt into only in rare circumstances (similarly as in Java/Maven/shadowing case).
I still wish at one point binary sizes are taken seriously though. But I am aware that it's likely (a) not at all a priority currently, and (b) a gigantic effort.
I’m not sure what “taken seriously” would mean to you, but I work in embedded. We get the Rust compiler to spit out programs in the hundreds or thousands of bytes regularly. Binary sizes are more about the people doing the programming than the compiler missing some sort of crucial technology.
I get that this is possible with `no_std` but that's not an option for me.
Guess I'll dig out the guides for reducing binary sizes but last time I needed tokio + opentelemetry + a Prometheus adapter my release binary was always at least 5MB.
Also sorry, didn't mean to come off as dismissive. It's just that to me 5MB is quite a lot and should be shrinkable. I keep wondering if the compiler/linker tech can help Rust there.
It’s all good! I don’t think you were being dismissive at all. I don’t even contribute to the Rust project anymore, and in fact wish they’d prioritize various toolchain improvements. Just in this specific case I don’t really think there’s anything to be done. All the standard stuff is already in there. But maybe I’m wrong!
Is that 5MB with or without symbols? Note that on Linux, binaries aren't stripped by default.
Having said that, if your binary ended up including a hyper-based HTTP server and a TLS implementation, and was built with the default release profile (optimizing for speed rather than size), I can believe it would reach 5MB stripped.
Take out that last line if you need to recover from panics at runtime. So far, in my limited use of Rust, I haven't had to. Anyway, those are all the easy tweaks to reduce binary size.
Had the first two already. I always prefer to optimize for speed and not size but just for the heck of it I enabled it. Wasn't super sure about the panic thing so I never used it but your comment is reassuring so I will just leave it in.
From 5.0M to 3.0M, not bad!
Turned the optimization for speed back on -- 4.4M! Wow.
This doesn't sound too bad, and functions* that are not changing could, in theory, also be unified, although a single change in one of the transitive functions called might force you to keep multiple versions.
I'm not really familiar with packaging problems and obviously quadratic is worse than linear, but is having no duplicates a real alternative?
(Disclaimer: I'm mostly guessing here, but am I missing something important?)
AFAIU the choice here also considers how easily things will build and link, and whether your package manager/build tool will need a SAT-solver to figure out which version of each library to use, and even then you can still run into unsatisfiable restrictions (`Could not resolve dependencies`) if libraries are not adequately updated/maintained.
It seems that by allowing duplication "only" pay for the libraries that are not updated (including all their deps), which means that you are trading computer resources (disk, cpu?) for human time updating and debugging, which might be a really good deal, especially if you only end up with duplicates in the cases where you lack the human time to keep all deps updated.
One could argue that the time cost of maintaining the libraries can only be deferred so there's no benefit deferring it, but the time people using the libraries save because they don't need the libraries to get updated if they have enough disk is probably what made Rust and NPM just duplicate dependencies.
Essentially it comes with a garbage collector which gets embedded into your binary as well as any other standard library calls you may need for fundamental features.
& from what I gather, APE is tightly-bound to x86_64, & depends on an embedded emulator for other architectures:
> All we have to do embed an ARM build of the emulator above within our x86 executables, and have them morph and re-exec appropriately, similar to how Cosmopolitan is already doing doing with qemu-x86_64, except that this wouldn't need to be installed beforehand. The tradeoff is that, if we do this, binaries will only be 10x smaller than Go's Hello World, instead of 100x smaller.
could still be useful as an optional build target though... (i.e. GOOS=cosmopolitan)
This thing of yours is quite an accomplishment. Even so, I hope it goes no further. If it does (& it probably will, because it is convenient), most of the computing world will have stacked yet another sub-optimal low-level mono-culture on top of the one that was already there.
From the robustness/survivability perspective, the mono-cultures we have are bad enough.
In the world where a windows binary is not expected to run on a mac, and vice versa, they still have the option to change things on an as-needed basis (like the golang breakage on osx). In a world filled with APEs, you will have canonized all of the present OS idiosyncrasies that made APE possible in the first place, & we will be stuck with them forever.
Nah eventually every operating system is just going to support x86_64-linux-gnu and there won't be a need for APE in our glorious future. In that case, Cosmopolitan will just be a non-GPL libc that goes as fast as Glibc. But until all operating systems become fully developed, we have an outstanding hack to hold us over.
> eventually every operating system is just going to support x86_64-linux-gnu
Whether it's batteries included, or batteries sold separately, we'll still be stuck with batteries. If I want to run Linux on my Mac, that's what UTM is for.
Despite my reservations, nothing but respect, & enjoyed your Feross presentation.
The other ARM license is an ISA license. You get no chip design. All you get is the instruction set architecture but you have to design the chip yourself. That's what Apple has been doing.
I know that I have a general bias against Apple, but it has generally annoyed me that Apple has decided to take a generic industry term like "$FOO Silicon" and turn it into a marketing title.
This highlights the ongoing problem of article titles needing to be eye-grabbing but usually inaccurate/misleading or more complex than a simple title allows for.
It's a title, it cannot be a paragraph long to be pedantically clear and unambiguous in what to expect.
"Website that lets you download a 1KB long executable that can be run on 7 _kernels_ and x86_64 architectures, but actually it's only been tested on Intel and will probably not work on Android even if technically it's still a Linux kernel, also Linux < 2.6 is unsupported so YMMV"
I don't understand what the issue is with cross-compiling software, especially for different architectures. This seems more geared for running a binary on the same architecture across multiple different operating systems. This is great and all until you use some API that isn't the same on those operating systems. I imagine once you see something like this attempting to use comparable features and APIs as Go or something else, that you'll see the bloat increase to be equal in size if not more.
> I don't understand what the issue is with cross-compiling software, especially for different architectures.
Agreed. The cosmopolitan project is incredible hacking and I appreciate the author's ingenuity, but this has always felt a little bit to me like a solution in search of a problem. How is cosmopolitan software intended to be distributed? If the answer is on the internet, I don't see how a single fat binary helps the end user dramatically more than separate download links. For the developer, cross compilation can be automated as part of deployment. The platform specific binaries should both be smaller than the universal ape binaries and won't resort to hackery to work properly.
Because end users use multiple operating systems these days.
Also, no. The operating system integration code is tiny in the grand scheme of things. Consider Fabrice Bellard's JavaScript interpreter. If I build it for six operating systems, then it's 782kB:
master jart@nightmare:~/cosmo2$ ls -hal o/tiny/third_party/quickjs/qjs.com
-rwxr-xr-x 1 jart jart 782K Aug 30 10:53 o/tiny/third_party/quickjs/qjs.com
If I build it for just Linux, then it's 710kB:
master jart@nightmare:~/cosmo2$ ls -hal o/tinylinux/third_party/quickjs/qjs.com
-rwxr-xr-x 1 jart jart 710K Aug 30 10:52 o/tinylinux/third_party/quickjs/qjs.com
That's a 72kB difference, which includes all those Windows NT hacks and bare metal support. Why not just stuff it in there and be done with it? That way you only have to list one file on your download site.
> Because end users use multiple operating systems these days.
Do end users commonly want to copy programs between OSes, or will they usually just revisit the project's website and download it again? My feeling is it would be the latter.
And if we're talking about running multiple OSes on the same computer, is it common to share filesystems between those OSes? It's been a while since I ran more than one OS, but trying to do that wasn't remotely pleasant.
I think this is a huge technical achievement, and I do think there are some use cases for it here and there, but I personally don't see a big benefit over just using a CI system to automatically create a build for each OS/arch combination at release time.
I also wonder about the trade offs involved in using a different libc. As an example, when I was looking into musl, I realized that it didn't use NSS (unsurprising, since that would defeat the whole zero-dependency static linking thing), and therefore had no support for multicast DNS (well, not that NSS is required, but they also didn't, and had no plans to, integrate mDNS support of their own). I've pretty much abandoned using musl (and Alpine for containers) due to random quirks and gotchas like this that end up biting me somewhere down the line.
Are you telling me that you don't see the benefit to clicking a build button on Linux, scp'ing to your website, and saying done -- versus -- setting up Jenkins, buying a Windows license, buying an Apple license, setting up six VMs, and then once your devops industry is in place, waiting fifteen minutes for the thing to build across all N platforms, and assuming there's no error with all your #ifdef soup needed to support building on six different operating systems with six different sets of tools, then extracting your binaries from all the Jenkins instances, scp'ing them to your website and setting up a download link for each one, and finally hoping that the end user clicks the correct one -- I mean are you being serious? Why use APE when you can just hire a devops team to do your releases.
Tools like go-releaser does everything you said already automatically and doesn't need 6 different operating systems and can be ran on GitHub. Also, why would you not just use the built-in package manager to install an app? I don't think anyone is denying that it is an achievement and has limited purpose but in my opinion you're overly enthusiastic about it and in the real world it isn't that practical. I hate to say it but fanatics sometimes ruin cool stuff because they want everyone to be as fanatic as them.
I am not an expert on the topic but red language achieves that with a 1MB self contained compiler, so I guess with some additional bloat should be possible (and a lot of work)
> Can it make a binary that's both x86 and x64 for Windows?
Seems to be a PE32+, so it won't run on 32-bit Windows.
> Is that even possible?
Yes, it should be. A 32-bit process on 64-bit Windows can run 64-bit code by changing the segment selector - make a far call or jump to 0x33:<address>. See [0] for more details.
There are some caveats though. kernel32.dll refuses to be loaded as both the 32-bit and 64-bit versions in the same executable. So the executable can probably only import ntdll. One thing one might do is just compile both a 32-bit and 64-bit version of the actual program as dll's and embed them inside the launcher and then extract and load them using only api's in ntdll. In the case of APE it would probably just embed an x64-on-x86 emulator and run the executable in that in case it's being run on x86.
Another solution might be to create a .NET AnyCPU executable which will run as 64-bit on 64-bit versions of Windows.
If you just turn the PE part into a .NET executable and keep the rest as it is, yes. Windows will see a .NET executable and the other OSes will see a shell script just like it works now (and a BIOS will still see the boot sector)
The PE format is used, and the cosmopolitan libc example was compiled for x64.
PE requires a single machine type to be specified, but these days there are support for hybrid binaries with the ARM64EC ABI (which follows x64 conventions and can include both x64 and ARM64 code) and fat binaries with the ARM64X ABI (can include both ARM64EC and ARM64 code).
If you really wanted to, you could hack your own "fat binary" format by packing the 64-bit EXE as a resource within an x86 binary that detects the current architecture, but that's not very exciting.
I actually like a lot the idea, and it’s impressive, however nowadays there is also the challenge of many consumer architectures on the commodity desktops, not just x86 like it used to be.
I guess that, and shared libraries ABI, is the biggest challenge now.
If I remember correctly from the DOS times, the limit for com files was 64KB. I haven't tried to compile one for ages, is this any different on Windows NT+ (I would try but I dont have a single windows machine at home)?
Ready to be corrected, as I haven't worked with DOS or Windows for decades, but I seem to recall ".com" is merely used for familiarity/compatibility and hasn't meant anything about the underlying format for a very long time now.
The Wikipedia page for .com suggests that 64 bit Windows NT+ systems can't run the original style .com files anymore due to lacking the MS-DOS emulation subsystem, so the point may be moot.
Already thirty years ago there were DOS executables with a .COM extension that weren't actually in the COM format. COMMAND.COM in later versions of DOS is an example.
Totally cheating would be to integrate the DOS application functionally into the Windows DOS stub. Instead of printing "This application requires Microsoft Windows" it could even use overlays to circumvent the 64kB com restriction.
The limit for a .COM's useful payload is 64KiB, but that wouldn't necessarily prevent the actual file from being bigger, with either truncate or wrapover semantics.
Don't think so, although they are complementary. Submission you link is more like a reference/article while the submission we're commenting on, is more like an application/demonstration of said technique.
Title seems wrong. I think this was intended to be "12 Kb executable runs natively on 7 operating systems". The HN suggestion of copying the article title would be: "How Fat Does a Fat Binary Need To Be?"
Filtering "listicles", articles with titles like "6 Secrets to Achieving Work-Life Balance", "23 Amazing Moments From The Harry Potter Movies", "5 Steps to Booking a Cheap Flight Online" and similar.
Yes, but the original title shouldn't have been rewritten in the first place since it was neither misleading nor linkbait (from https://news.ycombinator.com/newsguidelines.html: "Please use the original title, unless it is misleading or linkbait; don't editorialize."). I've fixed it now.
Minor nitpick: I'd prefer to see size denoted as 12kB (or 12 KB) rather than 12kb, where I have to wonder whether it means kilobit or kilobyte. Note that Justine has also worked on languages whose programs consists of bits [1].
As pointed out in https://news.ycombinator.com/item?id=32649569 , this is probably HN autoremoving numbers from the start of titles to combat blog spam titles like "12 reasons to switch to XYZ".
Justine's impressive work on this subject makes me wonder if it's possible to build truly native applications using more complete toolsets and more modern languages. Multiplatform toy programs have existed for ages, but this SDK of sorts makes it possible to write executables that don't require manually tricking the compiler or adjusting your code for portability with inline assembly. As long as the program is statically compiled and contains the necessary drivers for things like syscalls, this should be perfectly possible, though the current implementation isn't exactly user friendly, relying on giant headers like https://justine.lol/cosmopolitan/cosmopolitan.h, statically linked components like https://justine.lol/cosmopolitan/crt.o and https://justine.lol/cosmopolitan/ape.o, and other such tweaks.
I'd love a Go/Rust compile target that just works regardless of operating system. As an added bonus, the bare metal nature would be excellent for platforms like Firecracker to run services without even needing to boot a kernel!