> Take a moment to appreciate what just happened here - I downloaded a Windows build of Zig, ran it in Wine, using it to cross compile for Linux, and then ran the binary natively. Computers are fun!
> Compare this to downloading Clang, which has 380 MiB Linux-distribution-specific tarballs. Zig's Linux tarballs are fully statically linked, and therefore work correctly on all Linux distributions. The size difference here comes because the Clang tarball ships with more utilities than a C compiler, as well as pre-compiled static libraries for both LLVM and Clang. Zig does not ship with any pre-compiled libraries; instead it ships with source code, and builds what it needs on-the-fly.
> Take a moment to appreciate what just happened here - I downloaded a Windows build of Zig, ran it in Wine, using it to cross compile for Linux, and then ran the binary natively. Computers are fun!
Even though it probably doesn't qualify this is pretty close a Canadian Cross, which for some reason is one of my favorite pieces of CS trivia. It's when you cross compile a cross compiler.
And on that point, your correspondent is right. The two bear no real resemblance to each other. The cross compilation approach described in the article is not something to be held in high regard. It's the result of poor design. It's a lot of work involving esoteric implementation details to solve a problem that the person using the compiler should never have encountered in the first place. It's exactly the problem that the Zig project leader is highlighting in the sticker when he contrasts Zig with Clang, etc.
The way compilers like Go and Zig work is the only reasonable way to approach cross compilation: every compiler should already be able to cross compiler.
Thanks for putting it this way. I always wondered why cross compilation was a big deal. For me it sounds like saying "look, I can write a text in English without having to be in an English-speaking country!".
The problem with cross compilation isn’t the compiler, it’s the build system. See eg the autoconf, which builds test binaries and then executes them, to test for availability of strcmp(3).
Go doesn't cross compile: it only supports the Go operating system on a very limited number of processors variants.
If zig were to truly cross compile for every combination of CPU variant and supported version of every operating system, it would require terabytes of storage and already be out of date.
It doesn't require nearly as much storage as you think. For zig, there's only 3 standard libraries (glibc, musl, mingw) and 7 unique architectures (ARM, x86, MIPS, PowerPC, SPARC, RISC-V, WASM). LLVM can support all of these with a pretty small footprint and since the standard libraries can be recompiled by Zig, it really only needs to ship source code - no binaries necessary.
People really do appreciate such convenience. I am not familiar with Zig, but GO provides me similar experiences for cross-compilation.
Being able to bootstrap FreeBSD/amd64, Linux/arm64, and actually commonly-used OS/ARCH combinations in a few minutes is just like a dream, but it is reality for modern language users.
I'm all for cross compilation, but in reality you still need running copies of those other operating systems in order to be able to test what you've built.
Setting up 3 or 4 VM images for different OSes takes a few minutes. Configuring 3 or 4 different build environments across as many OSes on the other hand ...
Sure, but building typically takes more resources than executing, so it's not really feasible to use a Raspberry Pi to build, but it can be for testing.
You can do that in clang/gcc but you need to pass: -static and -static-plt(? I can't find what it's called). The second option is to ensure it's loader-independent, otherwise you get problems when compiling and running across musl/glibc platforms
In brief, most programs these days are position-independent, which means you need a runtime loader to load sections(?) and symbols of the code into memory and tell other parts of the code where they've put it. Because of differences between musl libc and gnu libc, in effect for the user this means that a program compiled on gnu libc can be marked as executable, but when they try to run it the user is told it is "not executable", because the binary is looking in the wrong place for the dynamic loader, which is named differently across the libraries. There are also some archaic symbols that gnu libc describes that are non-standard, which musl libc has a problem with, that can cause a problem for the end-user.
e: I didn't realise it was 5am, so I'm sorry if it's not very coherent.
I would also appreciate if you manage to be even more specific once more "coherency" is possible. I'm also interested what you specifically can say more about "The second option is to ensure it's loader-independent, otherwise you get problems when compiling and running across musl/glibc platforms"
Ok so, it's been a year or so since I was buggering around with the ELF internals (I wrote a simpler header in assembly so I could make a ridiculously small binary...). Let's take a look at an ELF program. If you run `readelf -l $(which gcc)` you get a bunch of output, among that is:
alx@foo:~$ readelf -l $(which gcc)
Elf file type is EXEC (Executable file)
Entry point 0x467de0
There are 10 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x0000000000000230 0x0000000000000230 R 0x8
INTERP 0x0000000000000270 0x0000000000400270 0x0000000000400270
0x000000000000001c 0x000000000000001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000fa8f4 0x00000000000fa8f4 R E 0x200000
you can see that in the ELF header is a field called "INTERP" that requests the loader. This is because the program has been compiled with the -fPIE flag, which requests a "Position Independent Executable". This means that each section in the code has been compiled so that they don't expect a set position in memory for the other sections. In other words, you can't just run it on a UNIX computer and expect it to work, it relies on another library, to load each section, and tell the other sections where to load it.
The problem with this is that the musl loader (I don't have my x200 available right now to copy some output from it to illustrate the difference) is usually at a different place in memory. What this means is that when the program is run, the ELF loader tries to find the program interpreter to execute the program, because musl libc's program interpreter is at a different place and name in the filesystem hierarchy, it fails to execute the program, and returns "Not a valid executable".
Now you would think a naive solution would be to symlink the musl libc loader to the expected position in the filesystem hierarchy. The problem with this is illustrated when you look at the other dependencies and symbols exported in the program. Let's have a look:
alx@foo:~$ readelf -s $(which gcc)
Symbol table '.dynsym' contains 153 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __strcat_chk@GLIBC_2.3.4 (2)
2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __uflow@GLIBC_2.2.5 (3)
3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND mkstemps@GLIBC_2.11 (4)
4: 0000000000000000 0 FUNC GLOBAL DEFAULT UND getenv@GLIBC_2.2.5 (3)
5: 0000000000000000 0 FUNC GLOBAL DEFAULT UND dl_iterate_phdr@GLIBC_2.2.5 (3)
6: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __snprintf_chk@GLIBC_2.3.4 (2)
7: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __pthread_key_create
8: 0000000000000000 0 FUNC GLOBAL DEFAULT UND putchar@GLIBC_2.2.5 (3)
9: 0000000000000000 0 FUNC GLOBAL DEFAULT UND strcasecmp@GLIBC_2.2.5 (3)
As you can see, the program not only expects a GNU program interpreter, but the symbols the program has been linked against expect GLIBC_2.2.5 version numbers as part of the exported symbols (Although I cannot recall if this causes a problem or not, memory says it does, but you'd be better off reading the ELF specification at this point, which you can find here: https://refspecs.linuxfoundation.org/LSB_2.1.0/LSB-Core-gene...). So the ultimate result of trying to run this program on a GNU LibC system is that it fails to run, because the symbols are 'missing'. On top of this, you can see with `readelf -d` that it relies on the libc library:
alx@foo:~$ readelf -d $(which gcc)
Dynamic section at offset 0xfddd8 contains 25 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x0000000000000001 (NEEDED) Shared library: [ld-linux-x86-64.so.2]
0x000000000000000c (INIT) 0x4026a8
Unfortunately for us, the libc.so.6 binary produced by the GNU system is also symbolically incompatible with the one produced by musl, also GNU LibC defines some functions and symbols that are not in the C standard. The ultimate result of this is that you need to link statically against libc, and against the program loader, for this binary to have a chance at running on a musl system.
Dlang is a better C. DMD, the reference compiler for Dlang, can also compile and link with C programs. It can even compile and link with C++03 programs.
It has manual memory management as well as garbage collection. You could call it hybrid memory management. You can manually delete GC objects, as well as allocate GC objects into manually allocated memory.
The Zig website says "The reference implementation uses LLVM as a backend for state of the art optimizations." However, LLVM is consistently 5% worse than the GCC toolchain at performance across multiple benchmarks. In contrast, GCC 9 and 10 officially support Dlang.
Help us update the GCC D compiler frontend to the latest DMD.
D's whole premise of "being a better C++" has always made them look like argumentative jerks. Why build a language on top of a controversy? Their main argument from early 2000s: C++ requires a stdlib and compiler toolchain is not required to provide one. Wtf D? I mean I understand that C++ provides a lot of abstractions on top of C to call itself "more" than C, but what does D provide other than a few conveniences? If you even consider garbage collection or better looking syntax or more consistent, less orthogonal sytax a convenience. It didn't even have most of it's current features when it was first out back in early 2000s. Trying to gain adoption through creating some sort of counterculture what are they? 14?. /oneparagraphrant
It is probably the case that D has a brilliant engineer team who doesn't really focus on the PR side of things. D definitely provides value over C/C++ other than a few sugars for the syntax. It is just not communicated that well.
Really incredible work and it's been very fun to follow along. The streams where Andrew did the last part of this work can be seen here: [1], [2].
I am really happy that someone is making the effort to steadily simplify systems programming rather than make it more complicated. Linux goes to such incredible lengths to be bug-for-bug backwards compatible, but then the complexities of all of our layers of libcs, shared libraries, libsystemd, dbus, etc cause unnecessary pain and breakage at every level. Furthermore, cross-compiling C code across different architectures on Linux is far harder than it needs to be. I have a feeling that there wouldn't be as much interest in the steady stream of sandboxes and virtual machines (JVM, NaCl, PNaCl, flatpak, docker, WebAssembly) if we could just simplify the layers and layers of cruft and abstractions in compiler toolchains, libc implementations, and shared libraries. Practically every laptop and server processor use the exact same amd64 architecture, but we have squandered this opportunity by adding leaky abstractions at so many levels. I can't wait until installing a program on linux is as simple as downloading a static executable and just running it and I hope zig brings this future.
Because when there's a security update to (say) OpenSSL, it's better for the maintainers of just that library to push an update, as opposed to forcing every single dependent to rebuild & push a new release.
My main issue with this rationale is that, in the vast majority of production environments (at least the ones I've seen in the wild, and indeed the ones I've built), updating dependencies for dynamically-linked dependents is part of the "release" process just like doing so for a statically-linked dependent, so this ends up being a distinction without a difference; in either circumstance, there's a "rebuild" as the development and/or operations teams test and deploy the new application and runtime environment.
This is only slightly more relevant for pure system administration scenarios where the machine is exclusively running software prebuilt by some third-party vendor (e.g. your average Linux distro package repo). Even then, unless you're doing blind automatic upgrades (which some shops do, but it carries its own set of risks), you're still hopefully at least testing new versions and employing some sort of well-defined deployment workflow.
Also, if that "security update" introduces a breaking change (which Shouldn't Happen™, but something something Murphy's Law something something), then - again - retesting and rebuilding a runtime environment for a dynamically-linked dependent v. rebuilding a statically-linked dependent is a distinction without a difference.
I would have agreed with this statement about five years ago. (Even though you would have had to restart all the dependent binaries after updating the shared libs.)
Today, with containers becoming increasingly the de facto means of deploying software, it's not so important anymore. The upgrade process is now: (1) build an updated image; (2) upgrade your deployment manifest; (3) upload your manifest to your control plane. The control plane manages the rest.
The other reason to use shared libs is for memory conservation, but except on the smallest devices, I'm not sure the average person cares about conserving a few MB of memory on 4GB+ machines anymore.
In addition to pjmlp's list, Steam is pushing toward this for Linux games (and one could argue that Steam has been this for as long as it's been available on Linux, given that it maintains its own runtime specifically so that games don't have to take distro-specific quirks into account).
Beyond containers / isolated runtime environments, the parent comment is correct about games (specifically of the console variety) being historically nearly-always statically-linked never-updated monoliths (which is how I interpreted that comment). "Patching" a game after-the-fact was effectively unheard of until around the time of the PS3 / Xbox 360 / Wii (when Internet connectivity became more of a norm for game consoles), with the sole exception of perhaps releasing a new edition of it entirely (which would have little to no impact on the copies already sold).
> Today, with containers becoming increasingly the de facto means
This assertion makes no sense at all and entirely misses the whole point of shared/dynamic libraries. It's like a buzzword is a magic spell that makes some people forget the entire history and design requirements up to that very moment.
Sometimes buzzwords make sense, in the right context. This was the right context.
Assuming you use containers, you're likely to not log into them and keep them up to date and secure by running apt-get upgrade.
The most common workflow is indeed: build your software in your CI system, in the last step create a container with your software and its dependencies. Then update your deployment with a new version of the whole image.
A container image is for all intents and purposes the new "static binary".
Yes, technically you can look inside it, yes technically you can (and you do) use dynamic linking inside the container itself.
But as long as the workflow is the one depicted above, the environment no longer has the requirements that led to the design of dynamic linking.
It's possible to have alternative workflows for building containers: you could fiddle with layers and swap an updated base OS under a layer containing your compiled application. I don't how common is that, but I'm sure somebody will want/have to do it.
It all boils down to whether developers still maintain control over the full deployment pipeline as containers penetrate the enterprises (i.e. whether re retain the "shift to the left", another buzzword for you).
Containers are not just a technical solution, they are the embodiment of the desire of developers to free themselves from the tyranny of filing tickets and waiting days to deploy their apps. But that leaves the security departments in enterprises understandably worried as most of those developers are focused on shipping features and often neglecting (or ignoring) security concerns around things that live one layer below the application they write.
Shared libraries have largely proven that they aren't a good idea, which is why containers are so popular. Between conflicts and broken compatibility between updates, shared libraries have become more trouble than they are worth.
I think they still make sense for base-system libraries, but unfortunately there is no agreed upon definition of 'base-system' in the wild west of Linux.
And the reason we're using containers in the first place is precisely because we've messed up and traded shared libs for having a proven-interworking set of them, something that can trivially be achieved using static linking.
Actually the main selling point of containers has nothing to do with "proven interworking", but the ability to deploy and run entire applications in a fully controlled and fully configurable environment.
Static libraries do nothing of the sort. In fact, they make it practically impossible to pull it off.
There's far more to deploying software than mindlessly binding libraries.
On Windows, I don't need to use Docker in order to run a program in a reproducible way. I just download a program, and in 90% of cases it "just works" whether I'm running Windows 10, Windows 8, or the decade-old Windows 7.
Furthermore, installing that program will (again, in 90% of cases at least) not affect my overall system configuration in any way. I can be confident that all of my other programs will continue to work as they have.
Why? Because any libraries which aren't included in the least-common-denominator version of Windows are included with the download, and are used only for that download. The libraries may shipped as DLLs next to the executable, which are technically dynamic, but it's the same concept—those DLL's are program-specific.
This ability is what I really miss when I try to switch to desktop Linux. I don't want to set up Docker containers for random desktop apps, and I don't want a given app to affect the state of my overall system. I want to download and run stuff.
---
I realize there's a couple of big caveats here. Since Windows programs aren't sandboxed, misbehaving programs absolutely can hose a system—but at least that's not the intended way things are supposed to work. I'm also skipping over runtimes such as Visual C++, but as I see it, those can almost be considered part of the OS at this point. And I can a ridiculous number of versions of MSVC installed simultaneously without issue.
> On Windows, I don't need to use Docker in order to run a program in a reproducible way. I just download a program, and in 90% of cases it "just works" whether I'm running Windows 10, Windows 8, or the decade-old Windows 7.
One program? How nice. How about 10 or 20 programs running at the same time, and communicating between themselves over a network? And is your program configured? Can you roll back changes not only in which versions if the programs are currently running but also how they are configured?
> This ability is what I really miss when I try to switch to desktop Linux. I don't want to set up Docker containers for random desktop apps,
You're showing some ignorance and confusion. You're somehow confusing application packages and the natural consequence of backward compatibility with containers. In Linux, deploying an application is a solved problem, unlike windows. Moreover, docker is not used to run desktop applications at all. At most, tools like Canonical's Snappy are used, which enable you to run containerized applications in a completely transparent way, from installation to running.
> the ability to deploy and run entire applications in a fully controlled and fully configurable environment
But isn't the reason to have this fully controlled and fully configurable environment to have a proof of interworking? Because when environment is in any form different you can, and people already do, say that it's not supported.
> Actually the main selling point of containers has nothing to do with "proven interworking", but the ability to deploy and run entire applications in a fully controlled and fully configurable environment.
Which is exactly the same selling point as for static linking.
As a user of the Linux desktop, I really love it when library updates break compatibility with the software I use too. Or can't be installed because of dependency conflicts.
Containers are popular because shared libraries cause more trouble than they are worth.
Containers most likely wouldn't have existed if we had a proper ecosystem around static linking and resolution of dependencies. Containers solve the problem of the lack of executable state control, mostly caused by dynamic linking.
More broadly, containers solve the problem of reproducibility. No longer does software get to vomit crap all over your file system in ways that make reproducing a functioning environment frustrating and troublesome. They have the effect of side-stepping the dependencies problem, but that isn’t the core benefit.
True—but that's far less of a problem, because it rarely occurs unexpectedly and under a time crunch.
Diffing two docker images to determine the differences between builds would be far less onerous than attempting to diff a new deployment against a long-lived production server.
Dynamic linking isn't the issue. Shared libraries are the issue. You could bundle a bunch of .so files with your executable & stick it in a directory, and have the executable link using those. That's basically how Windows does it, and it's why there's no "dependency hell" there despite having .dlls (dynamically linked libraries) all over the place.
Shared libraries are shared (obviously) and get updated, so they're mutable. Linux systems depend on a substantial amount of shared mutable state being kept consistent. This causes lots of headaches, just as it does in concurrent programming.
> Based on my experience this is very rarely the case
You must have close to zero experience them because that's the norm on any software that depends on, say, third-party libraries that ship with a OS/distro.
These are not diametrically opposed. Your company can have a service that uses OpenSSL in production that runs on Debian to automatically take advantage of Debian patches and updates if it's linked dynamically to the system provided OpenSSL.
You can either employ an extremely disciplined SecOps team to carefully track updates and CVEs (you'd need this whether you're linking statically or dynamically) or you can use e.g. Debian to take advantage of their work to that end.
Every single company that I used to work for had an internal version of Linux that they approved for production. Internal release cycles are disconnected from external release cycles. On the top of that, some of these companies were not using system-wide packages at all, you had to reference a version of packages (like OpenSSL) during your build process. We had to do emergency patching for CVEs and bump the versions in every service. This way you can have 100% confidence what a particular service is running with a particular version of OpenSSL. This process do not depend on Debian's (or other FOSS vendor's) release cycles and the dependencies are explicit, therefore the vulnerability assessment is simpler (as opposed to go to every server and check which version is installed). Don't you think?
If you need that level of confidence - sure. But it's going to cost a lot more resources and when you're out of business your customers are fully out of updates. I wouldn't want to depend on that (then again a business customer will want to maintain a support contract anyway).
Isn't a containerized solution a good compromise here? You could use Debian on a fixed major release, be pretty sure what runs and still profit from their maintenance.
What I'm saying is that the only way you can get away with not having an "extremely disciplined SecOps team" is to depend on someone else's extremely disciplined SecOps team. Whether you link statically or dynamically is orthogonal.
> Every single company that I used to work for had an internal version of Linux that they approved for production.
I can't deny your experience, but meanwhile I've been seeing plenty of production systems running Debian and RHEL, and admins asking us to please use the system libraries for the software we deployed there.
> Internal release cycles are disconnected from external release cycles.
That seems to me like the opposite of what you'd want if you want to keep up with CVEs. If you dynamically link system libraries you can however split the process into two: the process of installing system security updates doesn't affect your software development process for as long as they don't introduce breaking changes. Linking statically, your release cycles are instead inherently tied to security updates.
> We had to do emergency patching for CVEs and bump the versions in every service.
What is that if not tying your internal release cycles to external release cycles? The only way it isn't is if you skip updates.
> This process do not depend on Debian's (or other FOSS vendor's) release cycles and the dependencies are explicit, therefore the vulnerability assessment is simpler (as opposed to go to every server and check which version is installed). Don't you think?
I don't know, going to every server to query which versions of all your software they are running seems similarly cumbersome. Of course, if you aren't entirely cowboying it you'll have automated the deployment process whether you're updating Debian packages or using some other means of deploying your service. Using Debian also doesn't make you dependent on their release cycles. If you feel like Debian isn't responding to a vulnerability in a timely manner, you can package your own version and install that.
I'm talking about the operating system that's pretty much a major component of the backbone of the world's entire IT infrastructure, whether its directly or indirectly through downstream distros that extend Debian, such as Ubuntu. Collectively they are reported to serve over 20% of the world's websites,and consequently they are the providers and maintainers of OpenSSL that's used by them.
If we look at containers, docker hub lists that Debian container images have been downloaded over 100M times, and ubuntu container images have been downloaded over 1B times. These statistics don't track how many times derived images are downloaded.
~15 years ago my "daily driver" had 256MB of RAM and it was perfectly usable for development (native, none of this new bloated web stuff) as well as lots of multitasking. There was rarely a time when I ran out of RAM or had the CPU at full usage for extended periods.
Now it seems even the most trivial of apps needs more than that just to start running, and on a workstation, less than a year old with 4 cores of i7 and 32GB of RAM, I still experience lots of lag and swapping (fast SSD helps, althougn not much) doing simple things like reading an email.
Now, I'm not running anything particularly intensive at the moment, and I make a point of avoiding Electron apps. I also rebooted just a few hours ago for an unrelated reason.
But the fact is that I've monitored this before—I very rarely manage to use all my RAM. The OS mostly just uses it to cache files, which I suppose is as good a use as any.
Oh my god, just try running a modern OS on a spinning rust drive. It's ridiculous how slow it is. It's obvious that modern developers assume everything is running on SSD.
Are you sure? I've been running Linux for a long time with no page file. From 4gb to 32gb (The amount of RAM I have now) and have literally only ran out of RAM once (and that was because of a bug in a ML program I was developing). I find it very hard to believe that you experience any swapping at all with 32gb, much less "lots".
There's a screenshot in there showing it taking 22GB of RAM. I've personally never seen it go that high, but the 10-12GB of RAM that I have seen is absolutely ludicrous for a chat app. Even when it's initially started it takes over 600MB. Combine that with a few VMs that also need a few GB of RAM each, as well as another equally-bloated Electron app or two, and you can quickly get into the swapping zone.
I also experience the same thing with Mattermost (the client also being an Electron app). The memory bloat usually comes from switching back and forth from so many channels, scrolling up to load more chat history, and lots and lots of image attachments (and of course, the emoticons).
scrolling up to load more chat history, and lots and lots of image attachments (and of course, the emoticons).
I remember comfortably browsing webpages with lots of large images and animated GIFs in the early 2000s, with a fraction of the computing power I have today. Something has become seriously inefficient with browser-based apps.
You said yourself you managed to find a case where you ran out of memory. Why do you find it "very hard to believe", knowing nothing about his use cases, that his job doesn't involve exactly the sort of situations that consume vast amounts of RAM. Why do people insist with such conviction that "it doesn't happen to me, therefore it's inconceivable that it happens to someone else, doing something totally different than what I'm doing, into which I have no insight". Baffling.
> Why do you find it "very hard to believe", knowing nothing about his use cases, that his job doesn't involve exactly the sort of situations that consume vast amounts of RAM.
Probably the GGP said they experience lag while "doing simple things like reading an email." Now, maybe GGP meant to add "while I'm sequencing genes in the background", but since that was left out I can see how it would be confusing! :)
My dynamically-linked executable is 8296 bytes on disc. My statically-linked executable is 844,704 bytes on disc.
So if I had a "goodbye world" program as well, that's a saving of about 800KB on disc.
Now one can argue the economics of saving a bit under a megabyte in a time where an 8GB microSD card costs under USD5 in single quantities, but you can't argue that it's a relatively big saving.
At runtime, the dynamic version uses (according to top) 10540 KB virtual, 540 KB resident, and 436 KB shared. The static version uses 9092 KB virtual, 256 KB resident, and 188 KB shared.
256MB of RAM is a fairly large amount–this is how much iPhone 3GS had, for instance. It relied on dynamic linking to system libraries heavily and ran multiple processes.
I've also linked this Zig post into that list (and happy to add further languages if you can provide a link that shows that they have good out-of-the-box static linking support).
Rust has some really good static linking support. If you compile with the `musl` targets (e.g. x86_64-unknown-linux-musl), it will give you a fully statically linked binary.
Whether they did or not, they spoke truth. Linux's (userland, not kernel) backward compatibility is ridiculously bad unless you're compiling from source. This is not the case on Windows.
From the late 90s to the late aughts it’s unlikely anyone could have used a desktop for any work. 256MB of memory was a lot of memory not so long ago.
Looking at how a browser, an IDE, and a few compilation processes will gladly chew through 8GB of memory... it’s not necessarily horrible, but this is a modern contrivance.
I know the ability exists, but I'm pretty sure that it's not exactly easy to get it working. Last time I tried, it immediately failed because my distribution wasn't shipping .a files (IIRC) for my installed libraries. There's a lot of little things that don't quite work because nobody's using them so they're harder to use so nobody uses them...
It's easy to get working provided that you compile _everything_ from source. You can either omit glibc from this, or accept that it will still dynamically load some stuff at runtime even when "statically" linked, or switch to musl. A nice benefit is that LTO can then be applied to the entire program as a whole.
I don't see how zig cc would help with that. Your distribution probably also doesn't ship all the source files for your packages either, and there's no other way to statically link.
I've used Gentoo for almost a decade now, and no, that's not true. emerge doesn't just randomly keep source files on disk, and certainly not in a form easy to link to. In fact, Gentoo is worse than Debian for static linking, because not all packages have IUSE=static-libs. If it doesn't, you need to patch that ebuild and potentially many dependencies to support it. On the other hand, on Debian, the standard is for -dev packages to come with both headers and static libraries.
One I think people forget about is ASLR. What symbols are you going to shuffle? At least with dynamic linked dependencies the linker can just shove different shared objects into different regions without much hassle.
Other have mentioned the other points: runtime loading (plugins), CoW deduplication and thus less memory and storage.
In addition to the other excellent reasons mentioned here, there's also the fact that some libraries deliberately choose to use runtime dynamic linkage (dlopen) to load optional/runtime-dependent functionality.
If you want make a program that supports plugins, you have only two real options: non-native runtimes or dynamic linking. And the later gets you into a lot of trouble quickly. The former trades performance and memory usage for ease of use and a zoo of dependencies.
If you're the kind of person who wants static linking then you really don't want these features.
The real problem is that statically linked programs under Linux don't (didn't?) support VDSO, which means that syscalls like gettimeofday() are suddenly orders of magnitude slower.
In the end, we had to do a kind of pseudo-static linking - link everything static except glibc.
This is false. The license for glibc is the LGPL, not the GPL, and the LGPL has an exception to allow static linking without the whole code having to be under the LGPL, as long as the .o files are also distributed to allow linking with a modified glibc ("As an exception to the Sections above, you may also combine or link a "work that uses the Library" with the Library to produce a work containing portions of the Library, and distribute that work under terms of your choice [...] and, if the work is an executable linked with the Library, with the complete machine-readable "work that uses the Library", as object code and/or source code, so that the user can modify the Library and then relink to produce a modified executable containing the modified Library.")
> I can't wait until installing a program on linux is as simple as downloading a static executable and just running it and I hope zig brings this future.
For the record: This is pretty close to what AppImage is today. It's not quite 100% because userland fragmentation is so ridiculously bad that it doesn't work out of the box on a few of them, but I personally really wish all Linux software was distributed that way (or static like Zig).
And it pretty much proves that it's not such a great thing to aspire. Images are large, dependencies aren't updatable, locations don't match with distribution defaults.
On the other hand: no conflicts, no missing dependencies, can have multiple versions of the same thing installed at the same time, can store them anywhere including removable media...
The cross compiling features of Zig look fantastic! Installation is so easy, just downloading and extracting a single file.
Should every compiler stack have prioritized cross compilation over other features? (I vote: YES). Cross compiling programs has always been a PITA for most languages.
It would be great if Zig cc could be paired with vcpkg [1] for a nice cross-compiling development environment. Looks like vckpg requires a C++ compiler though.
Which reasonably-popular modern languages can be reasonably said to have ignored cross compilation? Interpreted languages like JavaScript and Python obviously don't have any problem, JIT languages like .NET and Java explicitly have a cross-platform layer, and modern compiled languages like Go and Rust specifically have cross-compilation as a goal. Rust still needs a libc though, but that's not Rust's fault, that's the result of trying to work together with the system instead of DIYing everything. (see: problems with Go doing system calls on BSDs, Solaris, etc)
You can't look at C which started in the 1970s and C++ which started in the 1980s and have expected them to even consider cross-compilation, when Autoconf wasn't even released until 1991.
I think the benchmark should be distribution. For instance, you mention "JavaScript and Python obviously don't have any problem", but say you want to create a distributable program for the 3 major operating systems, based in one of those languages, and you can't assume the user will have an installed interpreter. I don't think you will find any _standard_ solutions.
Most other languages make it _possible_ to generate some sort of artifact usable from different operating systems but not necessarily easy. I think Java only relatively recently included a standard way to create a bundle including a minimal JVM distribution with an app to make it usable when the user doesn't have an installed JVM (and again, there were a bunch of different non standard solutions of varying quality before that). Even now I wouldn't say the Java solution is easy to use.
I could continue in this fashion with different languages, but you get the idea.
I heard go is pretty good in ease of cross compilation, and well, looks like Zig is doing great in this area too. Ah! .net core is apparently pretty good in this area these days too.
That's moving the bar though. Only in the past decade or so has it become reasonable ("reasonable") to include entire runtimes together with an application. Java started in 1995, Python started in 1991. This was an era when one of Java's main targets was SIM cards and other highly minimal devices, so not only would the target already have a JVM, but it would be wholly impractical to ship your own. Even on desktops, downloading just a fraction of Java for each program would be a massive waste of limited bandwidth and disk space.
For that reason, Java and Python didn't start out with fully self-contained bundles as a design goal. It just wasn't practical in the 90s. Obviously, yes, if they had managed to correctly predict and plan for three decades of technological improvement, then sure, we'd be working in a very different technological landscape. But they couldn't possibly have, and solutions built on the old languages are always fraught with disagreement. So, we use new languages, like Go and Rust, which are developed with modern needs in mind.
> I think the benchmark should be distribution. For instance, you mention "JavaScript and Python obviously don't have any problem", but say you want to create a distributable program for the 3 major operating systems, based in one of those languages, and you can't assume the user will have an installed interpreter. I don't think you will find any _standard_ solutions.
Well, there are web browsers and servers. They distribute javascript programs and run them.
Interpreted environments like Node.js, Python, and Ruby absolutely have this problem since many popular packages utilize native extensions. Distribution is still a challenge.
Note that on linux hosts at least, for most target platforms, being able to cross compile with clang is only one single install of the right packages away.
Portability depends on a great deal more than just object code formats. The list of OS environment functions to call to achieve anything useful is radically different from one target to another.
This is what makes porting hard work. Cross-compiling is only the first step of a long trip.
I think a problem comes when you want to distribute your compiler potentially independent from your OS and/or linker and/or C library.
But it's also fair to say that if we had always considered those things as inseparable parts of the "compiler suite" that might have made everyone better off.
I'd love for a hotlist or even prize for programmers doing awesome stuff like this. Off the top of my head, Andrew, Andreas Kling of Serenity OS, whoever ffwff is (https://github.com/ffwff/), Niko Matsakis of Rust, and so on. It'd be very awesome to have a roundtable with some of these people. They could discuss different approaches to design, their own histories and plans for the future. I loved reading Coders at Work, but I felt that there was a missed opportunity by not having the subjects interact with each other. So many of them had similar and very dissimilar ideas/approaches, which would have made for a wonderful discussion.
If someone could do the same but for a younger generation, I think it'd be very valuable.
Aye, and it lives up to that claim as well in my opinion, despite being still relatively young and pre-1.0. My favorite thing about Zig is that it has managed to stay simple and solve many of the problems of C without resorting to greatly increased complexity like Rust (which is much more of a C++ replacement than a C replacement in my opinion).
IMO the marketing as Rust as a C/C++ replacement is a bit misplaced. I think it's more accurate to consider it an alternative systems language. The tools/languages used in this space are a bit broader than C-family languages.
That's true, though systems programming is in practice dominated by the C ABI (great post on that here by the way https://drewdevault.com/2020/03/03/Abiopause.html). Zig does something quite special that puts it ahead of the crowd in this space; It can import and use C libraries as easily as C does (no bindings required), and it can itself be built into a C library, auto-generating the required C headers.
You are absolutely correct on the point of the C ABI. It's definitely the systems lingua franca.
Just finished reading the Zig cc article and I must say I'm also quite impressed. I'll be keeping an eye on the next Zig release--being able to eventually use it as a `cc` or `mvsc` replacement would be a big game changer. Having recently run the study of trying to cross compile a few C++ and GTK apps, I can really see the appeal.
Rust is not a C++ replacement, nor a C replacement. It targets its own niche (embedded, realtime, correctness-oriented). It's a totally different development culture. Zig, OTOH, is absolutely a C replacement.
My day job is maintaining the toolchains for a popular safety-certified realtime embedded operating system. I have never once been asked to provide a Rust toolchain. Fortran, yes. Ada, yes. Python and Go, yes. By and large it's just C (and more and more C++).
Rust seems to be mostly something "full stack developers" and "back end developers" embrace for server-side toys and hobby projects.
Python, on a realtime embedded operating system? Rust isn't really there for embedded development yet. It's great for low-level code on full-blown processors/OSs though
Top-end embedded processors have GPUs and hypervisors these days and run AI algorithms to, say, detect lane changes and do parallel-parking maneuvers. These days AI means code written in Python and Fortran.
Rust does not target correctness-oriented code at all. Neither do standard C or C++, for that matter, but there are derivatives of C that can (and are) used for it.
> It targets its own niche (embedded, realtime, correctness-oriented).
I don't know of anyone who actually uses Rust in that niche.
Everyone who uses Rust is using it because of "C++ is hard, let's go shopping instead" syndrome.
I.e., at this point it's a language for beginners to ease themselves into programming without training wheels and eventually graduate to real big boy programming languages.
It's pretty much irrelevant what people feel emotionally about Rust.
The real-world fact is that Rust is, as of March 2020 at least, an entry-level systems programming language. It's used as a stepping stone by former PHP/Python/Go programmers, who are very intimidated by C++, to get into performance-oriented coding.
Nobody actually writing embedded or sensitive code (airplanes, nuclear power stations, etc.) is doing it in Rust.
The language is young and you don't certify a software solution every two days or don't rewrite your nuclear power station code every day.
Very experienced programmers switched to Rust because it makes it possible to build large scale industrial programs both efficient and reliable. They won't switch to C++ just because they think they're good enough to live dangerously.
(btw I work on plant control and yes I write parts in Rust)
It's true that a lot of PHP/Python/Go programmers look to Rust rather than C++ when getting into performance-oriented code. But it's not really a stepping stone, because you never have to leave.
It's true that not a lot of people are using Rust for embedded software. That's a much harder nut to crack because so many of the toolchains are proprietary (and embedded support in Rust is still missing quite a few things).
Also: I've used Rust at work. In fact, I learnt Rust because I was processing a lot of data at work, and needed a fast language to do so in a reasonable amount of time.
I know it's not as complicated as C++ but it sure seems like they're in a rush to get there :) . Modern (post c++14) covers most of my criticisms of c++ for my daily driver. I still poke at Rust because I really love statically typed languages :)
I've been very surprised by Rust's complexity. Some of it seems to be being hidden over time (as well as adding new features) but its syntax at any given moment is generally more complex than C++.
I’m not being contrarian, but I have only been following Zig in passing. Can you give a few example of the increase in complexity, I am genuinely curious. Zig seems, to my eyes at least, to have done an admirable job of remaining simple when compared to the behemoth that is C++, of any vintage (but especially post C++03).
I wonder how much it would cost to sponsor compiles-via-c support. I'd love to use zig-the-language but I need to compile for platforms that LLVM does not support, so I would need to use the native C toolchain (assembler/linker at the least, but using the native C compiler seems easier).
When we talk about something being "C-like", the syntax is what we're talking about. Being "distracted by the syntax" doesn't make sense when the syntax is the entire point of the statement being made.
Admittedly it's been decades since I've done any C (literally since 1999, except for an LD_PRELOAD shim I wrote about 5 years ago) but being someone who prefers C and other ALGOL-syntax derivates I find the syntax to be relatively easy to grok. One advantage over C is that the confusing pointer/array/function pointer thing has a well-defined order in Zig, there's no "spiral order" nonsense going on.
There is no spiral. C's pointer declaration syntax is arguably backwards ("declaration follows usage") but fundamentally follows normal precedence rules. The so-called spiral is total nonsense misinformation.
That's exactly my point. It's confusing enough that someone e made a bullshit claim that was totally wrong and confused noobs for years. That won't happen with zig.
The spiral rule works if "pointer to" and "function returning"/"array of" alternate but it breaks if they don't, i.e., if you have arrays of arrays or pointers to pointers.
On a first glance, it does look like a simpler version of Rust, and I say it without demeaning Zig. Looks very promising, I'll be keeping an eye for it.
It's been amazing following all the progress so far. I'm a proud $5/mo sponsor and look forward to writing something in Zig soon!
Are there any concurrency constructs provided by the language yet? I'm just starting to learn how to do concurrency in lower-level langauges (with mutexes and spinlocks and stuff). I'm coming from the world of Python where my experience with concurrent state is limited to simple row-level locks and `with transaction.atomic():`.
Thanks for all your effort on the project. By far my best experience with zig was writing an OpenGL renderer on windows which then “just worked” when I cloned it ran `zig build` on my Linux machine. Felt like magic.
It is amazing work, I'm so glad you invested your attention in this direction, kudos! I haven't used the language and the compiler yet but just reading the title article I almost jumped from joy, knowing how unexpectedly painful is to target different versions of system libraries on Linux.
Why does zig purposely fail on windows text files by default? Do you really expect a language to catch on when you are purposely alienating windows users by refusing to parse \r for some bizarre reason?
Every compiler handles it fine, I just can't wrap my head around why someone would intentionally cripple their software to alienate the vast majority of their potential users. I understand the problem and how to solve it, but it's such a giant red flag there is no way I'll care about this language. Most people probably don't actually understand why it fails and why a simple hello world program fails.
I'm floored that so many people would ignore this barrier to entry and somehow rationalize such a ridiculous design choice. If you make things straight up break for all windows users by default in your new language, it's going nowhere. No rationalization or excuses will change that reality.
Among programmers, distinction between CRLF and LF is known. Among non-programmers, you may well see people who will try to write code in WordPad, and that doesn't mean compilera should parse WordPad documents as source files. Do you expect people to write code in Notepad? If not, any reasonable programmer's editor supports Unix EOLs. Guide them to use Notepad++ or VSCode with proper options/plugins, case closed. Otherwise, you may as well complain about handling of mixed tabs & spaces in Python, or the fact that C requires semicolons.
Defaults cause zig to error out. This isn't that hard to understand or fix. It isn't even clear when it happens because it's just not something that any other compiler would error on. It isn't about being able to fix it, it about creating an assanine hurdle for anyone trying the language for the first time in windows. No other compiler or scripting language makes someone jump through that hoop. It is completely ridiculous. I still can not believe anyone would defend this decision. Do you want people to use it or not?
I have no issue with that decision at all and find it akin to Go forcing users to adhere to gofmt, and I don't use Zig. A good error message would be nice (EDIT: A zigfmt would be even better).
How many developers who try out new languages are even using Windows? I'd imagine most are on a UNIX-like.
Regardless, I don't think it's a big deal either way. You could always make a merge request with a fix if you feel so strongly about it.
Also tabs. I eventually discovered how to use parser combinators, but I would have made it more frustrating to get started and I'm somewhat conscientious the cognitive burden of these sorts of things can get overwhelming, especially for a project I'm not getting paid for.
I'm so glad we are seeing a shift towards language simplicity in all aspects (control flows, keywords, feature-set, ...).
It'so important in ensuring reliable codebases in general.
i would love to see a shift towards small languages, which is subtly different from a shift towards simple languages.
there are plenty of things i feel are serious shortcomings of C (mixing error results with returned values is my big one), but the fact that the set of things the language can do is small and the ways you can do them are limited makes it much easier to write code that is easy to read. and that will always keep me coming back.
Perhaps I missed this from the blog post since it's unfamiliar to me. Can you compile Linux with this? Like, could you really straight up use this is a drop-in replacement for a whole Gentoo system?
I'm guessing the build system of Linux depends on more than just a C compiler, and that's why the answer to the question is "no". If the build system of Linux only depends on a C compiler then my answer would be:
That would be a nice stress test, which would undoubtedly lead to bugs discovered. After enough bugs fixed, the answer would be "yes".
Historically the Linux kernel used GCC extensions that Clang did not support, and this is just a thin shim around the ordinary Clang frontend, so to the extent that's still a problem: no.
Otherwise: yeah. It's just clang's main function with a different name slapped on. Linking semantics may differ slightly which could be problematic. But in theory, yes.
Zig is great and I can't wait to try cc! However, Andrew if you're reading this, the std is very confusingly named. It's not a hash map, it's AutoHashMap, it's not a list, it's a ArrayList, etc. I had a lot of trouble finding idiomatic code without having to search through the std sources, like an example with each struct/fun doc would help a ton.
To be fair, 'list' is such a generic term it's not really useful. ArraList and LinkedList and even a hash table are all examples of lists, but their performance characteristics vary so wildly that it doesn't make sense to call any of them simply 'list'.
I reckon the point of the naming is to force you to think about exactly which behavior you want/need in your program, given that arrays and linked lists (for example) are very different from one another with very different performance characteristics.
It's a bit unfortunate that (last I checked) there's no "I don't care how it's implemented as long as it's a list" option at the moment (e.g. for libraries that don't necessarily want to be opinionated about which list implementation to use). Should be possible to implement it as a common interface the same way the allocators in Zig's stdlib each implement a common interface (by generating a struct with pointers to the relevant interface functions).
Zig looks super cool. I've been wanting to experiment with it for some system software.
Are there any guides anywhere for calling libc functions from zig? I'm interested in fork/join, chmod, fcntl, and that kind of thing. Do I just import the C headers manually? Or is there some kind of built-in libc binding?
I think you can call libc by importing the C headers "automagically" but zig does also give you some of these things in its (admittedly still poorly documented) std lib:
I read through the post but I'm still a bit confused as to what parts of this is Zig and what parts are coming from other dependencies. What exactly is zig cc doing, and what does it rely on existing? Where are the savings coming from? Some people are mentioning that this is a clang frontend, so is the novelty here that zig cc 1. passes the the correct options to clang and 2. ships with recompilable support libraries (written in Zig, with a C ABI) to statically link these correctly (or, in the case of glibc, it seems to have some reduced fileset that it compiles into a stub libc to link against)? Where is the clang that the options are being passed to coming from? Is this a libclang or something that Zig ships with? Does this rely on the existence of a "dumb" cross compiler in the back at all?
To compile C code, you need a bunch of different things: headers for the target system, a C compiler targetting that system, and a libc implementation to (statically or dynamically) link against. Different libc implementations are compatible with the C standard, but can be incompatible with each other, so it's important that you get the right one.
Cross-compiling with clang is complex because it's just a C compiler, and doesn't make assumptions about what headers the target system might use, or what libc it's using, so you have to set all those things up separately.
Zig is (apparently) a new language built on Clang/LLVM, so it can re-use that to provide a C compiler. It also makes cross-compilation easier in two other ways. First, it limits the number of supported targets - only Linux and Windows, and on Linux only glibc and musl, and all supported on a fixed list of the most common architectures. Second, building Zig involves pre-compiling every supported libc for every supported OS and architecture, and bundling them with the downloadable Zig package. That moves a lot of work from the end user to the Zig maintainers.
Like most magic tricks there's no actual magic involved, it's just somebody doing more and harder work than you can believe anyone would reasonably do.
IIUC, libc are not prebuilt and bundled ("for every supported target combination"), "just" the source code of musl and glibc is bundled, and compiled lazily on your machine when first needed (i.e. for any target for which you invoke zig).
In the Zig main page, there is a claim that making overflow undefined on unsigned can allow more optimization... but the example given is extremely dependent on the constant chosen. Try to change the 3/6 used in C++ to 2/4, 3/8 and 4/8 for example... It is very strange how the clang code generation changes for each pair of constants!
(Also, having undefined behaviour on unsigned overflow can make some bit twiddling code harder to write. Then again, maybe Zig has a bit-twiddling unsigned-like type without overflow check? Zip allow turning off checks, but then it turns off checks for everything...)
I really love Zig, it can apparently replace our C toolchain with brilliant, static cross-compilation, including s390 Linux support (mainframe Linux!).
My only gripe is that the syntax and stdlib, although practical and to the point, seem to suffer from some strange choices that somewhat clash with its own, albeit early, "zen" of simplicity.
- '@' prefix for builtin functions, a little strange and macro-looking for my eyes. Why not just plain keywords? And cleanup some of it: `@cos`, `@sin`, also feel like too much when they are already in the stdlib I believe.
- |x| for/while list bind var, why not just for(x in y)? Surrounding pipes are really annoying to type in some foreign keyboards and feel totally needless in 99% of the places.
- inconsistent required parenthesis predicates in block statements in "test STR {}" vs. "if() {}". Either require parenthesis or don't, I don't really care which one.
- prefixed type signatures, `?[]u32` feels a little off / harder to read.
- the need to dig deep into "std" to get your everyday lib functions out "std.io.getStdOut().outStream().print()". `@import("std")` repeated many times.
- consider implementing destructuring syntax early-on to deal with so much struct member depth ie `const { x, y } = p` or `const { math: { add, mul } } = @import("std")`.
- anonymous list syntax with `.{}` is eye catching as the dot implies "struct member" in Zig, but then the dot is everywhere, specially when you do anonymous structs `.{.x=123}`, maybe consider `[1,2,3]` and `[x=123]` given brackets are being used for array length annotation anyways ie `array[]`.
- `.` suffix for lvalue and rvalue pointer deref. Also `"str".` is a byte array unroll if I understood correctly. Here `f.* = Foo{ .float = 12.34 };` looks like it's doing something with `.` to get to the struct members but it's actually just a pointer deref. Also looks like a file or import lib wildcard (`file.*`) to my eyes.
- field access by string clunky `@field(p, "x") = 123;`, with an odd function as lvalue.
Sorry for the criticism, we're seriously checking out Zig for migrating a large C codebase and replacing future C projects. Although we can live with these quirks, they just make the language look a little random and NIH and that worries me and the team. For instance, Golang has great syntax and semantic consistency which is a boost on project steering quality and assured, life-long onboarding for newbees. Please consider widening the spec peer-review process, maybe in a separate Github repo with markdown proposal writeups. Discussing syntax seems superficial given many project and compiler feats under the hood, but it can become sorta "genetic disease" and a deal-breaker for the project on the long run!
This is a pre-release version I know, but it's just that my hopes are really up for Zig as Golang, C++ and Rust never really did it for us as a multi-target sw toochain for various reasons.
Feel free to raise these as Issue's on Zig's Issue tracker on Github, or comment on those which have previously been raised. If you have a good reason for it being a certain way, write up why and it may be considered as proposal.
While I agree it's jarring to adjust to Zig's purpose of a for loop (iterate over collections), the syntax here is really pushing you, consciously, to adopt while loops instead. The only snag is that the invariant lives outside the scope of the while, which feels bad.
~~Does `zig cc` cross compiler libgcc/compiler-rt on the fly? Does it compile libc on the fly?~~ Nevermind I did not scroll enough, it does compile on the fly. Whew!
As someone who also cares greatly about cross compilation, compilers definitely should step up their game, but `-target x86_64-windows-gnu` elides many details, unless you are confined to a fix set of platforms.
Looks very cool! Did not see 32-bit RISC-V on the list though, so wondering about that. I would have liked to use Zig cc to build 32-bit RISC-V binaries fast, if that is possible. Doesn't matter if they are freestanding.
Afaik the only drawback is that this functionality is very new and still has some open issues (linked at the end of the post). As stated there are no dependencies for Zig and it is shipped in relatively small tarballs which can be downloaded from the Zig website: https://ziglang.org/download/
The limitation is maturity/stability I'd say. Zig is still pre-1.0
Speed? Using Zig should be faster than using Clang directly in many cases. You get the caching system, and I think you can do more complex builds without having to resort to multiple Clang commands from a makefile.
can zig compile to C? so many languages would be very useful if they could compile to C for embedded systems as native compilers are very unlikely for new ( or even old ) languages
Compiling to C source isn't planned for the reference Zig compiler, as far as I know. It's more interested in helping people moving people off of C (see `zig translate-c`).
But for supporting more esoteric targets you might be interested in the goals of this ultra-early-stage assembler. ("Planned targets: All of them.")
As a (part-time) C programmer, I wouldn't really consider a "C replacement" that can't compile to C. Part of the appeal of C is that it's easy to integrate into other projects, regardless of the build system or obscure hardware it might be targeting. If you require a compiler for a language few people have heard of (even a very cool language), it seriously limits your potential user-base.
If you tell me that I can write better, safer code by using Zig, but I can also compile it into a .c artifact that anybody can use, now that is a tempting proposition!
> Part of the appeal of C is that it's easy to integrate into other projects, regardless of the build system or obscure hardware it might be targeting.
Zig should be just as easy to integrate. Sure, it's one more thing to install, but it'll spit out .o files just like a C compiler would if you tell it to (which means you can shove it in your Makefile or what have you), and will spit out .h files for linking. You miss out on Zig's build system niceties that way, though (including the cross-compilation demonstrated in the article).
Regardless, being a "C replacement" kinda implies (if not outright explies) that it's replacing C; compiling to C kinda defeats that purpose. It'd still be useful, though, and is probably possible (might even be relatively trivial if LLVM and/or Clang provide some mechanism to generate C from LLVM IR or some other intermediate representation).
I think three intention here is not the other way round. You can add c files to your zig project to make the transition, integration with existing code, as well as collaboration with c only programmers easier. After all zig wants to replace c, not the other way round.
Unless you really really need obscure platforms, just forget about it. Make the jump, don't look back.
After you get a taste of a modern toolchain (with cross-compilation, dependency management, but withou not-quite-portable build files to endlessly fiddle with, without outdated compilers to work around), you will not want to have to compile a C file again.
Languages like Zig and Rust are easy to install. Mostly it's just a tarball, so it's less of an inconvenience than getting the right version of autotools.
its not obscure platforms, its a ton of embedded platforms (you may consider that obscure, but reality is there are millions and millions of devices out there running software on these "obscure" platforms), most of them don't have many options. At one stage I was writing write my own language to compile to C based around state machines and actors as so many devices tend to be some version (often badly implemented) of those two things. But I ended up moving out of the embedded world and kind of lost my prime motivation for doing it.
When you compile to object files zig will generate C header files that you can use when linking. Granted this won't help with embedded systems where zig can't compile to.
I’m not arguing that certain people using particular standards could consider LLVM bloated and I’m certainly not going to argue that by certain standards C++ could be considered bloated. But for users of LLVM, be it via clang or Zig cc or GHC, it seems to work just fine. Are your complaints from the perspective of a compiler dev (or a general dev who wants to be able to more easily open up and tune a compiler) or are they just as a user? Also, for native binary compilation of performance sensitive applications, how many options are there in common use for the major languages? Your opinion seems pretty severe, so I’m just trying to see why that is.
> This has been my conviction for the past 10 years, and I'm glad I never had to touch llvm with a ten foot pole.
Out of curiosity: what do you touch with a ten foot pole? I'd be hard-pressed to call GCC or MSVC much better in that regard, and I can think of very few others that are in use anymore.
I mean, I've definitely dreamt about using SBCL or Clozure for things other than Lisp (seeing as they both include their own compilers not dependent on GCC/LLVM), but I've seen effectively zero effort in that direction.
I don't want to be negative as there's too much of that about but gcc and similar can do some pretty hefty optimisations, and for any real work I suspect those count for a great deal. Just because zigcc can compile C, neat as it is, doesn't make it a drop-in replacement for gcc.
Does yours do loop unrolling, code hoisting, optimise array accesses to pointer increments, common expression elimination etc?
This is just a new frontend for clang so it should use all the optimization passes of clang. The main new features are convenient cross compilation and better caching for partial compilation results.
> Compare this to downloading Clang, which has 380 MiB Linux-distribution-specific tarballs. Zig's Linux tarballs are fully statically linked, and therefore work correctly on all Linux distributions. The size difference here comes because the Clang tarball ships with more utilities than a C compiler, as well as pre-compiled static libraries for both LLVM and Clang. Zig does not ship with any pre-compiled libraries; instead it ships with source code, and builds what it needs on-the-fly.
Hot damn! You had me at Hello, World!