Hacker News new | past | comments | ask | show | jobs | submit login
Zig cc: A drop-in replacement for GCC/Clang (andrewkelley.me)
831 points by hazebooth on March 24, 2020 | hide | past | favorite | 288 comments



> Take a moment to appreciate what just happened here - I downloaded a Windows build of Zig, ran it in Wine, using it to cross compile for Linux, and then ran the binary natively. Computers are fun!

> Compare this to downloading Clang, which has 380 MiB Linux-distribution-specific tarballs. Zig's Linux tarballs are fully statically linked, and therefore work correctly on all Linux distributions. The size difference here comes because the Clang tarball ships with more utilities than a C compiler, as well as pre-compiled static libraries for both LLVM and Clang. Zig does not ship with any pre-compiled libraries; instead it ships with source code, and builds what it needs on-the-fly.

Hot damn! You had me at Hello, World!


> Take a moment to appreciate what just happened here - I downloaded a Windows build of Zig, ran it in Wine, using it to cross compile for Linux, and then ran the binary natively. Computers are fun!

Even though it probably doesn't qualify this is pretty close a Canadian Cross, which for some reason is one of my favorite pieces of CS trivia. It's when you cross compile a cross compiler.

https://en.wikipedia.org/wiki/Cross_compiler#Canadian_Cross

> The term Canadian Cross came about because at the time that these issues were under discussion, Canada had three national political parties.


In what way is this even close?

What are the three targets in this case? It simply isn’t relevant at all.


It is tangentially relevant CS-Trivia. I found it to be interesting and fun. The only think completely useless is unfortunately your comment.


You wrote:

> this is pretty close a Canadian Cross

And on that point, your correspondent is right. The two bear no real resemblance to each other. The cross compilation approach described in the article is not something to be held in high regard. It's the result of poor design. It's a lot of work involving esoteric implementation details to solve a problem that the person using the compiler should never have encountered in the first place. It's exactly the problem that the Zig project leader is highlighting in the sticker when he contrasts Zig with Clang, etc.

The way compilers like Go and Zig work is the only reasonable way to approach cross compilation: every compiler should already be able to cross compiler.


Thanks for putting it this way. I always wondered why cross compilation was a big deal. For me it sounds like saying "look, I can write a text in English without having to be in an English-speaking country!".


The problem with cross compilation isn’t the compiler, it’s the build system. See eg the autoconf, which builds test binaries and then executes them, to test for availability of strcmp(3).


I feel like there should be a more sane way to test for the availability of strcmp than to build and run a whole executable and see if it works.

The sheer number of things autoconf does even for trivial programs has always been baffling to me.


Go doesn't cross compile: it only supports the Go operating system on a very limited number of processors variants.

If zig were to truly cross compile for every combination of CPU variant and supported version of every operating system, it would require terabytes of storage and already be out of date.


It doesn't require nearly as much storage as you think. For zig, there's only 3 standard libraries (glibc, musl, mingw) and 7 unique architectures (ARM, x86, MIPS, PowerPC, SPARC, RISC-V, WASM). LLVM can support all of these with a pretty small footprint and since the standard libraries can be recompiled by Zig, it really only needs to ship source code - no binaries necessary.


If it's only supporting two OS runtimes and a small subset of hardware it's mostly just a curiosity.


People really do appreciate such convenience. I am not familiar with Zig, but GO provides me similar experiences for cross-compilation.

Being able to bootstrap FreeBSD/amd64, Linux/arm64, and actually commonly-used OS/ARCH combinations in a few minutes is just like a dream, but it is reality for modern language users.


I'm all for cross compilation, but in reality you still need running copies of those other operating systems in order to be able to test what you've built.


Setting up 3 or 4 VM images for different OSes takes a few minutes. Configuring 3 or 4 different build environments across as many OSes on the other hand ...


And actually building on these potentially very low-power systems...


Sure, but building typically takes more resources than executing, so it's not really feasible to use a Raspberry Pi to build, but it can be for testing.


Yes but the dev setup is not really necessary for all of those OSes.


Only if not using OS APIs.


Yeah sorry I didnt think about. Probably very important as low level code like this mainly for talking with OS APIs


You can do that in clang/gcc but you need to pass: -static and -static-plt(? I can't find what it's called). The second option is to ensure it's loader-independent, otherwise you get problems when compiling and running across musl/glibc platforms


Could you elaborate/link on the loader-independency topic?


In brief, most programs these days are position-independent, which means you need a runtime loader to load sections(?) and symbols of the code into memory and tell other parts of the code where they've put it. Because of differences between musl libc and gnu libc, in effect for the user this means that a program compiled on gnu libc can be marked as executable, but when they try to run it the user is told it is "not executable", because the binary is looking in the wrong place for the dynamic loader, which is named differently across the libraries. There are also some archaic symbols that gnu libc describes that are non-standard, which musl libc has a problem with, that can cause a problem for the end-user.

e: I didn't realise it was 5am, so I'm sorry if it's not very coherent.


I would also appreciate if you manage to be even more specific once more "coherency" is possible. I'm also interested what you specifically can say more about "The second option is to ensure it's loader-independent, otherwise you get problems when compiling and running across musl/glibc platforms"


Ok so, it's been a year or so since I was buggering around with the ELF internals (I wrote a simpler header in assembly so I could make a ridiculously small binary...). Let's take a look at an ELF program. If you run `readelf -l $(which gcc)` you get a bunch of output, among that is:

    alx@foo:~$ readelf -l $(which gcc)

    Elf file type is EXEC (Executable file)
    Entry point 0x467de0
    There are 10 program headers, starting at offset 64

    Program Headers:
      Type           Offset             VirtAddr           PhysAddr
                     FileSiz            MemSiz              Flags  Align
      PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                     0x0000000000000230 0x0000000000000230  R      0x8
      INTERP         0x0000000000000270 0x0000000000400270 0x0000000000400270
                     0x000000000000001c 0x000000000000001c  R      0x1
          [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
      LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                     0x00000000000fa8f4 0x00000000000fa8f4  R E    0x200000
you can see that in the ELF header is a field called "INTERP" that requests the loader. This is because the program has been compiled with the -fPIE flag, which requests a "Position Independent Executable". This means that each section in the code has been compiled so that they don't expect a set position in memory for the other sections. In other words, you can't just run it on a UNIX computer and expect it to work, it relies on another library, to load each section, and tell the other sections where to load it.

The problem with this is that the musl loader (I don't have my x200 available right now to copy some output from it to illustrate the difference) is usually at a different place in memory. What this means is that when the program is run, the ELF loader tries to find the program interpreter to execute the program, because musl libc's program interpreter is at a different place and name in the filesystem hierarchy, it fails to execute the program, and returns "Not a valid executable".

Now you would think a naive solution would be to symlink the musl libc loader to the expected position in the filesystem hierarchy. The problem with this is illustrated when you look at the other dependencies and symbols exported in the program. Let's have a look:

    alx@foo:~$ readelf -s $(which gcc)

    Symbol table '.dynsym' contains 153 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
         0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
         1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __strcat_chk@GLIBC_2.3.4 (2)
         2: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __uflow@GLIBC_2.2.5 (3)
         3: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND mkstemps@GLIBC_2.11 (4)
         4: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND getenv@GLIBC_2.2.5 (3)
         5: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND dl_iterate_phdr@GLIBC_2.2.5 (3)
         6: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __snprintf_chk@GLIBC_2.3.4 (2)
         7: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __pthread_key_create
         8: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND putchar@GLIBC_2.2.5 (3)
         9: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND strcasecmp@GLIBC_2.2.5 (3)
As you can see, the program not only expects a GNU program interpreter, but the symbols the program has been linked against expect GLIBC_2.2.5 version numbers as part of the exported symbols (Although I cannot recall if this causes a problem or not, memory says it does, but you'd be better off reading the ELF specification at this point, which you can find here: https://refspecs.linuxfoundation.org/LSB_2.1.0/LSB-Core-gene...). So the ultimate result of trying to run this program on a GNU LibC system is that it fails to run, because the symbols are 'missing'. On top of this, you can see with `readelf -d` that it relies on the libc library:

    alx@foo:~$ readelf -d $(which gcc)

    Dynamic section at offset 0xfddd8 contains 25 entries:
      Tag        Type                         Name/Value
     0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
     0x0000000000000001 (NEEDED)             Shared library: [ld-linux-x86-64.so.2]
     0x000000000000000c (INIT)               0x4026a8
Unfortunately for us, the libc.so.6 binary produced by the GNU system is also symbolically incompatible with the one produced by musl, also GNU LibC defines some functions and symbols that are not in the C standard. The ultimate result of this is that you need to link statically against libc, and against the program loader, for this binary to have a chance at running on a musl system.


Wow. Your answer is really a good fit to the details provided by the author of the original article.

Many, many thanks for the answer! I've already done some experimenting myself and wanted to do more, so it really means a lot to me.


For further interest you might want to take a look at:

http://www.muppetlabs.com/~breadbox/software/tiny/somewhat.h...

I altered a version of that ELF64 header for 64 bit, and then modified it to work under grsec's kernel patches: https://gitlab.com/snippets/1749660


One example of a description of how the Linux linkloader works is here [0]. Other OSes are similar.

[0] https://lwn.net/Articles/631631/


Dlang is a better C. DMD, the reference compiler for Dlang, can also compile and link with C programs. It can even compile and link with C++03 programs.

It has manual memory management as well as garbage collection. You could call it hybrid memory management. You can manually delete GC objects, as well as allocate GC objects into manually allocated memory.

The Zig website says "The reference implementation uses LLVM as a backend for state of the art optimizations." However, LLVM is consistently 5% worse than the GCC toolchain at performance across multiple benchmarks. In contrast, GCC 9 and 10 officially support Dlang.

Help us update the GCC D compiler frontend to the latest DMD.

Help us merge the direct-interface-to-C++ into LLVM D Compiler main. https://github.com/Syniurge/Calypso

Help us port the standard library to WASM.


>However, LLVM is consistently 5% worse than the GCC toolchain at performance across multiple benchmarks

That is true, but it is ALSO true that LLVM is consistently 5% better than the GCC toolchain at performance across multiple benchmarks


D seems like its pitch is "a better C++," but "a better C" doesn't seem quite right.


D's whole premise of "being a better C++" has always made them look like argumentative jerks. Why build a language on top of a controversy? Their main argument from early 2000s: C++ requires a stdlib and compiler toolchain is not required to provide one. Wtf D? I mean I understand that C++ provides a lot of abstractions on top of C to call itself "more" than C, but what does D provide other than a few conveniences? If you even consider garbage collection or better looking syntax or more consistent, less orthogonal sytax a convenience. It didn't even have most of it's current features when it was first out back in early 2000s. Trying to gain adoption through creating some sort of counterculture what are they? 14?. /oneparagraphrant


It is probably the case that D has a brilliant engineer team who doesn't really focus on the PR side of things. D definitely provides value over C/C++ other than a few sugars for the syntax. It is just not communicated that well.


It has an official subset and associated compiler flag. (https://dlang.org/spec/betterc.html)


Can DMD compile C programs though? That's what "zig cc" does, and it's so much easier to get up-and-running than any crossdev setup I've used before.


Yes.


Not really, because unlike Zig, D doesn't allow for C common security exploits, unless one explicitly write them as such.


Really incredible work and it's been very fun to follow along. The streams where Andrew did the last part of this work can be seen here: [1], [2].

I am really happy that someone is making the effort to steadily simplify systems programming rather than make it more complicated. Linux goes to such incredible lengths to be bug-for-bug backwards compatible, but then the complexities of all of our layers of libcs, shared libraries, libsystemd, dbus, etc cause unnecessary pain and breakage at every level. Furthermore, cross-compiling C code across different architectures on Linux is far harder than it needs to be. I have a feeling that there wouldn't be as much interest in the steady stream of sandboxes and virtual machines (JVM, NaCl, PNaCl, flatpak, docker, WebAssembly) if we could just simplify the layers and layers of cruft and abstractions in compiler toolchains, libc implementations, and shared libraries. Practically every laptop and server processor use the exact same amd64 architecture, but we have squandered this opportunity by adding leaky abstractions at so many levels. I can't wait until installing a program on linux is as simple as downloading a static executable and just running it and I hope zig brings this future.

[1] https://www.youtube.com/watch?v=2u2lEJv7Ukw [2] https://www.youtube.com/watch?v=5S2YArCx6vU


Linux and GCC today have the ability to compile and run fully static executables, I don't understand why this isn't done...


> I don't understand why this isn't done

Because when there's a security update to (say) OpenSSL, it's better for the maintainers of just that library to push an update, as opposed to forcing every single dependent to rebuild & push a new release.


My main issue with this rationale is that, in the vast majority of production environments (at least the ones I've seen in the wild, and indeed the ones I've built), updating dependencies for dynamically-linked dependents is part of the "release" process just like doing so for a statically-linked dependent, so this ends up being a distinction without a difference; in either circumstance, there's a "rebuild" as the development and/or operations teams test and deploy the new application and runtime environment.

This is only slightly more relevant for pure system administration scenarios where the machine is exclusively running software prebuilt by some third-party vendor (e.g. your average Linux distro package repo). Even then, unless you're doing blind automatic upgrades (which some shops do, but it carries its own set of risks), you're still hopefully at least testing new versions and employing some sort of well-defined deployment workflow.

Also, if that "security update" introduces a breaking change (which Shouldn't Happen™, but something something Murphy's Law something something), then - again - retesting and rebuilding a runtime environment for a dynamically-linked dependent v. rebuilding a statically-linked dependent is a distinction without a difference.


I would have agreed with this statement about five years ago. (Even though you would have had to restart all the dependent binaries after updating the shared libs.)

Today, with containers becoming increasingly the de facto means of deploying software, it's not so important anymore. The upgrade process is now: (1) build an updated image; (2) upgrade your deployment manifest; (3) upload your manifest to your control plane. The control plane manages the rest.

The other reason to use shared libs is for memory conservation, but except on the smallest devices, I'm not sure the average person cares about conserving a few MB of memory on 4GB+ machines anymore.


> Today, with containers becoming increasingly the de facto means of deploying software

I think that's something of an exaggeration.

Yes, containers are popular for server software, but even then it's a huge stretch to claim they are becoming de facto.


App bundles on MacOS and iOS are basically big containers, though there is some limited external linking through Apple's frameworks scheme.

And obviously video game distribution has looked like this since basically forever as well.


> App bundles on MacOS and iOS are basically big containers, though there is some limited external linking through Apple's frameworks scheme.

There's a file hundreds of megabytes large containing all the dynamically-linked system libraries on iOS to make your apps work.


Video games do not run on/as containers. Quite the opposite, in fact.


In addition to pjmlp's list, Steam is pushing toward this for Linux games (and one could argue that Steam has been this for as long as it's been available on Linux, given that it maintains its own runtime specifically so that games don't have to take distro-specific quirks into account).

Beyond containers / isolated runtime environments, the parent comment is correct about games (specifically of the console variety) being historically nearly-always statically-linked never-updated monoliths (which is how I interpreted that comment). "Patching" a game after-the-fact was effectively unheard of until around the time of the PS3 / Xbox 360 / Wii (when Internet connectivity became more of a norm for game consoles), with the sole exception of perhaps releasing a new edition of it entirely (which would have little to no impact on the copies already sold).


Kind of.

They do on XBox, Swift, iOS, Android sandboxes.


> Today, with containers becoming increasingly the de facto means

This assertion makes no sense at all and entirely misses the whole point of shared/dynamic libraries. It's like a buzzword is a magic spell that makes some people forget the entire history and design requirements up to that very moment.


Sometimes buzzwords make sense, in the right context. This was the right context.

Assuming you use containers, you're likely to not log into them and keep them up to date and secure by running apt-get upgrade.

The most common workflow is indeed: build your software in your CI system, in the last step create a container with your software and its dependencies. Then update your deployment with a new version of the whole image.

A container image is for all intents and purposes the new "static binary".

Yes, technically you can look inside it, yes technically you can (and you do) use dynamic linking inside the container itself.

But as long as the workflow is the one depicted above, the environment no longer has the requirements that led to the design of dynamic linking.

It's possible to have alternative workflows for building containers: you could fiddle with layers and swap an updated base OS under a layer containing your compiled application. I don't how common is that, but I'm sure somebody will want/have to do it.

It all boils down to whether developers still maintain control over the full deployment pipeline as containers penetrate the enterprises (i.e. whether re retain the "shift to the left", another buzzword for you).

Containers are not just a technical solution, they are the embodiment of the desire of developers to free themselves from the tyranny of filing tickets and waiting days to deploy their apps. But that leaves the security departments in enterprises understandably worried as most of those developers are focused on shipping features and often neglecting (or ignoring) security concerns around things that live one layer below the application they write.


Shared libraries have largely proven that they aren't a good idea, which is why containers are so popular. Between conflicts and broken compatibility between updates, shared libraries have become more trouble than they are worth.

I think they still make sense for base-system libraries, but unfortunately there is no agreed upon definition of 'base-system' in the wild west of Linux.


And the reason we're using containers in the first place is precisely because we've messed up and traded shared libs for having a proven-interworking set of them, something that can trivially be achieved using static linking.


Actually the main selling point of containers has nothing to do with "proven interworking", but the ability to deploy and run entire applications in a fully controlled and fully configurable environment.

Static libraries do nothing of the sort. In fact, they make it practically impossible to pull it off.

There's far more to deploying software than mindlessly binding libraries.


On Windows, I don't need to use Docker in order to run a program in a reproducible way. I just download a program, and in 90% of cases it "just works" whether I'm running Windows 10, Windows 8, or the decade-old Windows 7.

Furthermore, installing that program will (again, in 90% of cases at least) not affect my overall system configuration in any way. I can be confident that all of my other programs will continue to work as they have.

Why? Because any libraries which aren't included in the least-common-denominator version of Windows are included with the download, and are used only for that download. The libraries may shipped as DLLs next to the executable, which are technically dynamic, but it's the same concept—those DLL's are program-specific.

This ability is what I really miss when I try to switch to desktop Linux. I don't want to set up Docker containers for random desktop apps, and I don't want a given app to affect the state of my overall system. I want to download and run stuff.

---

I realize there's a couple of big caveats here. Since Windows programs aren't sandboxed, misbehaving programs absolutely can hose a system—but at least that's not the intended way things are supposed to work. I'm also skipping over runtimes such as Visual C++, but as I see it, those can almost be considered part of the OS at this point. And I can a ridiculous number of versions of MSVC installed simultaneously without issue.


> On Windows, I don't need to use Docker in order to run a program in a reproducible way. I just download a program, and in 90% of cases it "just works" whether I'm running Windows 10, Windows 8, or the decade-old Windows 7.

One program? How nice. How about 10 or 20 programs running at the same time, and communicating between themselves over a network? And is your program configured? Can you roll back changes not only in which versions if the programs are currently running but also how they are configured?

> This ability is what I really miss when I try to switch to desktop Linux. I don't want to set up Docker containers for random desktop apps,

You're showing some ignorance and confusion. You're somehow confusing application packages and the natural consequence of backward compatibility with containers. In Linux, deploying an application is a solved problem, unlike windows. Moreover, docker is not used to run desktop applications at all. At most, tools like Canonical's Snappy are used, which enable you to run containerized applications in a completely transparent way, from installation to running.


> the ability to deploy and run entire applications in a fully controlled and fully configurable environment

But isn't the reason to have this fully controlled and fully configurable environment to have a proof of interworking? Because when environment is in any form different you can, and people already do, say that it's not supported.


> But isn't the reason to have this fully controlled and fully configurable environment to have a proof of interworking?

No, because there's far more to deploying apps than copying libraries somewhere.


> Actually the main selling point of containers has nothing to do with "proven interworking", but the ability to deploy and run entire applications in a fully controlled and fully configurable environment.

Which is exactly the same selling point as for static linking.


Some of us use linux as a desktop environment, and like having the security patches be applied as soon as the relevant package has updated.


As a user of the Linux desktop, I really love it when library updates break compatibility with the software I use too. Or can't be installed because of dependency conflicts.

Containers are popular because shared libraries cause more trouble than they are worth.


Containers most likely wouldn't have existed if we had a proper ecosystem around static linking and resolution of dependencies. Containers solve the problem of the lack of executable state control, mostly caused by dynamic linking.


More broadly, containers solve the problem of reproducibility. No longer does software get to vomit crap all over your file system in ways that make reproducing a functioning environment frustrating and troublesome. They have the effect of side-stepping the dependencies problem, but that isn’t the core benefit.


But the images themselves are not easily reproducible with standard build tooling.


True—but that's far less of a problem, because it rarely occurs unexpectedly and under a time crunch.

Diffing two docker images to determine the differences between builds would be far less onerous than attempting to diff a new deployment against a long-lived production server.


Dynamic linking isn't the issue. Shared libraries are the issue. You could bundle a bunch of .so files with your executable & stick it in a directory, and have the executable link using those. That's basically how Windows does it, and it's why there's no "dependency hell" there despite having .dlls (dynamically linked libraries) all over the place.

Shared libraries are shared (obviously) and get updated, so they're mutable. Linux systems depend on a substantial amount of shared mutable state being kept consistent. This causes lots of headaches, just as it does in concurrent programming.


Based on my experience this is very rarely the case unless you have an extremely disciplined SecOps team.


> Based on my experience this is very rarely the case

You must have close to zero experience them because that's the norm on any software that depends on, say, third-party libraries that ship with a OS/distro.

Recommended reading: Debian's openssl package.

https://tracker.debian.org/pkg/openssl


You are talking about a FOSS project, I am talking about a company that has a service that uses OpenSSL in production.


These are not diametrically opposed. Your company can have a service that uses OpenSSL in production that runs on Debian to automatically take advantage of Debian patches and updates if it's linked dynamically to the system provided OpenSSL.

You can either employ an extremely disciplined SecOps team to carefully track updates and CVEs (you'd need this whether you're linking statically or dynamically) or you can use e.g. Debian to take advantage of their work to that end.


Every single company that I used to work for had an internal version of Linux that they approved for production. Internal release cycles are disconnected from external release cycles. On the top of that, some of these companies were not using system-wide packages at all, you had to reference a version of packages (like OpenSSL) during your build process. We had to do emergency patching for CVEs and bump the versions in every service. This way you can have 100% confidence what a particular service is running with a particular version of OpenSSL. This process do not depend on Debian's (or other FOSS vendor's) release cycles and the dependencies are explicit, therefore the vulnerability assessment is simpler (as opposed to go to every server and check which version is installed). Don't you think?


If you need that level of confidence - sure. But it's going to cost a lot more resources and when you're out of business your customers are fully out of updates. I wouldn't want to depend on that (then again a business customer will want to maintain a support contract anyway).

Isn't a containerized solution a good compromise here? You could use Debian on a fixed major release, be pretty sure what runs and still profit from their maintenance.


What I'm saying is that the only way you can get away with not having an "extremely disciplined SecOps team" is to depend on someone else's extremely disciplined SecOps team. Whether you link statically or dynamically is orthogonal.

> Every single company that I used to work for had an internal version of Linux that they approved for production.

I can't deny your experience, but meanwhile I've been seeing plenty of production systems running Debian and RHEL, and admins asking us to please use the system libraries for the software we deployed there.

> Internal release cycles are disconnected from external release cycles.

That seems to me like the opposite of what you'd want if you want to keep up with CVEs. If you dynamically link system libraries you can however split the process into two: the process of installing system security updates doesn't affect your software development process for as long as they don't introduce breaking changes. Linking statically, your release cycles are instead inherently tied to security updates.

> We had to do emergency patching for CVEs and bump the versions in every service.

What is that if not tying your internal release cycles to external release cycles? The only way it isn't is if you skip updates.

> This process do not depend on Debian's (or other FOSS vendor's) release cycles and the dependencies are explicit, therefore the vulnerability assessment is simpler (as opposed to go to every server and check which version is installed). Don't you think?

I don't know, going to every server to query which versions of all your software they are running seems similarly cumbersome. Of course, if you aren't entirely cowboying it you'll have automated the deployment process whether you're updating Debian packages or using some other means of deploying your service. Using Debian also doesn't make you dependent on their release cycles. If you feel like Debian isn't responding to a vulnerability in a timely manner, you can package your own version and install that.


> You are talking about a FOSS project

I'm talking about the operating system that's pretty much a major component of the backbone of the world's entire IT infrastructure, whether its directly or indirectly through downstream distros that extend Debian, such as Ubuntu. Collectively they are reported to serve over 20% of the world's websites,and consequently they are the providers and maintainers of OpenSSL that's used by them.

If we look at containers, docker hub lists that Debian container images have been downloaded over 100M times, and ubuntu container images have been downloaded over 1B times. These statistics don't track how many times derived images are downloaded.


N°1 most harmful post on the internet : https://akkadia.org/drepper/no_static_linking.html

I am convinced that Drepper's insistence on dynamic linking has set the linux desktop useability and developer friendliness back literal decades.


I work on an embedded linux system that has 256 MB of RAM. That can get eaten up really fast if every process has its own copy of everything.


~15 years ago my "daily driver" had 256MB of RAM and it was perfectly usable for development (native, none of this new bloated web stuff) as well as lots of multitasking. There was rarely a time when I ran out of RAM or had the CPU at full usage for extended periods.

Now it seems even the most trivial of apps needs more than that just to start running, and on a workstation, less than a year old with 4 cores of i7 and 32GB of RAM, I still experience lots of lag and swapping (fast SSD helps, althougn not much) doing simple things like reading an email.


I'm on a Mac with 32 GB of Ram.

According to Activity Monitor, right now:

• 4.26 GB are being used by apps

• 19.52 GB are cached files

• 8.22 GB are just sitting idle (!)

Now, I'm not running anything particularly intensive at the moment, and I make a point of avoiding Electron apps. I also rebooted just a few hours ago for an unrelated reason.

But the fact is that I've monitored this before—I very rarely manage to use all my RAM. The OS mostly just uses it to cache files, which I suppose is as good a use as any.


and I make a point of avoiding Electron apps

I do that personally too, but in a work environment that is unfortunately not always possible --- and also responsible for much of the RAM usage too.


Slack is the biggest culprit IME. If there was a native client, I'd take it like a shot



> ~15 years ago my "daily driver" had 256MB of RAM and it was perfectly usable for development

What I failed to mention was that the rootfs is also eating into that (ramdisk). In your case I'm guessing your rootfs was on disk.


Oh my god, just try running a modern OS on a spinning rust drive. It's ridiculous how slow it is. It's obvious that modern developers assume everything is running on SSD.


Are you sure? I've been running Linux for a long time with no page file. From 4gb to 32gb (The amount of RAM I have now) and have literally only ran out of RAM once (and that was because of a bug in a ML program I was developing). I find it very hard to believe that you experience any swapping at all with 32gb, much less "lots".


You've likely not experienced the amazing monstrosity that is Microsoft Teams:

https://answers.microsoft.com/en-us/msoffice/forum/all/teams...

There's a screenshot in there showing it taking 22GB of RAM. I've personally never seen it go that high, but the 10-12GB of RAM that I have seen is absolutely ludicrous for a chat app. Even when it's initially started it takes over 600MB. Combine that with a few VMs that also need a few GB of RAM each, as well as another equally-bloated Electron app or two, and you can quickly get into the swapping zone.


How is that possible, 22GB? Fucking electron. You would think, at least, that Microsoft would code a fucking real desktop app.. I hate web browsers.


I also experience the same thing with Mattermost (the client also being an Electron app). The memory bloat usually comes from switching back and forth from so many channels, scrolling up to load more chat history, and lots and lots of image attachments (and of course, the emoticons).


scrolling up to load more chat history, and lots and lots of image attachments (and of course, the emoticons).

I remember comfortably browsing webpages with lots of large images and animated GIFs in the early 2000s, with a fraction of the computing power I have today. Something has become seriously inefficient with browser-based apps.


You said yourself you managed to find a case where you ran out of memory. Why do you find it "very hard to believe", knowing nothing about his use cases, that his job doesn't involve exactly the sort of situations that consume vast amounts of RAM. Why do people insist with such conviction that "it doesn't happen to me, therefore it's inconceivable that it happens to someone else, doing something totally different than what I'm doing, into which I have no insight". Baffling.


> Why do you find it "very hard to believe", knowing nothing about his use cases, that his job doesn't involve exactly the sort of situations that consume vast amounts of RAM.

Probably the GGP said they experience lag while "doing simple things like reading an email." Now, maybe GGP meant to add "while I'm sequencing genes in the background", but since that was left out I can see how it would be confusing! :)


That's fair. Good point.


Then don't statically link. "Emebedded systems" requirements shouldn't dictate "Desktop" or "Server" requirements.


You should look into fdpic as format to store your binaries in. It think i might lessen your concerns.


So dynamic linking makes sure they each dependency is loaded into memory just once?

Could someone estimate how much software nowadays is bloated by duplicated modules?


Note that when using static linking, you don't get a copy of everything, just everything you actually use.

It doesn't alter the fundamental point: shared libraries save both persistent storage and runtime memory.


> Note that when using static linking, you don't get a copy of everything, just everything you actually use.

Which is a significant fraction of everything even if you call simple like printf.

> It doesn't alter the fundamental point: shared libraries save both persistent storage and runtime memory.

I fail to see the argument for this. Dynamic linking deduplicates dependencies and allows code to be mapped into multiple processes "for free".


Have you measured it? How much is dynamic linking saving you? How many processes are you running on embedded systems with of 256MB or RAM?


Ok, so I just measured with a C "hello world".

My dynamically-linked executable is 8296 bytes on disc. My statically-linked executable is 844,704 bytes on disc.

So if I had a "goodbye world" program as well, that's a saving of about 800KB on disc.

Now one can argue the economics of saving a bit under a megabyte in a time where an 8GB microSD card costs under USD5 in single quantities, but you can't argue that it's a relatively big saving.

At runtime, the dynamic version uses (according to top) 10540 KB virtual, 540 KB resident, and 436 KB shared. The static version uses 9092 KB virtual, 256 KB resident, and 188 KB shared.

I haven't investigated those numbers.


256MB of RAM is a fairly large amount–this is how much iPhone 3GS had, for instance. It relied on dynamic linking to system libraries heavily and ran multiple processes.


That proves the point: with multi-GiB memory nowadays you can fit many times over all the space that the iPhone saved using dynamic linking.


I would rather not be limited on my laptop to iPhone apps from ten years ago.


My second sentence was arguing for dynamic linkage (I called them "shared libraries", but I think that's a fairly common nomenclature).


Fear not, static linking is back on the rise!

https://stackoverflow.com/questions/3430400/linux-static-lin...

I've also linked this Zig post into that list (and happy to add further languages if you can provide a link that shows that they have good out-of-the-box static linking support).


Rust has some really good static linking support. If you compile with the `musl` targets (e.g. x86_64-unknown-linux-musl), it will give you a fully statically linked binary.


LD_PRELOAD is really useful, especially if you want to change memory allocators without having to recompile.

I never realised people were moaning about shared libraries.


Also if you want to hook all the `open` style calls to make them work in a very tight sandbox :)

(e.g. I have Firefox running under Capsicum: https://bugzilla.mozilla.org/show_bug.cgi?id=1607980)


Dynamic linking is what has allowed avoiding even longer backwards compatibility and even more cruft like Windows has.


Just for readers, do you mean avoiding incompatibility?


Whether they did or not, they spoke truth. Linux's (userland, not kernel) backward compatibility is ridiculously bad unless you're compiling from source. This is not the case on Windows.


No, I mean Linux has been able to avoid long-term backwards compatibility.


From the late 90s to the late aughts it’s unlikely anyone could have used a desktop for any work. 256MB of memory was a lot of memory not so long ago.

Looking at how a browser, an IDE, and a few compilation processes will gladly chew through 8GB of memory... it’s not necessarily horrible, but this is a modern contrivance.


I know the ability exists, but I'm pretty sure that it's not exactly easy to get it working. Last time I tried, it immediately failed because my distribution wasn't shipping .a files (IIRC) for my installed libraries. There's a lot of little things that don't quite work because nobody's using them so they're harder to use so nobody uses them...


It's easy to get working provided that you compile _everything_ from source. You can either omit glibc from this, or accept that it will still dynamically load some stuff at runtime even when "statically" linked, or switch to musl. A nice benefit is that LTO can then be applied to the entire program as a whole.


Yep, exactly this. And quirks of glibc and friends make fully static compilation likely to produce odd failures, unfortunately.


Glibc does not really support static linking.


Kind of my point — so much software depends on it, it’s difficult to statically link more things.


Musl is a pretty good replacement, I have been using it for years without any troubles.


I like Musl, but it is a source of pain at times too: https://github.com/kubernetes/kubernetes/issues/64924 https://github.com/kubernetes/kubernetes/issues/33554

Admittedly you could put that on the Kubernetes folks, but the same problem doesn't exist with glibc.


I don't see how zig cc would help with that. Your distribution probably also doesn't ship all the source files for your packages either, and there's no other way to statically link.


Gentoo does.


I've used Gentoo for almost a decade now, and no, that's not true. emerge doesn't just randomly keep source files on disk, and certainly not in a form easy to link to. In fact, Gentoo is worse than Debian for static linking, because not all packages have IUSE=static-libs. If it doesn't, you need to patch that ebuild and potentially many dependencies to support it. On the other hand, on Debian, the standard is for -dev packages to come with both headers and static libraries.


One I think people forget about is ASLR. What symbols are you going to shuffle? At least with dynamic linked dependencies the linker can just shove different shared objects into different regions without much hassle.

Other have mentioned the other points: runtime loading (plugins), CoW deduplication and thus less memory and storage.


In addition to the other excellent reasons mentioned here, there's also the fact that some libraries deliberately choose to use runtime dynamic linkage (dlopen) to load optional/runtime-dependent functionality.


If you want make a program that supports plugins, you have only two real options: non-native runtimes or dynamic linking. And the later gets you into a lot of trouble quickly. The former trades performance and memory usage for ease of use and a zoo of dependencies.


> some libraries deliberately choose to use runtime dynamic linkage (dlopen) to load optional/runtime-dependent functionality.

Also known as plugins.

It's not a design flaw, it's a feature.


Ironically that is how they have been implemented since the dawn of time.

Dynamic linking was added around Slackware 2.0 timeframe.


You cant statically compile in glibc right?


sure you can:

    $ cat hello.c
    #include <stdio.h>

    int main() {
        printf("hello world!\n");
    }
    $ gcc -o hello hello.c
    $ ldd hello
            linux-vdso.so.1 (0x00007ffff9da0000)
            libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fed449d0000)
            /lib64/ld-linux-x86-64.so.2 (0x00007fed45000000)
    $ gcc -o hello hello.c -static
    $ ldd hello
            not a dynamic executable


You can, but it is not supported. Certain features (NSS, iconv) will not work.


> Certain features (NSS, iconv) will not work.

If you're the kind of person who wants static linking then you really don't want these features.

The real problem is that statically linked programs under Linux don't (didn't?) support VDSO, which means that syscalls like gettimeofday() are suddenly orders of magnitude slower.

In the end, we had to do a kind of pseudo-static linking - link everything static except glibc.


I think the vDSO page is mapped into every process regardless of how the program is linked, although you may have difficulty using it.


Vdso is supported with static linking in Musl libc at least.


I don't think the parent comment's point was if this was technically possible.

glibc is GPL licensed, and the GPL explicitly forbids statically linking to it unless your code is GPL too.

Thus any non-GPL project has it's license tainted by the GPL if you statically link it.

It's not a technical limitation, it's a legal one.


This is false. The license for glibc is the LGPL, not the GPL, and the LGPL has an exception to allow static linking without the whole code having to be under the LGPL, as long as the .o files are also distributed to allow linking with a modified glibc ("As an exception to the Sections above, you may also combine or link a "work that uses the Library" with the Library to produce a work containing portions of the Library, and distribute that work under terms of your choice [...] and, if the work is an executable linked with the Library, with the complete machine-readable "work that uses the Library", as object code and/or source code, so that the user can modify the Library and then relink to produce a modified executable containing the modified Library.")


Sounds like I got conned by my (poor) memory. I should have re-googled this before posting, thanks for the correction.


Yes, but for many projects it means no static linking in practice.


This isn't a legal limitation.

It's FUD.

See here -

https://www.gnu.org/licenses/gpl-faq.html#LGPLStaticVsDynami...


Thanks for the link, that's very useful. I should have re-googled this stuff instead of commenting "from memory", sorry for spreading the FUD :/


Also, right above that: https://www.gnu.org/licenses/gpl-faq.html#GPLStaticVsDynamic

Dynamically linking a GPL library is the same as statically linking a GPL library; the resulting executable must be GPL-licensed.


That's also a good reason to avoid glibc and switch to a musl-based system.


I might be uneducated but wasn't there a "system library exception" or something like that in GPL to prevent these problems?


> I can't wait until installing a program on linux is as simple as downloading a static executable and just running it and I hope zig brings this future.

For the record: This is pretty close to what AppImage is today. It's not quite 100% because userland fragmentation is so ridiculously bad that it doesn't work out of the box on a few of them, but I personally really wish all Linux software was distributed that way (or static like Zig).


And it pretty much proves that it's not such a great thing to aspire. Images are large, dependencies aren't updatable, locations don't match with distribution defaults.


On the other hand: no conflicts, no missing dependencies, can have multiple versions of the same thing installed at the same time, can store them anywhere including removable media...


The cross compiling features of Zig look fantastic! Installation is so easy, just downloading and extracting a single file.

Should every compiler stack have prioritized cross compilation over other features? (I vote: YES). Cross compiling programs has always been a PITA for most languages.

It would be great if Zig cc could be paired with vcpkg [1] for a nice cross-compiling development environment. Looks like vckpg requires a C++ compiler though.

1: https://github.com/microsoft/vcpkg


Which reasonably-popular modern languages can be reasonably said to have ignored cross compilation? Interpreted languages like JavaScript and Python obviously don't have any problem, JIT languages like .NET and Java explicitly have a cross-platform layer, and modern compiled languages like Go and Rust specifically have cross-compilation as a goal. Rust still needs a libc though, but that's not Rust's fault, that's the result of trying to work together with the system instead of DIYing everything. (see: problems with Go doing system calls on BSDs, Solaris, etc)

You can't look at C which started in the 1970s and C++ which started in the 1980s and have expected them to even consider cross-compilation, when Autoconf wasn't even released until 1991.


I think the benchmark should be distribution. For instance, you mention "JavaScript and Python obviously don't have any problem", but say you want to create a distributable program for the 3 major operating systems, based in one of those languages, and you can't assume the user will have an installed interpreter. I don't think you will find any _standard_ solutions.

Most other languages make it _possible_ to generate some sort of artifact usable from different operating systems but not necessarily easy. I think Java only relatively recently included a standard way to create a bundle including a minimal JVM distribution with an app to make it usable when the user doesn't have an installed JVM (and again, there were a bunch of different non standard solutions of varying quality before that). Even now I wouldn't say the Java solution is easy to use.

I could continue in this fashion with different languages, but you get the idea.

I heard go is pretty good in ease of cross compilation, and well, looks like Zig is doing great in this area too. Ah! .net core is apparently pretty good in this area these days too.


That's moving the bar though. Only in the past decade or so has it become reasonable ("reasonable") to include entire runtimes together with an application. Java started in 1995, Python started in 1991. This was an era when one of Java's main targets was SIM cards and other highly minimal devices, so not only would the target already have a JVM, but it would be wholly impractical to ship your own. Even on desktops, downloading just a fraction of Java for each program would be a massive waste of limited bandwidth and disk space.

For that reason, Java and Python didn't start out with fully self-contained bundles as a design goal. It just wasn't practical in the 90s. Obviously, yes, if they had managed to correctly predict and plan for three decades of technological improvement, then sure, we'd be working in a very different technological landscape. But they couldn't possibly have, and solutions built on the old languages are always fraught with disagreement. So, we use new languages, like Go and Rust, which are developed with modern needs in mind.


> I think the benchmark should be distribution. For instance, you mention "JavaScript and Python obviously don't have any problem", but say you want to create a distributable program for the 3 major operating systems, based in one of those languages, and you can't assume the user will have an installed interpreter. I don't think you will find any _standard_ solutions.

Well, there are web browsers and servers. They distribute javascript programs and run them.


Interpreted environments like Node.js, Python, and Ruby absolutely have this problem since many popular packages utilize native extensions. Distribution is still a challenge.


Note that on linux hosts at least, for most target platforms, being able to cross compile with clang is only one single install of the right packages away.


Portability depends on a great deal more than just object code formats. The list of OS environment functions to call to achieve anything useful is radically different from one target to another.

This is what makes porting hard work. Cross-compiling is only the first step of a long trip.


I think a problem comes when you want to distribute your compiler potentially independent from your OS and/or linker and/or C library.

But it's also fair to say that if we had always considered those things as inseparable parts of the "compiler suite" that might have made everyone better off.


I'd love for a hotlist or even prize for programmers doing awesome stuff like this. Off the top of my head, Andrew, Andreas Kling of Serenity OS, whoever ffwff is (https://github.com/ffwff/), Niko Matsakis of Rust, and so on. It'd be very awesome to have a roundtable with some of these people. They could discuss different approaches to design, their own histories and plans for the future. I loved reading Coders at Work, but I felt that there was a missed opportunity by not having the subjects interact with each other. So many of them had similar and very dissimilar ideas/approaches, which would have made for a wonderful discussion.

If someone could do the same but for a younger generation, I think it'd be very valuable.


Zig (the language) is very appealing as a "better C than C". Check out https://ziglang.org (dis-disclaimer: I'm unaffiliated.)


Aye, and it lives up to that claim as well in my opinion, despite being still relatively young and pre-1.0. My favorite thing about Zig is that it has managed to stay simple and solve many of the problems of C without resorting to greatly increased complexity like Rust (which is much more of a C++ replacement than a C replacement in my opinion).


IMO the marketing as Rust as a C/C++ replacement is a bit misplaced. I think it's more accurate to consider it an alternative systems language. The tools/languages used in this space are a bit broader than C-family languages.


That's true, though systems programming is in practice dominated by the C ABI (great post on that here by the way https://drewdevault.com/2020/03/03/Abiopause.html). Zig does something quite special that puts it ahead of the crowd in this space; It can import and use C libraries as easily as C does (no bindings required), and it can itself be built into a C library, auto-generating the required C headers.


You are absolutely correct on the point of the C ABI. It's definitely the systems lingua franca.

Just finished reading the Zig cc article and I must say I'm also quite impressed. I'll be keeping an eye on the next Zig release--being able to eventually use it as a `cc` or `mvsc` replacement would be a big game changer. Having recently run the study of trying to cross compile a few C++ and GTK apps, I can really see the appeal.


Rust is not a C++ replacement, nor a C replacement. It targets its own niche (embedded, realtime, correctness-oriented). It's a totally different development culture. Zig, OTOH, is absolutely a C replacement.


My day job is maintaining the toolchains for a popular safety-certified realtime embedded operating system. I have never once been asked to provide a Rust toolchain. Fortran, yes. Ada, yes. Python and Go, yes. By and large it's just C (and more and more C++).

Rust seems to be mostly something "full stack developers" and "back end developers" embrace for server-side toys and hobby projects.


Python, on a realtime embedded operating system? Rust isn't really there for embedded development yet. It's great for low-level code on full-blown processors/OSs though


Python: customers ask, I deliver.

Top-end embedded processors have GPUs and hypervisors these days and run AI algorithms to, say, detect lane changes and do parallel-parking maneuvers. These days AI means code written in Python and Fortran.


Rust does not target correctness-oriented code at all. Neither do standard C or C++, for that matter, but there are derivatives of C that can (and are) used for it.


> It targets its own niche (embedded, realtime, correctness-oriented).

I don't know of anyone who actually uses Rust in that niche.

Everyone who uses Rust is using it because of "C++ is hard, let's go shopping instead" syndrome.

I.e., at this point it's a language for beginners to ease themselves into programming without training wheels and eventually graduate to real big boy programming languages.


This is a strange sentiment. I think 99/100 people who know rust would not categorize it as "for beginners"


It's pretty much irrelevant what people feel emotionally about Rust.

The real-world fact is that Rust is, as of March 2020 at least, an entry-level systems programming language. It's used as a stepping stone by former PHP/Python/Go programmers, who are very intimidated by C++, to get into performance-oriented coding.

Nobody actually writing embedded or sensitive code (airplanes, nuclear power stations, etc.) is doing it in Rust.


This makes zero sense and it's not about emotion.

The language is young and you don't certify a software solution every two days or don't rewrite your nuclear power station code every day.

Very experienced programmers switched to Rust because it makes it possible to build large scale industrial programs both efficient and reliable. They won't switch to C++ just because they think they're good enough to live dangerously.

(btw I work on plant control and yes I write parts in Rust)


> Very experienced programmers switched to Rust because it makes it possible to build large scale industrial programs both efficient and reliable.

This never happens in the real world; not unless the 'very experienced' bit is experience only in languages like PHP or Python.


It's true that a lot of PHP/Python/Go programmers look to Rust rather than C++ when getting into performance-oriented code. But it's not really a stepping stone, because you never have to leave.

It's true that not a lot of people are using Rust for embedded software. That's a much harder nut to crack because so many of the toolchains are proprietary (and embedded support in Rust is still missing quite a few things).


> ...because you never have to leave.

You kind of do if you ever want to work on anything other than pet personal one-man projects.


Not for technical reasons though.

Also: I've used Rust at work. In fact, I learnt Rust because I was processing a lot of data at work, and needed a fast language to do so in a reasonable amount of time.


I know it's not as complicated as C++ but it sure seems like they're in a rush to get there :) . Modern (post c++14) covers most of my criticisms of c++ for my daily driver. I still poke at Rust because I really love statically typed languages :)


I've been very surprised by Rust's complexity. Some of it seems to be being hidden over time (as well as adding new features) but its syntax at any given moment is generally more complex than C++.


Really? I would consider its syntax to be of similar complexity but of higher consistency. What parts of it do you find complex?


When you happen to have generics coupled with lifetime annotations for example.


I’m not being contrarian, but I have only been following Zig in passing. Can you give a few example of the increase in complexity, I am genuinely curious. Zig seems, to my eyes at least, to have done an admirable job of remaining simple when compared to the behemoth that is C++, of any vintage (but especially post C++03).


I wonder how much it would cost to sponsor compiles-via-c support. I'd love to use zig-the-language but I need to compile for platforms that LLVM does not support, so I would need to use the native C toolchain (assembler/linker at the least, but using the native C compiler seems easier).


There is a semi-maintained (perhaps more accurately “occasionally resurrected”) C backend for LLVM: https://github.com/JuliaComputing/llvm-cbe


Thanks; I was aware of this but last I checked it only supports a subset of LLVM (whatever Julia needed).


I am still not sold on its security story, usage of @ and module imports.


Can you elaborate on the security story story?


Manual memory management, we have already learned that isn't the way to go, when security is part of the requirements in a connected world.

Still no way to catch use-after-free.

https://github.com/ziglang/zig/issues/3180


Clear, good point, thanks.


I don't know... To me, zig does not look like C at all. IMO, go and zig are as similar to each other as they are dissimilar to C.


I mean, is this C-like?

  fn add(a: i32, b: i32) i32 {
      return a + b;
  }



Simple stuff like this looks very similar to Rust: https://godbolt.org/z/spkKai

Rust quickly becomes harder though, while Zig is much easier to grasp.


The fact that they compile to the same assembly instruction does not make them similar source code.


The languages are similar enough that the Zig tool-chain can translate C source into Zig source. There's mostly a one-to-one correspondence.


It could hardly be more C-like. Are you getting distracted by the syntax?


When we talk about something being "C-like", the syntax is what we're talking about. Being "distracted by the syntax" doesn't make sense when the syntax is the entire point of the statement being made.


Admittedly it's been decades since I've done any C (literally since 1999, except for an LD_PRELOAD shim I wrote about 5 years ago) but being someone who prefers C and other ALGOL-syntax derivates I find the syntax to be relatively easy to grok. One advantage over C is that the confusing pointer/array/function pointer thing has a well-defined order in Zig, there's no "spiral order" nonsense going on.


There is no spiral. C's pointer declaration syntax is arguably backwards ("declaration follows usage") but fundamentally follows normal precedence rules. The so-called spiral is total nonsense misinformation.


That's exactly my point. It's confusing enough that someone e made a bullshit claim that was totally wrong and confused noobs for years. That won't happen with zig.


That won't happen with any sane systems language all the way back to Algol dialects.

The only thing that C has taken from Algol, was structured control and data.


What about this do you find disagreeable? http://c-faq.com/decl/spiral.anderson.html


The spiral rule works if "pointer to" and "function returning"/"array of" alternate but it breaks if they don't, i.e., if you have arrays of arrays or pointers to pointers.

    int** arr1_of_arr2_of_arr3_of_ptr_to_ptr_to_int[1][2][3]
In this case, spiraling between [1], ×, [2], ×, [3], int is obviously wrong, the correct reading order is [1], [2], [3], ×, ×, int. (Edit: read × = *)

The Right-Left Rule is quoted less frequently on HN but it's a correct algorithm for deciphering C types: http://cseweb.ucsd.edu/~ricko/rt_lt.rule.html


I would name that algorithm "inside-out" instead of "right-left" if my understanding of its behavior is correct.


Aren't you supposed to treat multiple array or pointers as a group?


The fact that it's wrong if you try to generalize it beyond the examples given?


Does it? Here's one that I grabbed from cdecl.org:

  char * const (*(* const bar)[5])(int )
And its translation:

  declare bar as const pointer to array 5 of pointer to function (int) returning const pointer to char
This seems like what I'd get from the spiral technique.


They are both close-to-the-metal, few-abstractions, procedural, imperative Systems programming languages with explicit memory management and layout.


When "we" talk? I suggest that in the future if you mean syntactically similar you say syntactically similar.

But then, of course, it wouldn't have much to do with Zig's "appeal as better C than C", which prompted this whole discussion.


Exactly. I would appreciate an explanation what makes zig c-like if its syntax is nothing like c? What does "c-like" even mean then?


It's about 50% C-like. Which is giving people a That Dress experience answering yes or no to the question.


On a first glance, it does look like a simpler version of Rust, and I say it without demeaning Zig. Looks very promising, I'll be keeping an eye for it.


Thanks everyone for the kind words! It's been a lot of work to get this far, and the Zig project has further to go still.

If you have a few bucks per month to spare, consider chipping in. I'm hoping to have enough funds soon to hire a second full time developer.

https://github.com/users/andrewrk/sponsorship


It's been amazing following all the progress so far. I'm a proud $5/mo sponsor and look forward to writing something in Zig soon!

Are there any concurrency constructs provided by the language yet? I'm just starting to learn how to do concurrency in lower-level langauges (with mutexes and spinlocks and stuff). I'm coming from the world of Python where my experience with concurrent state is limited to simple row-level locks and `with transaction.atomic():`.

An equivalent article to this would be awesome for Zig: https://begriffs.com/posts/2020-03-23-concurrent-programming...

Edit: I just found this announcement for async function support: https://ziglang.org/download/0.5.0/release-notes.html#Async-...


Yes! Here's a repository where I show some comparisons with Go: https://github.com/andrewrk/zig-async-demo/

This area is still bleeding-edge experimental, but it's very promising.

I need to do a blog post on how async/await works in zig and event-based I/O. It's been a long time coming.


Thanks for all your effort on the project. By far my best experience with zig was writing an OpenGL renderer on windows which then “just worked” when I cloned it ran `zig build` on my Linux machine. Felt like magic.


It is amazing work, I'm so glad you invested your attention in this direction, kudos! I haven't used the language and the compiler yet but just reading the title article I almost jumped from joy, knowing how unexpectedly painful is to target different versions of system libraries on Linux.


I was happily supporting Zig/you at $5/month, but things got tight. I will be back on board next month! Keep up the great work Andy!


Hi Andy, thanks for your hard work on this. I am not a Zig user/sponsor yet but hopefully I will be soon. It's looking better and better every month.


Hijacking your comment to ask a question: how hard is it to add support for a new architecture and/or triple (if I already have support in llvm)?

Grand bootstrapping plan [1] sounds really impressive but still WIP? Is there a commit or series of commits showing recent targets that got support?

[1] https://github.com/ziglang/zig/issues/853


For the 0.60 release, could you please provide a download for Raspberry Pi running Raspbian?

On my RPi 4, 'uname -m -o' returns: armv7l GNU/Linux

Thanks!


This project invites donations. It is labor of love and solves problems that have been around a long time. Kudos!


Why does zig purposely fail on windows text files by default? Do you really expect a language to catch on when you are purposely alienating windows users by refusing to parse \r for some bizarre reason?


And here I was just amazed from the blog post at the Windows support in cross compilation in either direction.

I find it hard to believe that someone could be capable of writing a non-trivial program but not able to change their text editor settings to use \n.


Every compiler handles it fine, I just can't wrap my head around why someone would intentionally cripple their software to alienate the vast majority of their potential users. I understand the problem and how to solve it, but it's such a giant red flag there is no way I'll care about this language. Most people probably don't actually understand why it fails and why a simple hello world program fails.

I'm floored that so many people would ignore this barrier to entry and somehow rationalize such a ridiculous design choice. If you make things straight up break for all windows users by default in your new language, it's going nowhere. No rationalization or excuses will change that reality.


Among programmers, distinction between CRLF and LF is known. Among non-programmers, you may well see people who will try to write code in WordPad, and that doesn't mean compilera should parse WordPad documents as source files. Do you expect people to write code in Notepad? If not, any reasonable programmer's editor supports Unix EOLs. Guide them to use Notepad++ or VSCode with proper options/plugins, case closed. Otherwise, you may as well complain about handling of mixed tabs & spaces in Python, or the fact that C requires semicolons.


Defaults cause zig to error out. This isn't that hard to understand or fix. It isn't even clear when it happens because it's just not something that any other compiler would error on. It isn't about being able to fix it, it about creating an assanine hurdle for anyone trying the language for the first time in windows. No other compiler or scripting language makes someone jump through that hoop. It is completely ridiculous. I still can not believe anyone would defend this decision. Do you want people to use it or not?


I have no issue with that decision at all and find it akin to Go forcing users to adhere to gofmt, and I don't use Zig. A good error message would be nice (EDIT: A zigfmt would be even better).

How many developers who try out new languages are even using Windows? I'd imagine most are on a UNIX-like. Regardless, I don't think it's a big deal either way. You could always make a merge request with a fix if you feel so strongly about it.


Ok. So I'm currently writing something that parses zig code and explicitly not having to worry about \r is a godsend.


To be clear, not needing to ignore a single character is somehow a 'godsend'?


Also tabs. I eventually discovered how to use parser combinators, but I would have made it more frustrating to get started and I'm somewhat conscientious the cognitive burden of these sorts of things can get overwhelming, especially for a project I'm not getting paid for.


Does zig error out on tabs too?



Why not just put this in the compiler like every other compiler in existence?


I'm so glad we are seeing a shift towards language simplicity in all aspects (control flows, keywords, feature-set, ...). It'so important in ensuring reliable codebases in general.


i would love to see a shift towards small languages, which is subtly different from a shift towards simple languages.

there are plenty of things i feel are serious shortcomings of C (mixing error results with returned values is my big one), but the fact that the set of things the language can do is small and the ways you can do them are limited makes it much easier to write code that is easy to read. and that will always keep me coming back.


> uses a sophisticated caching system to avoid needlessly rebuilding artifacts

Reminded me of the Stanford Builder gg[1], which does highly parallel gcc compilation on aws lambda. make -j2000.

So with a zig cc drop-in, you might get highly-parallel cross-compilation?

Though the two caching systems might be a bit redundant.

[1] https://github.com/StanfordSNR/gg


"In order to provide libc on these targets, Zig ships with a subset of the source files for these projects:

musl v1.2.0

mingw-w64 v7.0.0

glibc 2.31"

These are super-important... (Otherwise, someone will be very limited in what they can compile -- the simplest of programs only...).

It's great that you included these, and as source, not as precompiled binaries!

(Also, a well-selected subset is probably the right balance of functionality vs. complexity...)

Anyway, very excited about the future of Zig as a drop-in Clang/gcc replacement!


Perhaps I missed this from the blog post since it's unfamiliar to me. Can you compile Linux with this? Like, could you really straight up use this is a drop-in replacement for a whole Gentoo system?


I'm guessing the build system of Linux depends on more than just a C compiler, and that's why the answer to the question is "no". If the build system of Linux only depends on a C compiler then my answer would be:

That would be a nice stress test, which would undoubtedly lead to bugs discovered. After enough bugs fixed, the answer would be "yes".

I'll try it!


Exciting! This is almost what I was thinking would be lovely about one month ago when I was last hacking on C.


Historically the Linux kernel used GCC extensions that Clang did not support, and this is just a thin shim around the ordinary Clang frontend, so to the extent that's still a problem: no.

Otherwise: yeah. It's just clang's main function with a different name slapped on. Linking semantics may differ slightly which could be problematic. But in theory, yes.


> so to the extent that's still a problem: no.

Clang can apparently build a vanilla Linux kernel now, no gotchas, actively used by Google for its Linux things (Android, chromeos).


Yep, check out clangbuiltlinux.github.io for more info!


Glad to hear it!


Wow, that's one of the easiest ways I've seen to get a C compiler for RISC-V. Going on my list for when I'm playing around with emulators again.


Zig is great and I can't wait to try cc! However, Andrew if you're reading this, the std is very confusingly named. It's not a hash map, it's AutoHashMap, it's not a list, it's a ArrayList, etc. I had a lot of trouble finding idiomatic code without having to search through the std sources, like an example with each struct/fun doc would help a ton.


To be fair, 'list' is such a generic term it's not really useful. ArraList and LinkedList and even a hash table are all examples of lists, but their performance characteristics vary so wildly that it doesn't make sense to call any of them simply 'list'.


I reckon the point of the naming is to force you to think about exactly which behavior you want/need in your program, given that arrays and linked lists (for example) are very different from one another with very different performance characteristics.

It's a bit unfortunate that (last I checked) there's no "I don't care how it's implemented as long as it's a list" option at the moment (e.g. for libraries that don't necessarily want to be opinionated about which list implementation to use). Should be possible to implement it as a common interface the same way the allocators in Zig's stdlib each implement a common interface (by generating a struct with pointers to the relevant interface functions).


Zig looks super cool. I've been wanting to experiment with it for some system software.

Are there any guides anywhere for calling libc functions from zig? I'm interested in fork/join, chmod, fcntl, and that kind of thing. Do I just import the C headers manually? Or is there some kind of built-in libc binding?


I think you can call libc by importing the C headers "automagically" but zig does also give you some of these things in its (admittedly still poorly documented) std lib:

std.os.fork: https://github.com/ziglang/zig/blob/master/lib/std/os.zig#L2... std.os.fcntl: https://github.com/ziglang/zig/blob/master/lib/std/os.zig#L3...


For POSIX, you can use the standard library:

std.os.chmod

std.os.fcntl

Better yet, use the higher level cross platform abstractions. For example instead of fork/join,

std.Thread.create

std.Thread.wait

These will work in Windows as well as POSIX.


I read through the post but I'm still a bit confused as to what parts of this is Zig and what parts are coming from other dependencies. What exactly is zig cc doing, and what does it rely on existing? Where are the savings coming from? Some people are mentioning that this is a clang frontend, so is the novelty here that zig cc 1. passes the the correct options to clang and 2. ships with recompilable support libraries (written in Zig, with a C ABI) to statically link these correctly (or, in the case of glibc, it seems to have some reduced fileset that it compiles into a stub libc to link against)? Where is the clang that the options are being passed to coming from? Is this a libclang or something that Zig ships with? Does this rely on the existence of a "dumb" cross compiler in the back at all?


To compile C code, you need a bunch of different things: headers for the target system, a C compiler targetting that system, and a libc implementation to (statically or dynamically) link against. Different libc implementations are compatible with the C standard, but can be incompatible with each other, so it's important that you get the right one.

Cross-compiling with clang is complex because it's just a C compiler, and doesn't make assumptions about what headers the target system might use, or what libc it's using, so you have to set all those things up separately.

Zig is (apparently) a new language built on Clang/LLVM, so it can re-use that to provide a C compiler. It also makes cross-compilation easier in two other ways. First, it limits the number of supported targets - only Linux and Windows, and on Linux only glibc and musl, and all supported on a fixed list of the most common architectures. Second, building Zig involves pre-compiling every supported libc for every supported OS and architecture, and bundling them with the downloadable Zig package. That moves a lot of work from the end user to the Zig maintainers.

Like most magic tricks there's no actual magic involved, it's just somebody doing more and harder work than you can believe anyone would reasonably do.


IIUC, libc are not prebuilt and bundled ("for every supported target combination"), "just" the source code of musl and glibc is bundled, and compiled lazily on your machine when first needed (i.e. for any target for which you invoke zig).


Does the LLVM toolchain that comes with Zig include cross-compilation support? Is that what it is using?


Yes.


In the Zig main page, there is a claim that making overflow undefined on unsigned can allow more optimization... but the example given is extremely dependent on the constant chosen. Try to change the 3/6 used in C++ to 2/4, 3/8 and 4/8 for example... It is very strange how the clang code generation changes for each pair of constants!

(Also, having undefined behaviour on unsigned overflow can make some bit twiddling code harder to write. Then again, maybe Zig has a bit-twiddling unsigned-like type without overflow check? Zip allow turning off checks, but then it turns off checks for everything...)


I really love Zig, it can apparently replace our C toolchain with brilliant, static cross-compilation, including s390 Linux support (mainframe Linux!).

My only gripe is that the syntax and stdlib, although practical and to the point, seem to suffer from some strange choices that somewhat clash with its own, albeit early, "zen" of simplicity.

- '@' prefix for builtin functions, a little strange and macro-looking for my eyes. Why not just plain keywords? And cleanup some of it: `@cos`, `@sin`, also feel like too much when they are already in the stdlib I believe.

- |x| for/while list bind var, why not just for(x in y)? Surrounding pipes are really annoying to type in some foreign keyboards and feel totally needless in 99% of the places.

- inconsistent required parenthesis predicates in block statements in "test STR {}" vs. "if() {}". Either require parenthesis or don't, I don't really care which one.

- prefixed type signatures, `?[]u32` feels a little off / harder to read.

- comment-looking, noisy prefixed multi-line slashes `\\`.

- the need to dig deep into "std" to get your everyday lib functions out "std.io.getStdOut().outStream().print()". `@import("std")` repeated many times.

- consider implementing destructuring syntax early-on to deal with so much struct member depth ie `const { x, y } = p` or `const { math: { add, mul } } = @import("std")`.

- anonymous list syntax with `.{}` is eye catching as the dot implies "struct member" in Zig, but then the dot is everywhere, specially when you do anonymous structs `.{.x=123}`, maybe consider `[1,2,3]` and `[x=123]` given brackets are being used for array length annotation anyways ie `array[]`.

- `.` suffix for lvalue and rvalue pointer deref. Also `"str".` is a byte array unroll if I understood correctly. Here `f.* = Foo{ .float = 12.34 };` looks like it's doing something with `.` to get to the struct members but it's actually just a pointer deref. Also looks like a file or import lib wildcard (`file.*`) to my eyes.

- field access by string clunky `@field(p, "x") = 123;`, with an odd function as lvalue.

Sorry for the criticism, we're seriously checking out Zig for migrating a large C codebase and replacing future C projects. Although we can live with these quirks, they just make the language look a little random and NIH and that worries me and the team. For instance, Golang has great syntax and semantic consistency which is a boost on project steering quality and assured, life-long onboarding for newbees. Please consider widening the spec peer-review process, maybe in a separate Github repo with markdown proposal writeups. Discussing syntax seems superficial given many project and compiler feats under the hood, but it can become sorta "genetic disease" and a deal-breaker for the project on the long run!

This is a pre-release version I know, but it's just that my hopes are really up for Zig as Golang, C++ and Rust never really did it for us as a multi-target sw toochain for various reasons.


Feel free to raise these as Issue's on Zig's Issue tracker on Github, or comment on those which have previously been raised. If you have a good reason for it being a certain way, write up why and it may be considered as proposal.


No for loops with a constant count is also a very strange choice

    for (([100]void)(undefined)) |_, verb| {
And I've been bitten multiple times with line endings having to be \n only.


While I agree it's jarring to adjust to Zig's purpose of a for loop (iterate over collections), the syntax here is really pushing you, consciously, to adopt while loops instead. The only snag is that the invariant lives outside the scope of the while, which feels bad.


Because you are using nixpkgs...

~~Does `zig cc` cross compiler libgcc/compiler-rt on the fly? Does it compile libc on the fly?~~ Nevermind I did not scroll enough, it does compile on the fly. Whew!

As someone who also cares greatly about cross compilation, compilers definitely should step up their game, but `-target x86_64-windows-gnu` elides many details, unless you are confined to a fix set of platforms.


This looks impressive. Cross-compilation is sorely underserved. I will definitely be checking this out.


I'm normally not an 'oh my god'-kind-of-guy, but... Oh my god!

Zig looks like a nice language for kernel development as well: https://github.com/jzck/kernel-zig


Looks very cool! Did not see 32-bit RISC-V on the list though, so wondering about that. I would have liked to use Zig cc to build 32-bit RISC-V binaries fast, if that is possible. Doesn't matter if they are freestanding.


You can indeed use zig to make riscv32-freestanding binaries (in both zig and C). What is not available is `-lc` for this target.


How does Zig compare with Nim and Rust? (putting aside the differences in adoption)


Manual memory management is the most important difference


Damn, what an awesome project! That's a lot of hard work, and now I want to try Zig. One more oh god save me from programming languages.


What are the limitations? Speed? External libraries?


Afaik the only drawback is that this functionality is very new and still has some open issues (linked at the end of the post). As stated there are no dependencies for Zig and it is shipped in relatively small tarballs which can be downloaded from the Zig website: https://ziglang.org/download/


The limitation is maturity/stability I'd say. Zig is still pre-1.0

Speed? Using Zig should be faster than using Clang directly in many cases. You get the caching system, and I think you can do more complex builds without having to resort to multiple Clang commands from a makefile.

Not sure what you mean with external libraries.


Why is there a sparc backend but no sparc64 backend? sparc64 is what Debian uses these days, so having support for it in Zig would be nice.


can zig compile to C? so many languages would be very useful if they could compile to C for embedded systems as native compilers are very unlikely for new ( or even old ) languages


Compiling to C source isn't planned for the reference Zig compiler, as far as I know. It's more interested in helping people moving people off of C (see `zig translate-c`).

But for supporting more esoteric targets you might be interested in the goals of this ultra-early-stage assembler. ("Planned targets: All of them.")

https://github.com/andrewrk/zasm


As a (part-time) C programmer, I wouldn't really consider a "C replacement" that can't compile to C. Part of the appeal of C is that it's easy to integrate into other projects, regardless of the build system or obscure hardware it might be targeting. If you require a compiler for a language few people have heard of (even a very cool language), it seriously limits your potential user-base.

If you tell me that I can write better, safer code by using Zig, but I can also compile it into a .c artifact that anybody can use, now that is a tempting proposition!


> Part of the appeal of C is that it's easy to integrate into other projects, regardless of the build system or obscure hardware it might be targeting.

Zig should be just as easy to integrate. Sure, it's one more thing to install, but it'll spit out .o files just like a C compiler would if you tell it to (which means you can shove it in your Makefile or what have you), and will spit out .h files for linking. You miss out on Zig's build system niceties that way, though (including the cross-compilation demonstrated in the article).

Regardless, being a "C replacement" kinda implies (if not outright explies) that it's replacing C; compiling to C kinda defeats that purpose. It'd still be useful, though, and is probably possible (might even be relatively trivial if LLVM and/or Clang provide some mechanism to generate C from LLVM IR or some other intermediate representation).


I think three intention here is not the other way round. You can add c files to your zig project to make the transition, integration with existing code, as well as collaboration with c only programmers easier. After all zig wants to replace c, not the other way round.


Unless you really really need obscure platforms, just forget about it. Make the jump, don't look back.

After you get a taste of a modern toolchain (with cross-compilation, dependency management, but withou not-quite-portable build files to endlessly fiddle with, without outdated compilers to work around), you will not want to have to compile a C file again.

Languages like Zig and Rust are easy to install. Mostly it's just a tarball, so it's less of an inconvenience than getting the right version of autotools.


its not obscure platforms, its a ton of embedded platforms (you may consider that obscure, but reality is there are millions and millions of devices out there running software on these "obscure" platforms), most of them don't have many options. At one stage I was writing write my own language to compile to C based around state machines and actors as so many devices tend to be some version (often badly implemented) of those two things. But I ended up moving out of the embedded world and kind of lost my prime motivation for doing it.


When you compile to object files zig will generate C header files that you can use when linking. Granted this won't help with embedded systems where zig can't compile to.


Can `zig cc` also compile to WebAssembly/WASI?


Zig itself does support cross-compiling to WASM/WASI (https://ziglang.org/documentation/master/#WebAssembly), so there's surely some way to coax 'zig cc' into doing the same (though I haven't tried it).


Apparently not :(

Zig is unable to provide a libc for the chosen target 'wasm32-wasi-musl'


llvm is a bloated hot-mess of a compilation framework, built on a bloated hot-mess of a language (C++).

Folks who have achieved great things with llvm (and C++) have done so 'despite' what they used, not 'because of' it.

This has been my conviction for the past 10 years, and I'm glad I never had to touch llvm with a ten foot pole.

I had no doubts it'll soon be surpassed by a common-sense no-bullshit tool-chain.

Has Zig cc achieved that? Great. No? It will or someone (or I) will develop an alternative that will.


> This has been my conviction for the past 10 years

I would suggest holding your convictions more loosely.


I’m not arguing that certain people using particular standards could consider LLVM bloated and I’m certainly not going to argue that by certain standards C++ could be considered bloated. But for users of LLVM, be it via clang or Zig cc or GHC, it seems to work just fine. Are your complaints from the perspective of a compiler dev (or a general dev who wants to be able to more easily open up and tune a compiler) or are they just as a user? Also, for native binary compilation of performance sensitive applications, how many options are there in common use for the major languages? Your opinion seems pretty severe, so I’m just trying to see why that is.


> This has been my conviction for the past 10 years, and I'm glad I never had to touch llvm with a ten foot pole.

Out of curiosity: what do you touch with a ten foot pole? I'd be hard-pressed to call GCC or MSVC much better in that regard, and I can think of very few others that are in use anymore.

I mean, I've definitely dreamt about using SBCL or Clozure for things other than Lisp (seeing as they both include their own compilers not dependent on GCC/LLVM), but I've seen effectively zero effort in that direction.


Wonder whether one can include this in iOS ...


You know what you doing! Take off every zig!


I don't want to be negative as there's too much of that about but gcc and similar can do some pretty hefty optimisations, and for any real work I suspect those count for a great deal. Just because zigcc can compile C, neat as it is, doesn't make it a drop-in replacement for gcc.

Does yours do loop unrolling, code hoisting, optimise array accesses to pointer increments, common expression elimination etc?


Zig uses clang on the back-end, so while IANA compiler expert, I suspect it does all these things.


This is just a new frontend for clang so it should use all the optimization passes of clang. The main new features are convenient cross compilation and better caching for partial compilation results.


Yes, it's clang under the hood.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: