Hacker News new | past | comments | ask | show | jobs | submit login
Problems of C, and how Zig addresses them (avestura.dev)
236 points by synergy20 on July 3, 2023 | hide | past | favorite | 284 comments



Tangentially... I'm grateful that these new batches of systems languages have chosen to use the u8, i32, etc, style integer types. Rather than the uint8_t and int32_t style. First thing I do in any of my low level embedded C projects is put in the various uNN/iNN types.


The C standard effectively can’t add new types without the “_t” suffix, so they’re stuck with limitations that a brand new language won’t have.


Can't they add a new header file?


What if a library API needs to include this new header because it wants to use the new typedefs in its API declarations but your own code (which needs to include the library header) also has its own 'u8' typedefs?

(PS: it's probably ok because C compilers seem to accept redundant typedefs)


Same reasoning applies to new keywords.


It sounds like you got some refactoring to use since you didn't namespace your own core types?


C doesn't have namespaces.

The only sane way to deal with name collisions is either to use C stdlib types in library APIs (e.g. uint8_t), or use a library specific prefix (e.g. mylib_u8) - which is of course even more awkward than just using the standard uint8_t.


A prefix and a namespace aren't really that different in practice.


They are different. Namespaces are syntactically enforced and you can opt-in or out to names. To have syntactically validated qualified access.

Naming might go well and have the same utility, but you never know what evil things are going on in all those headers, redefining each other's symbols and yours.


You seem to be focusing on specific problems with C/C++. I don't really care about that, all I'm saying is that prefixing names are equivalent to placing them within a namespace, in practice, and theoretically come to think of it. You can easily define a bijection between namespaced symbols and prefixed symbols. Requiring the full prefixes are simply a tad more verbose, essentially equivalent to using a fully qualified name for symbol.


I think that "namespacing" basic types such as u32 is completely unecessary.

In what world would an u32_MYLIB would be different than an u32_SOMELIB?

Basic types are common enough.


The preprocessor doesn’t know that u32 defined in foo.h is the same type as u32 defined in bar.h. So you’ll get an error when importing both.


If C made this change, it could also make it legal to redefine u32 etc as appropriate integer types any number of times without erring.


This would break legacy code. I had one employer with a bool type that was a 32-bit integer because of obscure alignment issues. This is why _Bool exists and the bool typedef has to be brought in explicitly with stdbool. You can't just go and usurp popular short type names. The standard reserved *_t and _[A-Z].+ for this purpose. People have been writing portable code with the expectation that future standards aren't going violate that promise and break things.


This is just one of the reasons that new languages that dispense with outdated assumptions and insted make assumptions more suitable to our times are a good thing. C and C++ are really old and some things can not be fixed with libraries.


foo.h defines i64 as “long” and bar.h defines it as “long long”. Now what?


They added new keywords just fine, even if it needs a trasition phase to go through them.


What I'm really waiting for is zig support for custom range ints https://github.com/ziglang/zig/issues/3806

Also minor nitpick I have is that the interval/range syntax in zig is pretty confusing as it can be both inclusive and exclusive depending on context. Swift does it imo better by ... being inclusive and ..< exclusive


I was surprised they didn’t copy Swift on this. It seems just obviously better than two dots.


Groovy has had that syntax for much longer than Swift has existed... Groovy normally copies features from Ruby/Python/SmallTalk so I would think one of those are the source for this syntax.


Most of the time you want to use something like size_t anyways.


do you mean just to save the _t and int substrings, over and over? (if so, i agree)


D uses byte, short, int, long for i8, i16, i32, i64, and ubyte, ushort, uint, and ulong for the unsigned versions. After 5 minutes with the language there is no longer any point to emphasizing the number of bits. Besides, they're just easier to touch type.


I think that trade off made more sense twenty years ago than it does today. So it's understandable that D did this, but doesn't make it a good choice for Zig.

Zig programmers don't need to keep talking about types, so the type (and thus in this case its size) is getting mentioned mostly when that matters, e.g. API boundaries.


Eh the `ix`/`ux` is just as short and more descriptive. D just clings to C-like design to a fault sometimes.

> Besides, they're just easier to touch type.

These two styles are equally fine for touch typing… as long as you know how to touch type (not just how to touch type on the alphabet rows).


But Zig lets you use arbritary number of bits... you can write `i4` for example, or `u120` or whatever, which is a pretty great advantage.


True, but can you create a pointer to a 4 bit type? I tried to make that work in D at one point, and wound up abandoning it.

D allows 4 bit types using conventional bit fields (but you can't take a pointer to them).


Yes:

    const std = @import("std");
    const expectEqual = std.testing.expectEqual;

    test "u4 is 1 byte" {
        try expectEqual(1, @sizeOf(u4));
    }

    test "u4 is 4 bits" {
        try expectEqual(4, @bitSizeOf(u4));
    }

    test "u4 is 1-byte-aligned" {
        try expectEqual(1, @alignOf(u4));
    }

    test "pointers to u4 work like any other pointer type" {
        var foo: u4 = 10;
        const foo_p = &foo;
        try expectEqual(@as(u4, 10), foo_p.*);
        foo_p.* = 7;
        try expectEqual(@as(u4, 7), foo);
    }
Packed structs are Zig's replacement for bit fields:

    const std = @import("std");
    const expectEqual = std.testing.expectEqual;
    
    test "bool has a bit size of 1 bit, a size of 1 byte, and an alignment of 1 byte" {
        try expectEqual(1, @bitSizeOf(bool));
        try expectEqual(1, @sizeOf(bool));
        try expectEqual(1, @alignOf(bool));
    }
    
    test "in a regular struct, fields are aligned to their natural alignment..." {
        const Natural = struct {
            read: bool,
            write: bool,
            exec: bool,
        };
        try expectEqual(3, @sizeOf(Natural));
        try expectEqual(1, @alignOf(Natural));
    }
    
    test "...unless otherwise specified" {
        const TwoByteAligned = struct {
            read: bool align(2),
            write: bool align(2),
            exec: bool align(2),
        };
        try expectEqual(6, @sizeOf(TwoByteAligned));
        try expectEqual(2, @alignOf(TwoByteAligned));
    }
    
    test "in a packed struct, fields occupy exactly their bit size" {
        const Packed = packed struct {
            read: bool,
            write: bool,
            exec: bool,
        };
        try expectEqual(3, @bitSizeOf(Packed));
        try expectEqual(1, @sizeOf(Packed));
        try expectEqual(1, @alignOf(Packed));
    }


Sure there is: when I’m parsing or serializing something it’s pretty important I know what size something is.


The size of `byte` is fixed at 8, `short` 16, `int` 32, `long` 64. There is no ambiguity or uncertainty about it.


At that point, why not just call them i/u8/16/32/64? If the sizes are fixed anyway, why come up with different names for them, especially when almost all of the times you would want to select a different integer type is specifically because of how many bits wide it is? (otherwise, surely you would just use the machine word size?)


Good question.

1. After 5 minutes, you know what sizes they are, and don't need reminding.

2. Easier to touch type.

3. They're just aesthetically more pleasing to the eye.

4. The names aren't really different, they follow the most-used (by far) sizes on C.

5. It's easier to say and hear them. I can say "int" when talking code with someone, instead of "eye-thirty-two".

6. I'm guessing it may be easier for a visually impaired coder with a screen reader.


Just be honest with yourself and say that you subjectively like it better that way, as it grew on you.

There is nothing wrong with that reasoning. Also, there will never be a language which is perfect in every conceivable way, this is such a minor difference that if someone chooses a language over this alone, they are not being reasonable.


A language is nearly all about subjective choices.


Not sure if I parse your argument right, but there's lot of cases when one would care how wide specific integer type is. Especially in the code that works with data in bulk (as in millions/billions of data objects), where not wasting bits really pays off.


It is couple of lines of typedef, what’s the big deal.


Having spent a lot of time porting 32bit system code to 64bit, I developed a dislike for these explicit types. It's a slippery slope to hard code your bitness with people making assumptions where size_t or pointers fit.

Now maybe if you're already 64bit that's fine (it's unlikely that we'll ever need 128bit, and code is unlikely to grow down), but for anything starting smaller it's a pain.


Huh?

Zig, like Rust, has two kinds of "primitive" numeric types, the kind which are an explicit size in bits (u8, i16, f64 and so on) and then the word size ones (isize, usize), which are whatever size is suitable for your target machine.

C gets this all muddled because it has named primitive numeric types but their meaning is imprecise, and then it uses a typedef to assign one of these (but you don't know which one) as the native word size. So maybe long is the same as your size_t, and C will just assume you know what you're doing when you write a long where you need a size_t - thus making your code non-portable.


Just because your code compiles on a new platform doesn't mean the prior assumptions of the prior behavior of the types remains valid.


But it should, and that's what stricter typing can help with.


People are already working on a 128bit Linux kernel.


The sizes of floating point are growing "apart" in the AI space. Who's to say what might happen with sizes of types. Maybe it's great to use mostly u8 if you're controlling a worldwide asynchronous networked megacomputer.


They’re better than implicit assumptions that int is 32 bits.


I'm super sold on Zig except that it doesn't make graphics/vector coding with operator overloading possible :( I've heard what Andrew Kelley has to say about it ("that ONE little feature from C++...") but it's just a very sad situation for what otherwise looks like a lovely basis for graphics coding.

Actually, I don't even want operator overloading in general (leading to stuff like the C++ stream API), it's JUST for vector stuff (ideally with swizzles); same for all the shading languages.

Perhaps it's doable to make a little domain-specific language (DSL) in Zig for this purpose?

With vector extensions (and as floh [hi!] points out, also matrices please) I'm 100% ready to seriously consider transitioning from C++, given the smooth C integration.


Last year I dabbled in making a DSL like solution for operator overloading: https://github.com/Laremere/alg

It ends up slightly more verbose in usage, but the statements themselves remain concise. Unfortunately I got a real job that isn't using Zig, so I've stopped working on this. Others can feel free to take up the torch, though.


Don’t even need operator overloading for vectors/matrices. It would be fine if we could simply define new operators (functions with infix notation) and use those. This would allow us to define an operator for matrix multiplication, matrix-vector product, dot product, and cross product.

Having to define these operators as functions with unique text names is extremely ugly and serves no purpose that I can see.


What was the language which used backticks to transform any function into an infix binary operator? Ruby? Perl?

Something like this could actually fulfill Andrews goal of not having any hidden control flow, as the function would be plainly visible.

The other thing of course would be, what then the valid function names would be? Definitely not `*` but probably `×` or whatever.


> What was the language which used backticks to transform any function into an infix binary operator? Ruby? Perl?

Haskell.


Haskell does this.


Racket


I think the change could be as limited as a change to the parser. You could also enforce hygenicity (limiting scope only to places where it is asked for) by requiring it to be

    const @"..." = @import("my_operator.zig").add;
For example. Then in the code the parser transforms

    (a ... b)
to

    @"..."(a, b)


+1 (don't want general operator overloading, but a more powerful @Vector builtin type, basically Clang's ext_vector_type: https://clang.llvm.org/docs/LanguageExtensions.html#vectors-...), plus maybe a similar @Matrix builtin up to 4x4.


I think operator overloading should be fine even for puritans, as long the language requires the type to be numerical in a strict sense, and so they have a well defined semantics. So integers, floating point, complex numbers, vectors, matrices, etc.


How would the language require it to be numerical? For example, if you're defining complex numbers in a library, what would the compiler be checking about your ComplexNumber type before allowing you to define `+` for it?


A few ways. The compiler could enforce commutativity, associativity and other properties of those operators.

Another possible route is to require that all types contained within the exported types are also numerical, or have some specific set of operators defined.


Do you mean at the type level or also for the operational semantics? In the latter case it's undecidable.

Also, you mentioned matrices in the previous comment, but multiplication between matrices is not commutative.


It's undecidable if the language used to define operators is sufficiently expressive. Even simple syntactic constraints would work well enough for most scenarios, like "operators may only be defined by an expression containing other operators".

New types also don't have to inherit the properties of the operators they use.

In any case, my point was that there are multiple avenues to explore in providing operators in a way that don't compromise the compiler's ability to optimize numerical code.


The problem with operator overloading is not really numericity, IMO. The problems are:

- hidden control flow (function calls should look like function calls, an operation should never mask a function call)

- global weirdness. If a library changes the language, how does that affect some other code you're pulling in? Where do you effect those changes?


Some CPUs don't have a MUL or (like some 32-bit RISC architectures) DIV instruction, and on these, the C compile has to fall back to a function call.


You sure the compiler doesn't inline it? That an operator might take more than one opcode I think is uncontroversial


Multiplication takes more than a handful of instructions if the hardware doesn't do it.

https://godbolt.org/z/ccYPaz1Pj

For this example (AVR), this is the int multiply function __mulhi3:

    00: 00 24  eor  r0, r0
    02: 55 27  eor  r21, r21
    04: 00 c0  rjmp .+0
    06: 08 0e  add  r0, r24
    08: 59 1f  adc  r21, r25
    0a: 88 0f  add  r24, r24
    0c: 99 1f  adc  r25, r25
    0e: 00 97  sbiw r24, 0x00
    10: 01 f0  breq .+0
    12: 76 95  lsr  r23
    14: 67 95  ror  r22
    16: 00 f0  brcs .+0
    18: 71 05  cpc  r23, r1
    1a: 01 f4  brne .+0
    1c: 80 2d  mov  r24, r0
    1e: 95 2f  mov  r25, r21
    20: 08 95  ret


I mean, pedantically speaking the normal zig operators do control flow in their operators, in checked mode they panic on overflow, etc (you're not supposed to recover from them but I suppose it's possible). There's probably (internal) control flow in the saturating operators, etc.


Not always. I’ve seen division and atomic operations stay as function calls to libgcc. However, semantics of these are well-defined either way, unlike user-defined operators, so existence of a hidden call is at most an inconvenience when linking code from different compilers.


> - hidden control flow (function calls should look like function calls, an operation should never mask a function call)

You can restrict operator definitions to expressions containing other operators. Operators are now guaranteed to expose only as much control flow as the underlying operators would already expose.

> - global weirdness. If a library changes the language, how does that affect some other code you're pulling in? Where do you effect those changes?

I'm not sure what you mean. Behaviour depends on language semantics. You don't specify why operators are uniquely weird on this compared to literally any other function call when language semantics or library behaviour changes.


Your atomic increment may turn into a function call if it’s targeting multiple microarchitectures.


> hidden control flow (function calls should look like function calls, an operation should never mask a function call)

An operator is either a function call or some simple built-in operation.

Thus not hidden.

That syntactic elements that consist of symbols rather than alnum and that are used as prefix or infix operators are necessarily “not a function call” is just a rule that can be changed.


> An operator is either a function call or some simple built-in operation.

In theory, yes. In practice, an operator is a simple built-in operation 99.999% of the time, which lures programmers into thinking that's how it always is.

It's like a self-driving car that's safe 99.999% of the time but still requires you to continuously pay attention to take over at a second's notice.

Humans just don't work like that.


> In theory, yes. In practice, an operator is a simple built-in operation 99.999% of the time, which lures programmers into thinking that's how it always is.

Noo it isn’t. Many languages use something like `+` for string concatenation. Maybe list concatenation.


Maybe I'm small brained but built in vector types align with heavily platform optimized libraries. Meaning you don't want this stuff implemented in the compilers front end. You want it handled in the compilers back end.


Performance is equivalent in both representations, but the important thing is being able to write

  vec4 c = a * 4 + b;
instead of

  vec4 c = vec4_add(vec4_mul(a, 4), b);
i.e. infix vs postfix order, with simple * and +/- operators etc. I would very much like to appeal to the Zig authors to see the beauty in the former expression compared to the latter, for a huge class of real-world mathematical applications.


If this wasn’t math code we’d refactor such code into helper functions, but somehow when it’s math we refuse to.

vec4 c = a * 4 + b;

vec4 c = vec4_add(vec4_mul(a, 4), b);

vec4 c = vec4_fma(4, a, b);


Are Clang’s builtin types interoperable with __m128 intrinsics?


Mathematics is such an unprincipled mess of DSLs. I wonder why math domains always come up as the sticking point with the caveat of “but not operators for anything else, though”.

It would be nice, for sure, to be able to define some infix combinators.

Anyway, it’s bad enough that arithmetic has pressured almost all programming languages to adopt operator precedence. (No operator precedence other than either go-left or go-right is so much simpler. Which would also mitigate some of the complaining about custom operators since then they just become infix functions.)


I agree with you on a theoretical level, but arguably we should not be the ones to bear the complexity of that decision — the representation of a given data is very important for our “limited” human brains.

I think the major problem here is shoehorning everything into a “symbol soup” on a 2D matrix. I do get that it has plenty advantages (can be typed without special program, easier version control, etc), but I would like to see a resurrection of interest in visual language. I’m not meaning something as visual as scratch or so, but some special blocks could come in handy.


As a comptime macro, yes, you should be able to slot in a DSL for all your vector math and essentially "pass in a string, get the function calls back". It would not be that hard, if you're up on your parser-writing skills, and definitely much cleaner than any C-style equivalent.


Yep that seems comptime possible but a little heavy handed with string work, no? With everyone writing their own parsers and language tools...

In any case, if something practical is done here I'd be interested. Aesthetically I prefer the "vectors and matrices are first class types" (and maybe complex numbers could be too), but I guess Andrew isn't so sympathetic to these types being part of the standard language :/


There's a proposal for complex numbers: https://github.com/ziglang/zig/issues/16278

Vectors are already first-class types via @Vector, aren't they?


Whoa, I haven't been keeping up to date about this, there seems to be good related discussion here: https://github.com/ziglang/zig/issues/7295

I'll hope for some future news of Zig having vector and complex number support! It will be a great day for fast ray/path tracers :)


Unfortunately, vectors in Zig are the kind you use in SIMD and not in graphics programming. Perhaps there's some overlap, but I'm unfamiliar with both fields so I can't say to what degree.


It would be nicer still if vectors and matrices were first class citizens. That way you can hide all of that stuff and it opens up across the board optimization routes that are otherwise much harder.


Agreed, that's basically what I'm asking/begging; don't need operator overloading, but please do give me vectors and matrices.

Edit: I asked in the Zig Discord and was told "it's been denied many times", oh well :/ Fair enough, it's their language to control.


That's a weird answer. If something is denied many times that shouldn't count as a reason to deny it again but to question whether that denial is appropriate.


Odin does this afaik.


Not all that useful but fun: you can implement swizzling with comptime


I'm no expert on either languages, but I tried Zig for the first time properly the other day. I really liked it up until I hit hashmaps. One thing that I think goes under-appreciated about C (and other languages with a similar paradigm) is thats its pretty upfront and clear about what it can and can't do out of the box. If you want hashmaps in C, you need to create your own implementation, otherwise think of a way round them. In Zig meanwhile, they exist in the standard library, but at least for me, were more work than they were worth to work with. The result was the same, when making my test project in both C and Zig I ended up not using hashmaps, the difference is that in C I came to this decision in 2 minutes rather than 2 hours.

I imagine if I had more experience in Zig, none of that would be a problem, and its not aiming to be a beginner-friendly language anyway, but that was just my experience of it so far


As counter point, the way that Zig implements hash maps is absolutely instrumental for TigerBeetle.

Zig hashmaps _are_ quite a bit more cumbersome than, eg, in Rust --- you need to decided whether you want the allocator to be bundled with the hasmap, and its also up to the user of the hash map to provide equality and hash code (and, of course, there's manual defer instead of RAII).

However, this flexibility and verbosity comes with a super-power --- you can pass an allocator to a hash map at creation time, and then _not_ pass an allocator when you actually use the hash map. This is huge. This means that we get compile-time guarantee that we follow rule 3 of NASA's 10 rules

    3. Do not use dynamic memory allocation after initialization.
And we _still_ can enjoy using a hashmap from the standard library! No other language I know has an equivalent tool.


You can achieve the same thing in Rust or any other language that offers some sort of field privacy (which Zig doesn't btw) by wrapping the hash map with your own type and exposing the interface you want.


Not quite std, but Rust's heapless lib let's you use a statically-allocated Hashmap. The distinction is its max size is declared on construction.


That's not quite it: with heapless, memory is fully static, the size of hash map is a compile-time parameter, which is a part of HashMap's type.

In Zig, the map could be initialized and sized at runtime, but you still can enforce, statically, that it doesn't do any allocations after that.

In both nightly Rust and Zig, heapless version can be expressed by passing a fixed-buffer-backed allocator to the standard hash map.


I don't quite understand how do you change the size of a zig hashmap in runtime by, for example, inserting 1M items at some point but at the same time being able to enforce in compile-time that such operation will not result with dynamic allocation?


Compare these two methods:

https://github.com/ziglang/zig/blob/0dffab7356685c7643aa6e3c...

https://github.com/ziglang/zig/blob/0dffab7356685c7643aa6e3c...

One of them requires passing an allocator (and can allocate), the other doesn’t have an allocator argument (and thus can’t allocate). If you only pass allocator to `init` method of your application, and don’t store it anywhere, only init will be able to call allocating methods, the rest of the app will be allocation free, by construction.


If the second one can't allocate then how does it handle the case where you don't have enough capacity to insert the new (k,v) pair?

I can see that the difference between the two is in self.growIfNeeded() call, https://github.com/ziglang/zig/blob/0dffab7356685c7643aa6e3c..., which the one that doesn't allocate really doesn't have. Does it assume that the predefined capacity will not be reached?


> Does it assume that the predefined capacity will not be reached?

Yes, the function name says exactly that as well, though admittedly what this actually means and what consequences it might have if the assumption is false are probably fairly opaque to a novice user.

> how does it handle the case where you don't have enough capacity to insert the new (k,v) pair?

Breaking the invariant results in safety-checked undefined behavior, that's what the "asserts" in the doc comment signifies[0]. Basically, if something goes awry at runtime you'll get a crash along with a nice error trace in Debug/ReleaseSafe mode or with the appropriate @setRuntimeSafety call.

If instead you'd like to have errors that you can handle, you could use your real allocator where you expect to actually use it and pass a failing allocator[1] everywhere else (but that's sort of abusing the API IMO, I don't know if I'd actually recommend you do this).

[0] https://ziglang.org/documentation/master/#Doc-Comment-Guidan...

[1] https://github.com/ziglang/zig/blob/0dffab7356685c7643aa6e3c...


Right, that's what I thought is happening under the hood. Thanks for confirming.

This also means that there is nothing novel about this approach that cannot be achieved in other programming languages as one of the parent comments claimed.

This is simply a hashmap with pre-allocated pool of memory.


I don't know, but you could avoid resizing the hashmap and allow the collisions but then your buckets would still need resizing


The hash map I'm used to in C uses intrusive lists so it's pretty much allocation free. Would they fit the bill?


"Pretty much" is great in most contexts, but in this one I think GP is pretty darn strict about having absolutely zero dynamic allocations


By "pretty much", I meant that the bucket lists are already allocation free thanks to intrusive lists; of course the hash table itself can uses allocation when it's resized, or not, depending on user choice. It's trivial to preallocate it or use an arena or whatever, since it's just a flat array.


> No other language I know has an equivalent tool.

I completely agree: this has been my experience as well.


I can't remember having any issues with using hash maps in Zig when I tried it out during Advent of Code. They were almost as easy to use in Zig as in Rust. Only extra complication was returning error on failed allocation rather than panicking.


Would you mind elaborating on the complications you ran into? I ask because I think you used std.HashMap instead of std.AutoHashMap, the latter of which automatically chooses a hash function based on the types provided.


It was a little over a week ago now so my memory is a bit hazy, but I'll try to the best of my knowledge.

Without getting into the weeds of why (happy to do so, just want to keep this readable), basically I needed to define and populate a hashmap in a new script and then import it into my main script, which to my mind left me with two options:

* Define and initialise it at the same time (my preferred method) as a constant. I don't have much to say on this as iirc, I had no luck with it at all, never even got close.

* Define it in a (public) function, add each field in with "put" and then return the hashmap. I tried with std.AutoHashMap and various other things, but to what I could work out there was no type of hashmap, so it wouldn't accept my return type.


I think for the first point you wanted a block that evaluated to the map, unsure if that’s what you wanted though


I think so, if I write it out in pseudo code it might make more sense. What I was trying to do was pretty much:

    const mymap = {1: "hello", 2: "world"};
But the only thing I could find any answers for was something more like:

    const mymap; mymap.put(1, "hello"); mymap.put(2, "world");


Here's how you return a hashmap from a function:

  fn buildMap(allocator: std.mem.Allocator) !std.AutoHashMap(u64, u64) {
      var result = std.AutoHashMap(u64, u64).init(allocator);
      errdefer result.deinit();
      try result.put(10, 100);
      try result.put(20, 200);
      return result;
  }
Your #1 option should be possible once Zig has comptime allocators -- it's on the roadmap, but not possible yet iirc.


If I remember correctly I tried almost exactly that, I think it was just the bang operator I missed out (obviously essential in this case).

Cheers for letting me know though!


> Nothing allocates on the heap, without you knowing it and letting it happen. Zig utilizes the Allocator type to achieve this. Any function that allocates on heap receives an Allocator as parameter. Anything that doesn't do so won't allocate on heap, guaranteed.

That might be true for the standard library, but it is definitely possible for a function to use an allocator from a struct or a global. Not to mention calling a c function that allocates.

> Safety tools to avoid memory leaks e.g. std.heap.GeneralPurposeAllocator

Maybe I'm missing something, but AFAICT, that doesn't prevent memory leaks, it just has the ability to log if there were leaks. Like a built-in valgrind.

> [Zig] helps your remain safe and avoid leaks

So it talks a little about how to _identify_ memory leaks, at runtime. Which is also possible in C with tools like valgrind. But it doesn't mention defer, which is an advantage zig has over c (at least portable c) for memory management. And it doesn't talk about the "safe" part at all, which I would take to mean protections against use-after-free, double-free, unitialized variables, invalid free, etc.


I wrote a similar article but focused on how Zig design enable better optimizations than C. https://zig.news/gwenzek/zig-great-design-for-great-optimiza...


Meh. In your article, you could always mark the function f() as pure or const and get the hoist


You can repeat "C is a minimal abstraction over assembly" as many times as you want, it doesn't make it true.


I’m just old enough to remember when C and Pascal were called high-level languages.

“C is actually assembly” was quite a plot twist from there.


You can thank Rob Pike for that, about a decade ago he wrote a Google Plus post arguing that one of the things C brought to the table was that it essentially was portable assembly compared to most other high level language projects at the time. I guess the tiny sliver of nuance that the "essentially" added was quickly lost.


Technically that's true because of the "C virtual machine", but pragmatically, C is still the "lowest-level high-level programming language" at least among the popular programming languages (arguably only Forth is lower level, but Forth isn't exactly mainstream).

(and I'd argue that C is closer to assembly than assembly is to what's actually happening inside the CPU, e.g. assembly itself is a high-level abstraction layer that's still pretty close to C - which isn't all that surprising because both probably developed as a symbiosis over time - especially when you look at all the non-standard language extensions in various C compilers)


You're thinking of the C abstract machine, not a virtual machine. Abstract art is when this is a painted blue circle but it's about the feeling of sadness when losing somebody close to you - virtual art is when somebody persuades you a crappy GIF of a monkey is worth a million dollars.

And no, it's just not usefully true to model things this way. The C abstract machine is pretty weird even compared to a PDP-11, and your modern computer is nothing like a PDP-11.

C was intended to be efficiently implementable, so that's nice, but it has numerous defects in practice in this regard, because it pre-dates a lot of discoveries about how to implement programming languages.

The machine doesn't have types. At all. They're just not a thing. C has types. They're not very good types, and they're poorly implemented, but they are definitely types. Several other languages from that era don't bother, C does because it's a "high level language" and you'll do better embracing that understanding than trying to pretend it's assembler.


Assembler has types : bytes, words, floats, addresses, even strings and "functions". They are easily worked around by design though, in similar ways in assembler and C.


C is an abstraction over assembly. It is also minimal compared to prolog.


So is JOVIAL, FORTRAN 66, PL/S, BLISS...


May I suggest you give arguments why it is not (and why we should care about those arguments from a practical standpoint), and what language should better earn the title?


Others have already made a few in response, but for reference, this ACM queue article [1] is a good place to start. It has been widely discussed both on HN and elsewhere, and a simple search of the title can bring up several counter-arguments etc. if you're interested and want to know more.

> what language should better earn the title?

None.

1: https://queue.acm.org/detail.cfm?id=3212479


This article states that "C is not a Low-Level Language", not that it is not a thin abstraction over assembly. The arguments in the article could as well be used to make a point that "Assembly is not a Low-Level Language".


Could you specify the main ways it isn't? I have limited experience in both but I haven't seen much that suggests otherwise


Many features of C do not directly correspond with most modern assembly languages. You cannot predict the exact assembly it will generate without knowing a lot of details about your compiler and platform, and even then it's often iffy. It seems like a bit of a leap to call something "a minimal abstraction" if you can't even correctly describe how an operation in the abstraction corresponds to operations in the lower level.


That's mainly the result of optimizer passes, C itself doesn't have much to do with it.

Assembly languages actually haven't changed all that much since the 70's, but compilers have improved a lot. The output of early C compilers did indeed match the source code quite closely (and not just on the PDP-11), but even today that's true if you disable optimizations (and even with optimizations is usually pretty straightforward to map the C source to the assembly listing - if you're somewhat aware what optimizer passes in modern compilers are doing).

Of course CPU ISAs are already human-friendly abstractions over what's actually happening down in the hardware.


C has everything to do with optimizer passes, the language definition is what allows them to happen! The fact that you get about what you'd expect on -O0 is merely incidental, the specification does not afford you this.


It's true and false at the same time. The operations C gives you map 1-to-1 with assembly. Given some C code you can quite accurately predict which loads/stores will be elided by the compiler and what the resulting assembly will be.

I can't name another language for which this is true.

I get that you're hinting at the insane level of undefined behaviour enforcement by compilers, but I don't think it matters all that much once you understand how optimizing compilers work.


> The operations C gives you map 1-to-1

It does not. Implicit casting, inlining, volatile, args passed via registers vs stack etc can significantly change what you expect to be generated.


And conversely you don’t have access to a few things assembly can do: arbitrary stack access, PC register, flags. Some operations like bit rotation, zero bit counting, or fancier addressing modes have informal code patterns with a hope they’ll optimize right, but nothing guaranteed in the standard.


Pretty much a matter of optimization isn't it? Try disabling them. But I take it that this was never the point anyway. I think the point is that the representation of language objects and the runtime are rather straightforward compared to many other languages.


Only one of those (inlining) is an optimization. Two are language features (implicit casts and volatile) and the other is a calling convention (passing arguments on registers vs. stack).


Calling conventions aren't (mainly) a C feature either though, but are defined by the ABI specified for a specific OS/CPU combination (and all languages which want to talk to system APIs need to implement that ABI, not just C).


> It's true and false at the same time. The operations C gives you map 1-to-1 with assembly. Given some C code you can quite accurately predict which loads/stores will be elided by the compiler and what the resulting assembly will be.

I mean, can you though? After all you won't even have the same output depending on compiler and flags. And of course not all architectures have the same capabilities, so the same code can compile to a various number of instructions depending on the target architecture.

Not to mention things like bitfield access that can result in non-atomic load/stores for a simple `foo->bar = 1;`

I'm not sure in what sense you could say that C operations map 1-to-1 with assembly any more than Rust, C++ or basically any compiled language.


> The operations C gives you map 1-to-1 with assembly

    int test(int x, int y) {
        return x % y;
    }

    test:                                   // @test
        sdiv    w8, w0, w1
        msub    w0, w8, w1, w0
        ret
Surprisingly, assembly doesn't have the remainder instruction, but instead it has a multiply-then-subtract instruction which is not corresponding 1-to-1 to anything in C.


x86 doesn't have remainder. Other instruction sets do. But yeah I agree with your point. Some targets might not even have multiply.


x86 has remainder: IDIV calculates both the quotient and the remainder at the same time and places them in different registers just like many other ISAs before it did. In fact, DIV instruction of PDP-11 worked that way too:

    Description:    The 32-bit two's complement integer in R and Rv1 is divided
                    by the source operand. The quotient is left in R; the remain-
                    der in Rv1. Division will be performed so that the remainder
                    is of the same sign as the dividend. R must be even.

    Example:        CLR RO
                    MOV #20001,R1
                    DIV #2, RO

                       Before          After
                    (RO) = 000000   (RO) = 010000   Quotient
                    (R1) = 020001   (R1) = 000001   Remainder
PDP-11 also had ADC and SBC (add/subtract with carry) instructions which weren't (and still aren't) exposed in C either.


> x86 doesn't have remainder

The code in the comment you replied to is 64-bit ARM assembly.


Oops of course.


In C, unlike pretty much every other language, you can't even know how big an integer is.


Of course you can, but this is implementation defined. So you need to create code that does different things based on the underlying implementation. Every reasonably large C code base checks the current implementation to define what type of integers, pointers, etc. they're dealing with.

This is by design, because C was created do deal with disparate processors and OSs. When it was crated it would be difficult and unwise to assume that an integer has size of 2 bytes, for example, since each machine would define its own preferred length.


> When it was crated it would be difficult and unwise to assume that an integer has size of 2 bytes

It was still a bad design. Of course this is in hindsight, but history has taught is it would have been much better if the default was for specific sized integers, with an option to use `uint_fast32_t` or whatever.


Then we would be stuck with 16 bits integers. We are lucky to be in a point in time where integer types have been stable for a while, but it wasn't always so.


We wouldn't. People would update their code eventually or opt in to variable sized integers.


History says otherwise: C was so successful it is still widely used.


Just because something is popular doesn't mean it's good. Tobacco smoking is also popular.


Some of the best software we have to date was written in C. If you like it or not, it is a magnificent tool.


It would still be good if it was written in a different language. Probably even better because the developers would have more time to improve the software instead of reinventing wheels, writing boilerplate code and chasing down segfaults.


> It would still be good if it was written in a different language.

But it wasn't, C made all these software tools possible.


No it didn't. C happened to be the most popular language for a long time but that doesn't mean people couldn't have written software without it, or that if it hasn't been so popular a better language wouldn't have arisen.

English is very popular but you wouldn't say "English is what made all those books possible" would you?


Yes it did and continues making it possible. Several of the largest codebases in the world are written in C.


To reiterate, C didn't make it possible to write those codebases just because they happen to be written in C.


Yes, it did. Just check how they created UNIX.


Sure you can, it's called sizeof(int). Now read the other comment to understand why.


Naturally it doesn't show how it tackles use-after-free cases.


Yup. It offers tests as a solution, which is nice I guess, but kind if reminds me of Valgrind.

Another thing that caught me by surprise is passing arrays as references. Ages ago we used to pass arrays (`int arr[100]`) as pointers (`int*`). I'm sure C still supports this?


Just wondering - how well does Valgrind work on Zig projects?


Exceptionally well. Zig by default outputs valgrind client requests to annotate undefined memory. This makes Valgrind even more an effective tool for zig code than for C code.


Rather than offer drive-by criticism, perhaps you could illustrate 'the problem' you are suggesting is being hidden?


The problem I believe is being discussed is that dereferencing a pointer after the memory it references is freed does not throw a warning at compile time. This means the programmer has to manually keep track of pointers, making sure not to free the relevant memory until there are no more references to it lingering about. While there are programming techniques which to lesser or greater extent prevent problems from occurring, e.g. by allocating all memory needed at the begin of the program and freeing all of it at the end, doing no (de-)allocation while the program is running; such solutions still require some discipline on part of the programmer and for some it is a tall ask.


So, Zig set out to solve all but the biggest issue with C?


The biggest issue with C is arguably buffer overflows, not use-after-free (Zig offers spatial but not temporal runtime memory safety).

If you want both, but don't want a Rust-style borrow checker (with all the restrictions this entails), use 'tagged-index-handles', those work just fine across all languages (and even make sense in Rust):

https://floooh.github.io/2018/06/17/handles-vs-pointers.html


That seems to be what OP is implying. I haven't used Zig, so I can't say. I suppose people who want to avoid memory management related bugs who can't acccept a GC language can use Carp.


The problem is that Zig is basically Modula-2 for C syntax lovers, plus metaprogramming.

It is hardly safer than Modula-2 already was over C already in 1978.

We are in 2023 now.


You're right, but it doesn't matter: Modula-2 didn't catch up, Zig has a small chance to succeed thanks to its focus on embedded and on its good tooling for cross-compilation.


Only makes sense if they fix the remaining security flaws, otherwise it hardly changes the stance that S stands for security in IoT.


Isn’t that a pretty clear, if terse, description of the problem?

Use-after-free is a well known problem in the memory managed programming space and one that Zig infamously does not tackle.

For those who aren’t familiar, it’s the use of a resource (a pointer) after it is no longer available for use (or has been freed). Which can in turn result in accidentally accessing unexpected data or crashing.


I never tried Zig, but can’t you, instead defer the actual resource release, replace it with one which crash at compile time? This is a naive suggestion based on lousy inferences of features that seems highlighted in Zig, comptime and deffer.


compile time may help for some stuff but if you’re passing pointers to a function, you don’t know what will happen there. You might get a use after free or a double free.

Like with all languages, if you’re leaving default safety up to programmer convention, the programmer will let you down eventually. Rust, swift and any language where raw pointers are the exception not the rule (any GC, ref counted or borrowing language really), they all switch the defaults around so that you’re not dealing with memory management yourself unless you absolutely want to.


I remain confused about how Zig makes it to the front page of Hacker News so often. As far as I know, no one is using it in production after 7 years of development.

Are people just really excited about this project? Are the people behind it just really good at marketing?

Honest question: why do we all keep talking about Zig?


> As far as I know, no one is using it in production after 7 years of development.

- Uber uses Zig to produce hermetic builds of their backends and was able to move their C/C++ codebases to arm64 thanks to Zig's C/C++ cross-compilation support.

https://www.uber.com/en-US/blog/bootstrapping-ubers-infrastr...

- Bun is written in Zig and its sudden success was big enough to cause Deno to have an identity crisis.

- TigerBeetle is written in Zig and it's probably the most promising upcoming database company out there.

- There are a few more notable examples, but I haven't been given permission to talk about them publicly yet.

> Honest question: why do we all keep talking about Zig? > Are the people behind it just really good at marketing?

The project is simply inherently interesting. For one reason or another, nobody figured out a proper build & cross-compilation experience for C/C++ before Zig did it, just like nobody has figured out a good package manager yet for C/C++ and we're about to do it.

When Apple Silicon came out Zig was the first compiler able to cross-compile for it (way before LLVM btw) thanks to our custom linker, and thanks to it and the fact that we're working on our own custom backends, we're also going to achieve incremental compilation with in-place binary patching, ultimately reducing linking time to basically zero for incremental rebuilds.

Zig is also the only language that has async/await but that doesn't have an ugly & wasteful split between blocking and evented versions of the same networking library.

I'm stopping here with technical arguments, but there would be more to talk about.

Lastly and most importantly, the project has an interesting approach to finances and governance: we've had a non-profit foundation for longer than Rust and we have made a point to never, ever, let big tech companies influence the governance of Zig.

The Zig Software Foundation pays its developers (instead of hiring copywriters & marketing people) and 90+% of what we make through donations or support contracts goes to developers. The rest is infrastructure and administrative costs (eg CI, accountant).

I'm the guy who's most in charge of marketing and I'm currently working on the automated documentation system because Zig pretty much markets itself.


> Bun is written in Zig and its sudden success was big enough to cause Deno to have an identity crisis.

VP of Community strikes again.


You feel it's an unfair statement?


It’s got nothing to do with that.


Was apple silicon the thing that brought "caring about cross compilation" back from the dead in general? small-scale embedded use never really went away, but it was always a distinct niche...


Do you know of any in depth reviews of Zig’s async implementation? It would be great to see a walk through of the choices and decisions made along the road to the final version.


Apple shipped a LLVM cross-compiler for Apple silicon on day one? Not entirely sure what you mean here.


Apple shipped a compiler with the M1 Mac, yes, but if you wanted to compile for Apple Silicon from say a Linux x86_64 machine, there was no toolchain that was able to do it other than Zig.


Theos has supported Linux and Windows for many years.


Somehow no one managed to make cross-compilers work, even though it is the standard way of working in embedded and game consoles for decades.


Yeah idk. I don’t want to be snarky but “we had a cross compiler for ARM with a few tweaks” that had already had hardware shipping for half a decade doesn’t sound all that impressive.


Indeed.


Andrew Kelley's intro[0] is probably the best talk I've ever seen, and his continuing leadership seems really great to me and worth supporting. Besides my reservations for its use in graphics programming (see my other comment) it looks very attractive in many practical ways (build system!! fuck CMake!) I could go from how I currently write my hobby code, sometimes a prototype for future commercial code.

[0] https://www.youtube.com/watch?v=Gv2I7qTux7g


This is easily one of the best programming talks I have ever seen, I loved the "no compiler magic" compile-time-known string formatting, and generics literally becoming a thing without magic either.


Zig solves real problems no other language tackle in the same way.

We do embedded development for medical devices, and zig fits perfectly because we can use static allocators and have all the features of the language.

It works very well, we write firmware for STM32 serie of chips.

We can compile the same code to WASM and run it into a simulator, it works great for us.


I can't answer you why Zig keeps showing up on HN (except for the surface-level answer that people keep submitting and upvoting it) but in terms of not being used in production, the Bun project, a Node/Deno alternative, is seeing a good deal of momentum by the people who like JavaScript a bit too much. It's probably the most widely used Zig project so far, including in production.

https://bun.sh


Who uses bun in production?

Seems totally insane to me to use a pre-1.0 application written in a pre-1.0 language in production considering Node and Deno basically do the same thing.


When the company you work for is burning VC money with no hope or intention of profitability, the only rational technology choice you can make as an engineer is to pick a promising but unproven technology which could boost your resume and help you land your next gig more easily. Or you simply pick it because you like and want to work with it and there is no one to stop you.

I'd be very happy to hear I am wrong about this and see a profitable company with an established product using it.


> Or you simply pick it because you like and want to work with it and there is no one to stop you.

Is there a problem with this? If you have a problem to solve and nobody's telling you you have to use certain tools to solve it, why not try a new tool yourself and see if it solves it better than the old ones? I do this whenever I can on client projects and I've found some rather pleasant tools this way.


I'd argue that nowadays a pre-1.0 version number usually means "this doesn't have all the features I want it to have yet" more often than "this is buggy and shouldn't be used in production yet," at least nowadays. I can't speak for Bun but I know Zig has an extensive test and timing suite to help avoid bugs or significant slow-downs making it into new releases.

As for who's using Bun in production, I don't know. But it's been around for a while and seems to have decent buzz around it in the JS community when observed from a safe distance (the proper way to observe the JS community), so I assume someone is. Take that for what it's worth.


TigerBeetle is using Zig in production.[0] I imagine some of Zig's other corporate sponsors are using it as well.[1]

[0] https://tigerbeetle.com/

[1] https://ziglang.org/zsf/


Zig is kind of fun to write to be honest. I mean about as fun as a systems programming language can be.


One thing nobody has pointed out yet - and this is not a knock on any language whatsoever but moreso a commentary on HN itself, so take it with a grain of salt, but: if you've followed this site since the beginning, programming languages cycle through here with a hype phase every few years like clockwork.

The one most people jump to from the past few years is Rust and/or Go, but we've experienced this with other languages like Lisp (& variants), Haskell, Scala had a small window, etc. Zig will no doubt have its own moment like this (if it's not already), and then the hype cycle will settle into wherever Zig is used best.

Think of it like a magnifying lens on overall programmer sentiment/interest.


Go was mostly hated on when it came out iirc.


You may be right - I paid less attention to that era of HN.


HN needs a way to counteract all Rust hype.


Because this is Hacker News, not corporate blub news. JFC


Eh if you count “code in production” Java and C would win, each in different category of software.

But they are boring; nerds like shiny new objects.


One thing that was unclear to me based on my prior noodling with Zig was what the plans were for supporting interface or trait-based polymorphism.

I've been falling in love with C++'s std::experimental::is_detected and am looking forward to being allowed to use C++20 in my environment since it will bring Concepts, but in my cursory examination it seems Zig seems to prefer vtable abstractions like Allocator.VTable.


Zig does have some kind of interfaces, check [1]. I am interested to learn, how Traits in Rust and Interfaces in Go behave differently from this concept.

[1] https://github.com/ratfactor/ziglings/blob/main/exercises/09...


This appears to be done by union-ing the types in question.

The thing about Go interfaces (and C++20 Concepts) is that you can name a type that contains certain methods or behaves a certain way, but you don't actually have to inherit from the interface or concept explicitly - anything shaped correctly that conforms to the interface or concept will work.

And at least with Go, if you try and pass something in that doesn't conform to the interface, it is very particular about telling you what you're missing. One downside of C++ templates is that if you have a problem with what you're passing in, you might get a horrendous error message somewhere deep in the implementation - or worse, your code might compile just fine, but have unexpected behavior - instead of a nice "Hey, you need to add a method named `foo` to this type."


That was a nice, clear explanation that even I could understand. One thing I wasn't sure about, though.

  var arr = [_]u32{ 1, 2, 3, 4, 5, 6 }; // 1, 2, 3, 4, 5, 6
  const slice1 = arr[1..5];             //    2, 3, 4, 5
  const slice2 = slice1[1..3];
If I were to use indices out of range, presumably that would that give me a compile time error? Could I use variables as the indices and, if so, could I get a seg fault at runtime or are there some sort of guards on there?


> If I were to use indices out of range, presumably that would that give me a compile time error?

yup

    $ zig test test.zig
     test.zig:8:56: error: index 3 outside array of length 2
        std.debug.print("\ncompile error: {d}\n", .{slice2[3]});
> Could I use variables as the indices and, if so, could I get a seg fault at runtime or are there some sort of guards on there?

Bounds are checked by default:

    $ zig test test.zig
    test.zig:9:56: error: index 3 outside array of length 2
        std.debug.print("\ncompile error: {d}\n", .{slice2[invalid_index]});
But you can disable them by building in `ReleaseFast` mode[1]

Here's a test file you can use to play around with it if you like:

    const std = @import("std");

    test "bounds" {
        var arr = [_]u32{ 1, 2, 3, 4, 5, 6 }; // 1, 2, 3, 4, 5, 6
        const slice1 = arr[1..5]; //    2, 3, 4, 5
        const slice2 = slice1[1..3]; //    3, 4
        const invalid_index: usize = 3;
        std.debug.print("\nslice2: {d} {d}\n", .{ slice2[0], slice2[1] });
        std.debug.print("\ncompile error: {d}\n", .{slice2[invalid_index]});
        try std.testing.expect(slice2.len == 2);
    }

1: https://ziglang.org/documentation/master/#Build-Mode


Slices in zig are a fat pointer with a runtime known length, which have bounds checking at runtime and will panic if you access out of bounds.

Unless you build in releasefast, which disables that check. There is releasesafe for a release build that keeps bounds checking panics.


I think that's runtime safety protected. It should segfault if the program is built with "ReleaseSafe", the variable value is only known at runtime, and the variable overruns the array. There's a "ReleaseFast" mode which disables things like this. Also, if that variable is somehow known at comptime, then I'm pretty sure it will fail while compiling which is what you would want.


Zig has bounds checking built-in. You can decide to remove it if you compile for ReleaseSafe instead of ReleaseFast, but it's always in debug builds.


Also note that you can compile your entire program in safety checked mode and compile functions with hot loops with release fast


I just realized I swapped the two but it's too late. The bounds checks obviously stay in ReleaseSafe, not ReleaseFast.


> If I were to use indices out of range, presumably that would that give me a compile time error?

Yes, if all operands are comptime-known (or comptime-length-known), bounds checks happen eagerly at compile time.

> Could I use variables as the indices and, if so, could I get a seg fault at runtime or are there some sort of guards on there?

Yes you can use variables. If they're out of range, behavior depends on the build mode:

- In Debug and ReleaseSafe, you get a guaranteed panic (which aborts the app with a stack trace)

- In ReleaseSmall and ReleaseFast, you get undefined behavior

On their discord I suggested a version of slicing that bounds checks in all modes (something like `slice?[1..5]`, syntax doesn't matter) and returns an optional slice, which is null if the bounds are out of range. But they didn't seem too keen on it. So I've been using a little wrapper function instead.


Why not just compile your whole program safe and then demarcate which functions you want to not be safety checked?


Careful, you're dangerously close to suggesting an effects system. If the function explicitly marked as doing unsafe math operations calls another that doesn't specify a preference, should its math operation be safe or unsafe? If the former, this makes it likely a better default, but limiting in its usefulness. If the later, then your compiler now has to keep track of this information for the entire call graph.


The form of safety we are talking about here is not like rust's "safety", it does not color functions, and is strictly scoped to the compiled unit (in this case, function)

https://ziglang.org/documentation/master/#setRuntimeSafety


I am aware, my point is about how the following is dealt with:

  fn foo(x: i32) i32 {
      return x + 100;
  }
  fn bar(x: i32) i32 {
      @setRuntimeSafety(false);
      return foo(x + 100);
  }
Particularly when you have arbitrarily deep call graphs where some explicitly enable and disable runtime safety.


You mean overflow checking? It only applies to the `x + 100` in bar. Bar can still panic if the `x + 100` in foo overflows. Why should it be any other way?


I really want to like Zig. Honest question: why is Zig so resistant to some kind of RAII? RAII is the biggest improvement that C++ brings over C. Leaving it out of a system programming language 50 years later is just strange? And no, manual defer is not the solution.


The most frequently given reason is that a core principle for Zig is ‘No implicit control flow’ or just the more general ‘Explicit is always better’. RAII, both for constructors and destructors are inherently implicit control flow constructs, also they usually involve access to allocation primitives, which Zig also rejects when implicit.

You can disagree with the language principles if that is one’s position. However, the ‘resistance’ to RAII is a clear effect of those principles.


I assume that zig requires the programmer to also manually allocate and free stack frames? And it eschews functions as they can hide allocations and control flows?


I would assume "allocation primitives" is referring to the heap.


I felt that way at first too but I’m less concerned about it the more I use Zig. Defer and errdefer are already a huge improvement over C. I won’t deny there’s more risk of forgetting to clean up than in Rust or C++, but Zig’s approach forces me to think more clearly about what’s going on. For example, instead of doing a slow tree traversal to free little bits of memory all over the place (which RAII would do implicitly), in Zig I’d probably use an arena and free it all at once. A similar principle applies to other non-memory resources too (though maybe not mutexes for example). I also think it would be difficult to achieve the flexibility of Zig’s unmanaged containers (where deinit takes a parameter) in a RAII language.


> I really want to like Zig.

Don't try to like Zig and instead make sure your cup is empty before entering a new tea shop.


From the article:

  const result = comptime square("hello"); // compile time error: type mismatch
ok, cool. But if the error occurs deep in some hierarchy of comptime calls, do you get the same kind of long errors that you do with C++ templates? Does zig have a way of achieving better ergonomics? One nice thing about generics in Rust and Swift is they are constrained by traits/interfaces so you get a concise error at the call site.


This isn't about Zig, but I found that I prefer something similar to concepts over traits/interfaces, as the former are somewhat limited as to what they express.

    /**
     * @checked x & 1 "The input parameter must allow bit operations"
     */
    macro bool is_power_of_2(x)
    {
      return x != 0 && (x & (x - 1)) == 0;
    }
In the example above (disregard the fact that it's placed in the doc comments) we'd get an error if `x & 1` is invalid (doesn't pass semantic checking). This is the simplest possible example. Basically if you pass in a float or a struct you would then get "The input parameter must allow bit operations" rather than a dump of errors from the macro body.

We can imagine other constraints, such as checking the value (if constant) to conform with valid ranges and so on.

This is more of a "contract" style constraint that can be placed directly at the call location, rather than pushing down the check further down into macro body, for example something like this:

    macro bool is_power_of_2(x)
    {
      $assert($checks(x & 1), "The input parameter must allow bit operations");
      return x != 0 && (x & (x - 1)) == 0;
    }
In this example the error would be localized to the macro, which perhaps isn't what we want, even if the compiler is kind enough to tell us where the macro was included.


A good api (like std.debug.print for example) can implement nice compile errors that fail early with readable messages manually at comptime. Still not as convenient as traits though


For me as a high level application developer, I liked C but felt that it needed a rational module system, generics and structural types. The addition of methods on structs would be nice too (no need for inheritance).

The only reason I thought about it is because I have been trying (and failing) to contribute to open source Linux projects - which are practically all written in C and the C++ projects are impossible to read.


C++ has all the things you "felt that C needed". So there must be something more that you dislike.


Due to my lack of experience, I definitely have no business criticising the language design of C or C++ haha but it's fun to talk about.

The issues I faced with C++ were that it has too many features. I feel that makes it difficult to carry experience in one project to another due to the variety in styles afforded by that design choice.

I once saw a C++ tutorial that rewrote a verbose for loop into a functional chain and it was practically unreadable for anyone lacking considerable C++ knowledge.

I'm also not a big fan of the way the module system in C++ works (similarly to C#, Java, Rust) where you import a namespace and then things are just available or extended.

Probably an unpopular opinion but, when evaluating the language semantics alone, I quite like the approach TypeScript takes to modules. You import a "thing" from a "relative filepath" explicitly and there is no ambiguity as to its origin (even when just groking a file in a low-tech text editor). For me it makes the process of tracing and understanding circuits easier than sorta guessing which namespace a function or method comes from. This also makes it easier for compilers to optimise binaries as they can statically determine what code is used, excluding unused code from a build.

I guess I could probably say the build system for C is difficult to grasp. In the high level world, I'm used to simply saying "compiler build main.xyz" - where C has makefiles, configure scripts and I find it a bit much.

But what do I know, haha


The problem with C++ isn't too few features, it's too many.


Not just too many. But inconsistent ones, hard to use ones and badly designed ones. std::unordered_map anyone? The iterator-based APIs are just cumbersome and error-prone to use, etc... it's really easy to shoot yourself in the foot by forgetting to explicitly implement specific constructors, or by passing the wrong iterator (like begin() vs end()) to an STL template.

The C# specification isn't significantly smaller than the C++ one (can't check the exact length right now) but the language is still much easier to understand, even if it has really hairy corner-cases like the difference between readonly fields and normal fields' generated IL or the logic for defensive copies. Most programmers don't have to care, and even if you do, it's not a nightmare to test or experiment with.

In C++, I kind of feel I'm trying to squeeze water out of a stone. And even with the huge standard library, basic things are missing from it like a string startswith.


> basic things are missing from it like a string startswith.

It was added in C++20: https://en.cppreference.com/w/cpp/string/basic_string/starts...


Agreed. However, my point about GP stands.


About macros and comptime: why is the compiler unable to detect that a function is a pure function, and if it is called with constants, the result can be computed ahead of time? At least in the simple cases such as the square example, which are I believe quite common, that would work. Maybe still add an annotation to make sure the compiler will do what the user expects, because humans are terrible compilers after all.

Because the "footgun" here is really that when using macros we are switching languages and strategies - in one case it's eager evaluation, and in the other is sort of like lazy evaluation.


C/C++ compilers will do this when optimization is enabled (I guess the reason that's not done without optimization is to preserve "debuggability").

Notice how the add() function is completely 'disolved' into its result '5' here:

https://www.godbolt.org/z/q98svvacW

(the main difference to comptime is that this guarantees that the code is resolved at compile time, and if that's not possible you'll get an error).

The big downside of comptime is that it isn't debuggable, apart from what's essentially 'printf debugging' via the @compileError builtin).


Graal's native-image tool also does a form of this. The compiler actually executes parts of your program at compile time, but it's real execution so you can theoretically do anything. The heap data that's generated is then persisted into the program. You can debug such code just like normal code by just running it on HotSpot instead.


Not only the work of the C/C++ optimizer but the exact reason why C++ constexpr was designed - to compute expressions during the compile-time iff all its dependants can be resolved during the compile time. If they don't it's still cool, they behave as a normal runtime function.


Do you mean in zig? It's possible to do arbitrary, bounded computation (compiler quits if you expend "too many tokens") at comptime in zig, unless there is strange statefulness.

This is sensible since you want changes in, e.g. your code tree to taint compiled resources. If your code can go read from the filesystem in an untracked way, you break the incremental compilation model.


This was sort of a general question, although Zig confused me a bit by placing the annotation at the call site rather than on the function declaration. I guess this choice was made to keep compilation times under control.


A function should not have to be labelled as comptime to be run at comptime. There are cases when you might want to run the same function at comptime as at runtime (for example, if you want to do a thing with algorithm X but cache the result for a series of trivial cases as a lookup table).

Putting it at the callsite is more explicit actually and for example generating precompiled arrays clearly ties comptimeness to the actual artifact created instead of making it inferred from the signature of the function (which may be very far away in code)


One thing Zig is missing is Exception Handling. Now before you complain about exception handling grossly bloating the size of a program, know that there are ways to trigger exceptions that do not involve the "throw" keyword. You get processor exceptions when you access a bad pointer, divide by 0, etc. If you don't want the program to instantly terminate, you need to be able to handle those exceptions.

So far, it seems that the only way to handle exceptions is to use another programming language. Zig allows you to mix in C or C++ code.


I am curious why anyone still thinks that exception is superior/better/desirable than a simple typed functions (like Result<T,E> or Option<T>).

> You get processor exceptions when you access a bad pointer, divide by 0, etc. If you don't want the program to instantly terminate, you need to be able to handle those exceptions.

The processor does a lot of other things too, all of these functions can simply be wrapped with the appropriate Result<T,E> and Option<T>. Hell, I'd argue that for a system programming language all mathematical operations should return Result<T, ArithmeticError>.

How is a try/catch better than a pattern match on the exact errors that you care about? inb4 checked exception, which is just an inferior version of algebraic data types.


Everything mentioned in this post relates to API level, for code that you control. You can pick your favorite type for optional values with the possibility of an error.

Exceptions, on the other hand, can come from code you don't control. Some DLL that you call into doesn't care if you like Option<T> better, it's still going to throw an exception. It could be designed to "throw" using the C++ exception syntax, or it could simply be dereferncing a null pointer or dividing by zero, causing an exception.

So then the issue becomes support for catching the exceptions caused by code outside of your control.

One option is to do nothing and just let the program die.

The other option is to catch the exception, automatically save the user's work, then restart the program.


I see, but isn't exceptions implementation specific? How does language X knows how some random binary/DLL is going to throw an exception? What kind of object does it throw? What does it look like? How big is it? How many nested exceptions are there?


Windows has an ABI for exceptions, I don't know about Linux.


I know basically nothing about Zig, but you don't need language-level exceptions to handle platform exceptions. You can write the handler code in C. Why can't you write it in Zig? The usual problem is that the language runtime gets in the way, but I thought Zig had a very minimal runtime like C.


__try and __except are win32-specific language extensions for C that do not exist in Zig. They become code which adds a linked-list item (exception handler entry) somewhere into the TEB. Win32 requires that you call all the Stack Unwind entries within your exception handler. When your exception handler is done, you longjmp out, or return to the offending code. Using the language extensions for exception handling is far more friendly than doing it manually with low-level code.



Problems of C:

- memory safety

- concurrency

How Zig addresses them:

- it doesn't.


Zig has async and await and is safe from buffer overflows and related memory issues. What it indeed doesn't solve is memory safety with time (so use after free and leaks), though this is somewhat ameliorated by the encouraged programming style.


The ≠ prettification of != threw me for a second. I was like, "Does Zig use special characters for operators"?


Does Zig have a formal grammar (like C)?


There's a grammar that tries its best to be maintained alongside the language but it's nothing solidified yet.


I think it should be

const x: i16 = -1 * 32768; // valid

In the text it's i32


That explanation is bogus anyway. It's a unary operator applied to a positive integer literal. No multiplication involved. In this case the range issue can be addressed in C by using the "L" suffix to upgrade the literal to 32-bit and let the compiler truncate it back to 16-bit.



Is a single comment really a "discussion"?


OK, so Zig to C is like Typescript to Javascript.


I don't think I fully understand the analogy.

Typescript has almost identical semantics to Javascript but adds typing syntax to improve developer experience and make it easier to manage a large-scale JS codebase.

Zig is a fundamentally different language than C that has a lot of new features. Two great examples are comptime and allocators which have complements in C, but are really very different from what C provides.

If you're suggesting that zig is aiming to provide a C alternative with better ergonomics, then I agree; but zig and C have a lot more differences than Typescript and Javascript.


Zig has more things for C interop than you're suggesting though, namely the extern keyword, calling convention, etc. Like ts/js, zig/c seems first class designed be used together in one codebase, compared to eg Rust where is obviously possible but more deliberately discouraged (since it breaks the safety guarantees, which Zig never claimed to begin with)


Typescript also compiles to C, zig explicitly seeks to replace C


TS is just modern JS with type annotations.

Zig OTH is "just C" but with a modern syntax, much more correctness, comptime, reflection, generics, a rich standard library, an integrated build system, cross-compiling that 'just works'.

(I'm sure I forgot a couple of things)


Technically there is no "compile time" in C. Any part of the program can be computed at compile time or at runtime. Its entirely up the the implementation. The AS-IF rule in the C standard lets implementations do whatever they want as long as the program outputs the same things. The pre-processor, const expressions, constexpr (a for this reason compliably broken feature in my opinion), even text parsing and tokenization can happen at run time.

Not having the standard define what is compile time and not is a feature not a bug, and one of the reasons that C continues to enjoy healthy compiler development. (I'm not taking a stand for or against Zig, other then that i support the development of new languages)


C23 includes constexpr for some constructs, by the way.


I know, and its very broken.


In what way? I haven't had the opportunity to try it out myself yet.


It is a feature that sounds like it promises to make something compile time but it doesn't. At the same time it has a bunch of limitations, that in no way reflects what modern compilers can do. Plenty of times you can write an expression and if you say its a constexpr, the compiler is forced to tell you that its not a valid constexper, but if you just remove the constexper qualifier, the compiler the compiler can solve the expression at compile time just fine. So using constexpr gives you:

-No guarantee that its computed at compile time. -Limits what the compiler lets you do when using the constexper keyword. -Adds no performance benefits or guarantees. -Makes your code not portable to most C compilers.

On top of this it adds implementation burden for compiler, and complicates the specification significantly.


Gotcha. To be honest, this description sounds like what constexpr is in C++ as well, and various `const` things in Rust. I thought you were saying that the feature doesn't work, but it sounds more like you don't like how the feature was designed. Thank you for letting me know!


Well, its broken in the sense that one part of the standard describes some functionality, and then an other part of the standard, says "Oh you can ignore that."


C is standardized and is one of the most (the most?) widely used programming language on the planet. If you write code targeting a C standard and don't include many unstable dependencies, it has a good chance of running correctly for a very long time. If you move from C to Zig, you lose that stability. Are Zig's convenience features really enough to compensate for this loss?


One of the main reasons, in fact, that we picked Zig for TigerBeetle (over C, which was the alternative, given we needed to handle memory allocation failure), was because of Zig's excellent interoperability with the C ABI.

For example, we write the reference TigerBeetle client implementation completely in Zig, then wrap this with the C ABI, and then bind to this C ABI from all target languages, to increase our velocity in how quickly we can ship language clients.

More details, in general, around our clients here: https://tigerbeetle.com/blog/2023-02-21-writing-high-perform...


> given we needed to handle memory allocation failure

Isn't it that the OOM killer is likely to be a much bigger problem? With the overcommit enabled and OOM enabled I don't think I can envision the case where you would run into a failed allocation - allocations on Unices practically never fail. It is the OOM-killer that will likely kill your process once it detects that the pressure on your ram+swap is high so you won't even have a chance to run into the OOM.

OTOH if you disable the OOM-killer there's another and much bigger problem to solve - a kernel panic. I guess that's not the condition you want your system to run into.

So, I think that the only combination where you can deterministically detect and run into allocation failures is when both OOM-killer and overcommit are disabled. That's what I think Windows is doing by default.


Sounds cool but doesn’t address my question. Do you think the relative instability of Zig will hurt you? Or do the benefits outweigh the costs? Lots of languages make it easy to talk C ABI (including… C itself). So why Zig?

I guess the people downvoting my assume I’m hostile to Zig. I’m not. I’m saying that the stability of C is a selling point and I’m curious what features of Zig are so valuable that people are giving that up.


It is nice and innovative, no doubt. But reinventing the syntax from scratch instead of introducing just the minimal amount of changes to the original C is a barrier that will deter 90% of possible adopters.

The reason C# took off is that it's as close to C/C++ as possible. If there is a difference, it's due to a fundamental semantic change. E.g. it moved from painstakingly reinterpreting hundreds of small header files for each compiled source file to loading efficiently serialized hierarchies of class definitions. Hence #include got replaced with using. It changed the semantics of pointers vs. references, hence 'ref' instead of '*', and so on. But there is no "fn square(x: u32)" instead of "uint32_t square(uint32_t x)" just for the sake of it.

And that's the main reason why many people will never consider even looking into the advantages Zig offers. Keeping another syntax in your head is just not worth it.


If you just keep C ideas you might almost just as well give up altogether.

C# actually only resembles C rather superficially, idiomatically there's a large gap and of course you should write idiomatic code, for example it's true you can write a C-style for loop in C#, but you almost never should, C# has a (not great, but it's something) for-each loop that's idiomatic.

C#'s primitive types look superficially like C types, but behave more like the modern sized types from a language like Rust. They're technically structures, albeit with a more convenient alias keyword. For example "long" is a signed 64-bit integer type, like i64, not some arbitrarily "maybe bigger than int" type as it is in C.

123.ToString() is a reasonable thing to write in C#, it means "Call the ToString method on this integer 123" much like Rust's 123.to_string() -- you can't do anything similar in C or even in C++


>for example it's true you can write a C-style for loop in C#, but you almost never should, C# has a (not great, but it's something) for-each loop that's idiomatic.

This makes the learning curve way easier. You can start meaningfully using C# while using C-style for loops, and eventually switch to 'foreach' once you realize it's better.

If I wanted to give Zig a try today by using it for some small low-priority task, I would keep stumbling on these minor syntax differences all the time, and would eventually give up and do it in C/C++ because the overhead outweighs my curiosity.

Most pragmatists don't have infinite time to learn a new programming language for fun. They have a very limited amount of attention and tight time constraints, and will move to the next pragmatic solution if it starts looking like the current one is not cutting it.


> I would keep stumbling on these minor syntax differences all the time, and would eventually give up

I can't necessarily fault Zig for the state of this so early in its lifetime, but - that should not be a big problem, handling transition is work for the diagnostics.

Suppose I write this in Rust: printf("%d", count);

Rust says it can't find a function named printf, but it suggests perhaps I want the print! macro instead?

OK, let's try again: print!("%d", count);

No, says Rust, % style format strings aren't a thing in Rust, use {curly brackets}

Sure enough: print!("{count}"); // compiles and works.


> And that's the main reason why many people will never consider even looking into the advantages Zig offers.

I think you got it upside down. Syntax is the least of its problems. Any experienced developer has already learned several languages and can pick up new syntax quickly. Zig syntax is straightforward, similar to other languages, and can be learned in a couple of hours. Easy stuff. The problems with Zig are more related to uncertainty around long term support, the network effect, and so on.


I disagree, I argue that you've got it backwards.

C has a fair amount of kludge in its syntax. C cannot change its syntax because it would massively break backwards compatibility, which is bad.

New languages do not have this problem - they don't have to worry about backwards compatibility. Because they have that freedom, they should always opt for what they believe is the best possible syntax. I'd say they're obligated to do so. Otherwise, we're stuck with another 10-20+ years of dealing with bad syntax, for no good reason!

If the opportunity for improvement is there, and it's nearly close to free to do so, it should absolutely be taken.


Counterpoint: C itself invented lot of syntax compared to the contemporary ALGOLs and PL/I. Yet, it became immensely popular.


I would dare say, different combination of early adopters/pragmatists in your target audience. C/C++ these days is mostly legacy stuff, so if you want to cater to that audience, you deliver small incremental improvements.

Many people that could be bothered to learn Zig syntax, jumped ship to Python/Java/whatever already.


Zig and Python/Java are completely different beasts. One is a low-level systems language on the order of Rust or C. The other two are much higher level, easier to work with languages more attuned to desktop, mobile applications and enterprise work.

I don't think anyone is seriously "jumping ship" from Zig to those.


Low-level programming isn't the holy grail. If a person gets fed up with limitations of C, they might as well move to a higher-level domain area as well. Especially, given the higher pay there.

Those who haven't will have a much lower tolerance for changes. Survivor bias of a kind.


It's not really a choice of low-level or high-level. Neither is better than the other.

It's about choosing the right language for the task at hand. Some work will really be suited (or only be feasible) with a low-level language and vice-versa.

If I'm writing a command-line tool on the order of ripgrep or working on a microcontroller embedded in a dishwasher, I'm not going to go to Java. That would be weird and awkward. And if I'm writing a new 3D AAA-level game, I'm going to jump to something maybe even higher level like UE5 - trying to do all that in Rust would be a PITA.


Legacy stuff like the compiler and runtime used by the new hip languages or the webserver, browser, database, libraries needed to run those programs, or the OS, hypervisor, device drivers needed by the machine those programming run on.


Thankfully most of those are C++ and not C, and some hip languages are bootstraped.


IIRC even K&R Second Edition acknowledges that the C type syntax is awkward for non-trivial cases.

The "new" style used by Go, Rust, Zig, Typescript etc... is a lot easier to read because it always 'resolves' from left to right.


In practice it only takes a weekend or two to pick up the syntax changes, and the benefits easily outweigh that investment.

Heck, the simple fact that type signatures can be read literally from left to right (instead of using the spiral rule) is enough for me to switch. https://zig.news/toxi/typepointer-cheatsheet-3ne2


Minor differences in syntax are not what makes programming (in general, and when it comes to learning a new language) hard.

As an aside, I really like that I can grep source files for a keyword such as `fn ` and get a good idea of how many functions I am defining, and where they are. This gets especially powerful with multi-cursor editing.


What do you think about C3 then? https://c3-lang.org




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: