A compiler can’t know why you fucked up, it can’t even know that you fucked up, because UBs are just ways for it to infer and propagate constraints.
If an optimising C compiler can’t rely on UBs not happening, its potential is severely cut down due to the dearth of useful information provided by C’s type system.
> A compiler can’t know why you fucked up, it can’t even know that you fucked up, because UBs are just ways for it to infer and propagate constraints.
To be honest, that's just how compiler writers interpret UB these days.
It's perfectly possible (in principle) to use lots of more sophisticated static and dynamic analysis to recover much of what C compiler just assume. You don't have to restrict yourself to what C's type system provides.
(For an example of what's possible, have a look at all the great techniques employed to make JavaScript as fast as possible. They have basically no static types to work with at all.)
> For an example of what's possible, have a look at all the great techniques employed to make JavaScript as fast as possible. They have basically no static types to work with at all.
I’m sure people will be very happy with a C JIT. That’s definitely what they use C for.
JIT-ed code is full of runtime type and range assertions which bail if the compiler’s assumptions are incorrect.
Oh, I didn't mean to imply that it would be practical. Only that it's possible and that the type system isn't the only thing you can rely on.
Instead of just assuming that 'x > x + 1' is always true (for signed integers), the compiler could also do the heavy lifting of static analysis (for cases where that's possible).
But you talked about JavaScript, where the “heavy lifting” is in fact “assume that it's always true, and switch back a less optimized version if the assumption turns out to be false”. That's exactly the kind of things you cannot do in C, because people use C in contexts were JIT isn't an option.
The signed integer overflow rule is extremely important for common optimizations, mostly related to loops like knowing if they're finite or rewriting their index directions.
The way to start getting rid of it would be to add for...in... loops or something where the loop index can be a custom no-overflow type.
And "defining" it is a lame approach to safety. If you make it wraparound, you now have silent wraparounds that can't be found by static analysis. You want unintended overflows to trap, not just be defined.
> And "defining" it is a lame approach to safety. If you make it wraparound, you now have silent wraparounds that can't be found by static analysis. You want unintended overflows to trap, not just be defined.
Yes. But even the lame approach is better than UB, because it doesn't bring the whole program down.
I've been wondering if I should mention that using int for an index is a bad idea because the standard only guarantees it's 16 bits. You should use size_t instead. And in C size_t is unsigned.
My take is all of the low hanging fruit optimizations that the standard enables has been picked a long time ago. Everything left is problematic.
If an optimising C compiler can’t rely on UBs not happening, its potential is severely cut down due to the dearth of useful information provided by C’s type system.