I'll give a concrete example. It's not the most compelling optimization in the world, but I think illustrates the tradeoffs clearly. The following is pseudocode, not any particular language, but hopefully will be clear.
let i = mystruct.i;
if (i < array.length) {
let x = expensive_computation(i);
array[i] = x;
}
For the purpose of this example, let's say that the expensive computation is register-intensive but doesn't write any memory (so we don't need to get into alias analysis issues). Because it is register-intensive, our optimizer would like to free up the register occupied by i, replacing the second use of i by another load from mystruct.i. In C or unsafe Rust, this would be a perfectly fine optimization.
If another thread writes struct.i concurrently, we have a time of check to time of use (TOCTOU) error. In C or unsafe Rust, that's accounted for by the fact that a data race is undefined behavior. One of the behaviors that's allowed (because basically all behavior are allowed) is for the two uses of i to differ, invalidating the bounds check.
Different languages deal with this in different ways. Java succeeds in its goal of avoiding UB, disallowing this optimization; the mechanism for doing so in LLVM is to consider most memory accesses to have "unordered" semantics. However, this comes with its own steep tradeoff. To avoid tearing, all pointers must be "thin," specifically disallowing slices. Go, by contrast, has slices among its fat pointer types, so incurs UB when there's a data race. It's otherwise a fairly safe language, but this is one of the gaps in that promise.
Basically, my argument is this. If you're really rigorous about avoiding UB, you essentially have to define a memory model, then make sure your use of LLVM (or whatever code generation technique) is actually consistent with that memory model. That's potentially an enormous amount of work, very easy to get subtly wrong, and at the end of the day gives you fewer optimization opportunities than C or unsafe Rust. Thus, it's certainly not a tradeoff I personally would make.
Thanks. Currently Odin would cache i on the stack for retrieval later, granting LLVM the ability to load it into a register if profitable with knowledge that after the read, `i` is constant, which bypasses the RAW hazard after the initial read.
My view is that undefined behavior is a trash fire and serious effort should be undertaken to fix the situation before it gets even more out of hand.
> For the purpose of this example, let's say that the expensive computation is register-intensive but doesn't write any memory (so we don't need to get into alias analysis issues). Because it is register-intensive, our optimizer would like to free up the register occupied by i, replacing the second use of i by another load from mystruct.i. In C or unsafe Rust, this would be a perfectly fine optimization.
WAT?
This is obviously not "fine".
This kind of bullshit is why I said Odin's stance on UB is what swayed me to prefer it.
If another thread writes struct.i concurrently, we have a time of check to time of use (TOCTOU) error. In C or unsafe Rust, that's accounted for by the fact that a data race is undefined behavior. One of the behaviors that's allowed (because basically all behavior are allowed) is for the two uses of i to differ, invalidating the bounds check.
Different languages deal with this in different ways. Java succeeds in its goal of avoiding UB, disallowing this optimization; the mechanism for doing so in LLVM is to consider most memory accesses to have "unordered" semantics. However, this comes with its own steep tradeoff. To avoid tearing, all pointers must be "thin," specifically disallowing slices. Go, by contrast, has slices among its fat pointer types, so incurs UB when there's a data race. It's otherwise a fairly safe language, but this is one of the gaps in that promise.
Basically, my argument is this. If you're really rigorous about avoiding UB, you essentially have to define a memory model, then make sure your use of LLVM (or whatever code generation technique) is actually consistent with that memory model. That's potentially an enormous amount of work, very easy to get subtly wrong, and at the end of the day gives you fewer optimization opportunities than C or unsafe Rust. Thus, it's certainly not a tradeoff I personally would make.