I might be one of today's luck ten thousand. Suppose you're writing an OS for a ...

GrantMoyer · 2025-01-31T02:53:13 1738291993

1. Yep, this is an abstraction HW works hard to preserve.

2.The specific scenario I had in mind was: write value X to address A from core 1, write value Y to A from core 2, read value from address A from core 1 (still X in core 1's cache), core 1 cache gets invalidated, read value from address A from core 1 again (now it fetches Y from memory).

See [1] for reference on how it applies to C specifically, especially "Absent any constraints on a multi-core system, … one thread can observe the values change in an order different from the order another thread wrote them".

[1]: https://en.cppreference.com/w/c/atomic/memory_order

ajross · 2025-01-31T00:20:01 1738282801

> Is my view of the world too simplistic?

It's not. There are no artifacts of cache incoherence visible on modern devices that are treated as part of the C language undefined behavior rules. The language runtime can assume memory is just memory.

Obviously there are visible artifacts, but all that code lies outside the realm of standard C, even though you write it in C and build it with a C compiler. This is one of the reasons I find arguments that start from the UB rules as postulates unpersuasive. At the end of the day we have to write code for real hardware.

The tl;dr version is that Rust as currently understood is just never going to be able to express stuff like incoherent DMA spaces. As witnessed here it still struggles with representing read() in a way that doesn't require zero-filling the buffer.

lmm · 2025-01-31T03:31:40 1738294300

> At the end of the day we have to write code for real hardware.

We have to write code for real (often future) compilers, and compilers are generally unsympathetic to code whose behaviour is undefined under the standard (I think this is unreasonable behaviour by compiler maintainers, but that argument has been lost).

In my experience C compilers don't really think or care about what happens for code that triggers UB, so it wouldn't at all surprise me if it was possible to get a compiler to emit code that exposes cache incoherence (i.e. it reads and writes memory in a pattern that does not have defined behaviour according to the underlying platform's memory model (would need extra barrier instructions etc.), and then on that hardware the result is a cache incoherence effect). Probably not on x86 with its friendly memory model, but maybe on ARM or the like (Alpha used to be a notorious place to see such bugs).