Hacker News new | past | comments | ask | show | jobs | submit login

I think the hard part of it is that x86 only has one atomic ordering and none of the other modes do anything. As such, it’s really hard to build intuition about it unless you spend a lot of time writing such code on ARM which wasn’t that common in the industry and today most people use higher level abstractions.

By databases, do you mean those running on DEC Alphas? Cause that was a niche system that few would have had experience with. If you meant to compare in terms if consistency semantically, sure but there’s meaningful differences between database consistency semantics of concurrent transactions and atomic ordering in a multithreaded concept.

Java’s memory model “wrestling” was about defining it formally in an era of multithreading and it’s largely sequentially consistent - no weakly consistent ordering allowed.

The c++ memory model was definitely the first large scale adoption of weaker consistency models I’m aware of and was done so that ARM CPUs could be properly optimized for since this was c++11 when mobile CPUs were very much front of mind. Weak consistency remains really difficult to reason about and even harder to play around with if you primarily work with x86 and there’s very little tooling around to validate that can help you get confidence about whether your code is correct. Of course, you can follow common “patterns” (eg loads are always acquire and stores are release), but fully grokking correctness and being able to play with the model in interesting ways is no small task no matter how many learning resources are out there.




Nit: x86 has acquire/release and seq_cst for load/stores (it technically also has relaxed, but it is not useful to map it to c++11 relaxed). What x86 lacks is weaker ordering for RMW, but there are a lot of useful lock free algorithms that are implementable just or mostly with load and stores and it can be a significant win to use non-seq-cst stores for this on x86


I would have to imagine you mean x86-64 right? I would imagine 32bit x86 doesn’t have those instructions?

I’m also kind of curious if a lot of modern code compiled to x86 would see consistency issues running on old CPUs before TSO was formalized (like a p2 multiprocessor server).


32-bit x86 has many of the same instructions, including cmpxchg8b (in models dating to the 90s).


Indeed there is different code generated by seq_cst for stores. Though for loads it appears to be the same: https://godbolt.org/z/WbvEcM83q


Re: the godbolt example, note that release semantics are not meaningful for load operations.

> If order is one of std::memory_order_release and std::memory_order_acq_rel, the behavior is undefined.

https://en.cppreference.com/w/cpp/atomic/atomic/load


Yes, seqcst loads map to plain loads on x86.


X86 might but devices connected to it in embedded world have had to be very very aware of this stuff since the 90s.


Embedded devices did not necessarily use the c++ memory model, and definitely not in the 90s and were highly likely in order CPUs to boot with no crazy compilers and thus atomics didn’t matter too much anyway (volatile was sufficient). They had a weaker memory model maybe but at the same time multi threading on embedded did not really exist as it was only being introduced into the industry with any real seriousness around that time (threading on Linux started to shake out around the mid 90s).


SMP systems were widely in use in the 1990s, but you’re correct the dual core MIPS was 2003ish in emedded.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: