Hacker News new | past | comments | ask | show | jobs | submit login

This is a race triggered by a particular reording of memory accesses as seen by different cores. It's the kind of thing that doesn't necessarily show up in a unit test anyway.



Exactly. This kind of bugs require stress testing to reveal. You might need to run it for minutes, hours or days to get it reproduced. And this bug might not come up with the hardware you are running.

Code dealing with memory barriers in SMP systems is non trivial to write, review and test. Everything is hardware specific, timing dependent and non-deterministic. Simple unit tests are useless for this kind of tasks, it needs stress testing on different hardware and a variety of workloads.


Reminds me of a passage from Tracy Kidders "Soul of a new Machine" [1]. The guys at Data General were implement one of the first pipelined 32bit processors for a mini computer in the late 70's. In the book Kidder talks about how they had a gate level simulator implemented in software that allowed them trouble shoot timing issues. Makes me wonder if a similar simulator could be useful to test and/or debug these types of issues.

Great book, highly recommend it. It won the Pulitzer.

[1] http://en.wikipedia.org/wiki/The_Soul_of_a_New_Machine




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: