> 2. Lockfree can be faster -- This is the case most people hope for.
it's not about being faster, it's about never returning control to the OS if you want to achieve soft-real time in one of your threads (a common application being audio processing where you have to consistently fit as much as you can do every 1.33 milliseconds for the more demanding users).
Mutexes are spinlocks for roughly 1000ish cycles on both Linux and Windows. If you're in-and-out before the 1000 cycles are up, the typical mutex (be it a pthread_mutex or Windows CriticalSection) will never hit kernel code.
If you take more than 1000ish cycles, then yeah OS takes over as a heuristic. But most spinlock code won't take more than 50 cycles.
What can the OS do for you if your consuming thread has for example taken a lock and is now stalled on a page fault triggering IO from spinning disk that could take seconds to finish? Nothing that I'm aware of - you just have to wait until it runs again and releases the lock.
That's why people write lock-free - they don't want a stall or death of the one thread to stall other threads. Deadlock, lovelock, priority inversion.
Read the Wikipedia page for a starting point of motivations for lock-free.
I mean, what happens if the writer thread stalls out in the lock free case? The reader thread runs out of things to do and is also going to be scheduled out (eventually).
Like, that's the OS's job. To remove tasks from the "runnable" state when they stall out.
> That's why people write lock-free - they don't want a stall or death of the one thread to stall other threads.
Erm, if you're saying that the reader thread will be highly utilized... that's not likely the case if the writer thread in this "lock free" implementation stalls out.
Lets say the writer-thread hits a blocking call and is waiting for the hard drive for ~10 milliseconds. What do you think will happen to the reader-thread? In the "lock-free" implementation talked in this blog.
> I mean, what happens if the writer thread stalls out in the lock free case? The reader thread runs out of things to do and is also going to be scheduled out (eventually).
Yeah it keeps running, exactly as you say! It can keep working on jobs already on the queue independently.
Are you arguing that you you personally don't think it's worth doing lock-free for various practical reasons? Or arguing that's never why anyone does it?
Because the latter is simply false - I've done it for the latter reasons professionally both in academia and industry so there's one data point for you, and the literature talks about this use-case extensively so I know other people do as well.
> Are you arguing that you you personally don't think it's worth doing lock-free for various practical reasons? Or arguing that's never why anyone does it?
No. Lock-free is great. When it is done correctly.
The code in this blog post was NOT done correctly. It only works in the case of 1-reader + 1-writer, so the "advantages" you're talking about are extremely limited.
You have to keep in mind the overall big picture. Lock-free code is incredibly difficult to write and usually comes with great cost to complexity. However... sometimes Lock-free makes life easier (okay, use them in that case). Sometimes, lock-free can be faster (okay... use them in that case... if speed is important to you).
But in many, many cases, locked data-structures are just easier to write, test, and debug.
Yeah, but the "queue-of-queues" is a more traditional structure familiar to people with lock-free algorithms.
I don't have time to do a serious code review of everything... but immediately upon looking at that code, its already clearly written with better quality: with acquire/release fences being used in (seemingly) the correct spot (just as an example).
Like, the real issue I have here is the quality of the code that is being demonstrated in the blog post. A lock-free of queues-of-queues implementation makes sense in my brain. What's going on in the blog-post ultimately doesn't make sense.
> Erm, if you're saying that the reader thread will be highly utilized... that's not likely the case if the writer thread in this "lock free" implementation stalls out.
That does not mean that the reader thread has nothing to do. e.g. the most common case is that your reader thread will produce an output, and your writer thread sends messages to change how this output is generated - but even if there's no message, the reader thread has to keep going at a steady rate.
it's not about being faster, it's about never returning control to the OS if you want to achieve soft-real time in one of your threads (a common application being audio processing where you have to consistently fit as much as you can do every 1.33 milliseconds for the more demanding users).