Hacker News new | past | comments | ask | show | jobs | submit login
What makes concurrency so hard? (buttondown.email/hillelwayne)
58 points by BerislavLopac 9 months ago | hide | past | favorite | 41 comments



Concepts from Functional Programming can be used to reduce the state space.

First, do as much work as possible in pure functions. Pure functions can safely be run in parallel, in any order because they do not have side-effects.

Then the side-effects can be applied. This code should be smaller/simpler since any work that can be done in a pure function was already done.

https://grokkingsimplicity.com/ covers this topic in chapter 17: Coordinating timelines. The book even walks through a non-trivial example.


I think it's because people have a hard time seeing things from more than one perspective.

We live our days and react to things based on our perception. We think about our state. Few people think about how things look from another person's perspective. Many people have a hard time really getting into someone else's shoes.

With multithreading, people can reason about what one thread will do. They can also reason about what another thread will do. However, thinking about the various unexpected ways these threads could interact, and designing a protocol for those interactions to limit adverse behavior is something many people are bad at. One perspective at a time is OK. Holding 2 or more at once is tricky.


I don't think viewing from someone else's perspective is comparable. That is a matter of interest. You can do it sync.

Concurrency is more complex as it is multiple things at once, with no guaranteed order of instructions which makes the modelling so much more difficult and introduces explosive amount of edge cases.


Actor model with preemption solves the entire thing like it was never a problem. Three axioms, one mechanism, puff, gone.

But for some strange reason, bona fide preemptive actors exist only in Erlang and derived languages (Elixir, Gleam, etc). All other actor model implementations are cooperative in some regard, which requires some thinking on blocking/nonblocking code in actors etc. Only Erlang abstracts it all away.


No free lunch. Two Erlang processes doing an RPC with each other can deadlock. Of course, you remove RPCs from everything, but that requires quite a bit of thought. You can still get lunch, but it's not free.


> All means of making concurrency 'easier' are concerned first and foremost with managing the state space.

> Data mutations create new steps, which immutable data structures avoid by definition.

This is why I'll already reach for immutability-first, and only mutate in special cases.


In the past, programmers opted for concurrency because creating a thread was the only way to get a continuation/closure type entity, where you could nicely do everything in one tidy scope. So even the thread was preemptive, and thus particularly threatening, it was nicer to write the program.

Also, if you multiplex events onto a single thread, though you get some better guarantees about execution order than you do under preemptive threads, it is still chaotic. Events arrive in any order. Multiple state machines update their state in any order, and hit their important states in any order.

Data being chopped into frames arbitrarily scrambles things. The past million time the program read four bytes from the network socket, it got the entire 32 bit integer in one operation. Today, it didn't notice that the read only returned 3 bytes.

Threads are exciting, especially to young programmers. Look, all you have to do is write little programs that do simple things in a loop, taking some messages here and replying there. Put them together, and the complex behavior of the application will somehow just emerge, as if by magic. And you don't even have to document it anywhere; just document the simple things that the individual threads are doing!


> Also, if you multiplex events onto a single thread, though you get some better guarantees about execution order than you do under preemptive threads, it is still chaotic.

The combination is even worse. There's an impedance mismatch when you have both multiple threads and something that involves callback/async logic. I just discovered that the Rust graphics stack of Rend3/WGPU/Vulkan/GPU has this problem. The goal is that you're busy rendering the scene in one thread, and in other threads you're loading assets into the GPU for stuff you're going to be near soon as you walk or drive around the scene. Down at the Vulkan/GPU level, the model is that you put commands on a command queue, and you get completions back, not necessarily in order. You can have more than one command queue, and the usual setup is a "graphics" queue, which can ask for anything the GPU can do, and a "transfer" queue, which can only move data from CPU memory to GPU memory.

On top of this, there's a constraint that, while adding a command to a queue is cheap, "submitting" a queue and starting it up is expensive. So having only one command on the queue per asset-loading thread is apparently inefficient. This means that you have to reply to transfer requests with something like a future, one that blocks when you actually need the data.

Other than the Unreal Engine people, nobody seems to actually implement all this. Everybody else just has one queue and gives up a sizable amount of performance.


Three words: Communicating Sequential Processes.


I thought of an answer in 2 words reading the title, namely:

Shared state.

Actor model is indeed a nice way to prevent shared state. So are "The Elm Architecture" and state-less web server on top of a db.

I have a hunch "Communicating Sequential Processes" and "Shared state" are somehow describing the same thing in different words (or from a different perspective).


The actor model is still prone to many forms of race conditions, among others, live and dead locks.


They're supposed to be opposed. "Communicate by sharing" vs "share by communicating". Communicating has nicer ownership rules, but not all runtimes make it easy to e.g. communicate raw video frames to a compositor

But yeah in the end it's all Turing machines and you can emulate anything with anything else


It doesn’t make it easy. You can still have dead/live locks.


I prefer plain actor model, it is so simple that it happily abstracts the entire problem away. CSPs require a lot of reasoning beforehand.


Like in Go? I still think it's difficult there. Is it made easy? Sure. But it's not simple


In part because Go isn't true CSP, it's just inspired by it. It's very easy to accidentally share data between threads even when exclusively using channels to communicate. Couple this with the cases where it is better to use shared state with mutexes and you get can bit by them being easy to screw up or forget.


Can you share more than 3 words and explain?


I'd say it's because it's not obvious in the languages we use what is and is not an atomic operation. Different languages do provide different levels of protection by default as well, making it even harder to know what is and is not safe. Personally when I'm in a language that makes things obvious then it becomes easier to reason about. Rust helps, verbosly. I don't have any problem in JavaScript because the runtime usually doesn't allow for your own code to be running concurrently. Each way here has trade-offs.


JavaScript, Lua, and Qt have all given me cases of single-threaded concurrency because you can poll the event loop from inside a function. (I don't remember what loop I was using in Lua)

So you don't get memory unsafety (except in Qt) but you do get immutable variables changing when you're not looking at them.

Actually maybe it was c#. I wrote a coroutine in c# once and I regret that I did not make it an explicit state machine. Having to re-check all my shared state after each await point made it worthless.

Sure state machines look like mashed potatoes compared to "wow it's just communicating processes" threads but it represents the complexity of the program more accurately at a glance. Maybe the domain really is mashed potatoes


I mean, the semi fun thing in C# is you can still YOLO the async state machine to where it is clean to consumers,

But agree, sometimes to solve the problem best you gotta get dirty.


> We do concurrent reasoning every time we drive a car!

> More generally, some studies find that if you frame concurrent systems in human terms ("meatspace modeling"), people get quite good at finding the race conditions. > So while concurrency might be difficult to reason about, I don't think it's because of a fault in our brains.

Indeed, when you drive a car you get rich input, unlike with concurrency where neither the design nor, even more importantly, execution, is presented to you in anything similarly comprehensible


Maybe we need a richer debugger experience.


Definitely, there are some ideas on at least not wasting the time on stepping thru and instead having "rewindable" debug recordings, but then that's still not show you visually that, e.g., this whether the interleaving is forming the unexpected (by you) sequence of A1B2B1A2.


People like to say that CSP solves everything. It's not. In reality everything communicates through a shared memory. Take sound, for example. When a human speaks the shape of sound waves changes in the ear of another human. What's happening is a write access to a shared resource. Another agent should process it accordingly. There is no magical "onMessage" in reality. CSP theory hides the fundamentals, thus it won't be as fast as more basic primitives, combined correctly.


My skepticism of CSP is not that it's slow, slow alone could be fine. Tokio is slow for some things.

My issue is that it pretends things are simple when they're not, and it doesn't encourage the programmer to really think about "what if I start action x and then I'm interrupted by message y?"

And this isn't incompatible with CSP, you'd just have to embed an event loop in each process and never block it. So then you may as well use one loop and dedicated worker threads.

Like I think a common pitfall for novices (I certainly have done this) is thinking of the happy path and then just writing that into a thread.

Worst case, say it's a self-driving car. When I see a green light and there's nothing in my way, I'm gonna hit the accelerator and move 50 feet through the intersection. Lovely.

I write that into a thread and then a senior asks me what if a pedestrian enters while my thread is holding the accelerator. I forgot that I have to re-check for obstacles every frame. Oops. (Contrived, I know)

I think I'm an okay programmer. I think I'm even okay at concurrency.

I just have to ask myself on every line, can this fail? And can it take more than a couple microseconds?

Hashing data for instance is predictable. If I'm swapping and the os faults in memory, that's out of my scope. Hashing can't fail and always finishes in some number of cycles.

Parsing can fail.

File reads can fail and take time. The disk is a shared resource with its own controller, in every sense a read is actually a network call.

So if CSP means someone will hide long-running operations by accident, I don't think it's helpful


Good point, cancelling concurrent actions that started another concurrent actions is extremely complicated. I don't think humans can solve it efficiently in real life. There is usually a mess left behind if plans are reverted or changed significantly.


I have some fun habits to "idiot-proof" things. If I'm paying for something with my credit card, I keep my wallet out until my card is back in it. If I'm cooking something on the stove, the kitchen light and the exhaust fan must be on.

So at any moment, if my wallet is out, I know I must think about my card before I put my wallet back in my purse. And if the kitchen light is on or I hear the fan, I must check the stove regularly.

I have a couple friends who constantly lose phones or wallets, and I simply don't.

It's not intuitive to program but what's worked for me in software is, simply do _everything_ on every tick, and make sure the loop always ticks. Anything you can't do in tick, has to be made into an async task somehow and monitored every frame. Even up to a thousand tasks, "Is this still running? Should I stop it? Should I start it?" is a very cheap question to ask. I believe this is how some embedded folks code, like for a furnace controller.

A lot of multi-threading and TOCTOU vulns also seem to come down to "What if just one thread of yours froze for 60 seconds between any two atomic operations?" If I do something like create a file and then set permissions on it, it feels like it's fast. But actually, an attacker only needs one time slice where the permissions are wrong to get in. Luckily I don't work in security, so I just do my best.


FWIW: My hardware designer friends find software tediously limited because only one thing happens at a time.


Mostly through either self-inflicted pain of languages that got its concurrency primitives wrong or ended up having it too complex to be useful before we arrived upon better ways.

Naturally, this is not to say the language designers were bad at it, far from it! It just took the benefit of the hindsight and time due to non-uniform knowledge propagation throughout the industry to nail them down as can be seen in Rust and C# in the form of async/await and Futures+Tasks and CSP-style in the form of Elixir and Go.

Today, Rust makes it trivial (given the tradeoffs) to achieve such even in bare metal scenarios with custom executors, while C# offers zero-effort model for general purpose scenarios:

    using var http = new HttpClient();
    var req1 = http.GetStringAsync("https://news.ycombinator.com");
    var req2 = http.GetStringAsync("https://lobste.rs");
    Console.WriteLine(string.Concat(await req1, await req2));


(Safe) Rust only gives you data race freedom, you can still deadlock and livelock. Don’t pretend that it’s a solved problem, because it is not even close to that.


I don't, locks are hard. But the impact of locks tends to be way more focused/limited than the area of application of concurrency.


And yet there is dangerous usage of HttpClient (in parent defense the situation is a mess)


HttpClient is thread-safe. The above use case, while not ideal, will work perfectly fine in 95% of situations.

(Of course, the "right" way to use HttpClient is to have it be injected via DI or retrieved from IHttpClientFactory, but if all you are doing is writing a script-like CLI utility, there is no advantage to that)


>What makes concurrency so hard?

AanTythinANgY canTI happeME!n


state


Emergence.


Concurrency is not so hard if you have the right tools / primitives:

- Monads with do-notation

- Atoms

- Immutable data structures


According to the framework the author uses, how would you describe these in terms of state space management (apart from immutability, which the author dealt with)?


> Monads with do-notation

Phrased like this, it sounds like 'the monad abstraction' and 'good syntax for monads' are what helps, which is slightly off the mark.

But particular monads do help:

- The Par monad gives you deterministic parallelism for computations.

- The flipside of marking mutable/non-determistic functions as IO (another monad), is that the functions not marked as such are deterministic, non-interfering etc., so interleaving them differently will not grow the state-space.

- The STM monad gives you transactions, which give you safe, shared mutability without running into the following issue from the article:

> I can use programming constructs like mutexes and barriers to "prune" the state space and give me the behaviors I want, but given how big the state space can be, I have to do a lot of pruning to get the right behaviors. I can make mistakes in implementation, "misshape" the space (like by adding a deadlock), or not notice a buggy state I need to remove. Threads are very error prone.

I'm not sure what 'Atoms' refer to, unless it's running code in an atomic block in STM.

That said, having a good monadic syntax is still great, even if the syntax can't solve semantic concurrency issues. I believe that CompletableFutures were a good way to model async code in Java (and Rx/observables etc, to some extent). But they got resistance from Java programmers who didn't want to flatMap everything all the time. They had semantic issues too (can't cancel?!? what?!). But I think it's largely their syntax that made programmers miss the Thread model of writing straight-line imperative code.


By atoms they might mean runtime-wide symbols that evaluate to themselves, kinda like a singleton that evaluates to itself. Useful for tagging data or states, in BEAM-languages like Erlang or Elixir you commonly use the atoms :ok and :error to tag tuples to make it easy to switch or pattern match into appropriate receiving branches.

You'll also come across them in Lisp-like languages and logic programming.

Edit: Another, maybe better, illustration from the BEAM might be the 'booleans', they're implemented as the atoms :true and :false.


> Phrased like this, it sounds like 'the monad abstraction' and 'good syntax for monads' are what helps, which is slightly off the mark.

> But particular monads do help:

The part the language needs to provide are merely the abstraction and the good syntax which uses it, not the particular monads: you can then implement them as required.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: