Hacker News new | past | comments | ask | show | jobs | submit login

That is how I learned it at uni too. Unfortunately, I have found these definitions are all over the place in some literature.





Pragmatically it doesn't feel like there's a lot of difference to humans, because even single-threaded concurrent code feels like it's parallel because computers are just so, so much faster than humans that switching rapidly feels parallel. And you often are well advised to treat it as parallel anyhow, at least to some extent, for safety's sake. That is, you may be able to get away with less "locking" in concurrent-but-not-parallel code, but you still can't write it exactly the same as conventionally single-threaded code or you will still get into a lot of trouble very quickly.

So pragmatically I don't find a lot of value in distinguishing between "concurrent" and "parallel". Some. But not a lot.

There is a difference for sure. It just isn't useful in proportion to the way people like to jump up and correct people about the differences.


Think about it not in terms of timing but in terms of instruction. Parallelism requires that different instructions can occur with resource independence. In JavaScript that requires something like web workers or Node clusters. Concurrency merely means one task does not block another, which is less restrictive and more commonly available through the event loop.

To sort through the "dialect of jargon" that a piece of educational CS/SWEng writing is using, it helps to put it in the context of its publication. Consider both when it was published, and what kind of work the author of the writing does.

Why "when it was published"?

Well, mainframes were single-core until the 70s, and then multi-core (really, massively multi-socket NUMA) thereafter. PCs were single-core until the late 90s, and then multi-core thereafter. Thus:

• Anyone writing about "concurrency" in some old Comp Sci paper, is probably thinking about "concurrency" as it pertains to their local single-core Minix mainframe, or on their single-core Sparc/NeXT/SGI university workstation — and so is inherently thinking and talking about some form of cooperative or pre-emptive multi-tasking through context-switching on a single core, with or without hardware-assisted process address-space isolation.

• Anyone writing about "concurrency" in some old "industry-sponsored" Software Engineering or (especially) Operational Research paper, was likely working with the early parallel batch-processing mainframes, or perhaps with HPC clusters — and so is much more loose/sloppy with their definitions. In their mental model, there is no context-switching — there is just a cluster-level workload scheduler (itself bound to a core) which assigns workloads to cores; where these workloads are essentially single-threaded or shared-nothing-multi-threaded virtual machines, which own the cores they run on until they finish. (Actually very similar to writing CUDA code for a GPU!) To them, the only kind of concurrency that exists is parallelism, so they just use the words interchangeably.

And why "what kind of work the author does"?

Well, anyone writing about "concurrency" today, is writing in a context where everything — even the tiniest little microcontrollers — have both multiple cores and the inherent hardware capability to do pre-emptive multitasking (if not perhaps the memory for doing so to make any sense — unless your "tasks" can be measured in kilobytes); and yet everything does it in a slightly different way. Which in turn means that:

• If such a person is writing product software, not in control of the deploy environment for their software — then to them, "concurrency" and "parallelism" both just mean "using the multi-threading abstractions provided by the runtime, where these might become OS threads or green-threads, might pin their schedulers to a core during CPU work or not, might yield the core during IO or not, who knows." None of these things can be guaranteed — even core count can't be guaranteed — so their definition of what "concurrency" or even "parallelism" will do for them, has to be rather weak. To these people, "parallelism" is "nice if you can get it" — but not something they think about much, as they have to write code under the assumption that the software will inevitably get stuck running on a single core (e.g. on a heavily-overloaded system) at some point; and they must ensure it won't deadlock or livelock under those conditions.

• Meanwhile, if such a person comes from a background of writing SaaS software (i.e. software that runs in a big knowable environment), then anything they say about concurrency / parallelism is likely founded on the assumption of having large distributed clusters of big highly-multicore servers, where the chief concern isn't actually in achieving higher throughput through parallelism (as that part is "easy"), but in resolving distributed data races through write-linearization via artificial concurrency-bottlenecking abstractions like channels, message queues, or actors that hold their own linear inboxes. For these types, "parallelism" puts them in mind of the "multi-threaded with shared state behind semaphores" model that they want to avoid at all costs to keep their software scalable and ensure distributed fault-tolerance. So this type prefers to talk about designing architectures made of little actors that are individually intentionally concurrent-but-not-parallel; and then hand-wavingly introducing shared-nothing instances or pools of these trees of little actors, that can live and move within greater parallel distributed clusters.

• And if such a person comes from a background of writing embedded software (i.e. software that runs in a small knowable environment), then their assumption will likely be founded on a concern for achieving realtime dataflow semantics for at least some parts of the system — requiring some, but not all, of the cores to sit there bound to particular tasks and spin-waiting if they finish their processing step early; while other cores are free to be "application cores", executing arbitrarily-long instruction sequences. To these people, "concurrency" is mostly the frustrating low-level process of handing off data between the real-time and non-realtime worlds, using lockless abstractions like shared ring buffers; and "parallelism" is mostly just getting the single most expensive part of the application to schedule a pool of almost-identical expensive operations onto a pool of specialized identical cores. (This is the design perspective that made the PS3's Cell architecture seem like a good idea.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: