Nowadays there are increasingly many memory buses. I think the top-end Epyc uses...

mumblemumble · on Dec 19, 2020

Perhaps I over-simplified, but, from what I've seen, performance benefits from multi-channel memory architectures are pretty variable and not something a programmer should typically assume.

Partially this is due to configuration variability. A lot of laptops are still configured in single channel mode. Even high-end laptops tend to be dual channel setups, even if the CPU itself supports more. I'm inclined to say that high-end CPUs configured for 4 (or more) channels are rare enough that you probably shouldn't expect to be able to enjoy one unless you're developing in-house software where you can know exactly what kind of hardware you'll be running on.

And partially it's due to contention. All the memory channels in the world won't help you when memory access starts to pile up on the same bank. Which is, in practice, going to be something that will happen all the time if you're not taking steps to keep it form happening, because that's the kind of unhelpful jerk that stochastic processes are.

ncmncm · on Dec 20, 2020

Agreed. I did not mean to detract from your point, and awareness of bus architecture is so limited that vendors usually get away with cheaping out on memory channels so we often have even fewer than our CPU module could exercise.

In practice if you have more memory buses they are more likely to be useful to help more processes or threads make simultaneous progress than to speed up one process. For a single thread to make good use of more memory buses usually requires using memory pre-fetch intrinsics and memory sequestration, and even then it is hard to keep more than two or three usefully engaged for that thread.

So, your point stands.