Fintech has mostly determined that 1 thread can get the job done. See LMAX disru...

Retric · on June 14, 2023

> What problems exist that generate events or commands faster than 500 million per second?

AAA games, Google search, Weather simulation, etc? I mean it depends on what level of granularity you’re talking about, but many problems have a great deal going on under the hood and need to be multi threaded.

bob1029 · on June 14, 2023

I would add a qualifier of "serializable" to those events or commands. This is the crux of why fintech goes this path. Every order affects subsequent orders and you must deal with everything in the exact sequence received.

The cases you noted are great examples of things that do justify going across the PCIe bus or to another datacenter.

Retric · on June 14, 2023

That’s half of it, the other half is each of those events is extremely simple so the amount of computation is viable with a single thread.

If individual threads were dramatically slower the architecture would get unpleasant by necessity. Consider the abomination that is out of order execution on a modern CPU.

spacechild1 · on June 14, 2023

Don't forget about audio!

CyberDildonics · on June 14, 2023

I don't think audio applies here at all.

Each channel can be mixed in parallel, 44,100 samples per second is not much per channel and mixing isn't difficult.

Also most can be cached and don't need to be mixed or filtered in real time because they haven't been changed from the last play.

spacechild1 · on June 14, 2023

Sorry, I should have been more specific. I meant DAWs and computer music environments, not simple audio players. Modern DAWs are heavily multi-threaded.

taeric · on June 14, 2023

I am curious on that. 500 million events per second sounds high. Even for games. That many calculations? Sure. I take "events" to mean user generated, though. And that sounds high.

Same for searches. Difficulty there is size of search space, not searches coming in. Right?

ascar · on June 14, 2023

Google search isn't a good example. AAA games are a great example when you think about graphics. However, most of that is trivially parallelizable, thus "all you need to do" is assign vertices/pixels to different threads (in quotation marks as that's of course not trivial by itself, but a different kind of engineering problem).

However, once you get into simulations you have billions (or multiple orders of magnitude more) elements interacting with each other. When you simulate a wave every element depends on it's neighbors and the finer the granularity the more accurate your simulation (in theory at least).

taeric · on June 14, 2023

Thinking of graphics, though, i would assume most of that is in the GPU side. Simulations do make sense, but I see games like Factorio still focused on single thread first. And then look for natural parallel segments.

That is all to say that millions of events still feels like a lot. I am not shocked to know it can and does happen.

imtringued · on June 14, 2023

There are no good solutions for something like factorio. There are solutions that work but they aren't worth the trouble. My personal recommendation is that you split the world into independent chunks. A big interconnected factorio map is a nightmare scenario because there is hardly anywhere where you can neatly split things up. Just one conveyor belt and you lose. Aka parallelize disconnected subgraphs.

So the game would have to be programmed so that conveyor belts and train tracks can be placed at region boundaries and that there is a hidden buffer to teleport things between regions. Now you need an algorithm to divide your graph to both minimize the imbalance between the number of nodes in the subgraph but also to minimize the edges between subgraphs.

pshc · on June 14, 2023

Just dreaming over here, but if someone had the opportunity to rebuild a Factorio from the ground up, I bet they could design something massively scalable. Something based in cell automata, like how water flow works in Minecraft. Current cell state = f(prev cell state, neighboring cell states).

It would take some careful work to ensure that items didn't get duplicated or lost at junctions, and a back-pressure system for conveyor belt queues. Electrical signals would be transmitted at some speed limit.

crabbone · on June 14, 2023

This is a wrong view of the problem. Often times your application has to be distributed for reasons other than speed: there are only so many PCIe devices you can connect to a single CPU, there are only so many CPU sockets you can put on a single PCB and so on.

In large systems, parallel / concurrent applications are the baseline. If you have to replicate your data as its being generated into geographically separate location there's no way you can do it in a single thread...

preseinger · on June 14, 2023

> God forbid you have to wait on the GPU or network. If you have to talk to those targets, it had better be worth the trip.

few programs are CPU-bound, most programs are bottlenecked on I/O waits like these

jason_wo · on June 14, 2023

As far as I know the LMAX disrupter is a kind of queue/buffer to send data from one thread/task to another.

Typically, some of the tasks run on different cores. The LMAX disruptor is designed such that there is no huge delay due to cache coherency. It is slow to sync the cache of one core to the cache of another core when both cores write to the same address in RAM. The LMAX disruptor is designed that each memory location is (mostly) written to by at most thread/core.

How is the LMAX disrupter relevant for programs with 1 core?

bob1029 · on June 14, 2023

> How is the LMAX disrupter relevant for programs with 1 core?

It is not relevant outside the problem area of needing to communicate between threads. The #1 case I use it for is MPSC where I have something like an AspNetCore/TCP frontend and a custom database / event processor / rules engine that it needs to talk to.