Hacker News new | past | comments | ask | show | jobs | submit login
Peredvizhnikov Engine: Lock-free game engine written in C++20 (github.com/eduard-permyakov)
369 points by gjvc on Sept 10, 2023 | hide | past | favorite | 164 comments



I have an Actor framework which uses a vanilla std::deque for method pointers, and to add messages to the queue, the locking technique is a Benaphore (the original Futex, which uses an atomic and a locking primitive, with the twist that my locking primitive is a combo of spinlock/mutex based on retry count). Nothing special. Benchmarks show that very rarely does the message push function block, and the chance of an OS context switch is also very low, so even though occasionally we get a locked thread swapping out, it is so infrequent that it doesn't justify the lock-free algorithm cost.

In a nutshell, non lock free queues are much faster than lock free queues, however you must be prepared to accept the very infrequent longer delay due to a context switch (where the lock is not available to anyone). My benchmarks show that the price is acceptable since so much more performance is available with non lock free queues. 10 million messages per second per work thread can be queued on modern hardware.


> the original Futex

The "Benaphore" was a pretty old idea, but the Futex isn't just "Oh it's a Benaphore but Linux" the essential trick is that you don't actually need a kernel object - your "locking primitive" at all, and that idea is where this goes from "Yeah, everybody knows that" to OK, our OS should add this feature ASAP.

Instead of an OS synchronisation object which is used to handle conflicts, with the futex design the OS carries a list of address -> thread mappings. If a thread T is asleep on a futex at address X, the address X goes in the list pointing to thread T. When the OS is asked to wake the X futex, it walks the list and wakes T.

The give away is the limits. For something like Benaphores you're constrained, these are a costly OS wide resource, I think BeOS only allowed 65536 per machine or something. But a Futex is just memory, so there's no reason to have any limit at all.


With a post like this, however interesting, you really should link the code. Not only will it prove your claim, but others like me will find the idea intriguing and immediately want to see how it's done.



It really depends on the specifics. If there's a lot of contention then performance will drop off a cliff. Even atomic instructions can become a bottleneck ( https://stackoverflow.com/q/2538070 ).

I think what you're observing, i.e. that in many cases just use a lock and don't worry about it (or your variation) is true. But there are certain applications/situations where you can do better. Having a consumer pull out everything from the queue with one locking operation and being careful with how you signal the consumer from the producer(s), assuming there's a signal, can also make the queue more efficient/have higher throughput (e.g. you shouldn't signal for every item you put in the queue, only when it becomes non-empty).


Aren't lock free data structures more about reducing the impact of contention than throughput under low contention?


Under a lock free algorithm, we promise that across our entire system, eventually some progress is made. I can't necessarily point at any specific thread of execution and say "This thread makes progress" but I can say that forward progress is made for the system as a whole.

Under a wait free algorithm, we promise that every individual thread eventually makes progress.

Suppose I have one really dumb thread which just goes to sleep for 10 seconds in a loop, as well as other threads which do some actual work. If we're a lock free algorithm, it is OK if the 10-second-sleep thread is always chosen to make progress, its "progress" consists of sleeping for 10 seconds, too bad, try again next time.

With a wait free algorithm, the threads which actually do useful work will also make at least some progress, eventually despite the 10 second sleeper.

Generally we can't say for sure that we e.g. "reduce impact of contention" only that we definitely make some forward progress, on at least one thread eventually.


Lock free is kind of an overloaded term that can mean a variety of things depending on what the user is thinking. Usually the goal is being able to make forward progress even if one thread is context switched out by the OS scheduler, which might be called wait-free or non-blocking. Using mutexes (unless you can prevent OS scheduling, interrupts, etc) makes this property impossible to achieve. In general, MPSC queues are super fast and there's no real reason to prefer a locked queue.


> In general, MPSC queues are super fast and there's no real reason to prefer a locked queue.

There's one significant advantage that a locked vector or deque has over MPSC/MPMC queues: the consumers can dequeue all messages in a single operation by locking the vector, swapping it with an empty vector (typically, that's just 3 words), and locking it again. That's such a simple operation that it will typically be as fast or even faster than a single pop-one-message operation in an MPSC/MPMP. Similarly, if the vector is empty, a producer can push any number of messages in a single, constant-time operation.


All true, although I would quibble with:

> That's such a simple operation that it will typically be as fast or even faster than a single pop-one-message operation in an MPSC/MPMP.

Not if the lock is blocked because one of the writers context switched out! The typical case is good, but the worst case is pretty bad.


Depends what you mean by “the impact of contention”. It is possible, if you build a lock-free system, to end up with threads that keep churning and never make forward progress.


My benchmarks show that the price is acceptable

Well, except to people who have a hard requirement that there never be this unpredictable, infrequent longer delay you mention.


It's probably fair to claim that zero* game developers have this hard requirement. At best they have a strong desire which might be partially backed up by benchmarks and squinting hard enough.

*: ignoring deliberately esoteric cases, like Doom on a hard real-time system.


It's a requirement for any game that needs a reliable frame rate. Unpredictable delays during frame rendering cause jank.

(This is not to say the OP has any such issues in its context.)


This is usually called "soft realtime" -- you want it to be fast but it's not wrong if it misses deadlines. It doesn't violate correctness of the system to miss a deadline. In a "hard realtime" system, it is a fatal error if the system misses a deadline because correctness is violated. I presume this difference is what OP is talking about. Games are never hard realtime; missing frame deadlines reduces user experience but it doesn't break the game. Hard realtime is things like industrial motion control, automotive and aviation electronics, pacemakers, etc.


I would say that an exception could be made for VR games where any frame rate 'jank' causes an uncomfortable experience and can lead to increased simulator sickness.


VR games almost all use Unity and Unreal. They drop frames left and right. VR platforms use extremely complex time warp algorithms to hide the jank.

So you're not wrong. VR games are much more susceptible to dropped frames causing problems. But it both happens and is hidden remarkably well.


> Games are never hard realtime

Depends how high your standards are! One of the things that makes playing old games on the original hardware really satisfying is how consistent they are.


Those old games could be so consistent because the software and hardware were both so incredibly simple compared to today.

Today, a PC game has to work on a large range of hardware that has come out over the past 5+ years. And there are GPU features that are only available on certain cards, like hardware ray tracing and things like DLSS and FSR for upscaling.

And the game engines are incredibly more complex today to handle modern expectations, with dynamic lighting and shadows, huge maps, etc.

It doesn’t matter what your standards are. Hard realtime just isn’t realistic or even possible any more, except maybe in a game that would be considered truly primitive by today’s standards.


I am sure I have memories of how hard it was to support a range of hardware when I had to write specifically for each piece. When I had to write separately for a Soundblaster card and an Adlib and a Disney SoundSource and a Roland and a Gravis.

And that was just the sound cards. Writing for different hardware became so much easier when OpenGl and DirectX came into being. Suddenly I just had to write to these APIs.

I think I'm disagreeing with you. Supporting multiple hardware configuration way back when was so much harder than doing it today.


Yeah, and Tandy graphics were different from EGA or VGA, just enough that you needed to write different display code for them. And you had to get the user to tell you which ports/IRQ numbers half of their hardware was configured at.

We have way better hardware abstractions in the OS today, which I would agree makes modern development easier overall.


Oh yes, that brings it all back. Having the user tell me the IRQ and also DMAs and... I'm sure there was more. Sometimes the user would end up having to open up the PC and reseat physical jumpers on hardware cards to have the cards use values that were both offered by the software and not clashing with anything else.

Different world. So much easier now.


Oh gosh, the many different variants of SVGA that existed back in the days...

/me shivers at the recollection


> Games are never hard realtime; missing frame deadlines reduces user experience but it doesn't break the game.

It depends on whether the game is multi-player and, if so, how it keeps the different players in sync with each other.

Games that rely on deterministic gameplay can desync (two players don't see the same world state) and abort if one player's simulation drops a frame while the other doesn't.


If you have multiplayer like this, you absolutely do not have a hard requirement on frame latency.

You don't control network latency spikes. Latency tolerance is a hard requirement, or you will have constant problems and likely be unplayable. Or custom networking and hardware stacks, which is deep into esoteric territory.


That's still just a degradation of user experience and not a fatal fault. Indeed, support for running in desynced mode is written is because they know deadlines can be missed. In hard realtime, deadlines can't be missed. There's no recovery; it's a critical fault and you have to halt or failover to a backup.


> That's still just a degradation of user experience and not a fatal fault.

I don't know how you'd describe a game spontaneously aborting not a "fatal fault". Yes, it's not turning off someone's pacemaker, but within the scope of what a game is able to do, kicking the player back to the matchmaking screen in the middle of a game is about as fatal as it gets.


That's only going to happen if the client has such a massive network latency spike that the server thinks it's disconnected. At least half a second, possibly more. You're never going to get that kind of delay from synchronizing threads, unless the process totally deadlocks.

EDIT: Well, there is one situation where you might get delays like that: if the computer is so woefully inadequate to run the game that it consistently misses frame deadlines by several hundred milliseconds. Of course, in such a situation a different concurrent algorithm wouldn't have solved anything anyway.


Literally any multiplayer game will have a "fatal fault" if its connection to the other players and/or server is interrupted for long enough, but it seems disingenuous to describe a system tolerant of delays variable across several orders of magnitude, up to hundreds of milliseconds or even full seconds, as "hard real time" in the sense in which the term is generally understood.


What I don’t understand here is how missing a frame deadline in this situation (<1/60 of a second or 16ms) is intolerable when typical network latency varies constantly by much more than this. It seems to me that if the latency requirements were this strict you could only support LAN multiplayer.


The only scenario I can imagine is some interactive game installation at Disney/etc. But there'd still be no actual benefit over an architecture more lenient to timing deviations.


> Games that rely on deterministic gameplay can desync (two players don't see the same world state) and abort if one player's simulation drops a frame while the other doesn't.

Right. This is why lockstep deterministic (LSD) games are bound to the SLOWEST player's machine.

No LSD game in existence crashes the game if one player's machine falls behind. Instead you either pause the simulation for all players until they catch up or you slow down time itself.

Source: shipped RTS games that were lockstep deterministic.


If you want to support players with ping over 16ms, there's no way you're gonna be synchronizing input from one frame to the output of the same frame for another player; there'll necessarily be some latency hiding, at which point any freeze should be coverable via just treating it as a temporarily-high ping and thus temporarily more latency hiding.


Lockstep simulation can't be hard tied to frame rate. Two computers will never have precisely the same frame timing -- even if they are running identical video modes, clocks are only accurate to within a tolerance and will diverge. The simulation has to allow for simulation ticks running independently from rendering frames.


Since you can’t control latency of individual packets let alone that the hardware timing is identical this isn’t practical as a requirement.

Deterministic gameplay means after the same x frames with the same external inputs you will end up with the same state y on different hardware. Therefore we only need to make sure the clients stay within a reasonable margin. We can do this by waiting for all players inputs so moving at the slowest speed possible which is the lock-step method. Or we can change our measured frame time so each tick (which still ticks the same amount of sim time) is seen as faster or slower. That way we can track the other frames that other clients have completed from the input they send and slow or speed ourselves up to maintain a decent margin. This is the scheme used in rollback networking as the naïve implementation where this isn’t the case forces all rollbacks onto the faster machine.


If your program state tracking/management is so tightly coupled to renderer that missing a frame can desync, then you have much bigger architectural problems anyway.

Network latency is a thing and your state management subsystem absolutely has to be fault tolerant.


"Unpredictable delays during frame rendering cause jank"

This is usually not serious inflicted and the presentation threads will be higher priority than the simulation in order to minimize visual 'jank'

Lockfree game code ain't fixing jank from the OS.


>Lockfree game code ain't fixing jank from the OS.

Latency is like a chain. Every link matters.

A game engine programmer can't do a thing about the OS, but can still keep latency bounded in the code under their control.


These people are probably not game developers because frame dips are all over the place on console and PC gaming.


Well do we have a benchmark where the price of locking is unacceptable? I think the linked project is a fun theoretical exercise, but I personally don’t accept the assertion that locking is unacceptable without seeing evidence.


I agree, lock free is cool and all, but often times it is more complex and not every case justifies the increased complexity.


The problem with vanilla std::deque is that the only acceptable implementation is in clang libc++. On MSVC it's basically an overcomplicated linked list.


Thank you for sharing this.

Could you tell me if your 10 million figure includes batching or are they a loop that tries to enqueue as many items as possible?


I gather from your comment that lock-free is actually the best choice for videogames. Where consistent frame timing should be paramount.


When you protect an std::deque with a mutex you would need at least two atomic operations: to lock the queue before pushing, and to unlock the queue after pushing. Because you're using an std::deque it may need to allocate memory during a push, which would happen under the lock, which makes it more likely for a thread to suspend with the lock taken. While the queue is locked other threads will have to wait, possibly even suspend on a futex, and then the unlocking thread would have to wake another thread up.

The most expensive part of any mutex/futex is not locking, it's waking other threads up when the lock is contended. I'm actually surprised you only get 10 million messages per second, is that for a contended or an uncontended case? I would expect more, but it probably depends on the hardware a lot, these numbers are hard to compare.

My actor framework currently uses a lockfree intrusive mailbox [1]_, which consists of exactly two atomic exchange operations, so pushing a node is probably cheaper than with a mutex. But the nicest part about it is how I found a way to make it "edge triggered". A currently unowned (empty) queue is locked by the first push (almost for free, compared to a classic intrusive mpsc queue [2]_ the second part of push uses an exchange instead of a store), which may start dequeueing nodes or schedule it to an executor. The mailbox will stay locked until it is drained completely, after which it is guaranteed that a concurrent (or some future) push will lock it. This enables very efficient wakeups (or even eliding them completely when performing symmetric transfer between actors).

I actually get ~10 million requests/s in a single-threaded uncontended case (that's at least one allocation per request and two actor context switches: a push into the target mailbox, and a push into the requester mailbox on the way back, plus a couple of steady_clock::now() calls when measuring latency of each request and checking for soft preemption during context switches). Even when heavily contended (thousands of actors call the same actor from multiple threads) I still get ~3 million requests/s. These numbers may vary depending on hardware though, so like I said it's hard to compare.

In conclusion it very much depends on how lockfree queues are actually used, and how they are implemented, they can be faster and more scalable than a mutex (mutex is a lockfree data structure underneath anyway).

I'd agree with you in that mutexes are better when protecting complex logic or data structures however, because using lockfree interactions to make it "scalable" often makes the base performance so low, that you'd maybe need thousands of cores to justify the resulting overhead.

.. [1] https://github.com/snaury/coroactors/blob/a599cc061d754eefea... .. [2] https://www.1024cores.net/home/lock-free-algorithms/queues/i...


how do you deal with msvc's `std::deque`? (or maybe you're not using literal std::dequeue)


What is special about msvc's `std::deque`?


The memory chunks it allocates to store entries are so small that it ends up allocating one chunk for every element for almost every type. The result is a lot of allocation overhead, pointer indirection and various other things that are bad for performance.


It's block size is something stupid like 16 bytes, so it effectively becomes a slower linked list.


See https://devblogs.microsoft.com/oldnewthing/20230810-00/?p=10... which includes confirmation & a brief commentary of the tiny block size.


Whatever the issue is, they support Clang on linux, i.e. no MSVC standard library.


The lockfree scheduler certainly looks interesting (especially linearizability of event broadcasts), but I was surprised to see benchmark results in the paper with a peak of 43500 messages/s for 12 pairs of actors (and 12 cores?), with a graph showing ~5000 messages/s for a single core, which is surprisingly low for that kind of benchmark. Unfortunately the engine requires linux and more importantly x86 (due to asm intructions) so I wasn't able to replicate them yet, but I would expect at least ~1 million requests/s per a pair of actors (e.g. with Erlang), otherwise the overhead is prohibilitely low.

The engine also focuses on message passing, but from experience it's very difficult to work with (state machines are hard, especially when working with multiple downstream actors), and at the core actors are more about isolating state without locks than message passing. Swift actors did it right in my opinion, method calls instead of messages are not only easier to reason about, they give additional hints to the runtime when context may switch without involving a scheduler at all (any shared state is slow and inhibits scalability).

I actually wrote a header-only library recently (search for "coroactors" if you're interested) that implements something similar to Swift actors with C++20 coroutines, and I thought ~10 million requests/s (when uncontended) or ~1-3 million/s (when contended and depending on a scheduler) was a way too high of an overhead, especially when compared to normal method calls with shared state protected with mutexes. Coroutines tend to go viral (with more and more functions becoming "async" coroutines), and any non-trivial code base would have a lot of coroutine calls (or messages passed), and that overhead needs to be as low as possible, otherwise you'd spend more time task switching than doing useful work.


It says that it's actor-based, and sending messages to an actor is equivalent to running the actor function under a mutex, and thus reduces parallelism (in other words, you have N threads sending messages, but only 1 thread running the actor code, so it's just as serialized as a mutex), so while it may technically be "fully lock-free", using actors means that there is no parallelization improvement.


Not quite; with mutex based actors an interruption of the actor thread would cause that mutex (and thus that actor code) to remain locked until the original thread resumes. No additional parallelism will allow that actor code to be "restarted" or "resumed" as the mutex owned by the interrupted thread is locked.

This implementation relies heavily on restartable functions in order to allow another parallel thread to pick up and continue the work of an already in progress but otherwise interrupted actor. See page three of the (excellent) design document: https://github.com/eduard-permyakov/peredvizhnikov-engine/bl...

Thus while it might not strictly be "more parallel" (same number of actors), it does seem to be able to make better use of more parallelism for completing the same set of work.


> sending messages to an actor is equivalent to running the actor function under a mutex

Where does it say that? In my understanding actor model means message-passing with asynchronous execution. So quite the contrary, actor model allows N threads executing in parallel given N actors.


Whenever I read any press about actors or goroutines, they say the same thing about preventing races (and the need for explicit locking) through share-nothing concurrency.

It's easy to scatter the computations, but they never go on to explain how to gather them back up.

You're going to render one frame to the screen. Did multiple actors have a write-handle into the frame buffer? What about collisions, does each entity collide with itself, without sharing position information with other actors?


One way or another, there has to be a master thread that gathers things back. So instead of a mutex, you have a supervisor.


How to gather them up, I guess messages again? It depends on the situation how easy it is, and yes I would assume typically not very.


Your right that there is no parallelization improvement, but it does not reduce parallelism either, its just a different(imo easier) way to think about concurrency.

Because it is easier for me to think about it is easier for me to see where things will contend the same resource and actually helps me improve potential parallelism. Once you recognize a particular opportunity where SMP can speed things up, you can stray a way from the actor-model a bit and have multiple threads receiving on your message queue, or if that isn't possible, you can just add more actors and split up the data better.


Does anyone have experience debugging/profiling highly contended critical sections of STM vs a more traditional mutex implementation? At the end of the day something has to mediate concurrent access to shared memory, there’s no free lunches, and mutexes are so well optimized, profiled, and understood. I’m unclear if the same applies to STM where a transaction may need to be retried an unbounded(?!) number of times.


Yes. The mediator in this case is the scheduler. The one that actually calls the asynchronous block. Potentially retrying it if it fails. In the OP’s code, there’s atomic blocks, sequential execution of blocks, stateful blocks, etc for ensuring singular access at a time.

The meat here is scheduler.cpp. It uses std::coroutines.

This is like async/await in other languages. The scheduler has a queue of work(coroutines) and a pool of threads(N>0) to execute those on.

In this case, messages are passed between work that contains the data. No locks are required at the expense of memory footprint.


Thanks!

So the scheduler serializes execution of critical sections?

> In this case, messages are passed between work that contains the data. No locks are required at the expense of memory footprint.

Are messages copied or moved? If moved is there compile time checking for ownership or runtime debugging tools?



Actually, for starvation-free STMs, transactions will retried a _bounded_ number of times. One example is 2PLSF, but there are several others https://zenodo.org/record/7886718


Interesting, thanks! It’s been years since I’ve read up on STM.



I didn't see mention of how hard it is to debug such an engine.


I don't have time to read through the impl, but the readme makes it sound like a classic distributed system between game threads, where patterns like retry-backoff will be common.


The paper goes into more detail without forcing you to go into the impl but it does seem to be a decent bit more advanced than that.

https://github.com/eduard-permyakov/peredvizhnikov-engine/bl...


As cool as lock free sounds, in my opinion any amount of non-trivial atomics usage should be accompanied by a formal, ideally machine checked, proof. Non sequential consistent atomic ordering usage is so difficult to get right. I've repeatedly seen code that gets it wrong, and the bugs that creates are the worst.


Where is the game demo? Also to be considered gaming engine nowadays, you need actual tools, exporters (Maya, 3DSMax, etc.) + who knows what else (collaboration tooling, metrics, alerts)


I disagree—"game engine" does not necessarily mean "thing I can replace Unity or Unreal with".


Is there any "game engine" here though currently? Seems like 'just' a lock-free work scheduler


Not even asking for that much. Any game demo would be appreciated.


>Lock-free

    export std::mutex iolock{};
    export std::mutex errlock{};
    SDL_PollEvent
Not yet at least.


What formal verification, if any, has been done on this (e.g. with TLA+ or similar)? Lock-free/wait-free designs are notoriously bug prone.


I wonder why it is using clang modules instead of proper C++20 modules.

Except for header units, the support is already mostly there.


Any games built with this


Yes, it’s called C++20 Game Engine Developer.


All those people who complain about project names squatting on existing words have finally been heard :-D


Peredvizhniki [0] were a group of Russian realist painters from the 19th century. Among other things, they organized "traveling" art exhibitions to promote Russian art in the provinces. The name roughly translates as "The travelers".

- [0] https://en.wikipedia.org/wiki/Peredvizhniki


Not all who wander are lost.


I tried in it a zxcvbn simulator, and it qualified as a strong password ;)


My experience is that zxcvbn lacks many non-English dictionaries, it often gives false-positives for foreign languages...


[flagged]


> Unless the name is explained somewhere, I'm convinced it's complicated on purpose.

Tell me you only speak English without telling me you only speak English.


From the grammar, I would not assume GP is a native english speaker, much less monolingual.


It seemed perfectly cromulent to me.


I would love to hear what exactly about the grammar doesn't sound native.


The only thing that seems strange is “such commands as” used to describe a single command. I think a more common way of writing it would be “a command like.” But, I don’t think it is at all definitive, it could just be a native English speaker who went with a slightly odd phrasing.


I guess that my pun using "unreal" in a post about a game engine went under the radar and now I look like I can only speak English which is kinda flattering and worrying.


The Russian word can be broken into three sections. The first part is a prefix associated with a transition. The middle means movement and the bit at the end means a person that does that. So even though the group of artists with that name were translated as "the wanderers" or "the itinerants", the word literally means people who move around.


That's an awesome wordplay with the intent of the engine


Otherwise known as "cd p<tab>."


The painting used is Barge Haulers on the Volga - https://en.m.wikipedia.org/wiki/Barge_Haulers_on_the_Volga

Can someone please expand on the significance of this achievement to someone used to shooting their foot off in C++ in a predominantly single threaded manner?


Never mind that a lot of Volga boats were very single-threaded.

https://www.amusingplanet.com/2021/12/belyana-russias-giant-...


It's a logging framework! :D


Streaming logging framework


You can even see the steam ships that replaced the slave in the background.


They were not the slaves, but unionized workers:

https://en.m.wikipedia.org/wiki/Burlak


AFAIU an артель artelʼ usually did not engage in collective bargaining as such, so calling it a union is not really accurate; you might compare it to a guild but I think there’s no implicaton of a monopoly on a particular trade either. The most accurate description I can think of is perhaps a cooperative combined with a mutual insurance fund.


Agree, I might've stretched It a bit. Your descriptions is more accurate.


YEp, somewhat resemblance to Teamsters


Fun fact - this was painted about 8 to 10 years after the abolition of serfdom in the Russian Empire. Coincidentally, it happened two years prior to the US Emancipation Proclamation. I've wondered from time to time how linked the two events were, if at all.


they were not slaves, but something like Russian Teamsters


I hope the author is not picturing developers using his engine.


As an addendum - a fair portion here should be in one way or another familiar with Ilya Repin's Reply of the Zaporozhian Cossacks (aka Zaporozhian Cossacks are Writing a Letter to the Turkish Sultan) https://en.m.wikipedia.org/wiki/Reply_of_the_Zaporozhian_Cos...


Barge hauler 4th from the back appears to be checking his messages.


"Tut tut, those poor Canadian renters..."


It’s quite the stretch to call this a game engine, rather than a tech demo for clever locking strategies.


This. There’s no demonstration of any game made with this “game engine”. The benchmarks are some matrix multiplication and unimpressive message passing.

There might be some cool data container ideas or primitives. Those could have useful applications. But it isn’t really a “game engine”. Nor does it seem like an interesting way to build one.


Indeed. The project has no renderer or game physics code it seems? If so, it is too premature to be a game engine.


I wouldn't even call it clever. The Actor model trivializes avoiding locks, since you're letting everything act on stale information (if the actors process concurrently) and then just iterating through their intended actions against a singleton state and resolving those conflicts with sequential code.


Thanks for the summary, mate.


Maybe it's not done yet.


Say you are in the middle of building a house. The foundation has been built and they are framing it up.

Is it a house?


No, because otherwise you wouldn't say "the foundation has been built". You'd say the "the house has been built".

Does a house need a foundation? Yes. Is the foundation the same as the house? No.


But you do say, “I am building a house.”


Sure, but you don't build a foundation and say "I have built a house."

They're building a game engine, but it is not an actual engine yet. Just one core component.


If what you are building isn't an actual house, or an actual game engine, then how could anybody claim they had build that particular house, or that particular game engine?

If what you were working on--for the entire time you were working on it--was a completely different thing from a house or from a game engine, how could you clame to have built a house or a game engine? Putatively, what you were working on what an entirely different thing from the result.


No


No, and if you accept it as a house then this is a recipe for developing a bad habit of leaving a trail of half finished projects in your wake.


Can you point at a house and say “I designed this house?” If the house didn’t have some kind of existence while you were even designing it, how could it be that you designed that, particular house?


This model isn't novel, there's been experimental engines using this technique for years now. They're just not practical at scale


> At the moment, the only supported platform is Linux.

Regardless of your feelings on the status quo, there is one thing you must do when building a game engine if you want it to succeed: support Windows.


I'd love to see the Steam Deck "console" change this status quo.


Most of the SteamDeck games are just running windows games through photon, wine and other compatibility layers.

While that works amazingly well, I tend to prefer games with native Linux and SteamOS builds, even though they're rare.


win32 is the stable ABI for SteamOS, same as any other Linux distro


bullshit, consoles/mobile are bigger markets than PC/Windows

PC is just less than 1/3 of the whole picture

https://www.data.ai/en/insights/mobile-gaming/2022-gaming-sp...


I feel like it's a mistake to consider mobile games as being part of the same market as PC/console games.

Sure they're both "games" but I don't know that they're competing for the same set of users - either people play one or the other, or the people that do crossover in both markets are probably playing PC/console games at home and mobile games on the bus or train.


Got bad news for you about what the console SDKs run on

(Also, the consoles don't run Linux)


From what I know:

XBOX runs a variant of Windows.

PS3/4/5 OSes have all been based on FreeBSD.

Apparently Nintendo Switch runs a proprietary kernel, which is interesting (https://en.wikipedia.org/wiki/Nintendo_Switch_system_softwar...).


A bit more than that since Xbox runs Windows too.


July console sales:

PS5: 1.2m

Switch: 950k

Xbox: 370k

Xbox accounts for just 17% of total console sales in July

Both Switch and PS5 are FreeBSD based

If we count the whole period of the current gen of each vendors, it only accounts for 13%, it's not big

https://www.vgchartz.com/


And what about actual game sales and revenue through things like GamePass. It doesn't help your game if a million people bought a Nintendo Switch if 80% of them only buy Pokémon, Mario Kart and Zelda.


Actually its only the PS5 that's FreeBSD based. Switch runs a completely proprietary Nintendo OS that borrows a lot from Android.


A mix of the two

"partially Unix-like via certain components which are based on FreeBSD and Android"

https://en.wikipedia.org/wiki/Nintendo_Switch_system_softwar...


from the same article

"it is based on a proprietary microkernel"

"....Despite popular misconceptions to the contrary, Horizon (the Switch's OS) is not largely derived from FreeBSD code, nor from Android..."


It borrows very little from Android - I think it mostly draws some parts from stagefright.


Also borrows the whole of its compositor "nvnflinger" from Android's "SurfaceFlinger".

I'd call that a pretty significant piece.


A special flavour of Windows.


He's probably referring to the Desktop/laptop market. In which case windows controls like 90%.


Title is: "Peredvizhnikov Engine is a fully lock-free game engine written in C++20 "

A Game Engine targets various platforms

A "video-game" is not something exclusive to desktop/laptop windows market


Right but a toolchain that targets anything else other then a desktop is not straightforward to find or setup.

Toy game engines like this are in the majority of cases used on desktop.


Rude way of putting it, but yeah, mobile dwarfs everything.


Pretty sure they already succeeded. Wasting effort supporting an advertising platform like windows is irrelevant to whether or not they successfully made a lock-free game engine.


It’s licensed under GPL3, where the developer suggests individually negotiating a license for commercial projects. I am not sure this is a good approach given the early stage of this project. Video game development is already commercially and technically risky enough.

I cannot see game developers lining up to even try out this unproven technology when they have no sense of what the eventual fees will be.


quoting https://github.com/eduard-permyakov/peredvizhnikov-engine

The source code of Peredvizhnikov Engine is freely available under the GPLv3 license. However, I may grant permission to use parts or all of the code under a different license on a case-by-case basis. Please inquire by e-mail.


Actors is one of the worst programming models possible. It's hard to observe actor-based systems and debug them. The protocol complexity (and complete protocol informality) usually brings much more trouble than any kind of locking/synchronization.


I’ve had good success with actor based models. When you say “protocol complexity”, I don’t understand what you mean.

Observation requires good logging, but this isn’t out of line for any complex system. Debugging (as in actual breakpoints in an IDE or post-mortem analysis) can be facilitated with stack tracking (this same problem occurs with async await patterns and is solved in a similar way).

The advantages and disadvantages exist, but I think it’s an extremely effective programming model for many use cases. Formal state-machine programming, that’s the worst model, unless you need it.


For game dev or in general? I hate using actors for general backend stuff, but for game dev maybe it makes more sense.


Actors tend to lead to bloated incomprehensible code. Even when you do want game objects to look like independent message passing actors. Faking it is always easier.

Actors are self defeating because they turn everything into a distributed system.


That's what I was thinking. I'm trying not to write Actors off because I'm not an expert on game dev and can imagine the possibility of an unusually large-scaled game. But yeah, if you're writing a typical game, I imagine you're best off doing it the typical way that's been optimized for.


[flagged]


Forgive me if it’s common knowledge, but what’s wrong with SDL2 in this case? Broadly speaking, I feel as though there is largely positive sentiment around the SDL project, no?


I have nothing against the SDL project. I have everything against the DirectMedia style api of 1997. I get that people are actively developing with it and that roadmap towards 3.0 is getting close but my experience is it’s designed for legacy games, legacy rendering styles, legacy ABI’s. No one is shipping games for Dreamcast or PS2.

2/3rds of the library is dead code when working with Vulkan.


TIL Unreal 5 based games were "legacy"


They really only use it for Linux window and audio because there’s so many ways to skin a disto.


They definitely use it for more than that, because I'm sure I've seen it ship with Windows games as well.

Also there's exactly 2 ways to get a window on Linux, and those 2 ways are common to pretty much every modern distro. Same as audio.


What is a better alternative?


For what? Working with Vulkan? Or a windowing library? Personally I use a mix of glfw and native windowing on mobile, which is surprisingly simple for a gfx context window and this gets me to triangle.

SDL2 is about the same amount of boilerplate code only in SDL_Thing form.

Now, if I was writing a game that needed to be shipped on everything possible and I can’t afford Unity or Unreal, SDL is a viable choice. MonoGame, mine, and others have used it, it’s battle tested.

I just can’t look at SDL code anymore and say “this is the cleanest api” for anything outside of C99.


As with most SDL usage, it merely provides a rendering context and an input abstraction.

That's almost nothing, its boilerplate you don't want to write. Which particular lib you grab that boilerplate from, SDL, GLFW, fucking GLEW, whatever, it doesn't matter. SDL is a widely accepted library for doing so and it's fine. If you want more complete input handling you'll be using the platform APIs directly.


And surfaces, and audio… yeah I know. For tinkering and learning, sure. If you’re going to build an engine, pull that boilerplate code within.


Many, arguably most, custom game engines never internalize that boilerplate. Most engine codebases are much smaller and more specialized than the AAA players, and gain nothing from splitting their build into platform-specific layers when they could just link SDL.


[flagged]


Breaking the site guidelines like this will eventually get your main account banned as well, so please don't.

HN is no place for nationalistic flamewar, including when there's an actual war going on—in fact, especially not then:

"Comments should get more thoughtful and substantive, not less, as a topic gets more divisive."

https://news.ycombinator.com/newsguidelines.html


I'm 99% sure, the author has never done any completed / published non trivial game...

.. because doing lock-free programming is least of problems when creating games.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: