The most important pattern to learn in C is to allocate a giant arena upfront and reuse it over and over in a loop. Ideally, there is only one allocation and deallocation in the entire program. As with all things multi-threaded, this becomes trickier. Luckily, web servers are embarrassingly parallel, so you can just have an arena for each worker thread. Unluckily, web servers do a large amount of string processing, so you have to be careful in how you build them to prevent the memory requirements from exploding. As always, tradeoffs can and will be made depending on what you are actually doing.
Short-run programs are even easier. You just never deallocate and then exit(0).
Most games have to do this for performance reasons at some point and there are plenty of variants to choose from. Rust has libraries for some of them, but in c rolling it yourself is the idiom. One I used in c++ and worked well as a retrofit was to overload new to grab the smallest chunk that would fit the allocation from banks of them. Profiling under load let the sizes of the banks be tuned for efficiency. Nothing had to know it wasn't a real heap allocation, but it was way faster and with zero possibility of memory fragmentation.
Most pre-2010 games had to. As a prior gamedev after that period I can confidently say that it is a relic of the past in most cases now. (Not like that I don't care, but I don't have to be that strict about allocations.)
Yeah. Fragmentation was a niche concern of that embedded use case. It had an mmu, just wasn't used by the rtos. I am surprised that allocations aren't a major hitter anymore. I still have to minimize/eliminate them in linux signal processing code to stay realtime.
The normal practical version of this advice that isn't a "guy who just read about arenas post" is that you generally kick allocations outward; the caller allocates.
> Ideally, there is only one allocation and deallocation in the entire program.
Doesn't this techically happen with most of the modern allocators? They do a lot of work to avoid having to request new memory from the kernel as much as possible.
Just because there's only one deallocation doesn't mean it's run only once. It would likely be run once every time the thread it belongs to is deallocated, like when it's finished processing a request.
Short-run programs are even easier. You just never deallocate and then exit(0).