> Doing very small (less than say 32 bytes) allocations is also wasteful just due to the very large amount of data in proportion that will be used just to keep track of that tiny little memory area (within the malloc system). Not to mention fragmentation of the heap.
That not necessarily true. Modern allocators tend to use a bunch of fixed size-buckets.
But given that curl runs on lots of platforms it makes sense to just fix the code.
There has to be a free list somewhere, but a single bucket only needs a bitmap. I think the GP's point is that such a structure amortizes the cost of the free-list metadata over more items, reducing total overhead.
The free list can be stored inside the empty cells, meaning you put the pointer to the next empty cell inside the previous empty cell, and you need a single additional pointer to store the location of the first empty cell. When you free a cell you just make that the first empty cell.
It isn't free, but all that pointer chasing is one step for each allow and one for each free. There is no need to look for a 'best' block on alloc, nor is there a need to search for a best place to insert a block on free.
Moreover, the cache hit you take for an alloc likely would have happened anyways because, presumably, the program that made the allocation wants to write to the allocated memory (in theory, that doesn't imply the CPU has to _read_ the memory, but are there allocators that are that smart?)
For frees, the memory may, but need not be, already be in the cache when free is called.
I'm curious if any allocators with size classes/buckets actually do much pointer chasing. One can theoretically have a free-list for each class, so at most one pointer needs to be dereferenced: work out the size class, load the front of the appropriate list, read its tail pointer and store that as the new front of the list.
That not necessarily true. Modern allocators tend to use a bunch of fixed size-buckets.
But given that curl runs on lots of platforms it makes sense to just fix the code.