Using preload on the stock Ubuntu binary, to give it mimalloc, they got a 44% speedup.
By rebuilding the binary with different compiler options, but not changing malloc, they got an 20% speedup.
If we naively multiply these speedups, we get 1.78: 78% faster.
How it goes to 1.9 is that when you speed up only the program, but leave malloc the same, malloc matters to its performance a lot more.
When the faster malloc is applied to a program that is compiled better, it will make a better contribution than the 44% seen when the allocator was preloaded into the slower program.
To do the math right, we would have to look into how much time was saved with just the one change, and how much time was saved with the other. If we add those times, and subtract them from the original, slow time, we should get a time that is close to the 1.9 speedup.
By rebuilding the binary with different compiler options, but not changing malloc, they got an 20% speedup.
If we naively multiply these speedups, we get 1.78: 78% faster.
How it goes to 1.9 is that when you speed up only the program, but leave malloc the same, malloc matters to its performance a lot more.
When the faster malloc is applied to a program that is compiled better, it will make a better contribution than the 44% seen when the allocator was preloaded into the slower program.
To do the math right, we would have to look into how much time was saved with just the one change, and how much time was saved with the other. If we add those times, and subtract them from the original, slow time, we should get a time that is close to the 1.9 speedup.
Original time: 4.631
Better compiler options alone: 3.853 (-0.778)
Better allocator alone (preload): 3.209 (-1.422)
Add time saved from both: 2.200
Projected time: 4.631 - 2.200 = 2.431
Projected speedup from both: 4.631/2.431 = 1.905
Bang on!