My own benchmarks. The trick is SSE support.
darwin/bsd has that builtin, esp. since darwin can guarantee that their chips do support SSE. Generic linux builds do not, and -march-native or switching to sse optimized shared libs is rarely used.
But recently also bos and align_size support got better in other libc's, which do use clang and not gcc. gcc sucks big time with those optimizations. freebsd and darwin all use clang. This is in the ~60% ball figure.
Every 64-bit OS can guarantee SSE2 support, because it's baked into the x86-64 spec. Every single 64-bit Linux build can and does use SSE2 (unless someone explicitly turned that off for no reason). If there are performance differences, it's not due to chip support and crippled builds for compatibility.
The size of dynamically allocated malloc'ed structures, not just constants. The majority of pointers have an alloc_size, but no object_size (i.e. bos: __builtin_object_size).
You have a source for that? it wouldn't surprise me, GNU code tends to be more bloated than a similar BSD licensed project.