Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, while in my older tests on Intel Skylake derivatives I have also obtained SMT speedups around 25%, on the newer Zen 3 (a 5900X) I have obtained at most a 20% speedup in the most SMT friendly task that I have ever encountered, i.e. the compilation of a big software project (the comparison being done between optimal parameters for SMT disabled vs. SMT enabled, which for a 5900X have been determined to be "make -j13" vs. "make -j24").

An example of a multithreaded benchmark that is not SMT friendly is the GeekBench 6 multithreaded test, where Zen 3 with SMT disabled (12 threads on a 5900X) is slightly faster than with SMT enabled (24 threads on a 5900X).



It's worth noting that compilation is a partially serial task (e.g. linking is often largely single-core). It's entirely possible that going from 4 to 8 threads is much more helpful than 12 to 24 threads, as a 24-thread system will have far more idle threads in comparison. (Of course this is assuming 4c8t Skylake, so a normal consumer i7. Skylake-X had more cores.)


For big projects, as I have mentioned, the linking phase is at most a few percent of the compilation time.

I have CPUs with 48 threads and for big projects the compilation time decreases monotonically from 1 thread to 48 threads (and almost proportionally with the number of threads until 24 threads, then with a constant smaller slope until 48). The published benchmarks show that for big software projects the compilation times decrease monotonically until 512 threads on a 2-socket MB with 128-core CPUs.

So the compilation of a big software project, like Chromium, Firefox, LLVM, gcc, the Linux kernel, Libreoffice, etc. (all of which have many thousands of files that must be compiled) is one of the tasks that can use efficiently any number of threads that is currently available.

Moreover, there are now linkers that work partially concurrently. Tasks like the relocation of the object files and of the external symbol references can be done completely concurrently for all source files (after the start addresses are known for all object files and the corresponding object files are known for each symbol).




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: