Hacker News new | past | comments | ask | show | jobs | submit login

Yup. Clock-for-clock, Skylake is about 10-20% faster than Sandy Bridge, IPC improvements have been <5% per year. The apparent improvement since that time has been slowly cranking up the stock clockrates. If you overclock a Sandy Bridge to >4 GHz, which is extremely reasonable, then it keeps up just fine with a Skylake in most tasks.

CPU performance is largely "good enough" for most users. OS bloat has finally stopped: Win8.1 is just as fast as Win7 (and is more stable) and Win10 is faster and skinnier. Most users don't do anything intensive and probably wouldn't even notice if you substituted in a low-end processor. For those that do have big needs, GPU offloading has taken off in a big way.

This is kind of unfortunate in other respects. CPU performance (especially single-threaded) is extremely important for high refresh rates. At 144hz there's no margin for any weak link in the system. But I recognize that I'm kind of a niche user in that regard.




> Yup. Clock-for-clock, Skylake is about 10-20% faster than Sandy Bridge

Sorry, can you elaborate on this? Is Skylake somehow performing multiple instructions per core per clock cycle somehow?


Sometimes - there's multiple types of execution units in a CPU core (even multiples of the same type), and a thread can dispatch to multiple units at once (superscalar execution). It can also reorder the instruction stream to keep all the units occupied (out-of-order execution), preemptively execute along the most likely direction a branch will take (speculative execution), etc.

Basically, it's all a massive game to keep all the units of a core busy to execute the desired instruction stream as fast as possible. Over time, successive CPU architectures have gotten better at playing the game: better occupancy, more execution units, and more powerful units (SSE, AVX, etc), which translates into a greater number of instructions executed per clock cycle (IPC).

That's why a Skylake is much faster than a Pentium 4, even though the P4 might run at a higher clockrate. The Skylake has better IPC.

And as a side note: what Hyperthreading does is duplicate the part of the core that manages registers and instruction dispatch for a thread. So you have a second thread that can utilize any execution units that the first thread left unoccupied.

Bulldozer works somewhat similarly: two threads share a single core, and each core has a pair of integer execution units but they share a floating-point unit. So kinda like a Super-Hyperthreading, where they include a duplicate of (what they hope is) the most needed execution unit. Doesn't always work out in reality though.

https://en.wikipedia.org/wiki/Instructions_per_cycle

https://en.wikipedia.org/wiki/Superscalar_processor

https://en.wikipedia.org/wiki/Out-of-order_execution

https://en.wikipedia.org/wiki/Speculative_execution

https://en.wikipedia.org/wiki/Hyper-threading

https://en.wikipedia.org/wiki/Bulldozer_(microarchitecture)


Very instructive, thank you!


High end chips have been doing multiple instruction per clock for a while now.

https://en.wikipedia.org/wiki/Superscalar_processor


Added hardware instructions for common software instruction combinations taking load of the CPU itself and offloading it into a mini-ASIC.

Like AES, Floating point, h.264 etc;


Intel CPUs have had special AES instructions for a long time.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: