> CPU technology is quite arcane, very high level, there are so many patents, IP...

jokoon · on June 15, 2020

> If such a mechanism existing it would be documented

Why would it? It's an internal functionality, and CPU usually have a 1 year warranty or so, and I'm not sure they really have guaranteed FLOPS, only frequency I guess. If it's tightly coupled to trade secrets, I would not expect this to be documented. I also doubt that you could find everything you want to know in a CPU documentation.

> There is no actual evidence

The wikipedia article I mentioned, physics is enough evidence.

> If a single gate fails in a CPU

I did not say fail, I meant "miscalculated". There is a very low probability of it happening, but it can still happen because of the high quantity of transistors, hence error correction.

> Such redundancy is so incredibly expensive from a power and chip area point of view

Sure it is, so what? At one point all CPU need it and it becomes necessary. There are billions (I think?) of transistors on a CPU.

rcxdude · on June 15, 2020

Documentation is light on details, but both major CPU vendors give extensive documentation on the performance attributes of their processors, such as how many cycles an instruction may take to complete, and none see fit to mention once 'may take an arbitrary amount longer as the CPU ages'. Not to mention, these performance attributes are frequently measured by reasearchers and engineers, and such an effect as instructions taking more cycles on one sample compared to another from the same batch has yet to be observed (and it's notable and noted when it does differ, e.g. from different steppings or microcode versions). At least one of the many many people who investigate this in great detail would have commented on it.

The wikipedia article you linked makes zero mention of redundant gates as a workaround for reliability issues. The only thing close is that designers must consider it, but this is design at the level of the geometry of the chip, not its logic. It doesn't even make good sense as a strategy: the extra cost of redundant logic to work around reliability issues on a smaller node will outweigh the advantages of that node.

One of the greatest things about modern CPUs is how reliably they do work given that you need such a high yield on individual transistors.

jokoon · on June 15, 2020

Thanks for convincing me!