Hacker News new | past | comments | ask | show | jobs | submit login

I think the flops comparison you’ve presented is not fair: for nvidia it is “tensor” floops, not generic float multiplication (which is 10 times smaller), while for intel it is any float multiplication.

So for i9 the number would be higher if fma operations used, no?




Tensor flops is significant since this is exactly the use case for which it was designed. So IMO the comparison is fair.


It doesn’t make sense. Why it is fair to compare matrix multiplication with generic float operations? It should be either comparison of matrix multiplication to matrix multiplication or generic float to generic float.


Well, one confounding factor is that CPU Flops are more generic, for any algorithm. GPU Flops as mentioned work better on tensor cases.

However, when we do have tensors, the GPU and CPU would both work to their full potential, and thus the flops comparison ought to be valid.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: