That's true, but it's also true that "straight line" algorithms that don't involve branching pipeline stalls tend to be straightforwardly parallelizable, in which case a scalar CPU isn't the right hardware to be using in the first place and you want to be comparing vector FPU and/or GPU implementations.