The figure "20%" alone means nothing. You are probably meaning, "20% in this specific benchmark with "non-huge matrices" (as said by the OP). Algorithms can show wildly different performance when run on small-scale and large-scale inputs.
Well then you have to still test it if it's still 20% or not. I haven't written the code, I just think you shouldn't have to resort to hyperbole (ten hours, ten days) if the comment above doesn't warrant it.