Training and inference are only similar at a high level, not in actual application.
(ETA: In case it's not obvious, I'm agreeing with david-gpu's comment, and adding more reasons that training currently differs from inference.)
Training and inference are only similar at a high level, not in actual application.