Google's hardware is for inference, not training.

josephpmay · on May 10, 2017

Volta is for both inferencing and training, but has an emphasis on inferencing

gigatexal · on May 10, 2017

thanks for clarifying.

JustFinishedBSG · on May 11, 2017

It doesn't matter, operations are the same in forward and backward mode.

"Made for inference" just means "too slow for training" if you are pessimistic or "optimized for power efficiency" if you are optimistic.

Otherwise training and inference are basically the same

david-gpu · on May 11, 2017

You can do inference pretty easily with 8-bit fixed point weights. Now attempt doing the same during training.

Training and inference are only similar at a high level, not in actual application.

redcalx · on May 11, 2017

... because the gradient that is being followed may have a lower magnitude than can be represented in the lower precision.

dgacmu · on May 11, 2017

You also need a few other operations for training, such as transpose, which may or may not be fast in a particular implementation.

(ETA: In case it's not obvious, I'm agreeing with david-gpu's comment, and adding more reasons that training currently differs from inference.)