> Our new Rust-based platform recently handled millions of inferences a second of a sophisticated ad-targeting model with a median latency of 5 milliseconds, and a yearly infrastructure cost of $130K.
Were these run on CPUs or GPUs? How many of them?
Last I looked at running Tensorflow models on CPUs it was really slow, so slow we had to abandon it.
Were these run on CPUs or GPUs? How many of them? Last I looked at running Tensorflow models on CPUs it was really slow, so slow we had to abandon it.