The AWS premium for GPU instances is absolutely not worth it. You don't hear abo...

srtjstjsj · on Nov 3, 2020

What if you want 10x the GPU for one month to build a model?

freeone3000 · on Nov 3, 2020

That's the only scenario I can think where it comes out clearly in favor of AWS - you've tested your model in the small on Colab, you're confident you'll need only a few training runs, you can schedule them in us-east, and you can inference on CPU, and you won't need to rebuild for another eight months (when purchased cards become outdated).

It's not an impossible scenario... But imagine the sort of company that trains their own model instead of using a huggingface refinement or an off-the-shelf redistillation. (These can be done reasonably on an average gaming PC, no need for a cluster.) Such a company has expensive human resources. They bothered to get a data scientist and at least a research engineer, if not a full researcher. Were they hired on six-month contract as well? This is a huge expense, so it must be an important differentiator to have built a custom model -- and it's a one-and-done? I don't see it. I think it's going to be an ongoing project, or it shouldn't have been approved in the first place.