Hacker News new | past | comments | ask | show | jobs | submit login

Sure you probably don’t want to do full training runs locally, but There’s a lot you can do locally that has a lot of added friction on a gpu cluster or other remote compute resource

I like to start a new project by prototyping and debugging my training and cunning config code, setting up the data loading and evaluation pipeline, hacking around with some baseline models and making sure they can overfit some small subset of my data

After all that’s done it’s finally time to scale out to the gpu cluster. But I still do a lot of debugging locally

Maybe this kind of workflow isn’t as necessary if you have a task that’s pretty plug and play like image classification, but for nonstandard tasks I think there’s lots of prototyping work that doesn’t require hardware acceleration




Coding somewhat locally is a must for me too because the cluster I have access to has pretty serious wait times (up to a couple hours on busy days). Imagine only being able to run the code you’re writing a few times a day at most! Iterative development and doing a lot of mistakes is how I code; I don’t want to go back to punch card days where you waited and waited before you ended up with a silly error.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: