Hacker News new | past | comments | ask | show | jobs | submit login

DeepSpeed [1] is amazing tool to enable the different kind of parallelisms and optimizations on your model. I would definitely not recommend reimplementing everything yourself.

Probably FairScale [2] too, but never tried it myself.

[1]: https://github.com/microsoft/DeepSpeed

[2]: https://github.com/facebookresearch/fairscale




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: