Hacker News new | past | comments | ask | show | jobs | submit login

batchsize 1 should always work. The trade off is always “bigger batch size/more efficient” versus “smaller batch size/better results”. By using larger batches you are able to reduce overhead, take better advantage of memory caches, etc.... which can give very large speed ups. But mathematically it can slow down your training convergence (check out the plots showcasing how smaller batches converge in fewer optimizer steps). This is because when you average over an entire batch, you throw away a little bit of information that could otherwise be used to better learn your model. For intuition on this, imagine that you are doing homework, and you are forced to finish 1000 practice problems before being allowed to look at the answers; if you had looked earlier you may have been able to correct your mistake earlier and not gotten all 1000 wrong. However it would have taken you more time to go through those 1000 practice problems because you would have been looking up the answers and mentally adjusting your model for each problem you did.



I think you're confused about the benefits/drawbacks of increasing batch size. The training progress per optimizer step increases with batch size (otherwise the smallest batch would be always optimal). Increasing the batch size improves the quality of a single optimizer step but does not scale linearly. (i.e., larger batch training requires more samples to converge than small batch training even though fewer steps are required). Depending on the problem scale the computational efficiency of larger batches makes this tradeoff worth it because even though you need to process more samples to converge, it will take less wall clock time due to improved efficiency.


Yes, we're saying the same thing. It takes longer for the optimizer to converge when using larger batch sizes, in terms of number of samples pushed through the model. It takes less time in terms of wall clock time due to increased efficiency.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: