Hacker News new | past | comments | ask | show | jobs | submit login

Is it possible to over-fit optimisation methods? I mean people concentrate so much on coming up with fancy ways of training resnet really quickly or with huge batch sizes or whatever. Maybe the methods themselves don't generalise to other networks.

Maybe they do though. Just a though.




Yes, over-fitting is definitely something that happens. Here's a paper that demonstrates the problem by testing CIFAR classifiers on a new version of the input data: https://arxiv.org/abs/1806.00451

The flip side is that by having a common problem to tackle it's much easier to compare and contrast different results to figure out what really works. Many approaches that improve results on CIFAR don't scale to ImageNet, and many of the ImageNet papers don't scale to the even larger datasets that people are using now.


Aha amazing - I've been wondering for ages how rankings would change with new test data for MNIST (or another common dataset). Thanks for the link!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: