Hacker News new | past | comments | ask | show | jobs | submit login

In my masters degree I took a course where the entirety of the grade was in replicating a machine learning paper. It’s wasn’t a well-known one, but it still had many citations.

Well it turns out that the paper leaves out many key details in how it would actually be replicated, and the GitHub link was long since deleted (and not archived either). After many good faith efforts to reproduce the results we realized that the published result was actually probably due to the authors’ choice of a lucky random seed. Or more likely, rerunning the experiment many times with different seeds. In reinforcement learning that’s not uncommon.

Even if I am wrong, I still believe that a part of peer review should be in a third party being able to replicate your results, at least in computer science. It’s expensive but imo would lend a lot more credibility to results.




I would rather normalize publishing negative results (including refutations of previously published positive results).

Reproducing every single submission is extremely expensive, but per the Pareto principle, no one is ever going to care about the vast majority of publications. It's the ones that lots of people care about that we want to be accurate. Given the right incentives and possibilities, people will seek out to reproduce those by themselves.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: