Companies / researchers in general have no strong requirement to show you any artifacts to reproduce their work. There are incentives to not even provide that information at all.
What we usually see is that someone else goes through the time to reproduce the results when reproducing it. Or internally from companies you can see projects moving from one company to another like MapReduce (Google) to Yahoo (Hadoop).
And then if enough companies don't want to manage the whole project the software gets donated to the Apache Software foundation.
> Companies / researchers in general have no strong requirement to show you any artifacts to reproduce their work.
That's quite literally the opposite of what science is all about! If they don't want others to reproduce their results, they might just as well end each paper with "You can take our word for it!" and skip the details altogether...
What I'm rather interested in are points of comparison. Performance in terms of a chosen metric is one thing, but research gets more useful if it can easily be reproduced. This is the norm in all other sciences - why not in AI research?
If I can see that their approach is 20% better than SOTA, but they require 1M LoC plus 3 weeks of total computation time on a 100 machine cluster with 8 V100 per node, I can safely say - sod it! - use the inferior commercial product instead and add 20% manual effort (since I need to add manual work anyways as the accuracy isn't 100%).
I worked in a lab where I had to reproduce others peoples code in Java. I never finished any of those projects.
For Example GPT-2 would need around ~$50k to reproduce from scratch. GPT-3 is probably a few orders of magnitude than that. How would anyone reproduce it unless they are a company? I've seen NVIDIA reproduce some results.
Also, most of the issues are you don't have the datasets and after the PhD students graduate and the professor gets a job your access of the datasets go away like bitrot.
> If I can see that their approach is 20% better than SOTA, but they require 1M LoC plus 3 weeks of total computation time on a 100 machine cluster with 8 V100 per node, I can safely say - sod it!
8 V100's cost about $20/h, 100 machines for 2 weeks (allowing for a long training time) will cost $638K. This is the salary of three to five engineers for a year. If your model reduces more than that amount of time it is worth it. It's just a matter of how much use you can get out of it. Of course a model can be reused by different teams and companies, so it could easily be worth the price.
I expect the number I calculated to be exaggerated for this task, though, you don't need that much compute for this model. GPT-3 cost $1.2M per run and it is the largest model in existence.
What we usually see is that someone else goes through the time to reproduce the results when reproducing it. Or internally from companies you can see projects moving from one company to another like MapReduce (Google) to Yahoo (Hadoop).
And then if enough companies don't want to manage the whole project the software gets donated to the Apache Software foundation.