Companies / researchers in general have no strong requirement to show you any ar...

qayxc · on June 9, 2020

> Companies / researchers in general have no strong requirement to show you any artifacts to reproduce their work.

That's quite literally the opposite of what science is all about! If they don't want others to reproduce their results, they might just as well end each paper with "You can take our word for it!" and skip the details altogether...

What I'm rather interested in are points of comparison. Performance in terms of a chosen metric is one thing, but research gets more useful if it can easily be reproduced. This is the norm in all other sciences - why not in AI research?

If I can see that their approach is 20% better than SOTA, but they require 1M LoC plus 3 weeks of total computation time on a 100 machine cluster with 8 V100 per node, I can safely say - sod it! - use the inferior commercial product instead and add 20% manual effort (since I need to add manual work anyways as the accuracy isn't 100%).

zitterbewegung · on June 10, 2020

Yes, it is and I agree with you.

I worked in a lab where I had to reproduce others peoples code in Java. I never finished any of those projects.

For Example GPT-2 would need around ~$50k to reproduce from scratch. GPT-3 is probably a few orders of magnitude than that. How would anyone reproduce it unless they are a company? I've seen NVIDIA reproduce some results.

Also, most of the issues are you don't have the datasets and after the PhD students graduate and the professor gets a job your access of the datasets go away like bitrot.

visarga · on June 10, 2020

> If I can see that their approach is 20% better than SOTA, but they require 1M LoC plus 3 weeks of total computation time on a 100 machine cluster with 8 V100 per node, I can safely say - sod it!

8 V100's cost about $20/h, 100 machines for 2 weeks (allowing for a long training time) will cost $638K. This is the salary of three to five engineers for a year. If your model reduces more than that amount of time it is worth it. It's just a matter of how much use you can get out of it. Of course a model can be reused by different teams and companies, so it could easily be worth the price.

I expect the number I calculated to be exaggerated for this task, though, you don't need that much compute for this model. GPT-3 cost $1.2M per run and it is the largest model in existence.