Author here — thanks for your interest! So far we do not know of a way to estima...

samsamoa · on Dec 14, 2018

Another author here– I'll add to Jared's comment above that for long-running experiments (like the ones our Dota team runs), it can be useful to track this statistic in real time to see whether or not it would be useful to scale up the experiment.

taliesinb · on Dec 14, 2018

Whats a cheap and unobtrusive way to estimate the BSimple version of the noise scale in real time? Piggy back on ADAM's moving mean and variance estimates? Edit: I see that Appendix A has a method for the multi-device training setting, but I'm thinking of single device training.