google2342's comments

google2342 · on Aug 8, 2019

You shouldn't use the mean when doing benchmarking. Better to use the median or fastest time. Lots of random things can happen on computers (usually in the OS) that can result in some operation taking 1000x longer.

chronial · on Aug 9, 2019

Interesting point. The %timeit functionality actually used to output the mean of the x fastest runs, they seem to have changed that at some point. The docs still explain the old behavior [1].

I assume that they feel your concerns are addressed because they display the stddev together with the mean, so you can see if there were any extreme outliers.

[1] https://ipython.org/ipython-doc/dev/interactive/magics.html#...

jcranmer · on Aug 8, 2019

If you're testing a single-threaded benchmark, then the test statistics aren't going to be meaningfully different in interpretation, especially if you're only asking the question "is A or B faster?" What's more important is that you capture enough runs to characterize the distribution well; if you have that, you'll get meaningful results no matter which statistic you're actually measuring.

google2342 · on Aug 8, 2019

How do you expect someone to justify a statement that's not true? I suppose you could say that it's a fair characterization of a basic decision tree but that doesn't describe modern ML methods.

google2342 · on Aug 8, 2019

That's not a fair characterization. These contractors are mostly used for quality assurance. Once you've trained your model (which is not just if statements), you might you send a few thousand examples to raters to judge the true quality of your model.

google2342 · on July 29, 2019

Don't do that to yourself. Better to take care of yourself than buying a new phone.

google2342 · on July 24, 2019

New grads can start at 200K+ in the right companies.

google2342 · on July 23, 2019

I suspect the winners will already be well established in the field.

mlguy456 · on July 24, 2019

Right, and 25k is their bi weekly paycheck.