>when you train a computational model on half of your dataset and test it ag...

>when you train a computational model on half of your dataset and test it against the other half (which the author claims is Not Sufficiently Scientific), how is that qualitatively different from training it on all your data, and then "testing it on Nature"

it would be essentially the same thing if you took half the data, made a model, tested it on the other half, got good results, and stopped.

but thats not going to happen. your fist model will stink. you'll refine it and try again. and you'll keep doing that. unfotunatly, this mean your model ends up being dependent on all the data.

the easiest way to prevent this is to build a model and the test it on real world data that occurs after you build the model. if it doesn't work, tweak the model, and then find new data again.