I'm wondering if the work on adversarial systems, this one being quite interesting, can help us with our giant bugaboo of "OMG, its overfitted :-(" Right now we model, train, test, fail, and start all over again, and usually fiddle with the hyperparameters to boot - what would happen if we turned training into a two phased approach, with a BEAM/GAN whatnot used on each cycle to measure how 'brittle' the backprop is? The idea being to round down the spikes in the learned model by penalizing the backprop when it is too narrow - training would take longer, but we'd throw away fewer sets, I'd think