> people who think neural nets are a relatively recent thing, and not something that emerged back in the 1940s-50s
And to bring this full circle... if you really (really) buy into Schmidhuber's argument, then we should consider the genesis of neural networks to date back to around 1800! I think it's fair to say that that might be a little bit of a stretch, but maybe not that much so.
> Around 1800, Carl Friedrich Gauss and Adrien-Marie Legendre introduced what we now call a linear neural network, though they called it the “least squares method.” They had training data consisting of inputs and desired outputs, and minimized training set errors through adjusting weights, to generalize on unseen test data: linear neural nets!
... except linear neural nets have a very low level of maximum complexity, no matter how big the network is, until you introduce a nonlinearity, which they didn't. They tried, but it destroys the statistical reasoning so they threw it out. Also I don't envy anyone doing that calculation on paper, least squares is already going to suck bad.
Until you do that, this method is version of a Taylor series, and the only real advantage is the statistical connection between the outcome and what you're doing (and if you're evil, you might point out that while that statistical connection gives reassurance that what you're doing is correct, despite being a proof, points you in the wrong direction)
And if you want to go down that path, SVM kernel-based networks do it better than current neural networks. Neural networks throw out the statistical guarantees again.
If you want to really go back far with neural networks, there's backprop! Newton, although I guess Leibniz' chain rule would make him a very good candidate.
And to bring this full circle... if you really (really) buy into Schmidhuber's argument, then we should consider the genesis of neural networks to date back to around 1800! I think it's fair to say that that might be a little bit of a stretch, but maybe not that much so.