I think you sum up my feelings about him as well. He's a bit much sometimes but it's hard to deny that he's made monumental contributions to the field.
It's also funny that we laugh at him when we also have a joke that in AI we just reinvent what people did in the 80's. He's just the person being more specific as to what and who.
Ironically, I think the problem is we care too much about credit. It ends up getting hoarded rather than shared. We then just oversell our contributions because if you make the incremental improvements that literally everyone does, you get your works rejected for being incremental.
I don't know what it is about CS specifically, but we have a culture problem or attribution and hype. From building on open source, it's libraries all the way down, but we act like we did it all alone. To jumping on bandwagons as if there's a right and immutable truth to how to do certain things, until the bubbles pop and we laugh at how stupid anyone was to do such a thing. Yet we don't contribute back to those projects that have US foundation, we laugh at "theory" which we stand on, and we listen to the same hype train people who got it wrong last time instead of turning to those who got it right. Why? It goes directly counter to the ideas of a group who love to claim rationalism, "working from first principles", and "I care what works"
This aspect of the industry really annoys me to no end. People in this field are so allergic to theory (which is ironic because CS, of all fields, is probably one of the ones in which theoretical investigations are most directly applicable) that they'll smugly proclaim their own intelligence and genius while showing you a pet implementation of ideas that have been around since the 70s or earlier. Sure, most of the time they implement it in a new context, but this leads to a fragmented language in which the same core ideas are implemented N times with everyone particular personal ignorant terminology choices (see for example, the wide array of names for basic functional data structure primitives like map, fold, etc. that abound across languages).
If you find that you're spending almost all your time on theory, start turning some attention to practical things; it will improve your theories. If you find that you're spending almost all your time on practice, start turning some attention to theoretical things; it will improve your practice.
But yeah, in general I hate how people treat theory, acting as if it has no economic value. Certainly both matter, no one is denying that. But there's a strong bias against theory and I'm not sure why. Let's ask ourselves, what is the economic impact of Calculus? What about just the work of Leibniz or Newton? I'm pretty confident that that's significantly north of billions of dollars a year. And we what... want to do less of this type of impactful work? It seems a handful of examples far covers any wasted money on research that has failed (or "failed").
The problem I see with our field, which leads to a lot of hype, is the belief that everything is simple. This just creates "yes men" and people who do not think. Which I think ends up with people hearing "no" when someone is just acting as an engineer. The job of an engineer is to problem solve. That means you have to identify problems! Identifying them and presenting solutions is not "no", it is "yes". But for some reason it is interpreted as "no".
> see for example, the wide array of names for basic functional data structure primitives like map, fold, etc. that abound across languages
Don't get me started... but if a PL person goes on a rant here, just know, yes, I upvoted you ;)
[0] You can probably tell I came to CS from "outside". I have a PhD in CS (ML) but undergrad was Physics. I liked experimental physics because I came to the same conclusion as Knuth: Theory and practice drive one another.
I get weird looks sometimes lately when I point out that "agents" are not a new thing, and that they date back at least to the 1980's and - depending on how you interpret certain things[1] - possibly back to the 1970's.
People at work have, I think, gotten tired of my rant about how people who are ignorant of the history of their field have a tendency to either re-invent things that already exist, or to be snowed by other people who are re-inventing things that already exist.
I suppose my own belief in the importance of understanding and acknowledging history is one reason I tend to be somewhat sympathetic to Schmidhuber's stance.
Another interesting thing I see is how people will refuse to learn history thinking it will harm their creativity[0].
The problem with these types of interpretations is that it's fundamentally authoritarian. Where research itself is fundamentally anti-authoritarian. To elaborate: trust but verify. You trust the results of others, but you replicate and verify. You dig deep and get to the depth (progressive knowledge necessitates higher orders of complexity). If you do not challenge or question results then yes, I'd agree, knowledge harms. But if you're willing to say "okay, it worked in that exact setting, but what about this change?" then there is no problem[1]. In that setting, more reading helps.
I just find these mindsets baffling... Aren't we trying to understand things? You can really only brute force new and better things if you are unable to understand. We can make so much more impact and work so much faster when we let understanding drive as much as outcome.
I think you should have continued reading from where you quoted.
>> Aren't we trying to understand things? ***You can really only brute force new and better things if you are unable to understand. We can make so much more impact and work so much faster when we let understanding drive as much as outcome.***
I'm arguing that if you want to "deliver business units to increase shareholder value" that this is well aligned with "trying to understand things."
Think about it this way:
If you understand things:
You can directly address shareholder concerns and adapt readily to market demands. You do not have to search, you already understand the solution space.
If you do not understand things:
You cannot directly address shareholder concerns and must search over the solution space to meet market demands.
Which is more efficient? It is hard to argue that search through an unknown solution space is easier than path optimization over a known solution space. Obviously this is the highly idealized case, but this is why I'm arguing that these are aligned. If you're in the latter situation you advantage yourself by trying to get to the former. Otherwise you are just blindly searching. In that case technical debt becomes inevitable and significantly compounds unless you get lucky. It becomes extremely difficult to pivot as the environment naturally changes around you. You are only advantaged by understanding, never harmed. Until we realize this we're going to continue to be extremely wasteful, resulting is significantly lower returns for shareholders or any measure of value.
I'm in the same boat. At least there's a couple of us that think this way. I'm always amazed when I run into people who think neural nets are a relatively recent thing, and not something that emerged back in the 1940s-50s. People seem to tend to implicitly equate the emergence of modern applications of ideas with the emergence of the ideas themselves.
I wonder at times if it stems back to flaws in the CS pedagogy. I studied philosophy and literature in which tracing the history of thought is basically the entire game. I wonder if STEM fields, since they have far greater operational emphasis, lose out on some of this.
> people who think neural nets are a relatively recent thing, and not something that emerged back in the 1940s-50s
And to bring this full circle... if you really (really) buy into Schmidhuber's argument, then we should consider the genesis of neural networks to date back to around 1800! I think it's fair to say that that might be a little bit of a stretch, but maybe not that much so.
> Around 1800, Carl Friedrich Gauss and Adrien-Marie Legendre introduced what we now call a linear neural network, though they called it the “least squares method.” They had training data consisting of inputs and desired outputs, and minimized training set errors through adjusting weights, to generalize on unseen test data: linear neural nets!
... except linear neural nets have a very low level of maximum complexity, no matter how big the network is, until you introduce a nonlinearity, which they didn't. They tried, but it destroys the statistical reasoning so they threw it out. Also I don't envy anyone doing that calculation on paper, least squares is already going to suck bad.
Until you do that, this method is version of a Taylor series, and the only real advantage is the statistical connection between the outcome and what you're doing (and if you're evil, you might point out that while that statistical connection gives reassurance that what you're doing is correct, despite being a proof, points you in the wrong direction)
And if you want to go down that path, SVM kernel-based networks do it better than current neural networks. Neural networks throw out the statistical guarantees again.
If you want to really go back far with neural networks, there's backprop! Newton, although I guess Leibniz' chain rule would make him a very good candidate.
It's also funny that we laugh at him when we also have a joke that in AI we just reinvent what people did in the 80's. He's just the person being more specific as to what and who.
Ironically, I think the problem is we care too much about credit. It ends up getting hoarded rather than shared. We then just oversell our contributions because if you make the incremental improvements that literally everyone does, you get your works rejected for being incremental.
I don't know what it is about CS specifically, but we have a culture problem or attribution and hype. From building on open source, it's libraries all the way down, but we act like we did it all alone. To jumping on bandwagons as if there's a right and immutable truth to how to do certain things, until the bubbles pop and we laugh at how stupid anyone was to do such a thing. Yet we don't contribute back to those projects that have US foundation, we laugh at "theory" which we stand on, and we listen to the same hype train people who got it wrong last time instead of turning to those who got it right. Why? It goes directly counter to the ideas of a group who love to claim rationalism, "working from first principles", and "I care what works"