> What other statistical machine learning algorithm gives you a state machine as an output?
* hidden Markov models
* autoregressive models
* learned LQR
Sure, some of those have finite memory, but RNNs are practically limited–not as easily quantified.
I see what you're saying about the single-step operation appearing unique, but I also think that RNNs can be viewed through a lens such that they look like "normal" programming concepts like generators and folding iterators.
I'm not familiar with LQRs. Versus HMMs and linear autoregressive models, I think avoiding the need to predefine the form of the autoregression is a bigger leap than it might appear right now. I'd always been a neural network skeptic, thinking they're excessively complex for no benefit. It seems to me that's no longer true. I'm not certain that'll hold, but it's exciting.
The particular encoding -- matrix or otherwise -- is just an implementation detail.