Hmm, I'm not trying to make that suggestion at all. How did I make that impression? I'd like to correct it.
My goal with this article is only to help readers who have wondered how time-domain approaches like Kalman filtering relate to frequency-domain approaches like low-pass filtering, and to connect those dots. It's not a comprehensive article about the magic of Kalman filters.
(I do not have a PhD in control systems or specialize in it so take my input with a grain of salt) I think it's because Kalman filters and low pass filter are trying to achieve similar but different end goals. Kalman filter is trying to get a measurement and lower it's uncertainty. A low pass filter is trying to remove high fluctuations assuming those fluctuations is noise. As mentioned before, a low pass filter cannot achieve what a Kalman filter can. The example you show works because the noise is set to the high fluctuations. I think explaining a multiple input system is extremely critical to Kalman filters. Without it, it makes it a bit deceiving. For instance, if you do the same exercise in the article but measuring angle, you quickly see how a gyroscope + lowpass filter never reaches the level of measurement compared to gyroscope + accelerometer. Even funner, you can increase the noise in the gyroscope + accelerometer while keeping the gyroscope + lowpass filter noise the same, the gyroscope + accelerometer with a Kalman filter would always perform better.
Control theory person here. In the linear case resulting "system" is a linear time-invariant system. That's the standard state space system that you can convert and obtain a transfer function. But from a higher order transfer function, through partial fraction decomposition you can always write it as a sum of simple lead-lag and second order filters. So the distinction in the post and also discussion here is kind of orthogonal semantics. What NASA did was to "extend" the kalman filter for nonlinear cases (actually their extension was also not so big but if it works it works). In short every linear system can be written as a sum of trivial components such as low pass/notches/anti-notches etc. This is in fact how you implement higher order controllers real time.
The MIMO distinction is also not super important. Since you can also have MIMO low pass filters. The real difficulty is obtaining the coefficients of these filters and hopefully find the best ones. That's where you start getting into the optimality and the real contribution of these tools. But as a side-effect you must assume that the noise is Gaussian otherwise you lose much of the niceties of the theoretical guarantees. This is basically the biggest control theoretical disadvantage of kalman filters and the reason why other domains keep rediscovering it while in control it is "kinda, sorta" falling out of grace.
Thanks, and I broadly agree with what you say. It's only under specific conditions (system dynamics, process covariance, measurement matrices, and measurement covariance are all static) that the Kalman filter converges to the Wiener filter, as Kálmán says in his paper.
I would contest that the reason a frequency filter doesn't work on a gyroscope is not because it's a MIMO system (which frequency domain techniques can generalize to; you just end up with n x m transfer functions) but because the system is not static, so it doesn't satisfy the conditions that would cause the Kalman filter to converge to a fixed-coefficient filter.
The title is probably the issue. People read it and go in expecting you to make that case. In seeking brevity it seems you lost sufficient context and nuance.
even as a static model, it is not a low pass filter.
in fact, in the static case, a kalman filter is exactly the same as the recursive least squares estimator, which is provably the optimal unbiased estimator
In the static case, the Kalman filter converges to a fixed (frequency domain) filter after running for a while. The type of frequency behaviour depends on the system properties. The filter it converges to is the Wiener filter for that system.
My goal with this article is only to help readers who have wondered how time-domain approaches like Kalman filtering relate to frequency-domain approaches like low-pass filtering, and to connect those dots. It's not a comprehensive article about the magic of Kalman filters.