Right, and the practice of neural networks has significantly overshot the mathematical theory. Most of the aspects we know work and result in good models have poorly understood theoretical underpinnings. The whole overparamiterized thing for example, or generalization generally. There's a lot that "just works" but we don't know why, thus the stumbling around and landing on stuff that works