Even a dead simple two-layer neural network is a universal approximator. Meaning...

Even a dead simple two-layer neural network is a universal approximator. Meaning they can model any relationship, given enough neurons in the first layer. with as much accuracy as you want, subject to available resources for training time and model size.

Specific deep learning architectures and training variants reflect the problems they are solving, speeding up training, and reducing model sizes considerably. Both keep getting better, so efficiency improvements are not likely to plateau anytime soon.

They readily handle both statistically driven and pattern driven mappings in the data. Most modelling systems tend to be better or exclusively adaptive on one of those dimensions or the other.

Learning patterns means they don't just go input->prediction. They learn to recognize and apply relationships in the middle. So when people say they are "just" predictors, that tends to be misleading. They end up predicting, but they will do whatever they need to in between in terms of processing information, in order to predict.

They can learn both hard or soft pattern boundaries. Discrete and continuous relationships. And static and sequential patterns.

They can be trained directly on example outputs, or indirectly via reinforcement (learning what kind of outputs will get judged well and generating those). Those are only two of many flexible training schemes.

All those benefits and more make them exceptionally general, flexible, powerful and efficient tools relative to other classes of universal approximators, for large growing areas of data defined problems.