I'm surprised that genetic algorithms on resource-bounded general purpose virtual machines have not surpassed neural nets in their problem solving ability.
Is the computational advantage of neural nets on GPUs so dramatic?
It comes down to gradients. When you have useful gradients to work with backprop on a GPU is vastly faster and more directed than GAs. One could uncharitably say that all of the impressive results from NNs in the last decade are the result of this ability to throw large amounts of data into highly parallel training, rather than new theory about the expressiveness and capabilities of NNs.
When you don't have or can't use a gradient, GAs become the go-to tool, for optimizations (if only because you don't have much in the way of other options).
GAs can be way, way, way more compute intensive as your population size increases.
That said, I use GAs for optimization and find them to be an incredibly useful tool for various optimization tasks - just generally not for learning weights of NNs, where a GPU + backprop generally reigns supreme (or at least works well enough/much faster than a GA).
You might be interested in "compact GA" which only requires keeping the % of 1s (or 0s) at each bit position.
I suspect that with some more engineering and attention from people doing ML stuff, GA-style algorithms can be made just as memory and space efficient as gradient methods, while giving better results and being more widely applicable.
With a genetic algorithm you have to try a bunch of weight variations and see which one works best. With backpropagation you can calculate one derivative and find out which variation will work best in one step. It’s hugely more efficient.
GA-style search isn't actually taking multiple samples to decide on a new point to move to - it's taking multiple samples to decide on a smaller region to focus on. This can be more efficient than backprop, depending on how "easy" it is to tell via sampling which subregion has better performance.
Traditional GA has a practical problem for training large models - keeping a large population of model weights in memory is not feasible. Like if you had to keep 1000 variations of the GPT-3 model weights in memory for training that's no good. Though people have ideas for how to solve for that as well (again, see the post I linked).
You can use genetic algorithms for training the weights of a neurL network. This is called neuroevolution and it's competitive on reinforcement learning because gradients are harder to find there...
Is the computational advantage of neural nets on GPUs so dramatic?