The problem with this is that you have to determine what the goals are and how to evaluate whether they are met in a meaningful way. A computerized process like this will quickly over-fit to its input and be useless for 'actual' intelligence. The only way past this is to gather good information, which requires a real-world presence. It can't be done in simulation.
It's the same reason you can't test in a simulation. Say you wanted to test a lawnmower in a simulation... how hard are the rocks? How deep are the holes? How strong are the blades? How efficient is the battery? If you already know this stuff, then you don't need to test. If you don't know it, then you can't write a meaningful simulation anyway.
That's an interesting argument, but doesn't it assume a small, non-real-world input/goal set?
Dumb example off the top of my head: what if the input was the entire StackOverflow corpus with "accepted" information removed, and the goal was to predict as accurately as possible which answer would be accepted for a given question? Yes, it assumes a whole bunch of NLP and domain knowledge, and a "perfect" AI wouldn't get a perfect score because SO posters don't always accept the best answer, but it's big and it's real and it's measurable.
A narrower example: did the Watson team test against the full corpus of previous Jeopardy questions? Did they tweak things based on the resulting score? Could that testing/tweaking have been automated by some sort of GA?
The point there is that you can make a computer that's very good at predicting StackOverflow results or Jeopardy, but it won't be able to tie a shoe. If you want computers to be skilled at living in the real world, they have to be trained with real-world experiences. There is just not enough information in StackOverflow or Jeopardy to provide a meaningful representation of the real world. You'll end up overfitting to the data you have.
The bottom line is that without sensory input, you can't optimize for real world 'general AI'-like results.
It's the same reason you can't test in a simulation. Say you wanted to test a lawnmower in a simulation... how hard are the rocks? How deep are the holes? How strong are the blades? How efficient is the battery? If you already know this stuff, then you don't need to test. If you don't know it, then you can't write a meaningful simulation anyway.
So that is not an approach that can be automated.