Makes me wonder if anyone has looked into using genetic algorithms combined with...

esac · on July 10, 2021

>genetic algorithms combined with RL where the genetics determine the reward function.

I have been working on this problem for years (2+ as researcher, 2 as PhD student).

The main issue is that evolution is both massively parallel and had plenty of runtime to get to human level intelligence.

The person that pushes this evolution/evolved reward point is Andrew G. Barto and his students/collaborators over the years.

Satinder Singh in particular is actively working on gradient based algorithms to find rewards (e.g. https://arxiv.org/abs/2102.06741)

> Maybe we need to frame RL goals in much more simple terms, and allow genetic algorithms to evolve their own inputs and reward functions on their own.

I was checking HN while the current iteration of this (gradient based, genetic was my master thesis) algorithm, the main complexity is figuring out:

1) What are the sub-goal e.g. grasping things 2) How to solve those goals e.g. motor control 3) How to do something useful, e.g. surviving

Balancing those three processes is the current hurdle.

For more info my email is delvermm at mila.quebec

goatlover · on July 10, 2021

Also, evolution isn’t trying to get to human level intelligence. It’s just one out of millions of adaptations that work, it’s recent, and it’s rare. Change Earths parameters a little over the past several million years, and maybe we don’t evolve.

visarga · on July 11, 2021

It's an extremely open-ended process.

whimsicalism · on July 10, 2021

> The main issue is that evolution is both massively parallel and had plenty of runtime to get to human level intelligence.

How many entities are we talking about for substantial evolution? I know that there have been 100 billion "humans" (not that it's so clear-cut) alive, so guessing this is on the order of ~trillions of entities to simulate some evolution for (but maybe I'm really underestimating the early tail of tons and tons of microorganisms and small short-lived life that got us to this point).

Is the bandwidth of evolution that much larger than what we could possible simulate with computation, especially for a much simpler world/task than "generally survive"?

lelandbatey · on July 10, 2021

Given that a single teaspoon of soil probably has about a billion bacterial organisms in it, I suspect you're a couple orders of magnitude short.

Barrin92 · on July 10, 2021

The purpose of any scientific field is to generate knowledge, i.e. to actually understand the conceptual underpinnings of something like intelligence.

This idea that all that's necessarily for engineering intelligence is throw some chemicals into a bucket and turn the heat on is bad. By that logic you can just write a universe simulator, wait a million years and maybe you solve AI as a side challenge. if AI is just evolution and genetics and genetics is just physics just solve that and we're good to go.

It's like if someone tried to build a bridge and he just clobbers things together until it stands up and then prays that it doesn't fall down. That's not how engineering works obviously, but that's the attitude we have towards AI.

What AI needs at this point is the very opposite. An actual theory of intelligence at a high level because we haven't really made progress on that front in decades.

lumost · on July 10, 2021

The problem with this view is that the outside world has such an immensely vast amount of data to it that the problem becomes uncomputable.

It took ~3 billion years of real evolution to reach Humans. This evolution occurred on a planet scale, including naturally formed barriers which rose and fell, changes to climatic conditions, and even a few stellar events to shake up evolution. There isn't even a compelling reason to think that Humans couldn't have arisen anytime within the last ~300 million years. Implying that the probabilities of intelligent life emerging are low, or that the conditions are poorly understood and rare.

Effectively each attempt to learn an agent which has to interact with the real world runs into these problems. The solution is to make the reward more complex and the simulated environment more realistic - both actions which increase the computational costs of the problem faster than the improvements arrive.

MathYouF · on July 10, 2021

Well the failure of multilayer perceptrons to converge into useful models was because of limited compute scale, which eventually was solved with the advent of powerful GPU's and CUDA popularising using their parallel computationa meant for graphics rendering for the linear operations used during back propogation.

Maybe the problem isn't that the RL algorithm is wrong, but that it just doesn't work without a 3 billion year, atom resolution, planet scale computer.

kris-s · on July 10, 2021

> and allow genetic algorithms to evolve their own inputs and reward functions on their own

I've been playing with genetic algorithms for years now as a hobby and this type approach was a dead end for me, the GA entities would just "game the system" as it were and would min/max in surprising ways.

My latest genetic algorithm creation https://littlefish.fish has performed far better at pattern recognition than I expected. I really think they've got massive potential.

frutiger · on July 10, 2021

I tried

1 4 9 16 25 16 9 4 1 4 9

It did not predict what I had in mind.

kris-s · on July 10, 2021

What did you have in mind?

ycombobreaker · on July 11, 2021

javitury · on July 10, 2021

> anyone has looked into using genetic algorithms combined with RL where the genetics determine the reward function

There is an introductory guide to rust, deep learning and genetic algorithms that I really like, it's called "Learning to fly" [^0]

[0]: https://pwy.io/en/posts/learning-to-fly-pt1/

polishdude20 · on July 10, 2021

I think the stove and sex examples are on the right track but these qualities are also what every animal experiences. Well... judging by the face of a dog when he's humping your leg, I'm sure it feels good for him.

Anyways, I think there's another ingredient that's missing that we humans uniquely have. I think that ingredient is the fear of death. The knowledge that of all our intellect and powers as a human, we will inevitably die. Its better summed up by terror management theory I believe.

MathYouF · on July 10, 2021

Dogs have achieved an unbroken chain of survival dating back as far as your ancestors have, so they've achieved the same survival goals as you.

They've managed to do so without our intellectual abilities, which goes to prove that our goals of making RL algorithms become "smart" is malformed since high level complex and abstract reasoning skills apparently aren't a necessarily prerequisite to survival, at least in our earth environment.

cma · on July 10, 2021

Worker ants end the chain of survival, but are still necessary, so chain of surival among individuals is a flawed metric.

MathYouF · on July 10, 2021

Indeed, the genetics of every living thing are not evidence of its fitness, but once it has reproduced, it is.

visarga · on July 11, 2021

I found this video an excellent take on open ended goals in RL. https://youtu.be/lhYGXYeMq_E

suref · on July 11, 2021

One benefit with genetic algorithms is the fact that it can handle multiple objectives, like the NSGA-II algorithm. I used it to evolve a neural net in my master thesis.