Hacker News new | past | comments | ask | show | jobs | submit login
Teaching physics to neural networks removes 'chaos blindness' (phys.org)
143 points by JacobLinney on June 23, 2020 | hide | past | favorite | 74 comments



I believe this refers to work presented in this journal article. https://journals.aps.org/pre/abstract/10.1103/PhysRevE.101.0...

Abstract: Artificial neural networks are universal function approximators. They can forecast dynamics, but they may need impractically many neurons to do so, especially if the dynamics is chaotic. We use neural networks that incorporate Hamiltonian dynamics to efficiently learn phase space orbits even as nonlinear systems transition from order to chaos. We demonstrate Hamiltonian neural networks on a widely used dynamics benchmark, the Hénon-Heiles potential, and on nonperturbative dynamical billiards. We introspect to elucidate the Hamiltonian neural network forecasting.


Brings to mind this classic from the Jargon File:

http://www.catb.org/~esr/jargon/html/koans.html

In the days when Sussman was a novice, Minsky once came to him as he sat hacking at the PDP-6. “What are you doing?”, asked Minsky.

“I am training a randomly wired neural net to play Tic-Tac-Toe” Sussman replied.

“Why is the net wired randomly?”, asked Minsky.

“I do not want it to have any preconceptions of how to play”, Sussman said. Minsky then shut his eyes. “Why do you close your eyes?”, Sussman asked his teacher.

“So that the room will be empty.”

At that moment, Sussman was enlightened.


I don’t get it :(


I think it means - just as closing your eyes doesn't mean the room becomes empty, wiring the learning network randomly doesn't mean you'll end up with no pre-conceptions (e.g. the rule system at least will need to be programmed in).


It also doesn't avoid preconceptions directly. It just initializes random ones.


Aren’t preconceptions not random by definition?


I think he's suggesting that any given random set of preconceptions will itself be a 'sample' from the entire preconception-space, and therefor itself a preconception - just one that you didn't involve yourself in choosing.


If your preconceptions were not random relative to the truth, wouldn't you be able to know how to correct them?


I’ve said this before, but I think that a lack of physical modeling might be the key barrier for AV technology. Human drivers have a mental model of physics that they’ve honed for 17-18 hours a day since they were born.


Don't sell biology short like that. Human driver are born with a mental model of physics that's been honed 24 hours a day since before they were diatoms.


But were all a blank slate!!!


Even blank slates have specific properties, such as being better or worse at handling various types of information.


In the current context you're skirting heresy here... >.>


This is not at all true..


The slate structure and even size is significantly different.


I don't think that's quite right. I believe that humans are essentially born as blank neural networks; it's the structure, and the graph of connections between brain structures and sensory inputs, that is effectively primed for learning certain tasks that we find to be intuitive.

A baby is not born with the knowledge of body movement, for example, but through natural exploration of the body and environment, almost all physically capable humans learn to walk.


https://en.wikipedia.org/wiki/Fixed_action_pattern

More generally, reptiles are born with nearly all the behaviors they'll need throughout life. Why wouldn't humans be born with some?


>action patterns are said to be produced by the innate releasing mechanism, a "hard-wired" neural network, in response to a sign stimulus or releaser

This is exactly what I'm talking about. Just like a baby deer "instinctively" can walk, but wobbles around for the first few hours, what you're seeing is something very similar to a purpose evolved neural network structure who's weights are being set through the principle of firing and wiring together (I forget what it's called).

I can't believe I got -4 for that!

Edit: hebbian learning. Point is it's probably far too much information to encode in DNA, but if you structure your neural network properly, you encode, how could I put it, the general topology of the problem you are attempting to solve, and through reinforcement learning "fill in the blanks" by training weights (or hebbian learning which functions similarly).


There's even a lot of procedural building during fetal development, with limited input.

Brain scans of not-yet-born babies shows specific kind of brain waves.


Your hypothesis feel correct, and is one of two three parts that I feel are missing from current deep learning networks.

1) Pre-existing structures that are already specialized for the necessary tasks, but untrained. We kind of mimic this with transfer learning, and by discovering more appropriate general architectures by hand.

2) Training while inferring. We very crudely approximate this by releasing updated models every month but I think it would be best if also performed at the edge. Google has begun doing this, I have hope for 'federated learning'[0].

3) 20+ years of exaflop training.

More narrowly focused to this article, I believe researchers keep finding that models which are architected to solve the most "general case" possible to solve consistently perform better on highly specific tasks than models trained only on those specific tasks. Definitely creating models that understand general physics follows that trend. Although I suspect, (as I believe you do), that scaling will be hampered without some sort of ML "fixed action patterns".

My thinking about this topic has been strongly guided by a special issue of Scientific American: Mind that I read in 2013 [1]. The issue was hard for me to find today because it's not listed in the usual archives, due to being a special edition. SCIENTIFIC AMERICAN MIND September 2013 Volume 22, Issue 3s

The whole issue is devoted to optical illusions and what they can tell us about how our brain uses evolutionary shortcuts to efficiently determine things in the real world. "In the wild", these shortcuts improve accuracy and speed of inference. But with artificial stimuli, they can lead us astray, and do in the case of artificially generated optical illusions.

As for the -4 (which is the maximum negative you can go on HN) I think some people just saw the first part and clicked downvote at that point.

> I don't think that's quite right. I believe that humans are essentially born as blank neural networks

I wouldn't worry about the vote counter. "Those who play for applause, that's all they'll get." -Wynton Marsalis' dad.

Following up like this to clarify for us idiots is really the best thing to do, maybe editing the original comment for clarity if you really feel like it.

0: https://ai.googleblog.com/2017/04/federated-learning-collabo... 1: https://www.scientificamerican.com/magazine/special-editions...


> A baby is not born with the knowledge of body movement

Anyone who's witnessed a birth can tell you this is wrong.


Eh, I've seen my baby sucking on its thumb, and then the baby jerks and the thumb goes away, baby cries, somehow the wild flailing gets the thumb back to the mouth. Baby happy again, sucking on its own thumb.

I agree that we are not born a blank slate, but at the same time, there's a lot of knowledge missing on a newborn baby.


Even if the baby has trouble getting the thumb into the mouth at first, it has no trouble sucking it. Or any of thousands of other instinctive behaviors.


Yep, the baby knows how to cry when not sucking it’s thumb for example, and not cry when its content.


Perhaps born is the wrong word. That freshly-born baby did have nine months of gestation during which it's undoubtedly been exploring how to move about and sense its very limited environment.


Right but what it's doing in the womb is also lots of moving around ("kicking" and "jumping") so I just don't think this argument makes much sense to anyone who has experienced having a baby.


You don't think moving around is a normal part of a brain learning to move and sense?


“Pushed to prod”


They're capable of movement, I wouldn't say they have knowledge of it though. It can take them a while to even figure out suckling, and deliberate directed movements can take a month or two. Until then they're pretty much flailing randomly and gathering training data.


One thing that makes babies hard to analyze is that it is difficult to tell what they do not know how to do vs what they are not physically able to do reliably vs unlearning coping mechanisms for their limited abilities once they start being able to. Failing randomly could certainly be trying out different movements, but it could also be lack of hand eye coordination making it difficult to apply instinctual knowledge


True, but I think it’s fair to assume that when they’re happily sucking on a finger, suddenly rip their hand away from their face, and then punch themselves in the face repeatedly before starting to cry, that is not deliberate.


We're not a pure blank slate.


We're not even a dirty blank slate.

There are a zillion skills, aptitudes, and personality traits more or less hardcoded in a newborn.

Which is why there is such a thing as "human nature" that is meaningful to talk about.


>Which is why there is such a thing as "human nature" that is meaningful to talk about.

That kind of "human nature" that comes as a conclusion from the fact of what's hardcoded in a newborn baby is only a trivial kind, and not generally what people mean when they talk about "human nature" any more than the fact that babies are born with different eye colours tells us about human nature. Human nature, by definition, is found common to all humans, so a difference in "skills and aptitudes" does not say anything with regards to human nature (or the essence or appearance of it) other than "humans have skills, aptitudes and personality traits hardcoded" (which seems like a very strong claim to me anyway), but that itself would only be a trivial statement. It wouldn't tell us whether it's human nature (in the transhistorical, transsocietal sense) to be cooperative or greedy, violent or peaceful, etc.

Even so, understanding the fact that there is a human nature does not bring us much closer to what that human nature entails. Anthropologists, historians, economists, philosophers, and (some) evolutionary psychologists have a lot to say on the topic. To say that something is "just human nature" requires more evidence than "human nature is unchanging, applicable to all, and transhistorical".


Accepting that there is such a thing as "human nature" is a big thing, and at odds with much current thinking.

Many modern people assume human behavior is an effect of upbringing and social cues. A mental model where it is a mix of upbringing and people's inherent human nature can be shocking to many.

Turkheimer's Three Laws of Behavior Genetics is a good world shaking introduction to this world:

https://teammccallum.wordpress.com/3-laws-of-behaviour-genet...


>Many modern people assume human behavior is an effect of upbringing and social cues. A mental model where it is a mix of upbringing and people's inherent human nature can be shocking to many.

Many modern people don't mean everyone. The actual scientific literature on the nature vs nurture debate shows a surprisingly balanced picture which leaves some room for the hope of change. Still, defining human nature, and using it in a way which is non-trivial, requires a lot of work, especially the more abstract you go. Something along the lines of the quote, "To look at people in capitalist society and conclude that human nature is egoism is like looking at people in a factory where pollution is destroying their lungs and saying that it is human nature to cough."

Maybe we should be receptive to the arguments about human nature rather than (1) assuming it means what we think it does (2) assuming everyone who uses the term shares that meaning (3) assuming we know its essence and appearance right now.


OK, I doubt we disagree on much concretely. I'll happily cosign that this "requires a lot of work" to understand.

For context, this thread started with someone saying I believe that humans are essentially born as blank neural networks, so that's what I'm arguing against.


I disagree with this, sort of completely, after having been attacked by an animal. There is millions of years of evolution that will wake up and save you from getting hurt as it recognizes what's happening and instantly formulates a response to it.


This is clearly wrong. If you put your pinky finger inside the hand of a baby s/he will grab it and hold it. This can be seen very early.


This doesn’t even make sense. There must be some in-built programming on being able to learn new things, if nothing else.


You are likely correct. I think most researchers would agree, however. The bigger issue is actually learning how to form complex models. People want networks to just learn this implicitly, believing that we would likely impose counterproductive models. Other people simply struggle to incorporate models into the training process.


2 minute papers has good videos on neural nets learning physical modeling

https://www.youtube.com/watch?v=2Bw5f4vYL98


Side note, is "AV" to mean "autonomous vehicles" (assumed from context) a common usage? I've only ever heard it mean "audio/visual".


Yes. That usage of the acronym works when there are enough context clues. I think it will supplant the "audio/visual" meaning as autonomous vehicles become more salient. Here is text from an old job posting at Ford quoted on TechCrunch:

"We are seeking exceptional candidates to join our growing Autonomous Vehicle (AV) business team!"

https://techcrunch.com/2019/03/13/ford-is-expanding-its-self...


> autonomous vehicles

...oh that makes so much more sense! -.-


Vehicle dynamics is a fairly accurate science these days (50/50 for the tires)


I'm working on autonomous off-road vehicles, and while this is (probably) true for autonomous cars, dynamics modeling for wheeled robots on rough terrain is another beast where these approaches could very much help.


Is the issue in the surface modelling? I don't think I've ever seen a physical tire model for loose terrain


People in space robotics have been working on that (moon and mars rovers need to deal with this). Perception is also a bottleneck; you have to see rocks, root, grass, mud and predict the effects on the dynamics.


Sure, but to be clear, I meant physical modeling which includes real-time modeling of all salient objects and surfaces in the immediate and foreseeable environment. I mean going as far as creating a physical model for deer, their range of behavior and speed, weight distribution, predictive modeling for subsequent behavior, etc...


Racing teams and big car manufacturers have incredibly accurate models of vehicle dynamics.


But not outside those teams. If you want to put something together in a few weeks your options are relatively limited in that collecting accurate data is fairly hard. The actual dynamics of the car is fairly simple but the forces applied to it are quite hard to model (I don't know how much Michelin charge to use TameTire but I'm guessing not cheap)


This isn't something that has never been thought of. Jim Keller described many problems like changing lanes as a matter of ballistics.


Why do you need a neural network when you have the Hamiltonian mechanics of the system modeled? I've always understood Langrangian/Hamiltonian mechanics to be methods of modeling the behavior of a system through the decomposition of the external constraints and forces acting on a body. In other words you can understand a complex model by doing some calculus on the less complex constituents of the model.

I'm probably misunderstanding what the accomplished, but it sounds like they've increased the accuracy of a neural network model of a system, notably for edge cases, by training it on complete a complete model of said system.


> it sounds like they've increased the accuracy of a neural network model of a system, notably for edge cases, by training it on complete a complete model of said system.

Not quite. It's really just that they require the dynamics to be Hamiltonian, which would be highly atypical of the kind of dynamics an otherwise unconstrained neural network would learn. This is reflected in their loss function, the first of which learn an arbitrary second order differential equation, the second of which enforces Hamiltonian dynamics.

I don't understand how this was considered novel enough to warrant at PRE paper.

Here is a link to the paper:

https://journals.aps.org/pre/pdf/10.1103/PhysRevE.101.062207


For some systems even with the Lagrangian/Hamiltonian setup your solving differential equations with numerical techniques that has error. It might be that the neural networks has less error than the standard techniques. This is a guess.


Hamiltonian NNs are not a new thing. There was a NIPS 2019 paper [0] that attempted to do that same for some toy problems.

In general the idea of including model or context-based information into neural networks goes along the line of Kahneman's System I and System II of the human mind. System I is the "emotional" brain that is fast and makes decisions quickly while System II is the "rational" brain that is slow and expensive and takes time to compute a response. Researchers have been trying to develop ML models that utilize this dichotomy by building corresponding dual modules but the major challenge remains in efficiently embedding the assumptions of the world dynamics into the models.

[0] https://arxiv.org/abs/1906.01563 [1] https://en.wikipedia.org/wiki/Thinking,_Fast_and_Slow


To be frank, this should be the reference, compare to numerical integration and see which is better.


I love your comment. I follow, and appreciate it, but I could not help but think of this xkcd: https://xkcd.com/793/


> the NAIL team incorporated Hamiltonian structure into neural networks

ML non-expert here. Is this the same as having an extra column of your input data that's a hamiltonian of the raw input? Or a kind of neuron that can compute a hamiltonian on an observation? Or something more complicated.

is this like a specialized 'functional region' in a biological brain? (broca's area, cerebellum)


Also ML non-expert here. I think this is about a different kind of neuron(your 2nd suggestion). The paper another commenter linked says:

Hamiltonian neural network (HNN) intakes position and momenta {q,p}, outputs the scalar function H, takes its gradient to find its position and momentum rates of change, and minimizes the loss

<latex equation for a modified loss function that differs from traditional NN>

which enforces Hamilton's equations of motion.

https://journals.aps.org/pre/abstract/10.1103/PhysRevE.101.0...


I haven't used HNNs in practice but it seems that the main difference from common NNs is that the loss function incorporates gradients. It's not a new type of a neuron.


Why not shamelessly plug my work here? I see no reason not to.

So, here it is: https://github.com/thesz/nn/tree/master/series

A proof of concept implementation of training neural networks process where loss function is a potential energy in Lagrangian function and I even incorporated "speed of light" - the "mass" of particle gets corrected using Lorenz multiplier m=m0/sqrt(1-v^2/c^2).

Everything is done using ideas from quite interesting paper about power of lazy semantics: https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.32....

PS Proof-of-concept here means it is grossly inefficient, mainly due to amount of symbolic computation. Yet it works. In some cases. ;)


This sounds like the opposite of what Richard Sutton seemed to advocate for in his "Bitter Lesson"[0]. I don't know nearly enough to advocate for one thing or the other, but it is fascinating to see that those approaches seem to compete as we venture into the unknown.

[0] http://incompleteideas.net/IncIdeas/BitterLesson.html


They're not the opposite, and both are correct.

Sutton is saying 'over a slightly longer time'.

You can wait 20 more years and super-duper-deep-NN-on-steroids, and hardware a million times as big and powerful, would rediscover all of theoretical physics.

Or you could inject some theoretical physics acquired by humans and make DNNs smarter today.


I assume your 20 years is a guesstimate, and I do think it misses the point of what Sutton's writing is. The trap here is that there's always to be more computing in the future, so where do we draw the line? The idea is to think differently now, for the pursuit of actual progress down the road. Which, by the way, is exactly what people were doing about 40 years ago and what put down more than the foundations for all the tricks we're pulling these days.


I see what Sutton said as a "statistical learning and artificial intelligence" researcher in line with what the authors of the physics paper presented as "an application of learning research to computational science and engineering, CSE, surrounding physics".

CSE researchers did not sit down and wait for AI researchers to learn the bitter lesson before they resumed their work.

CSE research goes on independent of whether AI/GOFAI/ML has a winter, a summer, an ice age, or a global warming.

It just so happens that in light of the recent progress of AI/ML, specifically 2012 to 2019, they see the utility of incorporating a tiny bit of ML to their vast array of methods.

The paper shared in this thread is merely another attempt to advance such an incorporation. If it doesn't pan out, they go back to doing CSE on physics without any AI or ML.


That makes sense. As you said, those two sources don't have to be contradicting each other if they complement instead.


Can someone with AI knowledge please clarify - does this mean we can build 'rules based systems' into AI to synthesise intelligence from both domains?

If so, this would be dramatic, no?

If you could teach a translation service 'grammar' and then also leverage the pattern matching, could this be a 'fundamental' new idea in AI application?

Or is this just something specific?


They model a system which they know to be constrained by a closed-form equation called the Hamiltonian. They (cleverly, IMO) force the network’s predictions to be constrained by the Hamiltonian, by choosing the right output and loss function.

I don’t see a way to generalize this to the procedural rule-based systems you describe, unless they too are governed by a fairly simple continuous function Like the Hamiltonian.

I don’t know if it was “dramatic”, but it made me really happy.


So can you teach a NN an equation of motion, and if so would it execute faster than numerically integrating said equation? Could have impacts in physics simulations although the accuracy might not be as good


This sounds pretty terrifying.


Careful there, athesyn. No need to offend our computer overlords.


But... why?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: