Hacker News new | past | comments | ask | show | jobs | submit login
Why Does the Neocortex Have Columns? A Theory of Learning Structure of the World [pdf] (biorxiv.org)
237 points by blacksmythe on Oct 25, 2017 | hide | past | favorite | 73 comments



Jeff Hawkins gets shit on a lot in ML because his theories haven’t produced models with results as good as the mainstream approach, but I’m glad they keep working at it and keep coming up with interesting ideas.

Too much of ML these days is about some NN model that does 0.n% better than SOTA on some specific task. Then you change one tiny parameter and he entire thing breaks, and it turns out we didn’t understand why it was working at all.


I do think his reasoning and focus of what's wrong with ML today is good. Online learning, sequence learning and "motor" feedback are all critical.

Unfortunately he came up with this HTM and appears to be trying to cram all advances into the same framework rather than stepping back and thinking about how those critical features could be implemented differently.

Also as far as I know hebbian learning (like HTM) has achieved ~90% accuracy on MNIST while SOA is above 99% accuracy. So it's not a small gap between HTM and state of the art.

If HTM was near 98% it would be impressive (learning is faster and transfer better). I think alternatives or hybrids are necessary, but he seems stuck on his pet project.

That said, the principles and "what's important for intelligence" I think he is right on target and articulates those aspects well.


>Jeff Hawkins gets shit on a lot in ML because his theories haven’t produced models with results as good as the mainstream approach, but I’m glad they keep working at it and keep coming up with interesting ideas.

And also because Numenta's work isn't good, empirically-checkable neuroscience either.


Not sure whether that was a jab at Numenta or not, but I do think the combination of these two comments cuts to the heart of the PR problem for Numenta: they're neither trying to do 0.1% better on the classic perception benchmarks than other algorithms do, nor trying to publish accurate, testable descriptions of the true details of meatspace neuroscience.

To me, exploring alternative network architectures and algorithms seems an extremely worthwhile goal even if it's only loosely tethered to actual biology, but from a PR perspective they really need to be better about priming the conversation if they want people to care.

Bad (neuroscience-focused): "We're doing a lot of research on neuroscience, and finding some really interesting stuff, so we built a model that doesn't exactly match the way the brain works but is still interesting. No, we haven't tried to make it work to classify ImageNet test cases, that's not our goal. But look, it's closer to biology, and we have working code that we're playing with!"

Better (ML-focused): "We're developing a novel neural network architecture that performs online unsupervised learning using only local update rules. Though it performs competently at classic benchmarks X, Y, and Z when a small WTA layer is thrown on top, it can also tackle problems A, B, and C that classical deep learning networks can't make any progress on."

To be fair, I'm not even sure if Numenta's networks could perform competently at any classic benchmarks (I'm guessing that if they could, it would take some work to get them to do so), and I have no idea what new problems it could work on. But they really do need to reframe the conversation and emphasize that sort of innovation if they want to be taken more seriously - focusing on neuroscience underpinnings is not a great move if they're not engaged in research that can actually win over neuroscientists, and just pointing out that they're focusing on those things is not a way to win over industry ML folks if they don't have any results to point at.


>To be fair, I'm not even sure if Numenta's networks could perform competently at any classic benchmarks (I'm guessing that if they could, it would take some work to get them to do so), and I have no idea what new problems it could work on. But they really do need to reframe the conversation and emphasize that sort of innovation if they want to be taken more seriously - focusing on neuroscience underpinnings is not a great move if they're not engaged in research that can actually win over neuroscientists, and just pointing out that they're focusing on those things is not a way to win over industry ML folks if they don't have any results to point at.

It was definitely a jab, but I've also got some sympathy for their project. I genuinely agree that, well, theoretical and computational neuroscience need to become more genuinely computational! We're seeing an emerging computational paradigm for neuroscience that isn't just about jamming "network architectures" or "neural circuits" together and hoping something works; it supposedly has strong mathematical principles.

Ok, so where's the code? Sincere question. Some papers do simulations in Matlab, R, or Python that's just not shared. This includes even papers that purport to be applying these neuroscience-derived principles to robotics problems.

Computational cognitive science does a bunch better: their custom-built Matlab gets shared!

If we really believe our theories, we should put them to the computational test. If we put them to the test and they don't work well, we should either revise the theories, or revise the benchmarks. Maybe ImageNet classification scores are a bad idea for how to measure precise, accurate sensorimotor inference! New benchmarks for measuring the performance of "real" cognitive systems are a great idea! Let's do it!

But that requires that we do the slow work of trying to merge theoretical/computational neurosci, cognitive science, and ML/AI back together, at least in some subfields. This is challenging, because nobody's gonna give us our own journal for it until a few prestigious people advocate for one.


What part of Numenta's model is wrong? I certainly see some guesses and simplifications in their model, especially in poorly researched areas, but I figure that's the cost of trying to get things to work.


Numenta's HTM theory seems functionally similar to variants of N-gram models (https://en.wikipedia.org/wiki/N-gram), where the model is able to predict future steps based on learned conditional probabilities. Maybe that's why it hasn't worked exceptionally well?


Cargo cult programming abounds in a lot of the ML stuff I read. It's complex and complicated, and I think it happens because a lot of the research happens in corporate environments with top-down pressure. I have worked in computational modeling of sensory motor systems, and it is really challenging work. This is way more interesting than banal model performance bragging!


As a neuroscientist who just started doing ML research, I would call this paper cargo cult programming. If you cobble together a hodgepodge of ideas from neuroscience and build a network to accomplish some trivial task with no baseline to compare it to, I find it really difficult to take anything away from that. Ignoring the cortical column aspect, I'm not particularly convinced that Hawkins's model is a better approximation of biology than a typical deep neural network, just different (and likely far less capable, if you were to apply it to a challenging task). Why not start with a network that we know works and make a biologically-inspired change, and then see if that improves performance on a well-studied problem? If it does, then you have 1) an improvement on the previous network and 2) weak evidence that your idea of how the brain works may be right, if we assume that the brain is a highly optimized information processing device.


Jeff Hawkins gets shit on a lot in ML because his theories haven’t produced models with results as good as the mainstream approach, but I’m glad they keep working at it and keep coming up with interesting ideas.

I just ran across a paper [1] from Numenta on Time Series Anomaly Detection using HTM last night which provide a benchmark[2] with some existing approachs. (But it seems to me there is no NN based approach in them. )

[1] Unsupervised real-time anomaly detection for streaming data http://www.sciencedirect.com/science/article/pii/S0925231217...

[2] https://github.com/numenta/NAB


Show me how HTM solves any multi-step process. Tic-tac-toe. Anything.


I thought he gets shit because he patented his approach instead of making it available for everyone (please do correct me if I'm wrong).



That's the copyright license, which is orthogonal to patent issues.

Edit: Apparently not, GPL v3 has clauses regarding patent grants in it.


https://www.gnu.org/licenses/rms-why-gplv3.en.html

It looks like it includes some patent information.


Huh, I didn't realise that. That 'patents' clause is pretty powerful - if my (very quick) reading is right, any time you contribute to a GPL3 project, any patents that you own which would apply to that project are basically set free for anyone to use. Cool for FOSS advocates but I can see why a lot of businesses are wary of anything using GPL3.


GPLv3 initial draft was a copy of the clause from the apache license:

"each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work,"

The final gplv3 text says:

"Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor's essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version."

As can be seen, they are very similar. The key difference between apache, a fairly popular license with businesses, and GPLv3 in the realm of patents, is the Novell-Microsoft pact clause. Its not about patents that you own, but rather patents for which you sub license from others.

"If you convey a covered work, knowingly relying on a patent license,... you must ether ... arrange to deprive yourself of the benefit of the patent license for this particular work or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients."

This clause is the main crux of the issue. Large companies with large lawyer departments has complex patent deals with competitors, and deprive yourself of the patent would mean to open up yourself to being sued, and the other option would mean renegotiate with the competitor to grant the patent to all and everyone.


From a mostly outside point of view, I've wondered a lot about this. It seems a lot of ML/DL research that goes on nowadays amounts to educated guessing and tinkering i.e. "Based on our notion of what seemed right, we tried this other configuration of layers and/or activation functions and look; it worked".

I imagine at some point that the kind of research discussed in the linked post will begin to pan out more and more and will eventually lead to another period of rapid progress like that which followed Hinton's work on DL.


Full paper here: https://www.biorxiv.org/content/biorxiv/early/2017/09/28/162...

I always enjoy reading analysis/ideas/intuitions about how the brain works, because it provides inspiration for machine learning improvements that can be applied in the real world.

That said, I’m still optimistically waiting for Numenta (and Geoff Hinton’s capsule theory) to set the new bar at one of the many difficult image/speech/language/etc recognition challenges.

Ideas are great, but at the end of the day science moves forward when we measure our ideas against reality. (To the credit of this paper, it does make a series of predictions, though those seem extremely difficult to measure in biological systems for the time being.)


Skeptical. Previously the brain was a machine consisting of drives and pulleys, just like the state of technology of the day. Now the brain is a computer that runs deep learning models. It would be one thing to just say the cortex has columns. It's another to go on and model what those columns are doing with linear algebra. The coincidence that this picture emerges at the same time as current fads in tech is too great to ignore

I am more interested in the work that is identifying different kinds of cells e.g place cells


> The coincidence that this picture emerges at the same time as current fads in tech is too great to ignore

It's no coincidence, I see it as a kind of reframing the issue from a different perspective/approach and for that, we use whatever seems the best framework of explanation we have at hand during that time. Gotta start somewhere, can't try to paint a picture without a frame and at least some colors.

One can easily see the evolution how the frameworks we use keep getting more complex, from drives and pulleys to computers and NNs, so there clearly is some progress in how we are trying to describe the workings of the brain/consciousness.

In the end, it's also about what's the actual goal here: Trying to understand the human brain/consciousness or trying to artificially create "consciousness" or rather something resembling it. The later doesn't necessarily require the former.


Not consciousness, but intelligence. Big difference.


Not sure why you were downvoted. It's important to keep the two terms separate for the sake of discourse, even if "ultimately" they turn out to be the same.

Can something be more conscious but less intelligent than something else, or viceversa? We don't know for sure yet.


Are there commonly agreed definitions for those terms in the AI/ML community? I don't know, that's why I'm never sure what terms to use.


Cognitive scientists define consciousness as the ability to have subjective experiences. The definition is not mathematically concrete, but sufficient to differentiate from intelligence.


Is it? Personal theory is consciousness scales, like we're more conscious than a bug, but a bug still definitely conscious.

Then like A.I. overlords / borg hive mind more conscious still.


How far down does the scale go? Are brainless animals like oysters conscious? What about the mycelium network underneath a forest?


Well maybe better way to state, less 'conscious' and more 'things they're conscious of.'


Maybe it does, but does intelligence require consciousness?


And: Is the question empirical or analytic? It is hard to discuss such matters without an agreed framework, definitions. The definitions for these terms are a long-standing analytic conundrum.

Abstractly, it is hard to understand how any performance could require consciousness unless consciousness is defined as a class of performance. If it is, then the question boils down to "Is performance class I a proper subset of performance class C?" Which is, prima facie, an analytic question.


Previously the world was made from Earth, Water, Fire and Air. Then quarks and electrons. Now they are talking about how the entire universe is a computer, and how each Plank area is just a qubit.

The coincidence that this picture emerges at the same time as current fads in tech is too great to ignore.


I agree, it is suspicious, though let's distinguish between hard science and soft sciences, and between observation and theory


Alternatively if your real goal is just to develop some novel ML techniques, it would be better to not make claims about how the brain works and instead just try to draw inspiration using these thought experiments more as a 'muse'


> The coincidence that this picture emerges at the same time as current fads in tech is too great to ignore

Neural networks weren't nearly as much of a fad back in 2004 when Hawkins published On Intelligence. They did a fantastic job of illustrating how the neocortex may be a hierarchy of these pattern recognizers (columns). It's a highly engaging read!


Unfortunately you're skeptical since you're mistaking philosophy for science.

Here: [https://arxiv.org/pdf/1706.06208.pdf], for example - they use deep learning models in order to classify the behaviour of in vivo neurons.


I do believe that our understanding on how our brain works is very close to what ML is doing.

There is probably a lot to learn about the underlying core structure but the basics will not change that much anymore.


In the sense that there are neurons firing off each other sure. But it was known (and modelled in hardware) since early 1950s.

There is however a qualitative difference in the topology and function, or our very modest wetware wouldn't have been able to outperform about every deep learning model out there.


Sax Russell, is that you?

A practically identical train of thought from that character shows up in one of KSR's Mars books. Alway found it an excellent point.


Also laid out quite well in the 1990 movie "Mindwalk". [1]

A movie still worth watching even though it is dated.

1. http://www.imdb.com/title/tt0100151/?ref_=nv_sr_3


Do you have a reference for "when the brain was drives and pulleys"? I've never heard of such a theory.


I think he's referring to this https://en.wikipedia.org/wiki/Mechanism_(philosophy)

In particular, this Descartes quote "I should like you to consider that these functions (including passion, memory, and imagination) follow from the mere arrangement of the machine’s organs every bit as naturally as the movements of a clock or other automaton follow from the arrangement of its counter-weights and wheels."


> place cells

Suiting, given that your username means "place" in German :)


I'm not an expert, but isn't the causality reversed? Neural Nets were developed based on research of the physical structures in the brain, right? If that's the case, it makes sense to make analogies between the two systems.


Yes and no. Neural nets are somewhat inspired by neurons, but it's a pretty loose relation, and the more NNs improve, the less similar to the brain they look. Individual neurons are much more complex than the nodes of a neural network, and even if research into NNs provides insight into the way the brain works, it may be that the analogues of NN nodes are clusters or regions of neurons, or something more mathematically abstract.


Great to see numenta in the news again. On Intelligence was the book that got me excited about biologically based machine learning. The methods that in that book is very different to anything else I’ve seen in current “trendy” ML.


Here are some video resources to help explain this theory:

- https://www.youtube.com/watch?v=BvJJn9VS4rk - https://www.youtube.com/watch?v=-h-cz7yY-G8


Here's another very interesting talk from Jeff Hawkins about modelling the neocortex: https://www.youtube.com/watch?v=izO2_mCvFaw


The world needs more work being done in the way Hawkins and co. are doing it, and less in the mold of most deep learning/ML work. Why? He's actually trying to connect building intelligent machines with biology. This is a huge problem, but so few are working on it. Rather, we are all distracted by deep learning because of its recent successes in very specific problem areas. In a few years, when we run into its limits, Hawkins and people doing work like this will have a chance to shine (if they produce something that works, of course).


Deepmind does a ton of work in meat neuroscience, and are actually publishing useful research in the field (a recent if somewhat controversial review at http://www.cell.com/neuron/fulltext/S0896-6273(17)30509-3) And they're about the most prominent deep learning group I can think of.


But wait a minute... the world is already doing quite a good deal of work on this type of problem. Maybe not enough, but it's been goin on

https://en.wikipedia.org/wiki/Computational_neuroscience


Computer vision and NLP are pretty broad problem areas with many applications. These technologies are already powerful enough to influence the world in many positive ways. I wouldn’t say that Actually realizing this potential is a “distraction”


There is also a summary at: https://blog.acolyer.org


... probably not by pure coincidence.

p.s. specific dated link https://blog.acolyer.org/2017/10/25/why-does-the-neocortex-h...


> Error! Problem, or Page Not Found

> Sorry, the page you were looking for does not exist.

Link is broken. From browsing, I believe the correct link is

https://numenta.com/papers-videos-and-more/resources/layers-...


The link works fine if you disable javascript. Still weird though


Understand how the brain works means understand how we the humans see and understand the world and ourselfs within it. The brain organically and selectivelly selects what to learn and which information shou ld be retained. Our brain receives training since birthdays. Then the family implements some of their own training then school and world. The brain never stops.

Great job Numenta !!


I'm getting "Error! Problem, or Page Not Found Sorry, the page you were looking for does not exist."

Edit: Weird, I just closed my browser and tried again, and the article looks like it flashed into the screen then was replaced by the error message. Happens repeatedly on Chrome on Android...


I'm seeing that too. Ok, we'll change the URL from https://numenta.com/papers/why-does-the-neocortex-have-layer... to the pdf of the paper. There's also a summary at https://blog.acolyer.org/2017/10/25/why-does-the-neocortex-h... that other commenters have mentioned.


I'm probably talking out of my ass but I'm somewhat suspect of a cortex having straight up columns. I'm curious whether these are artifacts of the fact that linear algebra seems to be the dominant algebra in ML/modeling of human perception. Recently I've been dipping my feet into geometric algebra which seems to be the superior algebra for just about anything you can think of (human perception but also like all of physics, Maxwell's 4 equations are reduced to a single equation in GA) and it's particularly better for reasoning about spaces which this seems to be all about.

And unlike linear algebra it actually makes sense (e.g. why is cross product only in three dimensions?, wtf are determinants esp. in the context of matrix division all about?).

This blog post introduces GA and talks about it's relationship to human perception.

https://slehar.wordpress.com/2014/03/18/clifford-algebra-a-v...


Cortical columns exist [1]; think perpendicular to the surface of the cortex [2, 2nd image, or just image search "cortical column"]. This is due to the morphology of cortical neurons. My understanding is that a column can be thought of as a functional unit, and passing information across columns adds complexity.

Of course biology is messy and there's tons of variation depending on which brain region you're looking at, but Visual Cortex was one of the earliest places this was observed. It gets complicated and detailed quickly, and I'm somewhat out of my element here.

[1] https://en.wikipedia.org/wiki/Cortical_column [2] http://www.mbfbioscience.com/blog/2012/01/neurolucida-helps-...


There are, of course lateral connections between columns, but the columns are very real.

I dont see what geomettic algebra has to with it, as the grandparent suggests.


They are performing spatial reasoning and GA is a better algebra for that.


I mean I don't see why geometric algebra bears on the factual existence of column-like structure perpendicular to the cortical surface in real brains. Geometric algebra may bear on what those columns do, but it is not relevant to whether the columns exist. This is a matter of observing network wiring from brain preparations.


I think it has something to do with the power of building hierarchical layers of abstraction, to make increasingly precise predictions about the world.


I would have assumed it was reducing complexity filtering out noise and compressing information), not adding it?


Linear algebra is equivalent to geometric algebra so you wouldn't find columns less likely in GA.

Also, GA is usually hyped by non-mathematicians -- people who don't understand that it's just an alternative notation that doesn't add much to linear algebra.

> Clifford Algebra, a.k.a. Geometric Algebra, is a most extraordinary synergistic confluence of a diverse range of specialized mathematical fields

> It is the very simplicity and generality of Clifford Algebra that confirms its “truth”

kook alarms ringing


> Linear algebra is equivalent to geometric algebra

That depends on your definition of equivalent. They have the same computational power but things make sense in one and less in the other.

> Also, GA is usually hyped by non-mathematician

What a loaded statement.

> kook alarms ringing

You aren't the first one to say it. Nonetheless, the author is a researcher at Harvard it seems http://cns-alumni.bu.edu/~slehar/Lehar.html

Don't let the form fool you. You can also read any of the many books on the topic.


> Research Fellow in Ophthalmology, Harvard University

And I'm sure he's a fine opthalmologist.

> I am an independent researcher with a novel theory of mind and brain, inspired by the observed properties of perception. These observations are confirmed by some peculiar anomalies in phenomenal perspective. The implications of these observations are that the foundational assumptions of neuroscience are fundamentally in error, and that an alternative paradigm of neurocomputation will have to be formulated to account for the properties of consciousness and perception.

If it quacks like a quack...


Mammalian cortex (well, in the species that have been studied closely) has both laminar and columnar structures. If you are interested in the biology side of things, take a look at

  http://hubel.med.harvard.edu/book/bcontex.htm
Caveat: it's pretty old, and is probably not quite right in a number of ways. But IMHO still a reasonable lay-person's introduction to what we know about the organization of the early visual pathway.

(Whether this has got anything to do with ML, I'm not in a position to say.)


You're confusing Gibbs-Heaviside vectors with linear algebra. Linear algebra is the study of linear spaces and linear operators between them. Endowing those spaces with a norm or a dot product is usually as far as linear algebra courses go.


Are you saying that cross product isn't a part of linear algebra?


I am.

I suppose you could say that any linear space equipped with some product on it falls into linear algebra, in which case it would be, but an obscure corner at best. Its sole application is electromagnetism in a non-relativistic formulation. That's a hugely important application if you're a physicist, but compared to the number of applications of linear algebra it's basically negligible.

The inner product is the only one usually taught in linear algebra, and that's because it's central to talking about representations of linear operators in bases.


What definition of linear algebra doesn't include the Clifford functor?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: