Why Neurons Have Thousands of Synapses: a Theory of Sequence Memory in Neocortex

mjpuser · on Nov 5, 2015

Hawkins' mantra is sequence based memory. I love this guy and his work. And although I think his company has had success with sequence based AI, it still doesn't totally answer how the brain works. His model basically works by taking encoded symbols and being able to predict the next encoded symbol. This works great for things with patterns (power consumption, traffic patterns, etc), but it doesn't work with language. You will not be able to predict what I am going to say, or predict my answer based on the question, at least not very well. The problem is that he needs to figure out meta computation. His product, nupic, would take in a string of words, which gets encoded into a semantic representation, and tries to predict what's next. But concepts are larger than a single word, and until it can build a working semantic representation of what it's currently processing, it will not prove to be a useful model in describing how the brain works, or at least being able to compare it to our brain.

RobertoG · on Nov 5, 2015

If you study the Numenta algorithm, you will see that the prediction is not only for the next steep and that the prediction, is forward feed to another layer where it becomes the input of a more general and stable prediction.

Also, there is strong evidence that prediction is the main (if not the only) function of the brain. I would recommend Hohwy's book (https://books.google.fr/books/about/The_Predictive_Mind.html...) for an explanation of this idea.

mietek · on Nov 5, 2015

> This works great for things with patterns, but it doesn't work with language.

Sure it does.

http://karpathy.github.io/2015/05/21/rnn-effectiveness/

> Hawkins' mantra is sequence based memory.

Note that Hawkins wrote “On intelligence” in 2005.

Schmidhuber wrote “Learning complex extended sequences using the principle of history compression” in 1992, and “Long short-term memory”, with Hochreiter, in 1997.

Today, a LSTM-based RNN can answer your Gmail.

http://people.idsia.ch/~juergen/rnn.html

http://googleresearch.blogspot.com/2015/11/computer-respond-...

mjpuser · on Nov 5, 2015

Sorry! To clarify that it doesn't work with language, I mean you can't ask it a question and then expect it to come up with an answer. For most meaningful questions, you can't sequence an answer from it's question in a general way. This might be in part because Hawkins isn't trying to mimic the brain, but learn about how it works to apply the algorithms it uses to our needs/wants.

p1esk · on Nov 5, 2015

you can't ask it a question and then expect it to come up with an answer

Why not? If the model was trained on a large enough corpus of texts, it will have seen lots of answers to your question, or similar questions. It can, in principle, extract the important features of those answers, and present them to you as an answer. This is pretty much how the majority of humans would answer "the most meaningful" questions.

mjpuser · on Nov 5, 2015

Nupic showcases it's "What did the fox eat" use case which uses cortical.io's SDRs of words. In this example it "asks the question" what does the fox eat after teach it what other animals eat, and without ever seeing fox before, it's able to accurately say what it eats since it groups foxes with coyotes, etc, which have a similar SDR. However, this is a sequence... They fed it sentences only in the form "a eats b" and then fed it "fox eats", and it replied "rodent" or whatever... The SDR's have to be sequences in order for the HTM to work. So if you ask "Guess how many balls are in this jug", you have to do an implied computation to guess how many possible balls fit inside the jug, and that implied computation is not the word sequence (or sdr sequence). Even if you gave it an infinite amount of similar questions and answers like this, it would never be able to figure out a general way to answer the question, which means you could always stump it by giving a different sized jug + different sized ball.

p1esk · on Nov 5, 2015

H in HTM stands for "Hierarchical". This means, that it can, in principle, construct more general, more abstract SDRs (patterns) based on many low level SDRs it sees. Thus it's plausible to see forming of ideas out of sentences, or something similar. This is how it can build a world model, and then run your input through multiple levels of abstraction, as many as needed to give a high confidence answer.

mjpuser · on Nov 6, 2015

My example was also spoon feeding questions and answers to the HTM. If you were to just give it a corpus of text, and then ask it a question about that text, it would not be able to formulate an answer that has any meaning, and follow proper English grammar. If you disagree, you can respond with a working example.

p1esk · on Nov 7, 2015

If you disagree, you can respond with a working example.

I am the working example. I can read texts and answer questions. HTM is an attempt to explain how I do that. Obviously, it's an incomplete theory, but its main ideas make sense to me. You got a better theory?

mjpuser · on Nov 9, 2015

I don't have a better theory, but that wasn't my point. It may be that there is a setup to read and understand language, but there has never been code presented based on this technology that does it. This is why I'm interested to see how you could set it up to make it work. If you could prove me wrong, then great! I'd love to see it.

p1esk · on Nov 10, 2015

HTM is a theory. It's very incomplete at this point, because our understanding of the brain is very incomplete. Numenta published their code as they discover more and more about the brain. Once the system is complete enough to process language, the code will be able to process language.

_12sh · on Nov 5, 2015

A noamrl peosrn can uranesntdd this whuiott too much dftfliciuy . I scseput there is hveay rniaelce on the ecixtetapon of waht wdors lileky come nxet . I thnik taht liniestng to seceph wkors the same wya.

versteegen · on Nov 5, 2015

Sure, Numenta seemed heavily focused on temporal sequences, but in "On Intelligence" Hawkins describes a more general picture. Prediction should be done not only on the past but on the context. The context includes other variables that also have to be predicted (e.g. predict the next word in a sentence based on a prediction of where this conversation is going)

I think that predicting the next word in a sentence does actually get to the point of the issue, which is why the Hutter Prize [0] is defined the way it is.

The human brain does: (1) operate sequentially on words (2) keeps an internal state that contains semantic information about what's been heard/seen (3) learns the semantic representation (4) is excellent at predicting the next word in a sentence, and continually does so (5) it's difficult to argue it's not learning to predict the next word in a sentence (ie optimised to that task).

Looking at that list, I'd say the only differences between what the brain is doing and what algorithms like these (in general) are doing are just the particular algorithms for learning the semantic representations in working memory and performing prediction, and less importantly the form of the input (humans have more cues available, including taking actions to get more information)

(Admittedly parsing a complicated sentence can require jumping backwards, but spoken language will have a very limited nesting level, and this just means including a small piece of the input in the working memory.)

In the last week I've actually been reading about algorithms for this (though I didn't know NuPIC was applied to sentences, thanks very much for mentioning it). Aside from the range of RNN-based systems that mietek mentioned, here are a couple more recent papers, related to word2vec, on representing the semantic and syntactic content of a sentence as vectors: [1], [2]. Both are based on optimising for prediction of the next words in a sequence. Now I agree it's dubious to try to reduce a sentence to a small fixed length representation, whether that's e.g. 2400 real numbers in [2], or a pattern of neuron activations in a RNN, but from the experimental results (which admittedly aren't always so great) they seems to be encoding something meaningful, though far too much information gets thrown away.

[0] http://prize.hutter1.net

[1] Quoc V. Le, Tomas Mikolov, 2014, Distributed Representations of Sentences and Documents, http://arxiv.org/abs/1405.4053

[2] Ryan Kiros et al, 2015, Skip-Thought Vectors, http://arxiv.org/abs/1506.06726

Edit: added a bit more about context

amelius · on Nov 5, 2015

> It has been previously proposed that non-linear properties of dendrites enable neurons to recognize multiple patterns. In this paper we extend this idea by showing that a neuron with several thousand synapses arranged along active dendrites can learn to accurately and robustly recognize hundreds of unique patterns of cellular activity, even in the presence of large amounts of noise and pattern variation.

I didn't RTA, and I'm clearly no expert, but what surprises me here is that a single neuron can apparently "learn" patterns. I was previously under the impression that only networks-of-neurons could learn patterns (as in artificial neural networks). Does this mean that the ANN community should rethink their mimetic approach?

raverbashing · on Nov 5, 2015

Neural Network "neurons" have nothing to do with biological neurons

What it seems more likely from this paper is that one neuron == one small neural network

versteegen · on Nov 5, 2015

I haven't read the paper and IANABiophysicist, but I have read that fuzzy-logic-like non-linear functions can actually be computed at the junctions where dendrite branches come together... and there are quite a lot of those per neuron. Of course many components of a neuron behave non-linearly to some degree. So yes, quite a lot like a whole ANN.

seiji · on Nov 5, 2015

Biology component: http://www.nature.com/nrn/journal/v7/n9/images/nrn1971-f2.jp...

ANN component: http://i.stack.imgur.com/KUvpQ.png

p1esk · on Nov 5, 2015

Here's a modern transistor: https://www.semiwiki.com/forum/attachments/content/attachmen...

And here's what it really does: http://www.instructables.com/files/deriv/FFQ/1E5K/H5JVXPW8/F...