There has been some nay-saying on the other thread (and some here too). One of the main objection seems to be that they used video stream to calibrate the subjects brain and that they essentially eavesdropped on the brain.
Brains of different people are not in a one to one correspondence, they do not have the same number of cells and even if they had, it is not known if the same information will get encoded in the exact same cell. So some form of calibration on test images/video seems unavoidable. However, in spite of the person to person variation there might be some common ground that allows some level of extrapolation from one person to another.
On a different note, artificial intelligence has always had this PR problem. Whenever it becomes possible to solve a problem that has been labeled AI it appears less impressive, because now we understand how it can be done. This has happened with computer vision, reasoning, chess, now jeopardy. AI is a moving frontier, and consists of things we do not understand well enough, and whenever we do, it is taken out from AI.
Another PR problem has been the difficulty to acknowledge the fact that solving an AI task and replicating how a human does it are different tasks. The former may be approached via the latter but it is not necessary. That said, It would indeed be more impressive and fair if the AI problem solvers (vision, chess, Jeopardy, etc. etc) are solved with systems that consume no more power than what a human brain does.
Brains of different people are not in a one to one correspondence, they do not have the same number of cells and even if they had, it is not known if the same information will get encoded in the exact same cell.
This is true, of course, but irrelevant to this work. At the level they're working at (fMRI scans, which have a resolution on the order of 0.5-4mm or so depending on the temporal resolution, etc.), you can't resolve individual cells anyways so you don't have to be concerned about those kinds of individual variations.
Visual activity in many parts of the brain follows a retinotopic map, where activity in nearby locations on the retina are processed in nearby regions of the brain. So, while you would have to calibrate some details, a lot of things would be constant between brains.
Inter-subject variability is a huge problem in this kind of work.
As you say, they're operating at a much larger scale than individual cells (100k-1m cells in each voxel). Likewise, some early visual processing areas are broadly organized kind of like big, noisy bitmaps on the surface of the brain.
But for sophisticated machine learning-style analyses like these, the gross differences in representation and morphology (especially at higher processing levels in the brain) make it very hard to pool the data across multiple people. That's why they're preferring to use many sessions from a small number of participants rather than a single session from many participants (the standard approach).
[I worked on applying machine learning methods to fMRI for my PhD]
Indeed and this is more true for the retinotopic map. For cognitive there seems to be quite a bit of variability. In fact its not a trivial task to distinguish a cognitively active brain from one that is in a waking but resting state.
The comment that you quote wasn't made with reference to their work specifically but to any sufficiently accurate technology that can read through your eyes via the brain.
This is a remarkable engineering feat, but not a novel motion model. It's known that V1 contains a topographic representation of the visual fields, and subsequent visual cortices encode for features like motion. As they mention in the paper though, dreams evoke activity in higher-level cortices and not so much in the early ones, making it difficult to tell if this method can be used to, e.g., visualize dreams.
What's with the sporatic clutters of watermark-like words in the video? I wish they would show how this material was being collected, or perhaps some indication of whether what they show is raw or edited. It's obviously at least cropped to be composed in the same way as the original video, to make the comparison easier. This makes me feel like everything about this isn't being shown, and the implicit dishonesty (or at least incompleteness) makes me skeptical.
A different site covering this same story mentions that the methodology used actually involves the computer scanning something like 18 million youtube videos, simulating what areas in the brain each of these videos would be likely to affect, and comparing that to what was recorded during the subjects' fMRI sessions. So the watermark-like words are coming from the youtube clips, not from the subjects' brains themselves.
I had the same reaction. Even though I grasp scientifically what they do, seeing that video is an almost spooky experience.
I know it cannot be used to visualize what we dream/hallucinate yet, it only shows what the patient directly sees. But if that's the next step... wow, it would change communication and content creation as we know it if we could just dream up videos in our mind and upload them :-)
The potential that we could one day share exactly what we see in our imagination is so awesome, it's almost hard to imagine it! (Irony not really intended)
And what about "uploading" yourself to a computer. What happens if one day we can create exact copies of a person in silicon?
At least they aren't working on interpreting audio signals into the playback too. Once they start doing that, we'll all have to close our eyes and put our fingers in our ears whenever we are around a police officer because we would effectively be recording their actions and could be arrested in several states for being in their presence. :)
Yes, this is a significant exaggeration, but the scary thing is it is easy to see how that could become a viable interpretation of the law if this technology advanced to use outside a lab.
Very cool indeed. It seems to extract information relative to images being directly perceived at the moment. I bet that accessing memories will be much harder though, due to more complex encoding (as opposed to just reading the visual "input buffer").
We'll probably need the visual memories anyway, because the "input stream" tends to be a sprawling mess that only looks so good because the brain does some pretty awesome inference and pattern matching to construct a high quality scene.
I'm not sure how far on the AI scale that would be.
It will probably be harder, but not as much harder as you might think. Handling new complexity requires new brain structures, which in turn uses up energy that could be spent elsewhere. Other things being equal, the brain tends to re-use as much as it can. There is some evidence that the brain re-uses the same structures for visual memories as it does for visual processing. I wish I could quickly find you a reference.
Richard Granger has some good articles on brain structures, and he is a very accessible writer. (For a start, Google for his paper "Engines of the Brain".)
Of course, take the "brain re-uses useful structures" hypothesis with a grain of salt. There is some evidence in for some brain structures, but by no means conclusive.
It's great achievement but I wonder whether the film industry could make use of this? The reason is - I don't think we could re-create an HD videos out of our brain , may be we could generate a blurry video because brain won't be keeping each every details of every objects that we see, unless we are looking at the object at same time.
If our brain would have stored all those detailed information, it would have been slow as the computer that we have now
Interesting and extraordinary discovery that offers an alternative to verbal communication not just to those who literally cannot vocalize but also to those who at times, for whatever reason, cannot "be on the same page". Whether a gap is generational, cultural, gender, or whatever, mind-reading technology presents a different way for all of us to bridge understanding.
Yes, I think it would. I don't have a reference right now, but I remember reading a study a few years ago about how they detected V1 (primary visual cortex) activity during dreaming very similar to that when awake.
Brains of different people are not in a one to one correspondence, they do not have the same number of cells and even if they had, it is not known if the same information will get encoded in the exact same cell. So some form of calibration on test images/video seems unavoidable. However, in spite of the person to person variation there might be some common ground that allows some level of extrapolation from one person to another.
On a different note, artificial intelligence has always had this PR problem. Whenever it becomes possible to solve a problem that has been labeled AI it appears less impressive, because now we understand how it can be done. This has happened with computer vision, reasoning, chess, now jeopardy. AI is a moving frontier, and consists of things we do not understand well enough, and whenever we do, it is taken out from AI.
Another PR problem has been the difficulty to acknowledge the fact that solving an AI task and replicating how a human does it are different tasks. The former may be approached via the latter but it is not necessary. That said, It would indeed be more impressive and fair if the AI problem solvers (vision, chess, Jeopardy, etc. etc) are solved with systems that consume no more power than what a human brain does.