- we get tons of data, just not all textual. We have visual (~30fps in much bigger than HD resolution all day long), audio (again, better than CD quality all day long), smell, taste, and touch, not to mention internal senses (balance, pain, muscular feedback, etc). By the time a baby is 6 months old, she's seen and processed a lot of data. Don't know if it's more than Google's 18B pages, but it's a lot.
-we get correlated data. Google has to use a ton of pages for language because it only gets usage, not context. Much (most?) of the meaning in language comes from context, but using text you only get the context that's explicitly stated. Speech is so economical because humans get to factor in the speaker, the relationship with the speaker, body language, tone of voice, location, recent events, historical events, shared experiences, etc, etc, etc. Humans have a million ways to evaluate everything they read or hear, and without that, you need a ton of text to make sure you cover those situations.
-we have a mental model. Everything we do or learn adds to the model we have of the world, either by explicit facts (A can of Coke has 160 calories) or by relative frequencies (there are no purple cows but a lot of brown ones). My model of automobile engines is very crude and inaccurate while my model of programming is very good. Also, because I have (or can build) a model, I have a way to evaluate new data. Does this add anything to a part of my model (pg's essays did this for me)? Does it confirm a part of the model that wasn't sure (more experimental data)? Does it contradict a weakly held belief? Does is contradict a strongly held belief? Is it internally consistent? Is the source trustworthy?
This mental model might just be a bunch of statistically relevant correlations, but that sounds like neurons with positive or negative attractions of varying strength. Kind of like a brain. I believe Jeff Hawkins is on to something (see On Intelligence http://www.amazon.com/o/asin/0805078533/pchristensen-20), but there needs to be correlated data (like vision/hearing/touch are correlated) and the ability to evaluate data sources.
I agree that if humans can do it, machines can do it, but I think you're vastly underestimating the amount and quality of data humans get.
Yes I think you do have a point, but I don't think it's about things like visual resolution and the amount of data it generates. It may be about the much greater variety of data we see and about our ability to experiment and interact with the world around us in order to test our beliefs.
So maybe you could say it's about the quality of information not just the amount of data of one particular kind.
In any event, this is a debate that is only at the very beginning. I don't claim to have come to a conclusion. I just think those brute force statistical techniques are not the end of the road but rather a practical workaround for the brittleness and the complexity of traditional rule based systems.
Don't want to be pedantic here, but your info on our visual bandwidth is a bit out of date. We actually only process about 10M/sec of visual data. Your brain does a very good job of fooling your conscious self, but what you are perceiving as HD-quality resolution is actually only gathered in the narrow cone of your current focal point. The rest of what you "see" is of much lower bandwidth and mostly a mental trick. We also don't store very much of this sensory data for later processing.
- we get tons of data, just not all textual. We have visual (~30fps in much bigger than HD resolution all day long), audio (again, better than CD quality all day long), smell, taste, and touch, not to mention internal senses (balance, pain, muscular feedback, etc). By the time a baby is 6 months old, she's seen and processed a lot of data. Don't know if it's more than Google's 18B pages, but it's a lot.
-we get correlated data. Google has to use a ton of pages for language because it only gets usage, not context. Much (most?) of the meaning in language comes from context, but using text you only get the context that's explicitly stated. Speech is so economical because humans get to factor in the speaker, the relationship with the speaker, body language, tone of voice, location, recent events, historical events, shared experiences, etc, etc, etc. Humans have a million ways to evaluate everything they read or hear, and without that, you need a ton of text to make sure you cover those situations.
-we have a mental model. Everything we do or learn adds to the model we have of the world, either by explicit facts (A can of Coke has 160 calories) or by relative frequencies (there are no purple cows but a lot of brown ones). My model of automobile engines is very crude and inaccurate while my model of programming is very good. Also, because I have (or can build) a model, I have a way to evaluate new data. Does this add anything to a part of my model (pg's essays did this for me)? Does it confirm a part of the model that wasn't sure (more experimental data)? Does it contradict a weakly held belief? Does is contradict a strongly held belief? Is it internally consistent? Is the source trustworthy?
This mental model might just be a bunch of statistically relevant correlations, but that sounds like neurons with positive or negative attractions of varying strength. Kind of like a brain. I believe Jeff Hawkins is on to something (see On Intelligence http://www.amazon.com/o/asin/0805078533/pchristensen-20), but there needs to be correlated data (like vision/hearing/touch are correlated) and the ability to evaluate data sources.
I agree that if humans can do it, machines can do it, but I think you're vastly underestimating the amount and quality of data humans get.