These CPU/GPUs are doing very inefficient digital simulations of analogue processes. If anything, human brain has significantly more 'processing power' than Alpha Go when you restrict the computation to the activation of linear threshold units.
It is typical for ML systems to surpass human performance while having very different characteristics in what it got right/wrong. For example in ImageNet, DCNN got a lot of points from distinguishing different breeds of dogs with subtle visual differences which are hard for human without training.
I think AlphaGo is also demonstrating some of these non-human characteristics as a consequence of the Monte Carlo Tree Search and optimization objective such as the brilliant move as well as the obvious slack moves/mistakes mentioned by the commentators.
I suspect that we are not close to the perfect game, as proving a perfect play requires expanding the enormous search tree and we do not have any analytical solutions nor brute force solutions.
Seconded. Another extreme example would be human brains, which I don't think we understand enough for it to be "accountable" in any mathematical bounds, yet we trust them to make complex decisions. Statistical characterization of the behavior of the AI system is a better objective than the ones based on inherently biased symbolic systems. Just because it is the way how humans communicate it, doesn't mean it's the best way to do it.
Brains don't do FLOPS. It is a completely different type of "calculation". Computational benchmarks are only directly comparable when the representations of input and output are the same.
That's very true, but in some ways it's even more remarkable: there exists an alternative form of calculation/computation that is "magical" in certain domains when compared to all the standard CS we know, and we don't have a clue what it is.
It's not an apple/apple comparison. It's an apple/unicorn comparison.
I don't know, if counted every modulation in every wire in a computer (CPU, RAM, cache activations, etc), you might get a signifant multiple of its FLOPS count, right?
>Whilst it is unable to assign one symbol with multiple nouns, I think these are more engineering issues than anything. The overall architecture of NELL can be made smarter with horizontally scalable knowledge inference additions.
It is not a trivial problem.[1] In fact representing knowledge as some ontology of human language strings with fixed schematics has its fundamental limitations. In many situations it simply fails, disambiguation is just one of them. For example, trustworthiness, temporal information and newly invented phrases etc.
The so-called reason by analogy is just gradient descent to approach local optima in optmization. The "by first principle" is just providing an approximate (ideally convex) model and solve it analytically for global optima. The problem is generally very hard (#P hard), both for formulating the problem and solving it. Global optimum of course in theory is better, but the quality of your objective function and constraints could easily offset this advantage. If you can come up with a simple linear programming problem for the battery example - that's great, but most likely you won't due to the prescence of competitor, market and policy constraint.
I agree, but would like to add the aspect of uncertainty. When reasoning from analogy, you usually have good statistical knowledge about the properties of the problem. For example, the targetted product category may exist, and customer behavior is known, and therefore predicting what would happen in some nearby configuration is usually somewhat accurate. On the other hand, in the first principles case you need to have a very accurate theory, because you are "far away" from what exists currently. If the theory in questions concerns physics, as for Musk, then this can work well. On the other hand if it is about social sciences and you need to predict the behaviors of customer from some kind of first-principle model (e.g. rational agent models), then you are very likely to make mispredictions, since human behavior is complicated, and the theories are inaccurate and overly general.
With uncertainty you just change your objective function to an evaluation function of any possible scenario scaled with its probability. It doesn't change how optimization works. Again this formulation is only easy to solve in very limited context. For example in economics people have been working with the oversimplified supply-demand curves precisely because they are usually the dominating factors and in practice a sufficiently accurate model works just fine. This model only gives insights for why different ways of reasoning works, not actually providing the panacea.
The good thing about gradient descent is that you do NOT need to have a model, you just need to focus on a few parameters and figure out what is the direction for best improvement from a current relatively good point, where the other billions of parameters are already accounted for and assumed independent from the direction you are going.
It seems you are assuming there are direct observations. Musk is talking about generating hypothetical observations from a model (the cost of the battery is bounded from below by the cost of the battery materials). This sort of bounding does not always work outside physics, because the uncertainties are so ill behaving.
Ah, sorry no. I'm just interested in AI in general and constrained optimization has been the what I found to be the most general model for solving and giving insights into AI problems or any problems that require "intelligence".
If you're asking waht I think you're asking, then what you're then looking for is called 'numerical analysis'. Specifically, the grandparent was describing this [1]. Here is an OCW link for a good primer in to various introductory numerical analysis processes for Engineering [2].
This is also why Google is paying millions of dollars for a quantum computer. Being able to solve complex optimization problems efficiently almost partially translates to access to higher intelligence.
This reminds me of the difference between pointers and actual values - it is more efficient for you to memorize where the detailed and possibly complex information can be found. Memorizing all the long-winded data just doesn't give you that much advantage anymore in this era of information explosion. I think it is fairly natural that most people accept this trade-off.
This is comparable with the invention of cars vs. walking. Cars might increased the rate of obessity but we certainly did not forget how to walk. It is very interesting to see how human adapt to these big changes without the biological evolution rate anywhere near the rate of change in technologies.