Hacker News new | past | comments | ask | show | jobs | submit login
Cracking Go (2007) (ieee.org)
58 points by nl on March 20, 2016 | hide | past | favorite | 20 comments



Here's the relevant quote:

"Nevertheless, I believe that a world-champion-level Go machine can be built within 10 years, based on the same method of intensive analysis—brute force, basically—that Deep Blue employed for chess."

A question: I was under the impression that AlphaGo's techniques would not be classified as "brute force", and if this is the case, is this indicative of how this field has shifted in thought since 2007?

Also, how relevant are the techniques described in the article nowadays? i.e.:

"When human players search through the Go game tree, they generally check the live-or-dead status of each stone only once, then in effect cache the result in their memories. In other words, they don't check again unless they have good reasons to do so. " (seems like memoization)


I could be mistaken but this is what I gathered from what the people who worked on AlphaGo said in various interviews and videos:

AlphaGo does not use brute force at all. It uses a neural network to find interesting moves, and this in effect limits the search space.

In addition, when AlphaGo "reads" moves (aka performs a tree search) it doesn't go all the way to the end of the game. It stops somewhere and asks another neural network to "evaluate" the board position, and figure out who is it favorable for. I'm thinking the neural network will also place some kind of probability on its evaluation, so it might give something like "this board position is 58% favorable for black".

In that way, it's very similar to how a professional player plays the game.

One difference I have noticed, which is probably the weakness that Lee Sedol exposed in game #4, is that AlphaGo does not do a thorough reading of "delicate" or "complicated" situation. I believe that a top professional would look at a complex situation and start considering moves that are not usually considered otherwise.

Another thing is that a professional would imagine a board position they want to arrive at, then try to search for a sequence of moves that allows them to reach that board position. This is specially the case in complex situations like what happened during game #4 where AlphaGo made a mistake


That doesn't sound different from chess AIs to me. Chess AIs can't "go all the way to the end of the game" until very very late in the game (it's simply impossible with current, even projected future, computing power). They too "stop somewhere" to 'evaluate' the board position, and figure out how favorable it is for them:

    For most chess positions, computers cannot look ahead to all possible final
    positions. Instead, they must look ahead a few plies and compare the possible
    positions, known as leaves. The algorithm that evaluates leaves is termed the
    "evaluation function", and these algorithms are often vastly different
    between different chess programs.
https://en.wikipedia.org/wiki/Computer_chess#Leaf_evaluation

(It's worth noting that if computers could "go all the way to the end of the game", they could play perfectly, which they can't.)

While there are similarities to how professional players play (either game), there are important differences. In particular, professional human players usually look ahead relatively little, but have developed a very accurate intuition for evaluating the board position and for what "feels" like the right next move. This intuition is expected to still be more accurate than the evaluation functions of even the best computer players, which is why anti-computer tactics typically involve "playing conservatively for a long-term advantage": https://en.wikipedia.org/wiki/Anti-computer_tactics


Interesting. I think though the difference with chess is there's no simple known heuristic to evaluate a board position.

AlphaGo does it with a neural network, and as far as I can tell, it seems to be doing a better job at it than top professionals.

All the AIs before AlphaGo basically (as far as I can tell) had some hand-coded heuristic rules.


"the difference with chess is there's no simple known heuristic to evaluate a board position"

While it is indeed hand-coded in most chess engines (I'm sure there are some experimental ones, but that's not the path that leads to beating human grandmasters), these heurystics are everything but "simple".


That's correct, but with chess engines (the strong ones) the heuristics of evaluating a position is hard-coded.


AlphaGo does do a tree search as you say, but the "value" of the stopping node is half the neural network you mention, and another half performing Monte Carlo with medium strength 'players' (2d amateur). These 'players' are really a quick version of the priority network that figures out which moves to play in the first place.


In fact, the paper mentions that the policy network (without MCTS) alone performs reasonably well and can beat most other Go bots.


AlphaGo used Monte Carlo methods, which I believe qualify as "brute force"—you generate a bunch of random positions, since the full game tree is way too large.


AlphaGo uses neural networks trained on human play to generate candidate moves, and then Monte Carlo Tree Search to evaluate them deeply. The neural nets are a strong player by themselves.

AlphaGo primarily refutes this essay, rather than supports it. Prior to AlphaGo we had many Monte Carlo bots, but none of them approached professional level of play, much less world-class play. A strategy other than brute force was required to get there.

---

AlphaGo's loss in game four is interesting, by the way; I'm not sure how you would fix it in their model. It appears that it lost because it failed to predict the (creative, unusual) wedge move in the neural nets, did not spend enough time evaluating it in MCTS, didn't give a strong response to it (which was available through MCTS if it had looked), and didn't understand afterwards that it was now dramatically losing the game and played some very strange moves.


Actually AlphaGo found that move and it rated it at a 1:10000 chance of being played. Possibly because of that it didn't evaluate the possible continuations. However it also took almost another ten moves before the value network decided that it was losing the game.


Thanks.

I found this article[1], (linked from the original) that also mentioned this:

"In recent years, many programmers have tried to get around this problem with Monte Carlo simulation, a statistical means of finding the best first move from a vast database of the games that might begin from a given position. That method is also used a bit in AlphaGo, together with the tree-generating methods of yore. But the key improvement is AlphaGo’s use of deep neural networks to recognize patterns."

1. http://spectrum.ieee.org/tech-talk/computing/networks/alphag...


Monte Carlo for Go was itself a major step forward; it pushed the best bots from about 1-5k to about 2d (amateur). It's not brute force; it's statistics!



Hassabis declared somewhere that AlphaGo evaluates 10,000 positions per second. That's not brute force for today standards. It has two neutral networks that decides where to look for moves, how good a position is and then it does a Monte Carlo search there. I believe the positions in the MC search tree are not counted as part of those 10k positions.


Others are arguing, that because AlphaGo does use a search tree process, that his prediction was correct. I would say that idea of the search tree was correct, but the methodology he was proposing in the rest of the article were further optimizations upon techniques indicative of deep blue, and not what AlphaGo employeed. In fact, his proposals largely consist of improving computer speed and some cleaver Go specific optimizations would be what crack Go.

The article talks about two extremes between only trying a few moves while being intelligent about the choices:

Specifically, they wanted computers to examine only playing sequences that were meaningful according to some human reasoning process. In computer chess this policy, known as selective search, never really made progress. The reason is that humans are extremely good at recognizing patterns; it is one of the things that we do best.

and using raw brute force to test all positions:

The idea was to let computers do what they do best, namely, calculate. A simple legal-move generator finds all the permissible moves in a position, considers all the possible responses, and then repeats the cycle. Each cycle is called a ply, each generation of new possibilities is called a node—that is, a branching point in a rapidly widening tree of analysis. The branches terminate in “leaf,” or end positions.

and the solution was a balance. A very dumb valuation, and searching lots of positions:

Deep Blue typically looked 12 plies ahead in all variations (and 40 or more plies in selective lines), generating around 170 million leaf nodes per second. Next, the program would evaluate each of these positions by counting “material,” that is, the standard values of the chess pieces. For example, a pawn is worth one point, a knight or bishop three, and so on. Then it added points for a range of positional factors, chosen with the help of human grandmasters.

In the middle of the article, he talks a lot about applying Go specific knowledge to the pruning process to make searching large number of positions easier, by caching knowledge it has discovered about how the game is going. Ie, this group of stones is dead.

At the end of the article, he talks about computing power, and assumes a good go Ai will use custom hardware and search far more positions than Deep Blue:

My gut feeling is that with some optimization a machine that can search a trillion positions per second would be enough to play Go at the very highest level. It would then be cheaper to build the machine out of FPGAs (field-programmable gate arrays) instead of the much more expensive and highly unwieldy full-custom chips.

It's important to note that AlphaGo searched fewer positions than Deep Blue did. He discounts the methodology of "Selective Search" because it relies upon human's ability to do patterns. However, this is effectively how AlphaGo worked. It is able to do very well without using a tree search at all (beating other Go playing games)

So to answer your question, "is this indicative of how this field has shifted in thought since 2007?" This whole article seems to show a transition in thought. They have the right idea: Searching positions and having knowledge about what positions are good and bad helps AlphaGo achieve professional level play. However, he completely misses that neural networks would be able to get the human quality of pattern recognition, and that no Go specific search optimizations would be needed.


The comments here are really missing the forest for the trees. The point is: here is one computer Go researcher predicting computer Go dominance would be achievable by 2017. I keep reading in the media that "Go experts thought it would take decades," but the fact is it's hard to find actual quotations from computer Go experts.

You can easily find many professional Go players who didn't think Go programs would surpass humans for decades, but these people aren't engineers. This would be like asking a human off the street for predictions about AGI.

It doesn't really matter whether or not the researcher predicted the exact methods of AlphaGo (he didn't - while AlphaGo uses tree search methods very extensively, just as important were the policy/value nets). Does anyone really believe that the people with predictions about the timeframe of AGI are going to predict the exact methods used in AGI?


Here is a 2013 paper that tried to empirically estimate the rate of improvement of Go playing software: https://intelligence.org/files/AlgorithmicProgress.pdf

>Go programs have improved about one stone per year for the last three decades. Hardware doublings produce diminishing Elo gains on a scale consistent with accounting for around half of all progress.

They don't give a clear prediction for when it was expected to exceed the best humans, but AlphaGo was clearly a discontinuous jump in ability.

According to this article, experts were extremely skeptical of this even a year ago: http://www.wired.com/2014/05/the-world-of-computer-go/

>After the match, I ask Coulom when a machine will win without a handicap. “I think maybe ten years,” he says. “But I do not like to make predictions.”...

>Even with Monte Carlo, another ten years may prove too optimistic. And while programmers are virtually unanimous in saying computers will eventually top the humans, many in the Go community are skeptical. “The question of whether they’ll get there is an open one,” says Will Lockhart, director of the Go documentary The Surrounding Game. “Those who are familiar with just how strong professionals really are, they’re not so sure.”


Right. The only quotation is "I think maybe ten years," by Remi Coulon. Obviously the community was mixed. But if you believed the way the media has been spinning it, you'd think the computer Go community as a collective thought it would not be possible.

Otherwise, the opinions of human Go players at the time (that were not developers of computer Go programs) aren't relevant, nor is the opinion of a WIRED article writer.


Here I was hoping someone was anouncing major vulnerability results for Go language or runtime. Another time...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: