Hacker News new | past | comments | ask | show | jobs | submit login
Finding surprising moves in chess games (github.com/cyhsm)
135 points by mdp on Dec 14, 2016 | hide | past | favorite | 35 comments



If I understand your code correctly in analyse_evaluations, you're defining a "surprising move" as a move that has a large change in valuation when it's considered at a higher depth. So if a "human" (really Stockfish at depth 5) evaluates a move as +1 and a "computer" (Stockfish now at depth 11) evaluates the move as +5, the move is surprising.

This is pretty interesting, but I'm not sure if it fully captures all the nuances of what a surprising move is. You might be able to classify a move as tactically surprising if it becomes clear after depth 7 that the ending position is favorable. However, in my opinion truly surprising moves are ones that carry plans that I haven't even considered. Hence, this methodology doesn't capture moves that are positionally surprising as there wouldn't be such a drastic change in evaluation at different depths. I'm not sure where you would start to figure that one out though :)

That being said this is really cool work!


You can filter for this:

Take a database of games from pro players; Take the set of all moves where Stockfish-5 agrees that the move actually played is the optimum. Filter for all the moves where Stockfish-11 has a different opinion that results in a big gain in position. What you get is a list of moves that would surprise pro-players under time pressure.

I wouldn't be surprised if professional chess players are all running a version of this against individual known opponents before a tournament to probe for weaknesses.

A harder problem would be to cross-reference this final list with the post-game opinions published by professional commentators and identify major discrepancies. This would be the "wouldn't have thought of it in a million years" list.


The list that's returned still contains mainly tactical surprises where Stockfish inaccurately evaluated the position at the end of depth 5. I think what I'm trying to say is there are some moves in a position that aren't tactically surprising (a piece sacrifice, a crazy attacking move, etc.) but positionally surprising (a long maneuver to get a piece to a certain square that I didn't think of). These positionally surprising moves aren't captured by this methodology because they don't involve large fluctuations in valuation when the depth changes.

As to your second point, an issue with how computer chess affects the modern scene is how playing the "best" move in any given position isn't representative of how humans play. Humans carry out plans and evaluate positions to the best of their ability, but the heuristics and procedure they use aren't the same as a computer's. For example, Karjakin didn't prepare for his match against Carlsen last month by playing a bunch of games against Stockfish. Rather he probably analyzed Carlsen's past games and opening choices to come up with a strategy.

I do think you can come up with a way to prepare against individually known opponents by identifying weaknesses programmatically. You can model a human's approach to playing chess as a distribution of parameters (material, king safety, pawn structure, etc.) that take in the current position and return the best move. You also have Stockfish's evaluation which returns the "best" move. With this, it's possible that you could use build a neural network that learns to play very similarly to a certain player by using their past games as a training set and comparing the chosen move to Stockfish's move. The network could learn to mimic the heuristics that the human individual uses to make decisions and playing against this new AI would be great practice for preparing against specific opponents.


I'm not sure I follow your point about tactical vs positional surprise. Surely the ultimate goal of the positional surprise is the same as the tactical surprise - you get an advantage at the end of an expected series of moves. Otherwise what's the point of getting into a surprising position that's not better than the conventional one?

My question is, is there any difference here that can't be solved by, say, upping the ply-number?

On humanlike chess-AI: have an adversarial network that works to classify human vs machine players, and optimize for humanness * strength-of-play in the AI?


The difference is that the positional sacrifice is less tangible. A space advantage, a tempo advantage, more mobile pieces, improved cohesion/coordination of pieces (Kasparov was legendary for taking this last kind of advantage and turning it into a lethal attack). It's a dynamic advantage rather than a static/permanent advantage, which also means there's a risk of that advantage dissipating as the game drags on.

These advantages aren't the kind where you can sit back and let the game play out confident of winning. It's a deliberate unbalancing of the equilibrium of the position, and one where this temporary dynamic advantage needs to be used to create a longer-lasting and static advantage.


Would it be fair to say you are trying to optimize for future positions where you aren't sure you will win, but the positions resemble certain archetypal positions/ share certain features that are advantageous (i.e. has a high probability of transforming into conventionally advantageous situations)?

I'm sure the chess AIs are full of this sort of knowledge internally, though, in the form of computation optimization algorithms. Perhaps the issue is to translate it to a human-usable format.


Indeed, chess engines do have heuristics to include positional advantage in their evaluation of a board, so they "know" in some way that a doubled pawn is disadvantageous or that development of pieces or attacking central squares is beneficial, much as humans know these things.

I've never heard experts discuss this, but I bet it's true that human beings still succeed in appreciating many of these benefits at a higher level of abstraction than machines do. An argument for this is that computers needed an extremely large advantage in explicit search depth to be able to beat human grandmasters. So the humans had other kinds of advantages going for them and most likely still do. One of those advantages that seems plausible is more sophisticated evaluation of why a position is strong or weak, without explicit game tree searches.

I looked at the Stockfish code very briefly during TCEC and it looks like a number of the evaluation heuristics that are not based on material (captures) are manually coded based on human reasoning about chess positions. But if I understood correctly, they are also running machine learning with huge numbers of simulated games in order to empirically reweight these heuristics, so if a particular heuristic turns out to help win games, it can be assessed as more valid/higher priority.

You could imagine that there are some things that human players know tacitly or explicitly that Stockfish or other engines still have no representation of at all, and they might contribute quite a bit to the humans' strength.


Perhaps the positional sacrifice can be identified by similar means. The most superficial measurement of a position is the material left on the board. So when you compare the superficial measurement to a deeper positional measurement and they are divergent, then we have something positional.

I think one of Kasparov's games against Karpov in the New York portion of one of their World Championship matches involved Kasparov sacrificing a queen for positional compensation on the black side of a King's Indian. It would be interesting to see what this project thinks of that game.


This is pretty much what I wanted to say.

What is a surprising move varies greatly from player to machine. Here's a good example:

http://www.chessgames.com/perl/chessgame?gid=1064780

Capa's move 10 here (Bd7) is completely surprising to the vast majority of players and computers. It breaks most of the standard 'rules' of development and space control. However, it doesn't move the needle in terms of tactical significance at all. To me, that's a surprising move.


Surprising may be a bit subjective. I know that I am all too often surprised by my opponent - and not in a good way. This may be a good way to study what kind of patterns have interesting weaknesses.


Surprising is very subjective. I was playing chess in a cabin full of people and found the checkers game next to me to be more entertaining - most likely because of the people. My chess moves were not quite random, but because I wasn't really paying attention they were really frustrating my opponent because he couldn't make sense of what I was doing. My moves were either not very logical or so brilliant that he didn't know what I was up to and it was really getting into his head. Surprising moves? Yes. Good moves? Not so much.


This seems a perfect candidate to test with some well known test games.

Fischer's 17 ... Be6, for example in the Game of the Century

Or 15 ...Nf2 (again Fischer and Byrne) http://www.chessgames.com/perl/chessgame?gid=1008419&kpage=1

It might be interesting to see how many blunders this identifies compared with traditional analysis. (Looking at the "surprising bad move" vs "surprising good move")

In any case, very interesting work. Others have some good ideas for additional checks/considerations, and it would be interesting to see how this evolves!


A good game to test this on is http://www.chessgames.com/perl/chessgame?gid=1139126 .

Tal played the move 19. Rf6!!, which instantly wins the game. However, (at least when I tested this a few months ago), Stockfish takes a fairly long time to recognize this-- it prefers the more conservative 19. c4 instead.


I just went through the game and man 19. Rf6!! Took me by surprise! What an amazing move!

I also would love to see some chess engine that plays ridiculous moves in the beginning to throw people off guard (i.e. increase greatly increase the variance in possible evaluations) and then crushes the opponent by playing all the computer moves while the human players have to walk on their toes in a minefield of possible blunders because the position is so wacky.


Note that Stockfish, which this project uses, just won the TCEC computer chess championship.

https://en.wikipedia.org/wiki/TCEC

There are other championships out there, but this is an indication that Stockfish is among the very strongest chess-playing entities in the world.


Hi! I'm the author of this little library. Somebody apparently took it from my reddit thread (https://www.reddit.com/r/chess/comments/5i54lu/how_to_quanti...) and linked it here. Thank you for doing that, i haven't thought this would interest so many people!

I hope i can answer some of your questions given more time but I have to say there are some very cool suggestions here and in the reddit thread which i have to keep in mind.


I've been thinking about adding a game-analyse feature to my Tarrasch Chess GUI https://triplehappy.com. I was toying with the idea of using stockfish plus a poor engine as opposed to stockfish deep plus stockfish shallow. I was actually going to use my own Tarrasch Toy engine (about 1500 elo) as a reasonable proxy of a poor human player. My engine has fixed depth, so when set to very low depth will routinely just do the obvious thing and fall into simple fork traps etc., which would make it possible to auto generate simple explanatory variations like "Not $bad_move_played_by_my_engine because of $refutation_variation_pointed_out_by_stockfish."

Like you I've noticed that often when looking at Engine Analysis the obvious human move doesn't appear as one of the first choices, which usually means there is an opportunity to get the machine to show you some tactics.

So essentially your idea modulated by my assumption it would be difficult to get Stockfish to play sufficiently badly for stockfish deep plus stockfish shallow.

But anyway, congrats to you for actually getting there first and making the idea work.


Really cool idea and interesting visualization! I am working on similar / related stuff if you'd like to collaborate further. dsjoerg at gmail.com. I participated in https://www.kaggle.com/c/finding-elo


You might also add the condition that best-move(deep-search) != best-move(shallow-search). IMO this is what really makes a move surprising - it looks bad, but is good.

Btw, I've never seen Rxd4!! from Kasparov-Topalov described as a "natural" move. I'm guessing the author meant "unnatural". :)


I read <irony>.


I think it's a bit of humor


I wish it had examples! I'd be more interested in a move that complicates the game. In fact, what is the most complicated position in chess? (Legal or not)


Probably positions stemming from the Botvinnik variation of the Semi-slav. Conventional chess wisdom doesn't really help, and figuring a way forward is pure calculation.

Then also, maybe lines of the "Irrational system" of the French Defence, Winawer variation (lines of Qg4, and Black sacrificing both g7 and h7 pawns).

Playing with nuances about what "complicated" means, perhaps also Nimzowitsch's "Immortal Zugzwang" game, where White is absolutely helpless on an almost full board of pieces.


The "Legal or not" part of your question is interesting. I am guessing you mean a "legal" board is one which was derived from the starting position following the legal moves. How could this be determined, short of doing a brute force search from the starting position?

This is something I've thought about when looking at chess problems, whether such problems could actually occur in a real game.


I wonder if existing computer chess engines can be modified to do retrograde analysis, and then find some heuristic for positions that are "obviously" possible via legal moves, and then see if the retrograde analysis can reach one of those.

I have a feeling this is often tractable, because we could probably construct a cooperative scenario involving sacrifice of arbitrary pieces at the beginning of the game (maybe via a series of knight captures?), and then the main question is confirming how particular pawns moved past each other, and finding a way of ensuring that no stalemates occurred while positioning the kings.


Sounds computationally intensive. It reminds me of an experiment I read about some time ago, tangentially: http://theinvisiblegorilla.com/blog/2012/02/15/how-experts-r...


Thanks for the link! That in turn reminds me of how Fischer advocated randomizing starting positions in order to eliminate the role of opening theory.

https://en.wikipedia.org/wiki/Chess960

(because apart from the structure part that the link you gave mentions, experts seem to know a huge amount of explicit opening theory, in terms of memorized opening lines and particular chess masters' view of their tactical consequences)

I'm getting more curious about the reachability question.

There are some discussions at

https://chess.stackexchange.com/questions/4830/how-many-lega...

https://mathoverflow.net/questions/138133/what-proportion-of...

https://www.chess.com/forum/view/general/how-many-different-...

https://en.wikipedia.org/wiki/Chess#Combinatorics_of_chess_a...

Also related:

https://en.wikipedia.org/wiki/Proof_game

That includes a retrograde solver for openings, but that's not exactly what I was envisioning -- I was thinking of a retrograde solver for endgames, which then needs something kind of akin to an inverse tablebase ("yes, this family of positions is known to be legal!" instead of "this position is a win for black").


It's an interesting question. It is easy to check when a position is unreachable, sometimes.

For instance a white king on H8 and black Rooks on A7, A8 (unless I'm mistaken).

It almost seems as if it is easier to find a position that is guilty (when it is), than to check that a position has a perfect record (check that it is legal).


I don't understand what "legal or not" means here. Does that include making the board 1000x bigger and increasing the number of pieces? Seems like by changing the rules of the game you could make a position arbitrarily complicated.


It's meant to be left for interpretation. But I was thinking not changing the board size. Add pinned pieces, compose tactics, make the game tree as wide as possible.


the starting position ?


No, add a pawn.


what ? You mean have a non-legal starting position ?

Well yeah, I could also add a rubiks cube to the white pieces.


Fair point. But we'll eventually settle on some rules.

The thought is like the following: people actually compose chess puzzles for fun, without regard to them appearing in an actual game: There's chess puzzle composition competitions.

Could one write a program to make a monster puzzle (or better, the most complicated) with forced moves?

That is: maximize X in "white to move and mate in X moves" restricted to board size and standard chess rules.


My wish was granted!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: