AI beats human poker champions

mattmaroon · on July 9, 2008

They at least used a couple very good heads up limit players this time, rather than Phil Laak and some other guy nobody has heard of.

ivankirigin · on July 9, 2008

Would you agree with me that they are playing a different game?

If you remove the human element, the players can't use one of their skills.

mattmaroon · on July 9, 2008

Different than live, really similar to online.

Heads up limit hold'em is also a tiny subset of poker, and (so I've been told) the easiest to program a good AI for. Get that bot beating top players at a 9 handed NL table and I'll stop playing online entirely.

gizmo · on July 9, 2008

I think 5 card draw is the easiest game (for humans and computers alike).

mattmaroon · on July 10, 2008

Hmm, that may be true, but there probably aren't five humans left who still play it for the bot to face.

Actually I'd think 5 card stud would be far easier still.

jm4 · on July 9, 2008

Exactly. When playing against a machine, a human can't use one of his skills but it doesn't apply anymore. There's no need to read an opponent or employ deception because a machine doesn't understand this.

Any serious poker player knows that the best strategy for winning is to make your opponent make mistakes while minimizing your own. One of the primary tools for accomplishing this is the use of deception.

Once the game is reduced to calculating the odds at any given moment and betting when they're in your favor, even a simple machine should be able to beat a human as long as it makes reliable calculations.

It should really come as no surprise that a machine can beat a human at a task that machines are really good at.

neilc · on July 9, 2008

Once the game is reduced to calculating the odds at any given moment and betting when they're in your favor, even a simple machine should be able to beat a human as long as it makes reliable calculations.

You clearly don't have much of a grasp about what makes computer poker difficult. Read the U of A papers on poker AI:

http://www.cs.ualberta.ca/~games/poker/papers/AIJ02.html

http://www.cs.ualberta.ca/~games/poker/papers/ICML99.html

http://www.cs.ualberta.ca/~games/poker/papers/davidson.msc.h...

gizmo · on July 9, 2008

Oh c'mon. Good poker players know all the odds of all hands with an error of at most 2%. Especially in heads-up.

If it's just about number crunching then the first computer AI would be in Omaha Hi-Lo or 7 card stud - because these poker variants have much more complex odds.

jm4 · on July 10, 2008

Hmm... Maybe poker AI is about more than number crunching. Anyway... Interesting tidbit: craps is structured so that the pass line bettor will lose 50.7% of the time. That 0.7% provides casinos with billions in profits every year. I would think that any decent poker player would have to be able to calculate odds with better than a 2% error rate.

mattmaroon · on July 10, 2008

Poker is not craps, and it's certainly not a game where people sit around a table and whoever calculates the odds best gets the money. There's absolutely no parallel there whatsoever.

Have you ever even played?

mattmaroon · on July 9, 2008

It calculates odds by assigning a range of holdings to you based on your play, so deception is still very useful.

jm4 · on July 9, 2008

Not necessarily. Basically, the only deception available is overplaying or underplaying a hand. There are also other factors that determine whether it's worthwhile to bet. I'm no expert, but I think the size of the ante or current bet relative to the pot is going to make a bigger impact. A human can manipulate some of these factors to affect the way a machine plays, but this is at a great risk and various human mistakes are inevitable. The machine may be tricked into betting or folding when it shouldn't, but it can be counted on to get the arithmetic right every time. I'm not sure the risk is worth the potential reward in this situation. I'm thinking a human is going to make far more mistakes than the machine.

mattmaroon · on July 10, 2008

Deception isn't twisting your Oreo cookie one way when you've got the nuts all night, then doing it the same way later when you're bluffing. Deception is playing every hand the same way, such that your opponent can't deduce what you hold from your betting patterns. For instance, though it would clearly be a losing strategy, if someone raised at every possible opportunity, their range would always be 100% of all possible hands and you'd never have the slightest clue what they hold.

This is no less relevant against a bot than it is against a human. Odds are no more relevant against a bot than they are against a human.

The math is so trivial to calculate as to be irrelevant. And if you think humans make far more mistakes than machines, you should try downloading poki and playing a full ring game.

gizmo · on July 9, 2008

Either you play optimally OR you (semi)bluff.

You can't have both. If you play optimally you lose. If you only bluff you lose. So you have to find a balance. And when you've found it, the other player will change his style so you're in the dark once again.

And the deception also affects your image (your perceived playing style). So the bets you make affect future hands, and the expected profit from future hands. That's what makes the game interesting.

What you describe is a game where you play against one of N players each hand, and you don't know which one. And every hand you get a different anonymous opponent. So basically you have no prior knowledge about your opponent, and your opponent doesn't know you. That would be a really boring game, and mostly number crunching.

mattmaroon · on July 10, 2008

So the bot can probably play optimally better than a human, but optimal play just ends up in a draw (or a loss if there's rake) for both parties.

It sounds like they've finally gotten the bot to where it can exploit human tendencies. If so, that's seriously impressive.

globalrev · on July 9, 2008

Game theory dude.

aggieben · on July 9, 2008

I would expect this to be true of any probability-based game, particularly where:

1) the distinct human advantage of unquantifiable sensory perception is taken away (i.e., a person can't "read" a computer's body language to make his probabilistic estimations more accurate), thus reducing the game to a number-crunching contest. We haven't been winning those contests since the 40s.

2) enough time is injected into the experiment.

gizmo · on July 9, 2008

You have clearly never played poker.

Suppose the PC has 2 aces. It knows it has the best hand (worst cast: tie against the last two aces), so it wants to put all the money in. However, the human will just fold because the computer is so predictable. Result: the PC essentially wastes his big hands because of the predictable behavior.

So the computer has to start bluffing, and semibluffing, just like humans. However, with every bluff you make you're wasting money (as bluffs are by definition sub-optimal play). So you have to estimate how much you gain in the future by making that bluff. But if the PC uses simple heuristics then the human can once again get the upper hand. E.g. the human can make outrageous bluffs, knowing that the PC knows the bluffs are unprofitable and therefore unlikely. That way the PC will lose once again.

Poker is not trivial - and people without any number crunching skills can win from the best number crunchers if they're predictable.

aggieben · on July 9, 2008

You have clearly never played poker.

What is it with people that think this kind of quip buys some sort of credibility? This is the second time today that someone has started a response with "Clearly, you have never...". I have played poker. I have also baked (reference to other thread).

Anyway, I don't buy what you're selling. It doesn't seem that it would be hard to make the computer be unpredictable, unless suddenly humans become good at figuring out the machinations of a pseudo-random number generator instinctively, and in real-time. Make the computer do irrational things at random intervals. The whole point is that a human player, not knowing what cards the computer has, will have little introspection into the computer's intended course of action. I suppose you could tell some things, depending on the rules of the game (i.e., if the computer traded 3 cards in a 5-card hand, you might assume he had a bad hand).

Essentially, it seems that you just said "it's easy to beat the computer if it plays stupid". That doesn't have anything to do with this. The computer can be both a perfect number cruncher and entirely unpredictable, and given enough dealt hands, I don't see how you could beat it.

gizmo · on July 9, 2008

The remark wasn't meant to establish my credibility, I was just saying you make elementary mistakes. My hunch is you have no experience playing poker. This doesn't necessarily make you wrong, but it doesn't exactly work in your advantage.

Are you really saying that the computer doesn't need to use any psychology - and can become "world champion" just by playing a mathematically perfect game with some random bluffs thrown in?

You've got to be kidding me.

(Just for the record - I'm talking NL hold'em here. If you're talking about 5 card draw you have a point. As that game doesn't have the same opportunity for psychology)

aggieben · on July 9, 2008

Well, if it's any comfort, your hunch is correct. I'm not very experienced.

My observation, perhaps incorrect, is that poker outcomes are almost entirely based on psychological manipulation, not on probabilities - because people can be psychologically manipulated. In this regard, poker is like Risk. There's an element of probability, but it only rarely tends to dominate outcomes (I am an experienced Risk player). It's possible to write Risk-playing software that is very difficult to beat except in the rarest of circumstances where luck dominates, because you can't convince the computer to invade Asia or fortify in Australia when it would really be against his interests. I see poker in the same way.

Perhaps I'm over-trivializing the difficulty of writing such a system for poker, but I suppose in essence, I am indeed arguing that you could create software that could just ignore the psychological aspects of the game, and instead play a total numbers game, and that over time it should be a winner. I guess I can concede that if you don't ignore the psychological aspects and try to use them in formulating strategy, that would be very difficult indeed; I just don't see the point.

gizmo · on July 9, 2008

Thanks for explaining your argument.

Let's suppose you're right - that playing by the numbers is sufficient. In this case you would not have to keep track of player histories, as you consider every hand without context. But this can only result in one thing: your play style will be constant. You will always take the best action with probability P and (semi)bluff with probability 1 - P. The human notices this and starts playing very aggressively. As a result the PC will fold winning hands and lose many blinds. What the PC should do is change tactics. Get aggressive or start laying traps (e.g. slowplay a great hand so it looks weak, to lure the aggressive human player into bluffing a lot of money).

If the computer always plays optimally from hand to hand it cannot deduce changes in playing style and will therefore always lose. From this I can only conclude that psychological aspects can never be ignored in poker.

dejb · on July 10, 2008

Actually this is not correct. The perfect unbeatable game theoretical model actually doesn't take into consideration the history of the other player. There's maths and stuff by John Nash to prove this. The game theory perfect model will always play as if it is playing a perfect opponent.

In way you could say that the mathematically perfect player is so on guard against tricks by an opponent that even after 20 consecutive raises by an opponent it still doesn't alter its strategy. Of course the strategy will not lose just because a player raises once and consequently it also won't lose to a player who raises all the time. However by doing this the algorithm will miss out on substantial chances to exploit an imperfect opponent's weaknesses (in this case being way too aggressive). As a poker pro there is a (small) chance this 'perfect' player wouldn't even beat the rake against fish. However it would not lose (on average) to the best player in the world (and would probably win).

This might all seem to go against your intuition and experience as someone who is probably a really good poker player but it has been mathematically proven as part of game theory. Of course finding/creating this 'perfect' playing strategy is probably more complicated than 'solving' chess which most people believe to be impossible.

gizmo · on July 10, 2008

You're right. History doesn't matter and cannot possibly matter in a perfect game. After all, by using the same strategy for more than few hands the player gives information away unnecessarily. So in a perfect game people will switch tactics at any moment, which removes the history element altogether.

So the reason why changing tactics every hand is ineffective in the real world is because people believe that the way you played hands before gives information about your play in the future: your table image. But you only work on your table image in the first place so you can exploit it later. When playing against a perfect opponent you won't sacrifice opportunity to maintain an image, because the opponent will always play in the same way (the opponent knows you can change tactics at any moment).

However! The perfect strategy is still absolutely useless in a tournament and most other games. Because exploiting weakness and amassing chips is the objective for the first hours, and the perfect model is bad for that.

Still, it's all fascinating. Appreciate your explanation.

aggieben · on July 9, 2008

I never really meant that the calculations should be without context. One could keep a "running distribution" to update P (probably a bunch of Ps, actually) as you play.

Maybe we're now just looking at the same thing from different angles. You say "use psychology" and I say "use numbers", but in the end, maybe it's the same thing and it's harder than I'm giving it credit for.

Thanks for the poker lesson, by the way. Next time I see a game I'll be more interested.

gizmo · on July 9, 2008

Ah-hah. But that's the problem. So you need some kind of distribution for each of the different tactics the player plays. And then you need some heuristics to predict which tactic is going to follow which tactic. It's a slippery slope, I guess.

I don't know where to draw the line myself. It can be argued that as soon as something is modeled by a computer it ceases to be psychology and becomes mundane math. But I do still believe that most tactics people use to trick others with need to show up in the code in some way.

I won't use the "Clearly you've never done x" line again. It was cheap and uncalled for. Thanks for calling me out on that.

mattmaroon · on July 10, 2008

So psychology and math in poker are sort of two sides of the same coin. Generally, psychology means narrowing your opponent's range of possible holdings as much as possible, while preventing him from doing the same to you. You do this to enable yourself to make better math-based decisions, and cause him to make worse ones.

If you can accurately pinpoint your opponent's range, and predict how he'd react with the hands in that range on future rounds, the game becomes a rather trivial math problem for a computer. It can simply enumerate all of the possibilities and determine the correct course of action.

Unfortunately narrowing a person's range and predicting how they'd play in future rounds is very hard to do for a human, and presumably even harder for a computer.

MaysonL · on July 9, 2008

If you think it's so easy, Poker Academy has an API...

cousin_it · on July 9, 2008

The computer's strategy does not have to be deterministic - it can randomize. Moreover, even deterministic strategies aren't necessarily predictable. For example: bad hand - always bluff, moderate hand - always fold, good hand - always go ahead. This is actually a Nash equilibrium strategy for one of the simplified poker models. Within that model, no human player can repeatably win against this strategy, and yes, it's a provable mathematical fact that has nothing to do with psychology.

gizmo · on July 9, 2008

If you have a link, I'd like to read it. I can't think of any Nash equilibrium where bluffing all bad hands works.

cousin_it · on July 10, 2008

The Von Neumann poker model. See http://www.math.ucla.edu/~tom/papers/poker1.pdf , pages 7-8, theorem 2. In a nutshell, it works because the opponent can't be sure if you have a very bad hand or a very good hand. Yes, it's a simplified model, but my game theory book says a computer analysis of straight poker gives the same result: when given the worst possible hand, you should always bluff.

mattmaroon · on July 10, 2008

That might be optimal from a certain standpoint, but it's not optimal at all for exploiting another human's weaknesses. Your goal in poker isn't to break even or even to win, it's to win the most possible.

schtog · on July 10, 2008

Noone said it wasn't. Fact is a gametheoretically optimal player will beat all players than dont play gametheoretically optimal.

So yes you are right that a perfectly playing bot might not beat a really weak player for as much as the best human players would but that is not the question here. The question is whether a bot can beat the best human players and if it plays GT-optimal and the human doesn't, well then the bot will win.

And it doesn't matter if human players can still beat the weak players for more. If people find out they are playing against robots they will quit playing (well I'd assume they would). The goal of malicious bots wouldn't be to beat the highstakes-games but rather fishing out the small ponds. And then the pyramid falls and you will have to get a real job;)

mattmaroon · on July 13, 2008

Isn't it incredibly difficult to build a bot that plays gametheoretically optimal, other than in rather simple situations (like large portions of a heads up limit match)? My friends who understand such things tell me not to worry about it just yet.

And nothing can make me get a real job.

schtog · on July 14, 2008

It might be, I am not THAT knowledgeable about gametheory and poker-AI, I was just straightening out the concepts.

From what I understand 3-player poker is much more like 9-player poker than HU and NL is much more complicated than limit.

dangoldin · on July 9, 2008

Well the fact that the machine is using probability and is playing rationally can be used by opposing player.

llimllib · on July 9, 2008

more at http://www.overcomingbias.com/2008/07/poker-vs-chess.html and http://www.cs.ualberta.ca/~games/poker/man-machine/.

mattmaroon · on July 9, 2008

"far more people play poker than chess"

Is that true? Certainly in the US, but worldwide?

coglethorpe · on July 9, 2008

Well, I've heard that many online players are from Sweden, and I think tour players as well. Other countries have poker rooms and allow online gambling openly.

schtog · on July 10, 2008

Most people here need to look into what OPTIMAL in a gametheoretic sense means.

It doesn't mean optimal in the way people normally use it.

A player playing gametheoretically optimal will beat ALL other players. That is however not the same thing as playing the most exploitative poker against every single player.

So the perfect bot would beat all the other players in the world, including the best human but if the bot plays the worst pokerplayer in the world and so does the best human player then the best human player might beat the worst human player for more money than the bot would beat the worst human player.

mattmaroon · on July 10, 2008

Won't a player playing gametheoretically optimal tie with all other players? That's what my math-nerd poker friends all tell me.

schtog · on July 10, 2008

In for example roshambo that is the case but not in poker.

Poker has dominated strategical choices so someone playing gametheorhetically optimal will beat someone that doesn't.

Of course if his opponent is also playing GT-optimally they will tie.

Then of course there is rake to consider so if the opponent plays close enough to GT-optimal you might both still lose if the rake is high.

steveplace · on July 10, 2008

I'm really surprised no one has yet to mention this:

codingthewheel.com is currently going through an explanation on how to build a winning poker bot online. Here's the latest post.

http://www.codingthewheel.com/archives/how-i-built-a-working...

wallflower · on July 9, 2008

> BioTools Inc. (Edmonton, Alberta) has built previous versions of Polaris into a downloadable poker coach called the Poker Academy.

lpgauth · on July 9, 2008

Wonder if this was no-limit hold em? If it is it's impressive. If not well the probabilities are not that challenging.

MaysonL · on July 9, 2008

Duplicate limit hold-em (same hands played at other table, with human and machine reversed)