For those interested in the intersection of computer science and economics, I highly recommend the freely available Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations by Yoav Shoham and Kevin Leyton-Brown [1].
Using Rock-Paper-Scissors and an AI engine I stumbled across a bit of a flaw in the textbook answer around this game.
The solution is not necessarily that both players will play all three options randomly. It is only necessary for ONE player to play randomly in order to reach an equilibrium. The other player at that point is free to pick any strategy they choose, including playing a single option 100% of the time.
I’m pretty sure that’s incorrect. I haven’t had time to look through the text yet but it sounds like you’re confusing the one-shot equilibrium (1/3 probability of playing any one of the three available strategies), each deploying simultaneously, and the iterated game scenario (in which one player could be random, and you argue the other could continually play the same strategy repeatedly, but then the other player would latch on and play the matching countermove, pushing them both back towards randomness).
EDIT: It’s not nice to vote people down just because they politely point out you’re demonstrably wrong.
Lets do then.
Player 1 randomizes (1/3, 1/3, 1/3). Player 2 randomizes (1, 0, 0). Does player 1 want to deviate? Yes! Randomizing is not the best strategy if player 2 is going to be playing the same pure strategy all the time, player 1 would be better off if she were to play the best response to whatever (1, 0, 0) is. -> Not an equilibrium.
You’re mixing up equilibrium payoff and Nash Equilibrium (which is a strategy profile). Yes, the payoffs of player A randomizing and player B playing scissors is the same as both players randomizing, but observe that this profile of strategies fails the ex-post facto definition of Nash Equlibrium, i.e. in retrospect player A would want to change their strategy choice to “play rock every time.”
I’m going to sound very arrogant but: I don’t need to set up the scenario you describe because optimal strategies in identical games have been proven since the 1940s when Oskar Morgenstern and John Von Neumann published their seminal Theory of Games and Economic Behaviour (1944).
If you are getting results that are in conflict with that which is mathematically proven (provided certain circumstantial criteria are met), then you can be sure that you have not exactly matched the circumstantial criteria assumed as a perquisite for the proof. So basically you’re in another situation where the theorem makes no claim to hold and, unsurprisingly, doesn’t.
That's ok, except that you really have a one-player game now.
If the first player is locked into playing randomly forever, irrespective of what the second player does, then for the remaining player any strategy is an "equilibrium" and can't be improved (or worsened). We're just talking about roulette or something.
But I'm a little disappointed that someone would downvote the comment above. The "something is wrong with the theory" claim is very premature (ok, it's wrong), but the important thing is now we are fighting with the text instead of just citing it. That's a step forward, not a step back.
That's not an equilibrium; as soon as one player deviates from a true 1/3 random strategy, the other player has a theoretical ability to punish.
But note the relationship between computation and exploitation here: if the deviating player makes its deviation difficult to detect (e.g. using a PRNG stream cipher, which is completely deterministic but may look random), then the cost of making good predictions to take advantage of the deviation may exceed the payoff.
The point is that if your strategy becomes a known quantity, it can be strictly dominated by your opponent choosing the strategy that wins against it. This would push you both back towards random choice (whether iterated or one-off).
A Nash equilibrium says that each player plays optimally given what all the other players do. If one player plays deterministically, it's no longer optimal for the other one to play randomly, so it's not an equilibrium.
If all players write down the strategy they are going to play on a sheet of paper and show them to all other players (can include randomisation) then no player would have a strict incentive to alter their strategy.
With rock paper scissors, if me and my opponent both write down 1/3,1/3,1/3 randomisation, then neither of us would strictly benefit from doing something else.
Your [1/3,1/3,1/3], [1,0,0] is not a Nash Equilibrium. Whilst the second player does not have a strict incentive to change strategies, the first player can switch to [0,0,1] and win every time.
This is generally true for Nash equilibria. If a a player in the equilibrium plays a randomized combination of strategies, then each pure strategy in the combination must have the same expected payoff. Otherwise he could profit by switching to only playing the highest-value one.
The abstract makes it sound not highly technical. You may be fine with just linear algebra and calculus 1.
Edit- quote from the introduction:
“The technical material includes logic, probability theory, game theory, and optimization.[...]the goal has been to gather the most important elements from each discipline and weave them together into a balanced and accurate introduction to this broad field. The intended reader is a graduate student or an advanced undergraduate, prototypically, but not necessarily, in computer science”
[1] http://www.masfoundations.org/mas.pdf