Thursday, December 25, 2008

The General Two-Person, Zero-Sum Game

The games in this chapter have no equilibrium points; if you are to win anything like the amount that you should, you must start second-guessing your opponent in earnest.

A strategy that prescribes the selection of a pure strategy by means of random device is called a mixed strategy.

Once a player resorts to randomizing, not only can he/she not lose (on average), no matter how well an opponent plays; he/she will not gain, however badly an opponent plays. The more capable your opponent, then, the more attractive the randomizing procedure.

von Meumann's Minimax Theorem

One can assign to every finite, two-person, zero-sum game a value V: the average amount that player I can expect to win from player II if both players act sensibly. This predicted outcome is plausible, for three reasons:
  1. Player I will not settle for anything less than V.
  2. Player I can prevented from getting more than V.
  3. Since the game is zero-sum, player II is motivated to limit I's average return to V.
We can treat all two-person, zero-sum games as though they had equilibrium points. The only difference between games with actual equilibrium points and those without them is that in one case you can use a pure strategy and obtain the value of the game with certainty, while in the other case you must use a mixed strategy and you obtain the value of the game on average.

Calculating Mixed Strategies

A strategy may be dominated by either a pure strategy or a mixed one.

  1. Calculate the maximin and minimax - if they are equal, this is a game with exact equilibrium point(s). Otherwise, go on to step 2.
  2. Eliminate all strategies that are dominated.
  3. Assign probabilities to each of your strategies so that the outcome of the game will be the same on average whatever your opponent does. Assume your opponent does the same; if the outcome when you use this mixed strategy is the same as the outcome when your opponent uses his/her mixed strategy and if all probabilities are nonnegative, you have solved the game.
If there is a gap between the outcomes or if some of the probabilities are negative, reexamine the games for dominated strategies; if there are none, then this method failed.

For example,




Your Opponent


D
E

A
15
10

B
6
15
You
C
5
20

1/5 A and 4/5 C7
18

p A and (1-p)C
15*p + 5*(1-p)10*p + 20*(1-p)
Your strategy B is dominated by the mixed strategy of A and C, such as play A 1/5 of the time and C 4/5 of the time. Then B is eliminated.

Then let's assume that you play A with probability p and C with (1-p), and your opponent plays D with probability r and E with (1-r). We need 15*p + 5*(1-p) = 10*p + 20*(1-p). Therefore, p = 3/4 and the corresponding payoff is 12.5 on average.

By a similar calculation, you find that your opponent should play each of his/her strategies half the time and the payoff is the same 12.5.

Notes

Player I can be stopped from getting any more; moreover, player II is motivated to stop I from getting more. If I chooses another strategy, he/she is gambling. If II also gambles, there is no telling what will happen. The minimax strategies are attractive in that they offer security. The appeal of security is a matter of personal taste.

If should be emphasized that the minimax strategy is essentially defensive, and when you use it, you often eliminate any chance of doing better.

Experimental studies show that people approach minimax strategies by "learning" during the game. They "learned" how to react effectively to the specific behavior of the experimenter, and adopt defensive strategies when meeting clever opponent or aggressive ones to exploit the others.

The weakest part of the theory is undoubtedly the assumption that a player should always act so as to maximize the average payoff. The justification for the assumption is that, in the long run, not only the average return but the actual return will be maximized. This is because any tricks or pure strategy will be recognized and exploited by the opponent. But if a game is played only once, ofter a strategy that maximizes the average return is not desirable, much less compelling.

In summary,
  1. Minimax strategy is secure but defensive, thus not always be the choice of people (depends on the level of the opponent).
  2. It relies on the concept of average payoff, which implicitly assumes repeated games. For game just goes once, it may not be suitable.
  3. The pre-requirement of zero-sum is strict and not always be the truth in real life. For this issue, refer to the next chapter of utility theory.

Some Thoughts

  1. Do we really need to eliminate dominated strategies? Maybe after solving the maxtrix, their corresponding probability equals to 0.
  2. The method may fail. Then what to do next?
  3. The definition of payoff matrix may be a sujective process.

No comments: