Computing Minmax; Dominance - PDF Free Download

Computing Minmax; Dominance CPSC 532A Lecture 5 Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 1

Lecture Overview 1 Recap 2 Linear Programming 3 Computational Problems Involving Maxmin 4 Domination 5 Fun Game 6 Iterated Removal of Dominated Strategies Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 2

What are solution concepts? Solution concept: a subset of the outcomes in the game that are somehow interesting. There is an implicit computational problem of finding these outcomes given a particular game. Depending on the concept, existence can be an issue. Solution concepts we ve seen so far: Pareto-optimal outcome Pure-strategy Nash equilibrium Mixed-strategy Nash equilibrium Other Nash variants: weak Nash equilibrium strict Nash equilibrium maxmin strategy profile minmax strategy profile Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 3

Mixed Strategies Define a strategy s i for agent i as any probability distribution over the actions A i. pure strategy: only one action is played with positive probability mixed strategy: more than one action is played with positive probability these actions are called the support of the mixed strategy Let the set of all strategies for i be S i Let the set of all strategy profiles be S = S 1... S n. Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 4

Best Response and Nash Equilibrium Our definitions of best response and Nash equilibrium generalize from actions to strategies. Best response: s i BR(s i) iff s i S i, u i (s i, s i) u i (s i, s i ) Nash equilibrium: s = s 1,..., s n is a Nash equilibrium iff i, s i BR(s i ) Every finite game has a Nash equilibrium! [Nash, 1950] Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 5

Maxmin and Minmax Definition (Maxmin) The maxmin strategy for player i is arg max si min s i u i (s 1, s 2 ), and the maxmin value for player i is max si min s i u i (s 1, s 2 ). Definition (Minmax, 2-player) In a two-player game, the minmax strategy for player i against player i is arg min si max s i u i (s i, s i ), and player i s minmax value is min si max s i u i (s i, s i ). We can also generalize minmax to n players. Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 6

Minmax Theorem Theorem (Minimax theorem (von Neumann, 1928)) In any finite, two-player, zero-sum game, in any Nash equilibrium each player receives a payoff that is equal to both his maxmin value and his minmax value. 1 Each player s maxmin value is equal to his minmax value. By convention, the maxmin value for player 1 is called the value of the game. 2 For both players, the set of maxmin strategies coincides with the set of minmax strategies. 3 Any maxmin strategy profile (or, equivalently, minmax strategy profile) is a Nash equilibrium. Furthermore, these are all the Nash equilibria. Consequently, all Nash equilibria have the same payoff vector (namely, those in which player 1 gets the value of the game). Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 7

Saddle Point: Matching Pennies Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 8

Linear Programming A linear program is defined by: a set of real-valued variables a linear objective function a weighted sum of the variables a set of linear constraints the requirement that a weighted sum of the variables must be greater than or equal to some constant Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 10

Linear Programming Given n variables and m constraints, variables x and constants w, a and b: maximize subject to n w i x i i=1 n a ij x i b j i=1 x i {0, 1} j = 1... m i = 1... n These problems can be solved in polynomial time using interior point methods. Interestingly, the (worst-case exponential) simplex method is often faster in practice. Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 11

Computing equilibria of zero-sum games minimize U1 subject to u 1 (a 1, a 2 ) s a2 2 U 1 a 1 A 1 a 2 A 2 2 = 1 a 2 A 2 s a2 s a 2 2 0 a 2 A 2 variables: U1 is the expected utility for player 1 s a2 2 is player 2 s probability of playing action a 2 under his mixed strategy each u 1 (a 1, a 2 ) is a constant. Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 13

Computing equilibria of zero-sum games minimize U1 subject to u 1 (a 1, a 2 ) s a2 2 U 1 a 1 A 1 a 2 A 2 2 = 1 a 2 A 2 s a2 s a 2 2 0 a 2 A 2 s 2 is a valid probability distribution. Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 13

Computing equilibria of zero-sum games minimize U1 subject to u 1 (a 1, a 2 ) s a2 2 U 1 a 1 A 1 a 2 A 2 2 = 1 a 2 A 2 s a2 s a 2 2 0 a 2 A 2 U 1 is as small as possible. Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 13

Computing equilibria of zero-sum games minimize U1 subject to u 1 (a 1, a 2 ) s a2 2 U 1 a 1 A 1 a 2 A 2 2 = 1 a 2 A 2 s a2 s a 2 2 0 a 2 A 2 Player 1 s expected utility for playing each of his actions under player 2 s mixed strategy is no more than U 1. Because U 1 is minimized, this constraint will be tight for some actions: the support of player 1 s mixed strategy. Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 13

Computing equilibria of zero-sum games minimize U1 subject to u 1 (a 1, a 2 ) s a2 2 U 1 a 1 A 1 a 2 A 2 2 = 1 a 2 A 2 s a2 s a 2 2 0 a 2 A 2 This formulation gives us the minmax strategy for player 2. To get the minmax strategy for player 1, we need to solve a second (analogous) LP. Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 13

Computing Maxmin Strategies in General-Sum Games Let s say we want to compute a maxmin strategy for player 1 in an arbitrary 2-player game G. Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 14

Computing Maxmin Strategies in General-Sum Games Let s say we want to compute a maxmin strategy for player 1 in an arbitrary 2-player game G. Create a new game G where player 2 s payoffs are just the negatives of player 1 s payoffs. The maxmin strategy for player 1 in G does not depend on player 2 s payoffs Thus, the maxmin strategy for player 1 in G is the same as the maxmin strategy for player 1 in G By the minmax theorem, equilibrium strategies for player 1 in G are equivalent to a maxmin strategies Thus, to find a maxmin strategy for G, find an equilibrium strategy for G. Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 14

Domination Let s i and s i be two strategies for player i, and let S i be is the set of all possible strategy profiles for the other players Definition s i strictly dominates s i if s i S i, u i (s i, s i ) > u i (s i, s i) Definition s i weakly dominates s i if s i S i, u i (s i, s i ) u i (s i, s i) and s i S i, u i (s i, s i ) > u i (s i, s i) Definition s i very weakly dominates s i if s i S i, u i (s i, s i ) u i (s i, s i) Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 16

Equilibria and dominance If one strategy dominates all others, we say it is dominant. A strategy profile consisting of dominant strategies for every player must be a Nash equilibrium. An equilibrium in strictly dominant strategies must be unique. Consider Prisoner s Dilemma again not only is the only equilibrium the only non-pareto-optimal outcome, but it s also an equilibrium in strictly dominant strategies! Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 17

Traveler s Dilemma Two travelers purchase identical African masks while on a tropical vacation. Their luggage is lost on the return trip, and the airline asks them to make independent claims for compensation. In anticipation of excessive claims, the airline representative announces: We know that the bags have identical contents, and we will entertain any claim between $180 and $300, but you will each be reimbursed at an amount that equals the minimum of the two claims submitted. If the two claims differ, we will also pay a reward R to the person making the smaller claim and we will deduct a penalty R from the reimbursement to the person making the larger claim. Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 19

Traveler s Dilemma Action: choose an integer between 180 and 300 If both players pick the same number, they both get that amount as payoff If players pick a different number: the low player gets his number (L) plus some constant R the high player gets L R, R = 5. Play this game once with a partner; play with as many different partners as you like. Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 20

Traveler s Dilemma Action: choose an integer between 180 and 300 If both players pick the same number, they both get that amount as payoff If players pick a different number: the low player gets his number (L) plus some constant R the high player gets L R, R = 5. Play this game once with a partner; play with as many different partners as you like. Now set R = 180, and again play with as many partners as you like. Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 20

Traveler s Dilemma What is the equilibrium? Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 21

Traveler s Dilemma What is the equilibrium? (180, 180) is the only equilibrium, for all R 2. Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 21

Traveler s Dilemma What is the equilibrium? (180, 180) is the only equilibrium, for all R 2. What happens? Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 21

Traveler s Dilemma What is the equilibrium? (180, 180) is the only equilibrium, for all R 2. What happens? with R = 5 most people choose 295 300 with R = 180 most people choose 180 Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 21

Dominated strategies No equilibrium can involve a strictly dominated strategy Thus we can remove it, and end up with a strategically equivalent game This might allow us to remove another strategy that wasn t dominated before Running this process to termination is called iterated removal of dominated strategies. Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 23

Iterated Removal of Dominated Strategies: Example L C R U 3, 1 0, 1 0, 0 M 1, 1 1, 1 5, 0 D 0, 1 4, 1 0, 0 Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 24

Iterated Removal of Dominated Strategies: Example L C R U 3, 1 0, 1 0, 0 M 1, 1 1, 1 5, 0 D 0, 1 4, 1 0, 0 R is dominated by L. Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 24

Iterated Removal of Dominated Strategies: Example L C U 3, 1 0, 1 M 1, 1 1, 1 D 0, 1 4, 1 Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 25

Iterated Removal of Dominated Strategies: Example L C U 3, 1 0, 1 M 1, 1 1, 1 D 0, 1 4, 1 M is dominated by the mixed strategy that selects U and D with equal probability. Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 25

Iterated Removal of Dominated Strategies: Example L C U 3, 1 0, 1 D 0, 1 4, 1 Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 26

Iterated Removal of Dominated Strategies: Example L C U 3, 1 0, 1 D 0, 1 4, 1 No other strategies are dominated. Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 26

Iterated Removal of Dominated Strategies This process preserves Nash equilibria. strict dominance: all equilibria preserved. weak or very weak dominance: at least one equilibrium preserved. Thus, it can be used as a preprocessing step before computing an equilibrium Some games are solvable using this technique. Example: Traveler s Dilemma! What about the order of removal when there are multiple dominated strategies? strict dominance: doesn t matter. weak or very weak dominance: can affect which equilibria are preserved. Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 27