Walras-Bowley Lecture 2003

Size: px

Start display at page:

Download "Walras-Bowley Lecture 2003"

Coleen Jones
5 years ago
Views:

1 Walras-Bowley Lecture 2003 Sergiu Hart This version: September 2004 SERGIU HART c 2004 p. 1

2 ADAPTIVE HEURISTICS A Little Rationality Goes a Long Way Sergiu Hart Center for Rationality, Dept. of Economics, Dept. of Mathematics The Hebrew University of Jerusalem SERGIU HART c 2004 p. 2

3 Most of this talk is based on joint work with Andreu Mas-Colell (Universitat Pompeu Fabra, Barcelona) SERGIU HART c 2004 p. 3

4 Most of this talk is based on joint work with Andreu Mas-Colell (Universitat Pompeu Fabra, Barcelona) All papers and this presentation are available on my home page SERGIU HART c 2004 p. 3

5 Papers SERGIU HART c 2004 p. 4

6 Papers Econometrica (2000) Journal of Economic Theory (2001) Economic Essays (2001) Games and Economic Behavior (2003) American Economic Review (2003) SERGIU HART c 2004 p. 4

7 PART I Introduction: Dynamics SERGIU HART c 2004 p. 5

8 Dynamics SERGIU HART c 2004 p. 6

9 Dynamics Learning SERGIU HART c 2004 p. 6

10 Dynamics Learning START: prior beliefs STEP: observe update (Bayes) optimize (best-reply) REPEAT SERGIU HART c 2004 p. 6

11 Dynamics Evolution SERGIU HART c 2004 p. 7

12 Dynamics Evolution populations each individual fixed action ( gene ) frequencies of each action in the population mixed strategy SERGIU HART c 2004 p. 7

13 Dynamics Evolution populations each individual fixed action ( gene ) frequencies of each action in the population mixed strategy Change: Selection higher payoff higher frequency Mutation random and relatively rare SERGIU HART c 2004 p. 7

14 Dynamics Adaptive Heuristics SERGIU HART c 2004 p. 8

15 Dynamics Adaptive Heuristics rules of thumb myopic simple stimulus response, reinforcement behavioral, experiments non-bayesian SERGIU HART c 2004 p. 8

16 Dynamics Adaptive Heuristics rules of thumb myopic simple stimulus response, reinforcement behavioral, experiments non-bayesian Example: Fictitious Play SERGIU HART c 2004 p. 8

17 Dynamics Adaptive Heuristics rules of thumb myopic simple stimulus response, reinforcement behavioral, experiments non-bayesian Example: Fictitious Play (Play optimally against the empirical distribution of past play of the other player) SERGIU HART c 2004 p. 8

18 ?. jevolution. jheuristics. Learning? SERGIU HART c 2004 p. 9

19 Rationality. jevolution. jheuristics. Learning Rationality SERGIU HART c 2004 p. 9

20 Rationality. jevolution. jheuristics. Learning Rationality most of this talk SERGIU HART c 2004 p. 9

21 Can simple adaptive heuristics lead to sophisticated rational behavior? SERGIU HART c 2004 p. 10

22 Game N-person game in strategic (normal) form Players i = 1, 2,..., N SERGIU HART c 2004 p. 11

23 Game N-person game in strategic (normal) form Players i = 1, 2,..., N For each player i: Actions s i in S i SERGIU HART c 2004 p. 11

24 Game N-person game in strategic (normal) form Players i = 1, 2,..., N For each player i: Actions s i in S i For each player i: Payoffs (utilities) u i (s) u i (s 1, s 2,..., s N ) SERGIU HART c 2004 p. 11

25 Dynamics Dynamics Time t = 1, 2,... SERGIU HART c 2004 p. 12

26 Dynamics Dynamics Time t = 1, 2,... At time t each player i chooses an action s i t in S i SERGIU HART c 2004 p. 12

27 PART II Regret Matching SERGIU HART c 2004 p. 13

28 Advertisement DON T YOU FEEL A PANG OF REGRET? 47.15% YIELD 47.15% January May % 4.43% XYZ Nasdaq Dow Jones Don t wait! Ask your broker today Haaretz June 3, 2003 SERGIU HART c 2004 p. 14

29 Regret Matching REGRET MATCHING = Switch next period to a different action with a probability that is proportional to the regret for that action SERGIU HART c 2004 p. 15

30 Regret Matching REGRET MATCHING = Switch next period to a different action with a probability that is proportional to the regret for that action REGRET = increase in payoff had such a change always been made in the past SERGIU HART c 2004 p. 15

31 Regret U = average payoff up to now SERGIU HART c 2004 p. 16

32 Regret U = average payoff up to now V (k) = average payoff if action k had been played instead of the current action j every time in the past that j was played SERGIU HART c 2004 p. 16

33 Regret U = average payoff up to now V (k) = average payoff if action k had been played instead of the current action j every time in the past that j was played R(k) = [V (k) U] + = regret for action k SERGIU HART c 2004 p. 16

34 Regret U = average payoff up to now V (k) = average payoff if action k had been played instead of the current action j every time in the past that j was played R(k) = [V (k) U] + = regret for action k R(k) R i T (j k) = [ 1 T ) t T : s (u ] it = j i (k, s i t ) u i (s t ) + SERGIU HART c 2004 p. 16

35 Regret U = average payoff up to now V (k) = average payoff if action k had been played instead of the current action j every time in the past that j was played R(k) = [V (k) U] + = regret for action k R(k) R i T (j k) = [ 1 T ) t T : s (u ] it = j i (k, s i t ) u i (s t ) + SERGIU HART c 2004 p. 16

36 Regret U = average payoff up to now V (k) = average payoff if action k had been played instead of the current action j every time in the past that j was played R(k) = [V (k) U] + = regret for action k R(k) R i T (j k) = [ 1 T ) t T : s (u ] it = j i (k, s i t ) u i (s t ) + SERGIU HART c 2004 p. 16

37 Regret U = average payoff up to now V (k) = average payoff if action k had been played instead of the current action j every time in the past that j was played R(k) = [V (k) U] + = regret for action k R(k) R i T (j k) = [ 1 T ) t T : s (u ] it = j i (k, s i t ) u i (s t ) + SERGIU HART c 2004 p. 16

38 Regret U = average payoff up to now V (k) = average payoff if action k had been played instead of the current action j every time in the past that j was played R(k) = [V (k) U] + = regret for action k R(k) R i T (j k) = [ 1 T ) t T : s (u ] it = j i (k, s i t ) u i (s t ) + SERGIU HART c 2004 p. 16

39 Regret Matching Next period play: Switch to action k with a probability that is proportional to the regret R(k) (for k j) SERGIU HART c 2004 p. 17

40 Regret Matching Next period play: Switch to action k with a probability that is proportional to the regret R(k) (for k j) Play the same action j of last period with the remaining probability SERGIU HART c 2004 p. 17

41 Regret Matching Next period play: Switch to action k with a probability that is proportional to the regret R(k) (for k j) Play the same action j of last period with the remaining probability σ(k) σ i T+1 (k) = cr(k), for k j σ(j) σ i T+1 (j) = 1 k j cr(k) SERGIU HART c 2004 p. 17

42 Regret Matching Next period play: Switch to action k with a probability that is proportional to the regret R(k) (for k j) Play the same action j of last period with the remaining probability σ(k) σ i T+1 (k) = cr(k), for k j σ(j) σ i T+1 (j) = 1 k j cr(k) c = a fixed positive constant (so that the probability of not switching is > 0) SERGIU HART c 2004 p. 17

43 Regret Matching Theorem Theorem If all players play Regret Matching then the joint distribution of play converges to the set of CORRELATED EQUILIBRIA of the game SERGIU HART c 2004 p. 18

44 Joint Distribution of Play Joint distribution of play z T = The relative frequencies that the N-tuples of actions have been played up to time T SERGIU HART c 2004 p. 19

45 Joint Distribution of Play Joint distribution of play z T = The relative frequencies that the N-tuples of actions have been played up to time T SERGIU HART c 2004 p. 19

46 Joint Distribution of Play Joint distribution of play z T = The relative frequencies that the N-tuples of actions have been played up to time T T = 1 SERGIU HART c 2004 p. 19

47 Joint Distribution of Play Joint distribution of play z T = The relative frequencies that the N-tuples of actions have been played up to time T T = z 1 SERGIU HART c 2004 p. 19

48 Joint Distribution of Play Joint distribution of play z T = The relative frequencies that the N-tuples of actions have been played up to time T T = /2 1/2 0 0 z 2 SERGIU HART c 2004 p. 19

49 Joint Distribution of Play Joint distribution of play z T = The relative frequencies that the N-tuples of actions have been played up to time T T = /3 1/3 0 0 z 3 SERGIU HART c 2004 p. 19

50 Joint Distribution of Play Joint distribution of play z T = The relative frequencies that the N-tuples of actions have been played up to time T T = 10 3/10 0 2/10 1/10 3/10 1/10 z 10 SERGIU HART c 2004 p. 19

51 Joint Distribution of Play Note 1: The fact that the players randomize independently at each period does not imply that the joint distribution is independent! T = 10 3/10 0 2/10 1/10 3/10 1/10 z 10 SERGIU HART c 2004 p. 19

52 Joint Distribution of Play Note 1: The fact that the players randomize independently at each period does not imply that the joint distribution is independent! Note 2: Players observe the joint distribution (the history of play) SERGIU HART c 2004 p. 19

53 Joint Distribution of Play Note 1: The fact that the players randomize independently at each period does not imply that the joint distribution is independent! Note 2: Players observe the joint distribution (the history of play) Note 3: Players react to the joint distribution (patterns, coincidences, communication, signals,...) SERGIU HART c 2004 p. 19

54 Correlated Equilibrium A Correlated Equilibrium is a Nash equilibrium when the players receive payoff-irrelevant signals before playing the game (Aumann 1974) SERGIU HART c 2004 p. 20

55 Correlated Equilibrium A Correlated Equilibrium is a Nash equilibrium when the players receive payoff-irrelevant signals before playing the game (Aumann 1974) Examples: SERGIU HART c 2004 p. 20

56 Correlated Equilibrium A Correlated Equilibrium is a Nash equilibrium when the players receive payoff-irrelevant signals before playing the game (Aumann 1974) Examples: Independent signals SERGIU HART c 2004 p. 20

57 Correlated Equilibrium A Correlated Equilibrium is a Nash equilibrium when the players receive payoff-irrelevant signals before playing the game (Aumann 1974) Examples: Independent signals Nash equilibrium SERGIU HART c 2004 p. 20

58 Correlated Equilibrium A Correlated Equilibrium is a Nash equilibrium when the players receive payoff-irrelevant signals before playing the game (Aumann 1974) Examples: Independent signals Nash equilibrium Public signals ( sunspots ) SERGIU HART c 2004 p. 20

59 Correlated Equilibrium A Correlated Equilibrium is a Nash equilibrium when the players receive payoff-irrelevant signals before playing the game (Aumann 1974) Examples: Independent signals Nash equilibrium Public signals ( sunspots ) convex combinations of Nash equilibria SERGIU HART c 2004 p. 20

60 Correlated Equilibrium A Correlated Equilibrium is a Nash equilibrium when the players receive payoff-irrelevant signals before playing the game (Aumann 1974) Examples: Independent signals Nash equilibrium Public signals ( sunspots ) convex combinations of Nash equilibria Butterflies play the Chicken Game ( Speckled Wood Pararge aegeria) SERGIU HART c 2004 p. 20

61 Correlated Equilibria "Chicken" game LEAVE STAY LEAVE 5, 5 3, 6 STAY 6, 3 0, 0 SERGIU HART c 2004 p. 21

62 Correlated Equilibria "Chicken" game LEAVE STAY LEAVE 5, 5 3, 6 STAY 6, 3 0, 0 a Nash equilibrium SERGIU HART c 2004 p. 21

63 Correlated Equilibria "Chicken" game LEAVE STAY LEAVE 5, 5 3, 6 STAY 6, 3 0, 0 another Nash equilibriumi SERGIU HART c 2004 p. 21

64 Correlated Equilibria "Chicken" game LEAVE STAY LEAVE 5, 5 3, 6 STAY 6, 3 0, 0 L 0 1/2 1/2 0 a (publicly) correlated equilibrium SERGIU HART c 2004 p. 21

65 Correlated Equilibria "Chicken" game LEAVE STAY LEAVE 5, 5 3, 6 STAY 6, 3 0, 0 L S L 1/3 1/3 S 1/3 0 another correlated equilibrium after signal L play LEAVE after signal S play STAY SERGIU HART c 2004 p. 21

66 Correlated Equilibrium A Correlated Equilibrium is a Nash equilibrium when the players receive payoff-irrelevant signals before playing the game (Aumann 1974) Examples: Independent signals Nash equilibrium Public signals ( sunspots ) convex combinations of Nash equilibria Butterflies play the Chicken Game ( Speckled Wood Pararge aegeria) SERGIU HART c 2004 p. 22

67 Correlated Equilibrium A Correlated Equilibrium is a Nash equilibrium when the players receive payoff-irrelevant signals before playing the game (Aumann 1974) Examples: Independent signals Nash equilibrium Public signals ( sunspots ) convex combinations of Nash equilibria Butterflies play the Chicken Game ( Speckled Wood Pararge aegeria) Boston Celtics front line SERGIU HART c 2004 p. 22

68 Correlated Equilibrium Signals (public, correlated) are unavoidable SERGIU HART c 2004 p. 23

69 Correlated Equilibrium Signals (public, correlated) are unavoidable Bayesian Rationality Correlated Equilibrium (Aumann 1987) SERGIU HART c 2004 p. 23

70 Correlated Equilibrium Signals (public, correlated) are unavoidable Bayesian Rationality Correlated Equilibrium (Aumann 1987) A joint distribution z is a correlated equilibrium u(j, s i )z(j, s i ) u(k, s i )z(j, s i ) s i s i for all i N and all j, k S i SERGIU HART c 2004 p. 23

71 Regret Matching Theorem [recall] Theorem If all players play Regret Matching then the joint distribution of play converges to the set of CORRELATED EQUILIBRIA of the game SERGIU HART c 2004 p. 24

72 Regret Matching Theorem CE = set of correlated equilibria z T = joint distribution of play up to time T distance(z T, CE) 0 as T (a.s.) SERGIU HART c 2004 p. 25

73 Regret Matching Theorem CE = set of correlated equilibria z T = joint distribution of play up to time T distance(z T, CE) 0 as T (a.s.) z T is approximately a correlated equilibrium (or z T is a correlated approximate equilibrium) from some time on (for all large enough T ) SERGIU HART c 2004 p. 25

74 Regret Matching Theorem Proof z T is a correlated equilibrium there is no regret: RT i (j k) = 0 for all players and all actions SERGIU HART c 2004 p. 26

75 Regret Matching Theorem Proof z T is a correlated equilibrium there is no regret: RT i (j k) = 0 for all players and all actions Regret Matching all regrets converge to 0 (Proof: Blackwell Approachability for the vector of regrets + approximate eigenvector probabilities by transition probabilities) SERGIU HART c 2004 p. 26

76 Regret Matching Theorem Proof z T is a correlated equilibrium there is no regret: RT i (j k) = 0 for all players and all actions Regret Matching all regrets converge to 0 (Proof: Blackwell Approachability for the vector of regrets + approximate eigenvector probabilities by transition probabilities) Note: z T converges to the set CE, not to a point SERGIU HART c 2004 p. 26

77 Remarks Correlating device: the history of play SERGIU HART c 2004 p. 27

78 Remarks Correlating device: the history of play Other procedures leading to correlated equilibria: Foster Vohra 1997 Calibrated Learning: best-reply to calibrated forecasts Fudenberg Levine 1999 Conditional Smooth Fictitious Play Eigenvector strategy SERGIU HART c 2004 p. 27

79 Remarks Correlating device: the history of play Other procedures leading to correlated equilibria: Foster Vohra 1997 Calibrated Learning: best-reply to calibrated forecasts Fudenberg Levine 1999 Conditional Smooth Fictitious Play Eigenvector strategy Not heuristics! SERGIU HART c 2004 p. 27

80 Behavioral Aspects Behavioral aspects of Regret Matching: SERGIU HART c 2004 p. 28

81 Behavioral Aspects Behavioral aspects of Regret Matching: Commonly used rules of behavior SERGIU HART c 2004 p. 28

82 Behavioral Aspects Behavioral aspects of Regret Matching: Commonly used rules of behavior Never change a winning team SERGIU HART c 2004 p. 28

83 Behavioral Aspects Behavioral aspects of Regret Matching: Commonly used rules of behavior Never change a winning team The higher would have been the payoff from another action the higher the tendency to switch to it SERGIU HART c 2004 p. 28

84 Behavioral Aspects Behavioral aspects of Regret Matching: Commonly used rules of behavior Never change a winning team The higher would have been the payoff from another action the higher the tendency to switch to it Small probability of switching (the status quo bias") SERGIU HART c 2004 p. 28

85 Behavioral Aspects Behavioral aspects of Regret Matching: Commonly used rules of behavior Never change a winning team The higher would have been the payoff from another action the higher the tendency to switch to it Small probability of switching (the status quo bias") Stimulus-response, reinforcement SERGIU HART c 2004 p. 28

86 Behavioral Aspects Behavioral aspects of Regret Matching: Commonly used rules of behavior Never change a winning team The higher would have been the payoff from another action the higher the tendency to switch to it Small probability of switching (the status quo bias") Stimulus-response, reinforcement No beliefs (defined directly on actions) No best-reply (better-reply?) SERGIU HART c 2004 p. 28

87 Behavioral Aspects Similar to models of learning, experimental and behavioral economics: SERGIU HART c 2004 p. 29

88 Behavioral Aspects Similar to models of learning, experimental and behavioral economics: Bush Mosteller 1955 Erev Roth 1995, 1998 Camerer Ho 1997, 1998, SERGIU HART c 2004 p. 29

89 Behavioral Aspects Similar to models of learning, experimental and behavioral economics: Bush Mosteller 1955 Erev Roth 1995, 1998 Camerer Ho 1997, 1998, SERGIU HART c 2004 p. 29

90 Behavioral Aspects Similar to models of learning, experimental and behavioral economics: Bush Mosteller 1955 Erev Roth 1995, 1998 Camerer Ho 1997, 1998, N. Camille et al, The Involvement of the Orbitofrontal Cortex in the Experience of Regret Science May 2004 (304: ) SERGIU HART c 2004 p. 29

91 PART III Generalized Regret Matching SERGIU HART c 2004 p. 30

92 Questions How special is Regret Matching? SERGIU HART c 2004 p. 31

93 Questions How special is Regret Matching? Why does conditional smooth fictitious play work? SERGIU HART c 2004 p. 31

94 Questions How special is Regret Matching? Why does conditional smooth fictitious play work? Any connections? SERGIU HART c 2004 p. 31

95 Generalized Regret Matching Regret Matching = Switching probabilities are proportional to the regrets: σ(k) = cr(k) SERGIU HART c 2004 p. 32

96 Generalized Regret Matching Regret Matching = Switching probabilities are proportional to the regrets: σ(k) = cr(k) Generalized Regret Matching = Switching probabilities are a function of the regrets: σ(k) = f(r(k)) SERGIU HART c 2004 p. 32

97 Generalized Regret Matching Regret Matching = Switching probabilities are proportional to the regrets: σ(k) = cr(k) Generalized Regret Matching = Switching probabilities are a function of the regrets: σ(k) = f(r(k)) f is a sign-preserving function: f(0) = 0, and x > 0 f(x) > 0 f is a Lipschitz continuous function (in fact, much more general: f k,j, potential) SERGIU HART c 2004 p. 32

98 Generalized Regret Matching Theorem If all players play Generalized Regret Matching then the joint distribution of play converges to the set of correlated equilibria of the game SERGIU HART c 2004 p. 33

99 Generalized Regret Matching Theorem If all players play Generalized Regret Matching then the joint distribution of play converges to the set of correlated equilibria of the game Proof: Universal approachability strategies + Amotz Cahn, M.Sc. thesis, 2000 SERGIU HART c 2004 p. 33

100 Special Cases Play probabilities proportional to the m-th power of the regrets (f(x) = cx m, for m 1) SERGIU HART c 2004 p. 34

101 Special Cases Play probabilities proportional to the m-th power of the regrets (f(x) = cx m, for m 1) m = 1: Regret Matching SERGIU HART c 2004 p. 34

102 Special Cases Play probabilities proportional to the m-th power of the regrets (f(x) = cx m, for m 1) m = 1: Regret Matching m = : Positive probability only to actions with maximal regret SERGIU HART c 2004 p. 34

103 Special Cases Play probabilities proportional to the m-th power of the regrets (f(x) = cx m, for m 1) m = 1: Regret Matching m = : Positive probability only to actions with maximal regret Conditional Fictitious Play SERGIU HART c 2004 p. 34

104 Special Cases Play probabilities proportional to the m-th power of the regrets (f(x) = cx m, for m 1) m = 1: Regret Matching m = : Positive probability only to actions with maximal regret Conditional Fictitious Play But: Not continuous SERGIU HART c 2004 p. 34

105 Special Cases Play probabilities proportional to the m-th power of the regrets (f(x) = cx m, for m 1) m = 1: Regret Matching m = : Positive probability only to actions with maximal regret Conditional Fictitious Play But: Not continuous Therefore: Smooth Conditional Fictitious Play SERGIU HART c 2004 p. 34

106 PART IV Unknown Game SERGIU HART c 2004 p. 35

107 Unknown Game The case of the Unknown game : The player knows only Its own set of actions Its own past actions and payoffs SERGIU HART c 2004 p. 36

108 Unknown Game The case of the Unknown game : The player knows only Its own set of actions Its own past actions and payoffs The player does not know the game (other players, actions, payoff functions, history of other players actions and payoffs) SERGIU HART c 2004 p. 36

109 Proxy Regret Unknown game Unknown regret (The player does not know what the payoff would have been if he had played a different action k) SERGIU HART c 2004 p. 37

110 Proxy Regret Unknown game Unknown regret (The player does not know what the payoff would have been if he had played a different action k) Proxy Regret for k: Use the payoffs received when k has been actually played in the past SERGIU HART c 2004 p. 37

111 Proxy Regret Unknown game Unknown regret (The player does not know what the payoff would have been if he had played a different action k) Proxy Regret for k: Use the payoffs received when k has been actually played in the past Theorem. If all players play strategies based on proxy regret, then the joint distribution of play converges to the set of correlated equilibria of the game SERGIU HART c 2004 p. 37

112 PART V Uncoupled Dynamics SERGIU HART c 2004 p. 38

113 Nash Equilibrium Question: Adaptive heuristics Nash equilibria? SERGIU HART c 2004 p. 39

114 Nash Equilibrium Question: Adaptive heuristics Nash equilibria? In SPECIAL classes of games: YES SERGIU HART c 2004 p. 39

115 Nash Equilibrium Question: Adaptive heuristics Nash equilibria? In SPECIAL classes of games: YES Fictitious play, Regret-based,... Two-person zero-sum games Two-person potential games Supermodular games... SERGIU HART c 2004 p. 39

116 Nash Equilibrium Question: Adaptive heuristics Nash equilibria? In SPECIAL classes of games: YES Fictitious play, Regret-based,... Two-person zero-sum games Two-person potential games Supermodular games... In GENERAL games: NO SERGIU HART c 2004 p. 39

117 Nash Equilibrium Question: Adaptive heuristics Nash equilibria? In SPECIAL classes of games: YES Fictitious play, Regret-based,... Two-person zero-sum games Two-person potential games Supermodular games... In GENERAL games: NO WHY? SERGIU HART c 2004 p. 39

118 Uncoupled Dynamics General dynamic for 2-person games: ẋ(t) = F ( x(t), y(t) ; u 1, u 2 ) ẏ(t) = G ( x(t), y(t) ; u 1, u 2 ) x(t) (S 1 ), y(t) (S 2 ) SERGIU HART c 2004 p. 40

119 Uncoupled Dynamics General dynamic for 2-person games: ẋ(t) = F ( x(t), y(t) ; u 1, u 2 ) ẏ(t) = G ( x(t), y(t) ; u 1, u 2 ) Uncoupled dynamic: ẋ(t) = F ( x(t), y(t) ; u 1 ) ẏ(t) = G ( x(t), y(t) ; u 2 ) x(t) (S 1 ), y(t) (S 2 ) SERGIU HART c 2004 p. 40

120 Uncoupled Dynamics Adaptive ( rational ) dynamics (best-reply, better-reply, payoff-improving, monotonic, fictitious play, regret-based, replicator dynamics,...) are uncoupled SERGIU HART c 2004 p. 41

121 Uncoupled Dynamics Adaptive ( rational ) dynamics (best-reply, better-reply, payoff-improving, monotonic, fictitious play, regret-based, replicator dynamics,...) are uncoupled Uncoupledness is a natural informational condition SERGIU HART c 2004 p. 41

122 Nash-Convergent Dynamics Consider a family of games, each having a unique Nash equilibrium (no coordination problems ) SERGIU HART c 2004 p. 42

123 Nash-Convergent Dynamics Consider a family of games, each having a unique Nash equilibrium (no coordination problems ) A dynamic is Nash-convergent if it always converges to the unique Nash equilibrium Regularity conditions: The unique Nash equilibrium is a stable rest-point of the dynamic SERGIU HART c 2004 p. 42

124 Impossibility There exist no uncoupled dynamics which guarantee Nash convergence SERGIU HART c 2004 p. 43

125 Impossibility There exist no uncoupled dynamics which guarantee Nash convergence There are simple families of games whose unique Nash equilibrium is unstable for every uncoupled dynamic SERGIU HART c 2004 p. 43

126 Impossibility Adaptive ( rational ) dynamics (best-reply, better-reply, payoff-improving, monotonic, fictitious play, regret-based, replicator dynamics,...) are uncoupled SERGIU HART c 2004 p. 44

127 Impossibility Adaptive ( rational ) dynamics (best-reply, better-reply, payoff-improving, monotonic, fictitious play, regret-based, replicator dynamics,...) are uncoupled cannot always converge to Nash equilibria SERGIU HART c 2004 p. 44

128 Nash vs Correlated Correlated equilibria Uncoupled dynamics SERGIU HART c 2004 p. 45

129 Nash vs Correlated Correlated equilibria Uncoupled dynamics Nash equilibria Coupled dynamics SERGIU HART c 2004 p. 45

130 Nash vs Correlated Correlated equilibria Uncoupled dynamics Nash equilibria Coupled dynamics Law of Conservation of Coordination There must be coordination either in the equilibrium concept or in the dynamic SERGIU HART c 2004 p. 45

131 Summary SERGIU HART c 2004 p. 46

132 Where Do We Go From Here? SERGIU HART c 2004 p. 47

133 Where Do We Go From Here? Dynamics and equilibria Which equilibria? Which dynamics? SERGIU HART c 2004 p. 47

134 Where Do We Go From Here? Dynamics and equilibria Which equilibria? Which dynamics? Correlated equilibria: theory and practice Coordination Communication Bounded complexity SERGIU HART c 2004 p. 47

135 Where Do We Go From Here? Dynamics and equilibria Which equilibria? Which dynamics? Correlated equilibria: theory and practice Coordination Communication Bounded complexity Experiments, empirics Theory SERGIU HART c 2004 p. 47

136 Where Do We Go From Here? Dynamics and equilibria Which equilibria? Which dynamics? Correlated equilibria: theory and practice Coordination Communication Bounded complexity Experiments, empirics Theory Joint distribution of play (instead of just the marginals) SERGIU HART c 2004 p. 47

137 Summary SERGIU HART c 2004 p. 48

138 Summary Therejis a simple adaptive heuristic always leading to correlated equilibria SERGIU HART c 2004 p. 48

139 Summary Therejis a simple adaptive heuristic always leading to correlated equilibria (Regret Matching) SERGIU HART c 2004 p. 48

140 Summary Therejis a simple adaptive heuristic always leading to correlated equilibria (Regret Matching) There are many adaptive heuristics always leading to correlated equilibria SERGIU HART c 2004 p. 48

141 Summary Therejis a simple adaptive heuristic always leading to correlated equilibria (Regret Matching) There are many adaptive heuristics always leading to correlated equilibria (Generalized Regret Matching) SERGIU HART c 2004 p. 48

142 Summary Therejis a simple adaptive heuristic always leading to correlated equilibria (Regret Matching) There are many adaptive heuristics always leading to correlated equilibria (Generalized Regret Matching) There can be jno adaptive heuristics always leading to Nash equilibria SERGIU HART c 2004 p. 48

143 Summary Therejis a simple adaptive heuristic always leading to correlated equilibria (Regret Matching) There are many adaptive heuristics always leading to correlated equilibria (Generalized Regret Matching) There can be jno adaptive heuristics always leading to Nash equilibria (Uncoupledness) SERGIU HART c 2004 p. 48

144 Summary ADAPTIVE HEURISTICS SERGIU HART c 2004 p. 49

145 Summary BEHAVIORAL ADAPTIVE HEURISTICS SERGIU HART c 2004 p. 49

146 Summary BEHAVIORAL ADAPTIVE HEURISTICS CORRELATED EQUILIBRIA SERGIU HART c 2004 p. 49

147 Summary BEHAVIORAL ADAPTIVE HEURISTICS CORRELATED EQUILIBRIA Regret Matching SERGIU HART c 2004 p. 49

148 Summary BEHAVIORAL ADAPTIVE HEURISTICS CORRELATED EQUILIBRIA Generalized Regret Matching SERGIU HART c 2004 p. 49

149 Summary BEHAVIORAL ADAPTIVE HEURISTICS CORRELATED EQUILIBRIA Uncoupledness SERGIU HART c 2004 p. 49

150 Question Can simple adaptive heuristics lead to sophisticated rational behavior? SERGIU HART c 2004 p. 50

151 Answer Q Can simple adaptive heuristics lead to sophisticated rational behavior? YES! SERGIU HART c 2004 p. 50

152 Answer Q Can simple adaptive heuristics lead to sophisticated rational behavior? YES! in time... SERGIU HART c 2004 p. 50

153 Summary Macro BEHAVIORAL RATIONAL SERGIU HART c 2004 p. 51

154 Summary Macro BEHAVIORAL RATIONAL ADAPTIVE HEURISTICS SERGIU HART c 2004 p. 51

155 Title ADAPTIVE HEURISTICS ( A Little Rationality Goes a Long Way) SERGIU HART c 2004 p. 52

156 Title ADAPTIVE HEURISTICS (A Little Rationality Goes a Long Way) Rationality Takes Time SERGIU HART c 2004 p. 52

157 Title ADAPTIVE HEURISTICS (A Little Rationality Goes a Long Way) Rationality Takes Time Regret?... No Regret! SERGIU HART c 2004 p. 52

158 Title ADAPTIVE HEURISTICS (A Little Rationality Goes a Long Way) Rationality Takes Time Regret?... No Regret! SERGIU HART c 2004 p. 52

Correlated Equilibria: Rationality and Dynamics

Correlated Equilibria: Rationality and Dynamics Sergiu Hart June 2010 AUMANN 80 SERGIU HART c 2010 p. 1 CORRELATED EQUILIBRIA: RATIONALITY AND DYNAMICS Sergiu Hart Center for the Study of Rationality Dept