Calibration and Nash Equilibrium

Size: px

Start display at page:

Download "Calibration and Nash Equilibrium"

Joella Richards
5 years ago
Views:

1 Calibration and Nash Equilibrium Dean Foster University of Pennsylvania and Sham Kakade TTI September 22, 2005

2 What is a Nash equilibrium? Cartoon definition of NE: Leroy Lockhorn: I m drinking because she is driving. Loretta Lockhorn: I m driving because he is drinking. Technical definition of NE: If everyone else will play the Nash equilibrium, then I should play it also. Holds for all players in a game. Equilibrium of what process?

3 Calibration: A form of unbiasedness Suppose in a long (conceptually infinite) sequence of weather forecasts, we look at all those days for which the forecast probability of precipitation was, say, close to some given value p and then determine the long run proportion f of such days on which the forecast event (rain) in fact occurred. If f = p the forecaster may be termed well calibrated. Dawid [1982] A minimal condition for performance On sequence: A constant forecast of.5 is calibrated A constant forecast of.6 is not calibrated

4 Calibration: A form of unbiasedness see handout Left graph Bridge players Forecasts of winning a contract that was just bid. Expert bridge players are more calibrated than beginners Note: some experts play hands with 0 chance of winning! Right graph College students sports is more about utility than about probability. (I want my team to win.)

5 Bridge players

6 College students

7 Learning in games Learning models for games: Two players repeatedly play a game Each views the sequence of the other person s plays as data Each predicts what the other play will do Each then plays a best response to the prediction We will discuss the equilbrium resulting from calibrated learning models

8 Traditional test functions for calibration Sequential prediction (t is time) X t is forecast by p t Traditional calibration, means 1 T T t=1 (X t p t ) w(p t ) 0 holds for all possible w(). Note: The class of w() can be restricted to indicator functions. Oakes proved without randomization, calibration is impossible. With randomization calibration is possible.

9 New test functions X t sequence to be forecast by p t Weak calibration, means 1 T T t=1 (X t p t ) w(p t ) 0 for all w() which are continuous function.

10 Achieving weak calibration via polynomial regression Algorithm: Fit the model X t = d i=0 on X 1,..., X T 1 to estimate the β s. β i p i t + noise Solve fixed point equation: p T = d i=0 β i p i T (If no solution exists, use arbitary rule, say p T = 0.5.) Use p T to forecast X T. Theorem: p T is approximately weakly calibrated.

11 Algorithm: Solve the fixed point equation p T = d i=0 ˆβ i p i T where the ˆβ s are determined by a polynomial regression of X 1,..., X T 1 on p 1,..., p T 1. Theorem: p T is approximately weakly calibrated. Proof: Lemma (1991): regression does as well as any linear combination. Thus p T will predict as well as any polynomial of p T. Hence no polynomial change of p T will do better. Trivia: I talked about this lemma the last time I was here (1988).

12 Games as a good application for paranoid data analysis Learning in games has extensive literature Both emprical and theoretical Two players repeatedly play a game Do they converge to playing an equilibrium? Typical learning setup: Player i uses p i,t to predict other s play at the round t Player i computes best response distribution s i (p i,t ) Player i then randomly action S i from this distribution

13 Individual vs Public calibration Game setting for calibration X i,t is the observable that player i cares about at time t p i,t is a forecast of X i,t Individual calibration: ( i) 1 T T t=1 (X i,t p i,t ) w(p i,t ) 0 Public calibration: ( i) 1 T T t=1 (X i,t p i,t ) w( p t ) 0

14 Sharp vs. smooth best response s i (p i,t ) is the distrubtion player i will use for making a play at time t. Sharp best response means s i maps to corners of simplex Used in orginal research on learning requires randomized forecasts to get convergence results Obviously p i,t must be protected from being leaked Smooth best reply restricts s i ( ) to be Lipschitz Only close to optimal Randomization is now in the play

15 Observables Game setup: Take X i = S i (i.e. all actions but player i) p i,t is forecast of X i,t Individual calibration: ( i) 1 T T t=1 (X i,t p i,t ) w(p i,t ) 0 Public calibration: ( i) 1 T T t=1 (X i,t p i,t ) w( p t ) 0

16 Convergence Suppose players play a smooth best reply to forecast p i,t. Traditional calibration correlated equilibria Public calibration Nash equilibria Speed of convergence is related to dimension of the Hilbert space of the testing functions For individual: dimension (1/ɛ) an For public: dimension is (1/ɛ) nan Hence convergence is slow in both cases. Need lower dimensional space, but what can be changed?

17 Proof: Public calibration converges to NE Truth prediction via calibration Truth is independent Given p each player is in fact playing independently ɛ-rationality ɛ-br to prediction p i includes information about what all other players will do Independence + ɛ-rationality = ɛ-ne.

18 Utility estimation Take X i,t to be the vector of potential payoffs S i is the vector of everyone else s play u i,t (k) = u i (k, S i,t ) X i,t = (u i,t (1),..., u i,t (a)) Calibration of utilities correlated equilibria Public calibration of utilities Nash equilibria

19 Conclusion: Today s talk in historical context Method Forecast probability Forecast utility Least squares doesn t converge doesn t converge (F. 91) Blackwell CE CE Approach- Calibration No regret ability (F. and Vohra, 97) (F. and Vohra 97) (Hart and Mas-Colell 00) Exhaustive NE NE search Hypothesis testing Regret testing (F. and Young 03) (F. and Young 05) (Germano & Lugosi 05) Public NE NE methods Weak calibration Weak utility estimation (Kakade and F. 04) (Kakade and F. 05)

Smooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics

Smooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics Sergiu Hart August 2016 Smooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics Sergiu Hart Center for the Study of Rationality