Equilibria for games with asymmetric information: from guesswork to systematic evaluation
|
|
- Meryl Tucker
- 5 years ago
- Views:
Transcription
1 Equilibria for games with asymmetric information: from guesswork to systematic evaluation Achilleas Anastasopoulos EECS Department University of Michigan February 11, 2016
2 Joint work with Deepanshu Vasal (PhD student graduating May 2016) and Prof. Vijay Subramanian Achilleas Anastasopoulos (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
3 Decentralized decision making in dynamic systems Communication networks Sensor networks Social networks Queuing systems Energy markets Wireless resource sharing Repeated online advertisement auctions Competing sellers/buyers chilleas Anastasopoulos (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
4 Salient features Multiple agents (cooperative or strategic) Objective: Maximize expected (social or self) reward Underlying system state (not perfectly observed) Agents make observations (asymmetric information) and take actions partially affecting future state chilleas Anastasopoulos (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
5 Classification of problems Symmetric Information Teams Markov decision processes (MDP) or partially observed MDP (POMDP) Games subgame-perfect equilibrium (SPE) Markov-perfect equilibrium (MPE) chilleas Anastasopoulos (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
6 Classification of problems Symmetric Information Teams Markov decision processes (MDP) or partially observed MDP (POMDP) Games subgame-perfect equilibrium (SPE) Markov-perfect equilibrium (MPE) Asymmetric Information Common information approach IEEE Control Theory Axelby paper award [Nayyar, Mahajan, Teneketzis, 2013] Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
7 Classification of problems Teams Games Symmetric Information Markov decision processes (MDP) or partially observed MDP (POMDP) subgame-perfect equilibrium (SPE) Markov-perfect equilibrium (MPE) Asymmetric Information Common information approach 1 Perfect Bayesian (PBE) Sequential eq. (SE) and refinements No methodology!? IEEE Control Theory Axelby paper award [Nayyar, Mahajan, Teneketzis, 2013] Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
8 Model Discrete-time dynamical system with N strategic agents over finite horizon T Player i privately observes her (static 2 ) type X i X i where N P(X ) = Q i (X i ), i=1 X = (X 1, X 2,... X N ) X Player i takes action A i t A i which is publicly observed Player i s observations: Private: X i, Common: A 1:t 1 = (A 1, A 2,..., A t 1 ) = (A j k )j N k t 1 Action (randomized) A i t σ i t( X i, A 1:t 1 ) Instantaneous reward R i (X, A t ) Player i s objective max σ i E σ { T } R i (X, A t ) t=1 2 Generalization to dynamic types straightforward. chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
9 Concrete example: A public goods game 3 Two players take action to either contribute (A i t = 1) or not contribute (A i t = 0) to the production of a public good Player i s type (private information) is her cost of contributing: X i {L, H}, where X i s are i.i.d. with P(X i = H) = q. (Assume 0 < L < 1 < H < 2) If either player contributes, the public good is produced and the utility enjoyed is 1 for both users (free riding) Per-period rewards (R 1 (X 1, A t ), R 2 (X 2, A t )) are contribute(a 2 t = 1) don t contribute(a 2 t = 0) contribute(a 1 t = 1) (1 X 1, 1 X 2 ) (1 X 1, 1) don t contribute(a 1 t = 0) (1, 1 X 2 ) (0,0) Each player s action A i t σ i t( X i, A 1:t 1 ). 3 Adapted from [Fudenberg and Tirole, 1991, Example 8.3] chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
10 Classification of problems Teams Games Symmetric Information Markov decision processes (MDP) or partially observed MDP (POMDP) Asymmetric Information Achilleas Anastasopoulos (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
11 Team with perfect observation of X X is observed by everyone Single team objective R(X, A t ) = i N Ri (X, A t ) contribute(a 2 t = 1) don t contribute(a 2 t = 0) contribute(a 1 t = 1) 2 X 1 X 2 2 X 1 don t contribute(a 1 t = 0) 2 X 2 0 chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
12 Team with perfect observation of X X is observed by everyone Single team objective R(X, A t ) = i N Ri (X, A t ) contribute(a 2 t = 1) don t contribute(a 2 t = 0) contribute(a 1 t = 1) 2 X 1 X 2 2 X 1 don t contribute(a 1 t = 0) 2 X 2 0 Optimal decisions are myopic (just look at instantaneous reward) and functions of the current system state X = (X 1, X 2 ) (A 1 t, A 2 t ) = (1, 0) if (X 1, X 2 ) = (L, H) (0, 1) if (X 1, X 2 ) = (H, L) (1, 0) or (0, 1) if (X 1, X 2 ) = (L, L) (1, 0) or (0, 1) if (X 1, X 2 ) = (H, H) chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
13 Team with perfect observation of X X is observed by everyone Single team objective R(X, A t ) = i N Ri (X, A t ) contribute(a 2 t = 1) don t contribute(a 2 t = 0) contribute(a 1 t = 1) 2 X 1 X 2 2 X 1 don t contribute(a 1 t = 0) 2 X 2 0 Optimal decisions are myopic (just look at instantaneous reward) and functions of the current system state X = (X 1, X 2 ) (A 1 t, A 2 t ) = (1, 0) if (X 1, X 2 ) = (L, H) (0, 1) if (X 1, X 2 ) = (H, L) (1, 0) or (0, 1) if (X 1, X 2 ) = (L, L) (1, 0) or (0, 1) if (X 1, X 2 ) = (H, H) What about time-varying types, e.g., Q(X t+1 X t ) or Q(X t+1 X t, A t )? MDP chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
14 Team with no observation of X X is not observed at all (symmetric information) Single team objective R(X, A t ) = i N Ri (X, A t ) Previous actions are not informative of X Same as before with average rewards (w.r.t. prior belief P(X i = H) = q) contribute(a 2 t = 1) don t contribute(a 2 t = 0) contribute(a 1 t = 1) 2 (mean total cost) 2 (qh + ql) don t contribute(a 1 t = 0) 2 (qh + ql) 0 chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
15 Team with no observation of X X is not observed at all (symmetric information) Single team objective R(X, A t ) = i N Ri (X, A t ) Previous actions are not informative of X Same as before with average rewards (w.r.t. prior belief P(X i = H) = q) contribute(a 2 t = 1) don t contribute(a 2 t = 0) contribute(a 1 t = 1) 2 (mean total cost) 2 (qh + ql) don t contribute(a 1 t = 0) 2 (qh + ql) 0 Optimal decisions are constant (A 1 t, A 2 t ) = (1, 0) or (0, 1) chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
16 Team with no observation of X X is not observed at all (symmetric information) Single team objective R(X, A t ) = i N Ri (X, A t ) Previous actions are not informative of X Same as before with average rewards (w.r.t. prior belief P(X i = H) = q) contribute(a 2 t = 1) don t contribute(a 2 t = 0) contribute(a 1 t = 1) 2 (mean total cost) 2 (qh + ql) don t contribute(a 1 t = 0) 2 (qh + ql) 0 Optimal decisions are constant (A 1 t, A 2 t ) = (1, 0) or (0, 1) What about time-varying types, e.g., Q(X t+1 X t ) or Q(X t+1 X t, A t )? POMDP chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
17 Classification of problems Teams Games Symmetric Information subgame-perfect equilibrium (SPE) Markov-perfect equilibrium (MPE) Asymmetric Information Achilleas Anastasopoulos (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
18 Game with perfect observation of X LL LH HL HH subgame L 1-L 1 1-L L L 1-L 1 1-H H H 1-H 1 1-L L H 1-H 1 1-H H 0 Players know exactly what branch they are on at each stage of the game Sub-game perfect equilibrium (SPE): given any history (path) players see a continuation game (sub-game) and do not want to deviate Algortihm: Backward induction Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
19 Game with perfect observation of X LL LH HL HH L 1-L 1-L L L 1-L 1 1-H H H 1-H 1-H 1-H 1 1-L H L 0 1-H 0 Here, at each stage of the game, the continuation game is the same SPE strategy profile does not depend on the entire history of actions but only on state X. Even with time-varying states, similar algorithm (backward induction) can be used Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
20 Classification of problems Teams Games Symmetric Asymmetric Information Information Common information approach Achilleas Anastasopoulos (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
21 Decentralized team problem Player i s observations: Private: X i, Common: A 1:t 1 Action (randomized) A i t σ i t( X i, A 1:t 1 ) Design objective for entire team { T } max E σ R(X, A t ) σ }{{} t=1 e.g., i N Ri (X,A t) chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
22 Decentralized team problem Player i s observations: Private: X i, Common: A 1:t 1 Action (randomized) A i t σ i t( X i, A 1:t 1 ) Design objective for entire team { T } max E σ R(X, A t ) σ }{{} t=1 e.g., i N Ri (X,A t) Problems to be addressed 4 1 Presence of common A 1:t 1 and private X i information for agent i 2 Decentralized, non-classical information structure (this is not a MDP/POMDP-like problem!) 3 Domain of policies A i t σ i t( X i, A 1:t 1) increases with time. 4 All these have been addressed in [Nayyar, Mahajan, Teneketzis, 2013] chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
23 A simple but powerful idea A policy σ i t( X i, A 1:t 1 ) can be interpreted in two equivalent ways: chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
24 A simple but powerful idea A policy σ i t( X i, A 1:t 1 ) can be interpreted in two equivalent ways: 1) A function of A 1:t 1 and X i to (A i ) X i A 1:t 1 H L (A i t) σ i t chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
25 A simple but powerful idea A policy σ i t( X i, A 1:t 1 ) can be interpreted in two equivalent ways: 1) A function of A 1:t 1 and X i to (A i ) 2) A function of A 1:t 1 to mappings from X i to (A i ) X i A 1:t 1 H L (A i t) A 1:t X i (A i ) σ i t ψ i t chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
26 A simple but powerful idea In the first interpretation, the policies to be designed (σ i ) i N have inherent asymmetric information structure X 1 σ 1 t A 1 t A 1:t 1 A 1:t 1 X 2 σ 2 t A 2 t Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
27 A simple but powerful idea In the second interpretation, each agent s action A i t σ i t( X i, A 1:t 1 ) can be thought of as a two-stage process chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
28 A simple but powerful idea In the second interpretation, each agent s action A i t σ i t( X i, A 1:t 1 ) can be thought of as a two-stage process 1 Based on common info A 1:t 1 select prescription functions Γ i t : X i (A i ) through the mapping ψ i Γ i t = ψ i t[a 1:t 1 ] A 1:t 1 ψ t Γ t = (Γ 1 t, Γ 2 t ) chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
29 A simple but powerful idea In the second interpretation, each agent s action A i t σ i t( X i, A 1:t 1 ) can be thought of as a two-stage process 1 Based on common info A 1:t 1 select prescription functions Γ i t : X i (A i ) through the mapping ψ i X 1 A 1 t Γ i t = ψ i t[a 1:t 1 ] 2 The actions A i t are determined by evaluating Γ i t at the private information X i, i.e., A 1:t 1 ψ t Γ t = (Γ 1 t, Γ 2 t ) X 2 Γ 1 t Γ 2 t A i t Γ i t( X i ) A 2 t chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
30 A simple but powerful idea In the second interpretation, each agent s action A i t σ i t( X i, A 1:t 1 ) can be thought of as a two-stage process 1 Based on common info A 1:t 1 select prescription functions Γ i t : X i (A i ) through the mapping ψ i X 1 A 1 t Γ i t = ψ i t[a 1:t 1 ] 2 The actions A i t are determined by evaluating Γ i t at the private information X i, i.e., A 1:t 1 ψ t Γ t = (Γ 1 t, Γ 2 t ) X 2 Γ 1 t Γ 2 t A i t Γ i t( X i ) A 2 t Overall A i t Γ i t( X i ) = ψ i t[a 1:t 1 ]( X i ) = σ i t( X i, A 1:t 1 ) chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
31 Transformation to a centralized problem X 1 σ 1 t A 1 t X 1 A 1 t Γ 1 t A1:t 1 A1:t 1 A1:t 1 ψt Γt = (Γ 1 t, Γ 2 t ) Γ 2 t X 2 σ 2 t A 2 t X 2 A 2 t Generation of A i t is a dumb evaluation A i t Γ i t( X i ) (nothing to be designed here) The control problem boils down to selecting prescription functions Γ i t = ψ i t[a 1:t 1 ] through policy ψ = (ψ i t) i N t T The decentralized control problem has been transformed to a centralized control problem with a fictitious common agent who observes A 1:t 1 and takes actions Γ t Last issue to address: increasing domain A t 1 of the pre-encoder mappings ψ t. Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
32 Introduction of information state We would like to summarize A 1:t 1 in a quantity (state) with time invariant domain chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
33 Introduction of information state We would like to summarize A 1:t 1 in a quantity (state) with time invariant domain Consider the dynamical system with state: (X, A t 1 ) observation: A t 1 action: Γ t reward: E{R(X, A t ) X, A 1:t 1, Γ 1:t } = a t Γ t (a t X )R(X, a t ) := R(X, Γ t ) chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
34 Introduction of information state We would like to summarize A 1:t 1 in a quantity (state) with time invariant domain Consider the dynamical system with state: (X, A t 1 ) observation: A t 1 action: Γ t reward: E{R(X, A t ) X, A 1:t 1, Γ 1:t } = a t Γ t (a t X )R(X, a t ) := R(X, Γ t ) This is a POMDP! Define the posterior belief Π t (X ) Π t (x) := P(X = x A 1:t 1, Γ 1:t 1 ) for all x X Can show that Π t can be updated using common information Π t+1 = F (Π t, Γ t, A t ) (Bayes law) (*) for this problem it also factors into its marginals Π t (x) = Π i t(x i ) with Π i t+1 = F (Π i t, Γ i t, A i t) i N chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
35 Characterization of optimal team policy From standard POMDP results, optimal policy is Markovian, i.e., Γ t = (Γ i t) i N = ψ t [A 1:t 1 ] = θ t [Π t ] A i t Γ i t( X i ) = θ i t[π t ]( X i ) = m i t( X i, Π t ) and can be obtained using backward dynamic programming (DP) θ t [π t ] = γ t = arg max γ t E {R(X, A t ) + V t+1 (F (π t, γ t, A t )) π t, γ t } V t (π t ) = max γ t E {R(X, A t ) + V t+1 (F (π t, γ t, A t )) π t, γ t } on the space of beliefs π t (X ) over prescriptions γ t (X i A i ) i N chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
36 Characterization of optimal team policy From standard POMDP results, optimal policy is Markovian, i.e., Γ t = (Γ i t) i N = ψ t [A 1:t 1 ] = θ t [Π t ] A i t Γ i t( X i ) = θ i t[π t ]( X i ) = m i t( X i, Π t ) and can be obtained using backward dynamic programming (DP) θ t [π t ] = γ t = arg max γ t E {R(X, A t ) + V t+1 (F (π t, γ t, A t )) π t, γ t } V t (π t ) = max γ t E {R(X, A t ) + V t+1 (F (π t, γ t, A t )) π t, γ t } on the space of beliefs π t (X ) over prescriptions γ t (X i A i ) i N In the public goods example: π t (π 1 t (H), π 2 t (H)) [0, 1] 2 and γ t (γ 1 t (0 H), γ 1 t (0 L), γ 2 t (0 H), γ 2 t (0 L)) [0, 1] 4 chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
37 Summary of team problem Introduction of prescription functions was crucial We gained: - Decentralized non-classical information structure POMDP A i t θ i t[π t ]( X i ) and θ can be obtained using DP chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
38 Summary of team problem Introduction of prescription functions was crucial We gained: - Decentralized non-classical information structure POMDP A i t θ i t[π t ]( X i ) and θ can be obtained using DP We gave up: - Fictitious common agent does not observe X i. - Can only maximize average reward-to-go E{ T t =t R(X, A t ) A 1:t 1} before seeing private information, - This is not a problem in teams since we are interested in maximizing the average reward chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
39 Classification of problems Teams Games Asymmetric Information Symmetric Information Perfect Bayesian (PBE) Sequential eq. (SE) and refinements No methodology!? Achilleas Anastasopoulos (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
40 Perfect Bayesian equilibria (PBE) Player s 1 perspective LL LH HL HH chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
41 Perfect Bayesian equilibria (PBE) Player s 1 perspective LL LH HL HH not a proper sub-game 2 need belief on X to evaluate expected future reward chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
42 Perfect Bayesian equilibria (PBE) Player s 1 perspective LL LH HL HH not a proper sub-game 2 need belief on X (conditioned on 01) to evaluate expected future reward SPE is not appropriate equilibrium concept! Perfect Bayesian equilibrium (PBE) chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
43 Perfect Bayesian equilibria (PBE) A PBE is an assessment (σ, µ ) of strategy profiles σ and beliefs µ satisfying (a) sequential rationality and (b) consistency (a) For every t T, agent i N, information set (A 1:t 1, X i ), and unilateral deviation σ i T T E µ,σ i σ i { R i (X, A t ) A 1:t 1, X i } E µ,σ i σ i { R i (X, A t ) A 1:t 1, X i } t =t (b) Beliefs µ should be updated by Bayes law (whenever possible) given σ and satisfy further consistency conditions [Fudenberg and Tirole, 1991, ch. 8] t =t Due to the circular dependence of µ and σ finding PBE is a large fixed-point problem (no time decomposition) Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
44 Ideas from teams: structured equilibrium strategies σ Useful idea from teams: Instead of considering equilibria with general strategies σ = (σt i ) i N t T form A i t σ i t ( X i, A 1:t 1 ) consider equilibria with structured strategies θ = (θ i t) i N t T where A i t Γ i t( X i ) = θ i t[π t ]( X i ) = m i t( X i, Π t ) of the form of the Π t+1 = F (Π t, Γ t, A t ) = F (Π t, θ t [Π t ], A t ) = F θ t (A 1:t ) (essentially Bayes law) σ θ Note: although equilibrium strategies are structured, unilateral deviations may be anything Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
45 Parenthesis: are structured strategies restrictive? Lemma For any given strategy profile σ = (σ i ) i N, there exists a structured strategy profile θ m = (m i ) i N with the players receiving the same average rewards for both σ and m. chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
46 Parenthesis: are structured strategies restrictive? Lemma For any given strategy profile σ = (σ i ) i N, there exists a structured strategy profile θ m = (m i ) i N with the players receiving the same average rewards for both σ and m. Bottom line: Structured strategy profiles m are a sufficiently rich class so that we can concentrate on equilibria within this class. Caveat: Each m i depends on the entire σ = (σ i ) i N, so unilateral deviations in σ i result in multilateral deviations in m chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
47 Ideas from teams: beliefs µ Recall that in PBE, µ is a set of beliefs on unobserved types X i for each agent i and for each private history (information set) (A 1:t 1, X i ) Consider beliefs that are: (a) only functions of the common history A 1:t 1 and (b) are generated from a common belief in product form µ t [A 1:t 1 ](X ) = µ j t [A 1:t 1 ](X j ) So, for each agent i and for each history (A 1:t 1, X i ) belief on X i is µ j t [A 1:t 1 ](X j ) j N \{i} j N In addition, given strategies σ θ, these beliefs are updated as µ i t+1[a 1:t ] = F (µ i t [A 1:t 1 ], θ }{{}}{{} t[µ i t [A 1:t 1 ]], A i }{{} t) Π i Π i t+1 t Γ i t chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
48 Ideas from teams: beliefs µ Recall that in PBE, µ is a set of beliefs on unobserved types X i for each agent i and for each private history (information set) (A 1:t 1, X i ) Consider beliefs that are: (a) only functions of the common history A 1:t 1 and (b) are generated from a common belief in product form µ t [A 1:t 1 ](X ) = µ j t [A 1:t 1 ](X j ) So, for each agent i and for each history (A 1:t 1, X i ) belief on X i is µ j t [A 1:t 1 ](X j ) j N \{i} j N In addition, given strategies σ θ, these beliefs are updated as µ i t+1[a 1:t ] = F (µ i t [A 1:t 1 ], θ }{{}}{{} t[µ i t [A 1:t 1 ]], A i }{{} t) Π i Π i t+1 t Γ i t Bottom line: all consistency conditions are satisfied automatically. chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
49 Summary so far We have motivated the use of structured (equilibrium) strategies σ θ {}}{ A i t σt i ( A 1:t 1, X i ) = θt[ i µ t [A 1:t 1 ]]( X i ) }{{} Γ i t We have restricted attention to a class of beliefs µ that remain independent and updated as µ i t+1[a 1:t ] = F (µ i t [A 1:t 1 ], θ }{{}}{{} t[µ i t [A 1:t 1 ]], A i }{{} t) Π i Π i t+1 t Γ i t PBE equilibrium (σ, µ ) (θ, µ ) even in this restricted class is still the solution of a large fixed point equation. Circularity between θ and µ still present Π t chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
50 Summary so far We have motivated the use of structured (equilibrium) strategies σ θ {}}{ A i t σt i ( A 1:t 1, X i ) = θt[ i µ t [A 1:t 1 ]]( X i ) }{{} Γ i t We have restricted attention to a class of beliefs µ that remain independent and updated as µ i t+1[a 1:t ] = F (µ i t [A 1:t 1 ], θ }{{}}{{} t[µ i t [A 1:t 1 ]], A i }{{} t) Π i Π i t+1 t Γ i t PBE equilibrium (σ, µ ) (θ, µ ) even in this restricted class is still the solution of a large fixed point equation. Circularity between θ and µ still present How can we find θ with a simple algorithm? Beliefs and policies are decomposed by considering the policies for all possible beliefs π; not just for µ chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44 Π t
51 First erroneous attempt Recall DP equation from team problem For each t = T, T 1,..., 1 and for every π t (X ) solve the following maximization problem θ t [π t ] = γt { = arg max E πt,γi t γ i t R(X, At ) + V t+1 (F (π t, γtγ i t i, A t )) } γt i γ i t What is the logical extension in games? chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
52 First erroneous attempt Recall DP equation from team problem For each t = T, T 1,..., 1 and for every π t (X ) solve the following maximization problem θ t [π t ] = γt { = arg max E πt,γi t γ i t R(X, At ) + V t+1 (F (π t, γtγ i t i, A t )) } γt i γ i t What is the logical extension in games? Transform it into a best-response type equation (fix γt i γt) i and maximize over γ i t for all i N arg max E πt,γi t γ i t γt i { R i (X, A t ) + V i t+1(f (π t, γ i tγ i t, A t )) } Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
53 First erroneous attempt: what is the catch? γ i t for all i N Why erroneous? arg max E πt,γi t γ i t γt i { R i (X, A t ) + V i t+1(f (π t, γ i tγ i t, A t )) } chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
54 First erroneous attempt: what is the catch? γ i t for all i N Why erroneous? arg max E πt,γi t γ i t γt i { R i (X, A t ) + V i t+1(f (π t, γ i tγ i t, A t )) } Explanation: reward-to-go is not conditioned on the entire history (A 1:t 1, X i ) for user i but only on part of it A 1:t 1 Π t. This was OK in teams but is not sufficient to prove sequential rationality in games! T T E µ,σ i σ i { R i (X, A t ) A 1:t 1, X i } E µ, σ i σ i { R i (X, A t ) A 1:t 1, X i } t =t t =t chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
55 Special case 5 Consider dynamical systems for which belief update is prescription-independent, i.e., Π t+1 = F (Π t, A t ) In that case the backward process decomposes and conditioning on X i is irrelevant A strong statement can be made for this special case: For every PBE there exists a structured PBE that corresponds to a SPE of an equivalent symmetric-information game 5 [Nayyar, Gupta, Langbort, Başar, 2014], [Gupta, Nayyar, Langbort, Başar, 2014] chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
56 Second erroneous attempt Condition on X i in the backward induction step to be consistent with sequential rationality condition For each t = T, T 1,..., 1 and for every π t (X ) solve the following one-step fixed-point equation γ i t for all i N and for all x i X i arg max E πt,γi t ( x i )γ i t γt i Note in this case reward-to-go is V i t (π t, x i ) { R i (X, A t ) + V i t+1(f (π t, γ i tγ i t, A t ), x i ) x i} Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
57 Second erroneous attempt: explanation E{ } = γt(a i t x i i )γt i (at i x i )π i (x i ) a t,x i ( R i (x i x i, a t ) + Vt+1(F i (π t, γtγ i t i, a t ), x i ) ) This is an unusual fixed point equation: dependence on γ i t( x i ) but also on the entire γ i t( ) (inside the belief update) chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
58 Second erroneous attempt: explanation E{ } = γt(a i t x i i )γt i (at i x i )π i (x i ) a t,x i ( R i (x i x i, a t ) + Vt+1(F i (π t, γtγ i t i, a t ), x i ) ) This is an unusual fixed point equation: dependence on γ i t( x i ) but also on the entire γ i t( ) (inside the belief update) Unfortunately this results in FP solution θ with γ t = θ t [π t, x] so resulting policy is of the form A i t Γ i t ( X i ) = θ i t[π t, X ]( X i ) which is not implementable (requires unknown private information X i for the strategy of i). chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
59 Third erroneous attempt Condition on X i in the backward induction step to be consistent with sequential rationality and optimize only over some part of the prescription For each t = T, T 1,..., 1 and for every π t (X ) solve the following one-step fixed-point equation γ i t ( x i ) for all i N and for all x i X i arg max E πt,γi t ( x i )γ i t γt i ( x i ) { R i (X, A t ) + V i t+1(f (π t, γ i t( x i )γ i t ( )γ i t, A t ), x i ) x i} chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
60 Third erroneous attempt Condition on X i in the backward induction step to be consistent with sequential rationality and optimize only over some part of the prescription For each t = T, T 1,..., 1 and for every π t (X ) solve the following one-step fixed-point equation γ i t ( x i ) for all i N and for all x i X i arg max E πt,γi t ( x i )γ i t γt i ( x i ) { R i (X, A t ) + V i t+1(f (π t, γ i t( x i )γ i t ( )γ i t, A t ), x i ) x i} This results in FP solution θ with γ t = θ t [π t ] for all π t (X ) Unfortunately, does not work in the proof: something more fundamental is going on... chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
61 An algorithm for PBE evaluation: backward recursion For each t = T, T 1,..., 1 and for every π t (X ) solve the following one-step fixed-point equation for all i N and for all x i X i γt i ( x i ) arg max E πt,γi t ( x i )γ i t γt i ( x i ) { R i (X, A t ) + V i t+1(f (π t, γ i t γ i t, A t ), x i ) x i} This results in FP solution θ with γ t = θ t [π t ] for all π t (X ) This is not a best-response type function: γt i side present on left/right hand Intuition: Find γ i t( x i ) that is optimal under unperturbed belief update! Remember the core concept in PBE... Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
62 An algorithm for PBE evaluation: forward recursion From backard recursion we have obtained θ = (θ i t) i N t T. For each t = 1, 2,..., T and for every i N, A 1:t, and X i σ i t (A i t A 1:t 1, X i ) := θt[µ i t [A 1:t 1 ]](A i }{{} t X i ) Γ i t µ t+1[a 1:t ] := F (µ t [A 1:t 1 ], θ t [µ t [A 1:t 1 ]], A t ) }{{}}{{}}{{} Π t+1 Π t Γ t In fact we can obtain a family of PBEs for any type distribution i N Qi (X i ) with appropriate initialization of µ 1 Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
63 Main Result Theorem (σ, µ ) generated by the backward/forward algorithm (whenever it exists) is a PBE, i.e. for all i, t, A 1:t 1, X i, σ i, E σ i t:t σ i t:t µ t { T R i (X, A n ) } A 1:t 1 X i n=t E σi t:t σ i t:t µ t and µ satisfies the consistency conditions. { T R i (X, A n ) } A 1:t 1 X i n=t chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
64 Sketch of the proof Independence of types and specific DP equation are crucial in proving the result Modified comparison principle (backward induction) chilleas Anastasopoulos (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
65 Sketch of the proof Independence of types and specific DP equation are crucial in proving the result Modified comparison principle (backward induction) Specific DP guarantees that unperturbed reward-to-go (LHS) at time t is the obtained value function Vt i = R i + Vt+1 i Specific DP guarantees that unilateral deviations with fixed belief update reduce Vt i Induction step reduces Vt+1 i to (perturbed) reward-to-go at time t + 1 Independence of types guarantees that resulting expression is exactly the (perturbed) reward-to-go at time t (RHS) chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
66 Comments on the new per-stage FP equation This is not a best-response type of FP equation (due to presence of γ i on both the LHS and RHS of equation) Standard tools for existence of solution (e.g., Brouwer, Kakutani) do not apply (problem with continuity of V ( ) functions) chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
67 Comments on the new per-stage FP equation This is not a best-response type of FP equation (due to presence of γ i on both the LHS and RHS of equation) Standard tools for existence of solution (e.g., Brouwer, Kakutani) do not apply (problem with continuity of V ( ) functions) Existence can be shown for a special case 6 where R i (X, A t ) does not depend on its own type X i In that case prescriptions Γ i t( X i ) = Γ i t( ) do not depend on private type X i and FP equation reduces to best response. No signaling! Essentially reduces to the model Π t+1 = F (Π t, A t ) 6 [Ouyang, Tavafoghi, Teneketzis, 2015] chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
68 Current/Future work Model generalizations: Types are independent controlled Markov processes (controlled by all actions) P(X t X 1:t 1, A 1:t 1) = i N Qi (Xt i Xt 1, i A t 1) 7 Dependence types with strategic independence 8 Types are observed through a noisy channel (even by same user) Q(Yt i Xt i ). Example: informational cascades literature Infinite horizon and continuous action spaces Existence results: prove existence for the simplest non-trivial class of problems. Core issue: the per-stage FP equation is not a best response Dynamic mechanism design (indirect mechanisms with message space smaller than type space) 7 [Vasal, Subramanian, A, 2015b] 8 [Battigalli, 1996] chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
69 Thank you! Achilleas Anastasopoulos (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
70 Extra: FP equations First attempt γ 1 = f 1 (γ 2, π) γ 2 = f 2 (γ 1, π) } γ = f (γ, π) γ = θ(π) Second attempt γ 1 = f 1H (γ 2, π) γ 1 = f 1L (γ 2, π) γ 2 = f 2H (γ 1, π) γ 2 = f 2L (γ 1, π) γ = f LL (γ, π) γ = f LH (γ, π) γ = f HL (γ, π) γ = f HH (γ, π) γ = θ(π, x) Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
71 Extra: FP equations Third attempt Correct γ H 1 = f 1H (γl 1, γ2, π) γ L 1 = f 1L (γh 1, γ2, π) γ H 2 = f 2H (γl 2, γ1, π) γ L 2 = f 2L (γh 2, γ1, π) γ H 1 = f 1H (γ 1, γ 2, π) γ L 1 = f 1L (γ 1, γ 2, π) γ H 2 = f 2H (γ 2, γ 1, π) γ L 2 = f 2L (γ 2, γ 1, π) { γ 1 = f 1 (γ 1, γ 2, π) γ 2 = f 2 (γ 1, γ 2, π) { γ 1 = f 1 (γ 1, γ 2, π) γ 2 = f 2 (γ 1, γ 2, π) } γ = f (γ, π) γ = θ(π) } γ = f (γ, π) γ = θ(π) Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
72 Battigalli, P. (1996). Strategic independence and perfect Bayesian equilibria. Journal of Economic Theory, 70(1): Fudenberg, D. and Tirole, J. (1991). Game Theory. MIT Press, Cambridge, MA. Gupta, A., Nayyar, A., Langbort, C., and Başar, T. (2014). Common information based markov perfect equilibria for linear-gaussian games with asymmetric information. SIAM Journal on Control and Optimization, 52(5): Nayyar, A., Gupta, A., Langbort, C., and Başar, T. (2014). Common information based markov perfect equilibria for stochastic games with asymmetric information: Finite games. IEEE Trans. Automatic Control, 59(3): Nayyar, A., Mahajan, A., and Teneketzis, D. (2013). Decentralized stochastic control with partial history sharing: A common information approach. Automatic Control, IEEE Transactions on, 58(7): Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
73 Ouyang, Y., Tavafoghi, H., and Teneketzis, D. (2015). Dynamic oligopoly games with private markovian dynamics. Available at www-personal.umich.edu/~tavaf/oligopolygames.pdf. Vasal, D., Subramanian, V., and Anastasopoulos, A. (2015a). A systematic process for evaluating structured perfect Bayesian equilibria in dynamic games with asymmetric information. Technical report. Vasal, D., Subramanian, V., and Anastasopoulos, A. (2015b). A systematic process for evaluating structured perfect Bayesian equilibria in dynamic games with asymmetric information. In American Control Conference. (Accepted for publication/presentation). Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44
Dynamic Games with Asymmetric Information: Common Information Based Perfect Bayesian Equilibria and Sequential Decomposition
Dynamic Games with Asymmetric Information: Common Information Based Perfect Bayesian Equilibria and Sequential Decomposition 1 arxiv:1510.07001v1 [cs.gt] 23 Oct 2015 Yi Ouyang, Hamidreza Tavafoghi and
More informationIntroduction to Sequential Teams
Introduction to Sequential Teams Aditya Mahajan McGill University Joint work with: Ashutosh Nayyar and Demos Teneketzis, UMichigan MITACS Workshop on Fusion and Inference in Networks, 2011 Decentralized
More informationProblems in the intersection of Information theory and Control
Problems in the intersection of Information theory and Control Achilleas Anastasopoulos anastas@umich.edu EECS Department University of Michigan Dec 5, 2013 Former PhD student: Junghuyn Bae Current PhD
More informationDecentralized Stochastic Control with Partial Sharing Information Structures: A Common Information Approach
Decentralized Stochastic Control with Partial Sharing Information Structures: A Common Information Approach 1 Ashutosh Nayyar, Aditya Mahajan and Demosthenis Teneketzis Abstract A general model of decentralized
More informationSufficient Statistics in Decentralized Decision-Making Problems
Sufficient Statistics in Decentralized Decision-Making Problems Ashutosh Nayyar University of Southern California Feb 8, 05 / 46 Decentralized Systems Transportation Networks Communication Networks Networked
More information6196 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 9, SEPTEMBER 2011
6196 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 9, SEPTEMBER 2011 On the Structure of Real-Time Encoding and Decoding Functions in a Multiterminal Communication System Ashutosh Nayyar, Student
More informationHigher Order Beliefs in Dynamic Environments
University of Pennsylvania Department of Economics June 22, 2008 Introduction: Higher Order Beliefs Global Games (Carlsson and Van Damme, 1993): A B A 0, 0 0, θ 2 B θ 2, 0 θ, θ Dominance Regions: A if
More informationMechanism Design: Implementation. Game Theory Course: Jackson, Leyton-Brown & Shoham
Game Theory Course: Jackson, Leyton-Brown & Shoham Bayesian Game Setting Extend the social choice setting to a new setting where agents can t be relied upon to disclose their preferences honestly Start
More informationOptimal Decentralized Control of Coupled Subsystems With Control Sharing
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 58, NO. 9, SEPTEMBER 2013 2377 Optimal Decentralized Control of Coupled Subsystems With Control Sharing Aditya Mahajan, Member, IEEE Abstract Subsystems that
More information1 The General Definition
MS&E 336 Lecture 1: Dynamic games Ramesh Johari April 4, 2007 1 The General Definition A dynamic game (or extensive game, or game in extensive form) consists of: A set of players N; A set H of sequences
More informationFirst Prev Next Last Go Back Full Screen Close Quit. Game Theory. Giorgio Fagiolo
Game Theory Giorgio Fagiolo giorgio.fagiolo@univr.it https://mail.sssup.it/ fagiolo/welcome.html Academic Year 2005-2006 University of Verona Summary 1. Why Game Theory? 2. Cooperative vs. Noncooperative
More informationChapter 4 The Common-Information Approach to Decentralized Stochastic Control
Chapter 4 The Common-Information Approach to Decentralized Stochastic Control Ashutosh Nayyar, Aditya Mahajan, and Demosthenis Teneketzis 4.1 Introduction Many modern technological systems, such as cyber-physical
More informationDynamic stochastic game and macroeconomic equilibrium
Dynamic stochastic game and macroeconomic equilibrium Tianxiao Zheng SAIF 1. Introduction We have studied single agent problems. However, macro-economy consists of a large number of agents including individuals/households,
More informationEquilibrium Refinements
Equilibrium Refinements Mihai Manea MIT Sequential Equilibrium In many games information is imperfect and the only subgame is the original game... subgame perfect equilibrium = Nash equilibrium Play starting
More informationCommon Knowledge and Sequential Team Problems
Common Knowledge and Sequential Team Problems Authors: Ashutosh Nayyar and Demosthenis Teneketzis Computer Engineering Technical Report Number CENG-2018-02 Ming Hsieh Department of Electrical Engineering
More informationRate of Convergence of Learning in Social Networks
Rate of Convergence of Learning in Social Networks Ilan Lobel, Daron Acemoglu, Munther Dahleh and Asuman Ozdaglar Massachusetts Institute of Technology Cambridge, MA Abstract We study the rate of convergence
More informationMS&E 246: Lecture 12 Static games of incomplete information. Ramesh Johari
MS&E 246: Lecture 12 Static games of incomplete information Ramesh Johari Incomplete information Complete information means the entire structure of the game is common knowledge Incomplete information means
More information4: Dynamic games. Concordia February 6, 2017
INSE6441 Jia Yuan Yu 4: Dynamic games Concordia February 6, 2017 We introduce dynamic game with non-simultaneous moves. Example 0.1 (Ultimatum game). Divide class into two groups at random: Proposers,
More informationSequential Decision Making in Decentralized Systems
Sequential Decision Making in Decentralized Systems by Ashutosh Nayyar A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Electrical Engineering:
More informationBasics of Game Theory
Basics of Game Theory Giacomo Bacci and Luca Sanguinetti Department of Information Engineering University of Pisa, Pisa, Italy {giacomo.bacci,luca.sanguinetti}@iet.unipi.it April - May, 2010 G. Bacci and
More informationGame Theory. Wolfgang Frimmel. Perfect Bayesian Equilibrium
Game Theory Wolfgang Frimmel Perfect Bayesian Equilibrium / 22 Bayesian Nash equilibrium and dynamic games L M R 3 2 L R L R 2 2 L R L 2,, M,2, R,3,3 2 NE and 2 SPNE (only subgame!) 2 / 22 Non-credible
More informationBayesian Learning in Social Networks
Bayesian Learning in Social Networks Asu Ozdaglar Joint work with Daron Acemoglu, Munther Dahleh, Ilan Lobel Department of Electrical Engineering and Computer Science, Department of Economics, Operations
More informationGovernment 2005: Formal Political Theory I
Government 2005: Formal Political Theory I Lecture 11 Instructor: Tommaso Nannicini Teaching Fellow: Jeremy Bowles Harvard University November 9, 2017 Overview * Today s lecture Dynamic games of incomplete
More informationOptimal Control Strategies in Delayed Sharing Information Structures
Optimal Control Strategies in Delayed Sharing Information Structures Ashutosh Nayyar, Aditya Mahajan and Demosthenis Teneketzis arxiv:1002.4172v1 [cs.oh] 22 Feb 2010 February 12, 2010 Abstract The n-step
More informationEconomics 201A Economic Theory (Fall 2009) Extensive Games with Perfect and Imperfect Information
Economics 201A Economic Theory (Fall 2009) Extensive Games with Perfect and Imperfect Information Topics: perfect information (OR 6.1), subgame perfection (OR 6.2), forward induction (OR 6.6), imperfect
More information6.207/14.15: Networks Lecture 11: Introduction to Game Theory 3
6.207/14.15: Networks Lecture 11: Introduction to Game Theory 3 Daron Acemoglu and Asu Ozdaglar MIT October 19, 2009 1 Introduction Outline Existence of Nash Equilibrium in Infinite Games Extensive Form
More informationSolution to Tutorial 9
Solution to Tutorial 9 2011/2012 Semester I MA4264 Game Theory Tutor: Xiang Sun October 27, 2011 Exercise 1. A buyer and a seller have valuations v b and v s. It is common knowledge that there are gains
More informationSolution to Tutorial 9
Solution to Tutorial 9 2012/2013 Semester I MA4264 Game Theory Tutor: Xiang Sun November 2, 2012 Exercise 1. We consider a game between two software developers, who sell operating systems (OS) for personal
More informationImpact of social connectivity on herding behavior
Impact of social connectivity on herding behavior Deepanshu Vasal [0000 0003 089 8080] Department of Electrical and Computer Engineering, University of Texas, Austin, Texas, 7875, USA {dvasal} at utexas.edu
More informationBELIEFS & EVOLUTIONARY GAME THEORY
1 / 32 BELIEFS & EVOLUTIONARY GAME THEORY Heinrich H. Nax hnax@ethz.ch & Bary S. R. Pradelski bpradelski@ethz.ch May 15, 217: Lecture 1 2 / 32 Plan Normal form games Equilibrium invariance Equilibrium
More informationEconomics 703 Advanced Microeconomics. Professor Peter Cramton Fall 2017
Economics 703 Advanced Microeconomics Professor Peter Cramton Fall 2017 1 Outline Introduction Syllabus Web demonstration Examples 2 About Me: Peter Cramton B.S. Engineering, Cornell University Ph.D. Business
More informationPolitical Economy of Institutions and Development: Problem Set 1. Due Date: Thursday, February 23, in class.
Political Economy of Institutions and Development: 14.773 Problem Set 1 Due Date: Thursday, February 23, in class. Answer Questions 1-3. handed in. The other two questions are for practice and are not
More informationDefinitions and Proofs
Giving Advice vs. Making Decisions: Transparency, Information, and Delegation Online Appendix A Definitions and Proofs A. The Informational Environment The set of states of nature is denoted by = [, ],
More informationSEQUENTIAL ESTIMATION OF DYNAMIC DISCRETE GAMES. Victor Aguirregabiria (Boston University) and. Pedro Mira (CEMFI) Applied Micro Workshop at Minnesota
SEQUENTIAL ESTIMATION OF DYNAMIC DISCRETE GAMES Victor Aguirregabiria (Boston University) and Pedro Mira (CEMFI) Applied Micro Workshop at Minnesota February 16, 2006 CONTEXT AND MOTIVATION Many interesting
More informationIndustrial Organization Lecture 3: Game Theory
Industrial Organization Lecture 3: Game Theory Nicolas Schutz Nicolas Schutz Game Theory 1 / 43 Introduction Why game theory? In the introductory lecture, we defined Industrial Organization as the economics
More informationInformed Principal in Private-Value Environments
Informed Principal in Private-Value Environments Tymofiy Mylovanov Thomas Tröger University of Bonn June 21, 2008 1/28 Motivation 2/28 Motivation In most applications of mechanism design, the proposer
More informationThe Social Value of Credible Public Information
The Social Value of Credible Public Information Ercan Karadas NYU September, 2017 Introduction Model Analysis MOTIVATION This paper is motivated by the paper Social Value of Public Information, Morris
More informationPartially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS
Partially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS Many slides adapted from Jur van den Berg Outline POMDPs Separation Principle / Certainty Equivalence Locally Optimal
More informationAdaptive State Feedback Nash Strategies for Linear Quadratic Discrete-Time Games
Adaptive State Feedbac Nash Strategies for Linear Quadratic Discrete-Time Games Dan Shen and Jose B. Cruz, Jr. Intelligent Automation Inc., Rocville, MD 2858 USA (email: dshen@i-a-i.com). The Ohio State
More informationStructure of optimal decentralized control policies An axiomatic approach
Structure of optimal decentralized control policies An axiomatic approach Aditya Mahajan Yale Joint work with: Demos Teneketzis (UofM), Sekhar Tatikonda (Yale), Ashutosh Nayyar (UofM), Serdar Yüksel (Queens)
More informationThreshold Policy for Global Games with Noisy Information Sharing
05 IEEE 54th Annual Conference on Decision and Control CDC December 5-8, 05 Osaka, Japan Threshold olicy for Global Games with Noisy Information Sharing Hessam Mahdavifar, Ahmad Beirami, Behrouz Touri,
More informationPreliminary Results on Social Learning with Partial Observations
Preliminary Results on Social Learning with Partial Observations Ilan Lobel, Daron Acemoglu, Munther Dahleh and Asuman Ozdaglar ABSTRACT We study a model of social learning with partial observations from
More informationCrowdsourcing contests
December 8, 2012 Table of contents 1 Introduction 2 Related Work 3 Model: Basics 4 Model: Participants 5 Homogeneous Effort 6 Extensions Table of Contents 1 Introduction 2 Related Work 3 Model: Basics
More informationHidden Markov Models. AIMA Chapter 15, Sections 1 5. AIMA Chapter 15, Sections 1 5 1
Hidden Markov Models AIMA Chapter 15, Sections 1 5 AIMA Chapter 15, Sections 1 5 1 Consider a target tracking problem Time and uncertainty X t = set of unobservable state variables at time t e.g., Position
More informationCheap talk theoretic tools for distributed sensing in the presence of strategic sensors
1 / 37 Cheap talk theoretic tools for distributed sensing in the presence of strategic sensors Cédric Langbort Department of Aerospace Engineering & Coordinated Science Laboratory UNIVERSITY OF ILLINOIS
More informationOutline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012
CSE 573: Artificial Intelligence Autumn 2012 Reasoning about Uncertainty & Hidden Markov Models Daniel Weld Many slides adapted from Dan Klein, Stuart Russell, Andrew Moore & Luke Zettlemoyer 1 Outline
More informationMARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti
1 MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti Historical background 2 Original motivation: animal learning Early
More informationMulti-armed bandit models: a tutorial
Multi-armed bandit models: a tutorial CERMICS seminar, March 30th, 2016 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions)
More informationGame Theory with Information: Introducing the Witsenhausen Intrinsic Model
Game Theory with Information: Introducing the Witsenhausen Intrinsic Model Michel De Lara and Benjamin Heymann Cermics, École des Ponts ParisTech France École des Ponts ParisTech March 15, 2017 Information
More informationOptimal Convergence in Multi-Agent MDPs
Optimal Convergence in Multi-Agent MDPs Peter Vrancx 1, Katja Verbeeck 2, and Ann Nowé 1 1 {pvrancx, ann.nowe}@vub.ac.be, Computational Modeling Lab, Vrije Universiteit Brussel 2 k.verbeeck@micc.unimaas.nl,
More informationMulti-channel Opportunistic Access: A Case of Restless Bandits with Multiple Plays
Multi-channel Opportunistic Access: A Case of Restless Bandits with Multiple Plays Sahand Haji Ali Ahmad, Mingyan Liu Abstract This paper considers the following stochastic control problem that arises
More informationPersuading Skeptics and Reaffirming Believers
Persuading Skeptics and Reaffirming Believers May, 31 st, 2014 Becker-Friedman Institute Ricardo Alonso and Odilon Camara Marshall School of Business - USC Introduction Sender wants to influence decisions
More information6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2
6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2 Daron Acemoglu and Asu Ozdaglar MIT October 14, 2009 1 Introduction Outline Mixed Strategies Existence of Mixed Strategy Nash Equilibrium
More informationAn MDP-Based Approach to Online Mechanism Design
An MDP-Based Approach to Online Mechanism Design The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Parkes, David C., and
More informationBasic Game Theory. Kate Larson. January 7, University of Waterloo. Kate Larson. What is Game Theory? Normal Form Games. Computing Equilibria
Basic Game Theory University of Waterloo January 7, 2013 Outline 1 2 3 What is game theory? The study of games! Bluffing in poker What move to make in chess How to play Rock-Scissors-Paper Also study of
More informationSocial Learning with Endogenous Network Formation
Social Learning with Endogenous Network Formation Yangbo Song November 10, 2014 Abstract I study the problem of social learning in a model where agents move sequentially. Each agent receives a private
More informationRobust Predictions in Games with Incomplete Information
Robust Predictions in Games with Incomplete Information joint with Stephen Morris (Princeton University) November 2010 Payoff Environment in games with incomplete information, the agents are uncertain
More informationRefinements - change set of equilibria to find "better" set of equilibria by eliminating some that are less plausible
efinements efinements - change set of equilibria to find "better" set of equilibria by eliminating some that are less plausible Strategic Form Eliminate Weakly Dominated Strategies - Purpose - throwing
More informationWars of Attrition with Budget Constraints
Wars of Attrition with Budget Constraints Gagan Ghosh Bingchao Huangfu Heng Liu October 19, 2017 (PRELIMINARY AND INCOMPLETE: COMMENTS WELCOME) Abstract We study wars of attrition between two bidders who
More informationBayesian Congestion Control over a Markovian Network Bandwidth Process
Bayesian Congestion Control over a Markovian Network Bandwidth Process Parisa Mansourifard 1/30 Bayesian Congestion Control over a Markovian Network Bandwidth Process Parisa Mansourifard (USC) Joint work
More informationOptimally Solving Dec-POMDPs as Continuous-State MDPs
Optimally Solving Dec-POMDPs as Continuous-State MDPs Jilles Dibangoye (1), Chris Amato (2), Olivier Buffet (1) and François Charpillet (1) (1) Inria, Université de Lorraine France (2) MIT, CSAIL USA IJCAI
More information1 Extensive Form Games
1 Extensive Form Games De nition 1 A nite extensive form game is am object K = fn; (T ) ; P; A; H; u; g where: N = f0; 1; :::; ng is the set of agents (player 0 is nature ) (T ) is the game tree P is the
More informationRecap Social Choice Functions Fun Game Mechanism Design. Mechanism Design. Lecture 13. Mechanism Design Lecture 13, Slide 1
Mechanism Design Lecture 13 Mechanism Design Lecture 13, Slide 1 Lecture Overview 1 Recap 2 Social Choice Functions 3 Fun Game 4 Mechanism Design Mechanism Design Lecture 13, Slide 2 Notation N is the
More informationOptimal Strategies for Communication and Remote Estimation with an Energy Harvesting Sensor
1 Optimal Strategies for Communication and Remote Estimation with an Energy Harvesting Sensor A. Nayyar, T. Başar, D. Teneketzis and V. V. Veeravalli Abstract We consider a remote estimation problem with
More informationPerfect Bayesian Equilibrium
Perfect Bayesian Equilibrium For an important class of extensive games, a solution concept is available that is simpler than sequential equilibrium, but with similar properties. In a Bayesian extensive
More informationEquivalences of Extensive Forms with Perfect Recall
Equivalences of Extensive Forms with Perfect Recall Carlos Alós-Ferrer and Klaus Ritzberger University of Cologne and Royal Holloway, University of London, 1 and VGSF 1 as of Aug. 1, 2016 1 Introduction
More informationEvolutionary Dynamics and Extensive Form Games by Ross Cressman. Reviewed by William H. Sandholm *
Evolutionary Dynamics and Extensive Form Games by Ross Cressman Reviewed by William H. Sandholm * Noncooperative game theory is one of a handful of fundamental frameworks used for economic modeling. It
More informationPartially observable Markov decision processes. Department of Computer Science, Czech Technical University in Prague
Partially observable Markov decision processes Jiří Kléma Department of Computer Science, Czech Technical University in Prague https://cw.fel.cvut.cz/wiki/courses/b4b36zui/prednasky pagenda Previous lecture:
More informationA Folk Theorem for Contract Games with Multiple Principals and Agents
A Folk Theorem for Contract Games with Multiple Principals and Agents Siyang Xiong March 27, 2013 Abstract We fully characterize the set of deteristic equilibrium allocations in a competing contract game
More informationErgodicity and Non-Ergodicity in Economics
Abstract An stochastic system is called ergodic if it tends in probability to a limiting form that is independent of the initial conditions. Breakdown of ergodicity gives rise to path dependence. We illustrate
More informationRobust Partially Observable Markov Decision Processes
Submitted to manuscript Robust Partially Observable Markov Decision Processes Mohammad Rasouli 1, Soroush Saghafian 2 1 Management Science and Engineering, Stanford University, Palo Alto, CA 2 Harvard
More informationEconomics 3012 Strategic Behavior Andy McLennan October 20, 2006
Economics 301 Strategic Behavior Andy McLennan October 0, 006 Lecture 11 Topics Problem Set 9 Extensive Games of Imperfect Information An Example General Description Strategies and Nash Equilibrium Beliefs
More information9 A Class of Dynamic Games of Incomplete Information:
A Class of Dynamic Games of Incomplete Information: Signalling Games In general, a dynamic game of incomplete information is any extensive form game in which at least one player is uninformed about some
More informationVoluntary Leadership in Teams and the Leadership Paradox
Voluntary Leadership in Teams and the Leadership Paradox Susanne Mayer 1 and Christian Hilbe 2 1 Institute for Social Policy, Department Socioeconomics, Vienna University of Economics and Business, Nordbergstraße
More information6.254 : Game Theory with Engineering Applications Lecture 13: Extensive Form Games
6.254 : Game Theory with Engineering Lecture 13: Extensive Form Games Asu Ozdaglar MIT March 18, 2010 1 Introduction Outline Extensive Form Games with Perfect Information One-stage Deviation Principle
More informationOn the Unique D1 Equilibrium in the Stackelberg Model with Asymmetric Information Janssen, M.C.W.; Maasland, E.
Tilburg University On the Unique D1 Equilibrium in the Stackelberg Model with Asymmetric Information Janssen, M.C.W.; Maasland, E. Publication date: 1997 Link to publication General rights Copyright and
More informationA Solution to the Problem of Externalities When Agents Are Well-Informed
A Solution to the Problem of Externalities When Agents Are Well-Informed Hal R. Varian. The American Economic Review, Vol. 84, No. 5 (Dec., 1994), pp. 1278-1293 Introduction There is a unilateral externality
More informationIntroduction to Artificial Intelligence (AI)
Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 10 Oct, 13, 2011 CPSC 502, Lecture 10 Slide 1 Today Oct 13 Inference in HMMs More on Robot Localization CPSC 502, Lecture
More informationHierarchical Bayesian Persuasion
Hierarchical Bayesian Persuasion Weijie Zhong May 12, 2015 1 Introduction Information transmission between an informed sender and an uninformed decision maker has been extensively studied and applied in
More informationEC319 Economic Theory and Its Applications, Part II: Lecture 7
EC319 Economic Theory and Its Applications, Part II: Lecture 7 Leonardo Felli NAB.2.14 27 February 2014 Signalling Games Consider the following Bayesian game: Set of players: N = {N, S, }, Nature N strategy
More informationBayesian Machine Learning - Lecture 7
Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1
More informationComplexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning
Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning Christos Dimitrakakis Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands
More informationSTRUCTURE AND OPTIMALITY OF MYOPIC SENSING FOR OPPORTUNISTIC SPECTRUM ACCESS
STRUCTURE AND OPTIMALITY OF MYOPIC SENSING FOR OPPORTUNISTIC SPECTRUM ACCESS Qing Zhao University of California Davis, CA 95616 qzhao@ece.ucdavis.edu Bhaskar Krishnamachari University of Southern California
More informationNotes on Coursera s Game Theory
Notes on Coursera s Game Theory Manoel Horta Ribeiro Week 01: Introduction and Overview Game theory is about self interested agents interacting within a specific set of rules. Self-Interested Agents have
More informationArtificial Intelligence
Artificial Intelligence Dynamic Programming Marc Toussaint University of Stuttgart Winter 2018/19 Motivation: So far we focussed on tree search-like solvers for decision problems. There is a second important
More informationOn Design and Analysis of Cyber-Physical Systems with Strategic Agents
On Design and Analysis of Cyber-Physical Systems with Strategic Agents by Hamidreza Tavafoghi Jahormi A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy
More informationDecentralized stochastic control
DOI 0.007/s0479-04-652-0 Decentralized stochastic control Aditya Mahajan Mehnaz Mannan Springer Science+Business Media New York 204 Abstract Decentralized stochastic control refers to the multi-stage optimization
More informationOblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games
Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games Gabriel Y. Weintraub, Lanier Benkard, and Benjamin Van Roy Stanford University {gweintra,lanierb,bvr}@stanford.edu Abstract
More informationLocal Communication in Repeated Games with Local Monitoring
Local Communication in Repeated Games with Local Monitoring M Laclau To cite this version: M Laclau. Local Communication in Repeated Games with Local Monitoring. 2012. HAL Id: hal-01285070
More informationPower Allocation over Two Identical Gilbert-Elliott Channels
Power Allocation over Two Identical Gilbert-Elliott Channels Junhua Tang School of Electronic Information and Electrical Engineering Shanghai Jiao Tong University, China Email: junhuatang@sjtu.edu.cn Parisa
More informationBandit models: a tutorial
Gdt COS, December 3rd, 2015 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions) Bandit game: a each round t, an agent chooses
More informationUC Berkeley Haas School of Business Game Theory (EMBA 296 & EWMBA 211) Summer 2016
UC Berkeley Haas School of Business Game Theory (EMBA 296 & EWMBA 211) Summer 2016 More on strategic games and extensive games with perfect information Block 2 Jun 12, 2016 Food for thought LUPI Many players
More informationOn Equilibria of Distributed Message-Passing Games
On Equilibria of Distributed Message-Passing Games Concetta Pilotto and K. Mani Chandy California Institute of Technology, Computer Science Department 1200 E. California Blvd. MC 256-80 Pasadena, US {pilotto,mani}@cs.caltech.edu
More informationEconomics of Networks Social Learning
Economics of Networks Social Learning Evan Sadler Massachusetts Institute of Technology Evan Sadler Social Learning 1/38 Agenda Recap of rational herding Observational learning in a network DeGroot learning
More informationLearning from Sequential and Time-Series Data
Learning from Sequential and Time-Series Data Sridhar Mahadevan mahadeva@cs.umass.edu University of Massachusetts Sridhar Mahadevan: CMPSCI 689 p. 1/? Sequential and Time-Series Data Many real-world applications
More informationOptimal Control of Partiality Observable Markov. Processes over a Finite Horizon
Optimal Control of Partiality Observable Markov Processes over a Finite Horizon Report by Jalal Arabneydi 04/11/2012 Taken from Control of Partiality Observable Markov Processes over a finite Horizon by
More informationA Decentralized Approach to Multi-agent Planning in the Presence of Constraints and Uncertainty
2011 IEEE International Conference on Robotics and Automation Shanghai International Conference Center May 9-13, 2011, Shanghai, China A Decentralized Approach to Multi-agent Planning in the Presence of
More informationLearning Equilibrium as a Generalization of Learning to Optimize
Learning Equilibrium as a Generalization of Learning to Optimize Dov Monderer and Moshe Tennenholtz Faculty of Industrial Engineering and Management Technion Israel Institute of Technology Haifa 32000,
More information9 Forward-backward algorithm, sum-product on factor graphs
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous
More informationDeceptive Advertising with Rational Buyers
Deceptive Advertising with Rational Buyers September 6, 016 ONLINE APPENDIX In this Appendix we present in full additional results and extensions which are only mentioned in the paper. In the exposition
More information