Equilibria for games with asymmetric information: from guesswork to systematic evaluation

Size: px
Start display at page:

Download "Equilibria for games with asymmetric information: from guesswork to systematic evaluation"

Transcription

1 Equilibria for games with asymmetric information: from guesswork to systematic evaluation Achilleas Anastasopoulos EECS Department University of Michigan February 11, 2016

2 Joint work with Deepanshu Vasal (PhD student graduating May 2016) and Prof. Vijay Subramanian Achilleas Anastasopoulos (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

3 Decentralized decision making in dynamic systems Communication networks Sensor networks Social networks Queuing systems Energy markets Wireless resource sharing Repeated online advertisement auctions Competing sellers/buyers chilleas Anastasopoulos (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

4 Salient features Multiple agents (cooperative or strategic) Objective: Maximize expected (social or self) reward Underlying system state (not perfectly observed) Agents make observations (asymmetric information) and take actions partially affecting future state chilleas Anastasopoulos (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

5 Classification of problems Symmetric Information Teams Markov decision processes (MDP) or partially observed MDP (POMDP) Games subgame-perfect equilibrium (SPE) Markov-perfect equilibrium (MPE) chilleas Anastasopoulos (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

6 Classification of problems Symmetric Information Teams Markov decision processes (MDP) or partially observed MDP (POMDP) Games subgame-perfect equilibrium (SPE) Markov-perfect equilibrium (MPE) Asymmetric Information Common information approach IEEE Control Theory Axelby paper award [Nayyar, Mahajan, Teneketzis, 2013] Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

7 Classification of problems Teams Games Symmetric Information Markov decision processes (MDP) or partially observed MDP (POMDP) subgame-perfect equilibrium (SPE) Markov-perfect equilibrium (MPE) Asymmetric Information Common information approach 1 Perfect Bayesian (PBE) Sequential eq. (SE) and refinements No methodology!? IEEE Control Theory Axelby paper award [Nayyar, Mahajan, Teneketzis, 2013] Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

8 Model Discrete-time dynamical system with N strategic agents over finite horizon T Player i privately observes her (static 2 ) type X i X i where N P(X ) = Q i (X i ), i=1 X = (X 1, X 2,... X N ) X Player i takes action A i t A i which is publicly observed Player i s observations: Private: X i, Common: A 1:t 1 = (A 1, A 2,..., A t 1 ) = (A j k )j N k t 1 Action (randomized) A i t σ i t( X i, A 1:t 1 ) Instantaneous reward R i (X, A t ) Player i s objective max σ i E σ { T } R i (X, A t ) t=1 2 Generalization to dynamic types straightforward. chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

9 Concrete example: A public goods game 3 Two players take action to either contribute (A i t = 1) or not contribute (A i t = 0) to the production of a public good Player i s type (private information) is her cost of contributing: X i {L, H}, where X i s are i.i.d. with P(X i = H) = q. (Assume 0 < L < 1 < H < 2) If either player contributes, the public good is produced and the utility enjoyed is 1 for both users (free riding) Per-period rewards (R 1 (X 1, A t ), R 2 (X 2, A t )) are contribute(a 2 t = 1) don t contribute(a 2 t = 0) contribute(a 1 t = 1) (1 X 1, 1 X 2 ) (1 X 1, 1) don t contribute(a 1 t = 0) (1, 1 X 2 ) (0,0) Each player s action A i t σ i t( X i, A 1:t 1 ). 3 Adapted from [Fudenberg and Tirole, 1991, Example 8.3] chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

10 Classification of problems Teams Games Symmetric Information Markov decision processes (MDP) or partially observed MDP (POMDP) Asymmetric Information Achilleas Anastasopoulos (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

11 Team with perfect observation of X X is observed by everyone Single team objective R(X, A t ) = i N Ri (X, A t ) contribute(a 2 t = 1) don t contribute(a 2 t = 0) contribute(a 1 t = 1) 2 X 1 X 2 2 X 1 don t contribute(a 1 t = 0) 2 X 2 0 chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

12 Team with perfect observation of X X is observed by everyone Single team objective R(X, A t ) = i N Ri (X, A t ) contribute(a 2 t = 1) don t contribute(a 2 t = 0) contribute(a 1 t = 1) 2 X 1 X 2 2 X 1 don t contribute(a 1 t = 0) 2 X 2 0 Optimal decisions are myopic (just look at instantaneous reward) and functions of the current system state X = (X 1, X 2 ) (A 1 t, A 2 t ) = (1, 0) if (X 1, X 2 ) = (L, H) (0, 1) if (X 1, X 2 ) = (H, L) (1, 0) or (0, 1) if (X 1, X 2 ) = (L, L) (1, 0) or (0, 1) if (X 1, X 2 ) = (H, H) chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

13 Team with perfect observation of X X is observed by everyone Single team objective R(X, A t ) = i N Ri (X, A t ) contribute(a 2 t = 1) don t contribute(a 2 t = 0) contribute(a 1 t = 1) 2 X 1 X 2 2 X 1 don t contribute(a 1 t = 0) 2 X 2 0 Optimal decisions are myopic (just look at instantaneous reward) and functions of the current system state X = (X 1, X 2 ) (A 1 t, A 2 t ) = (1, 0) if (X 1, X 2 ) = (L, H) (0, 1) if (X 1, X 2 ) = (H, L) (1, 0) or (0, 1) if (X 1, X 2 ) = (L, L) (1, 0) or (0, 1) if (X 1, X 2 ) = (H, H) What about time-varying types, e.g., Q(X t+1 X t ) or Q(X t+1 X t, A t )? MDP chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

14 Team with no observation of X X is not observed at all (symmetric information) Single team objective R(X, A t ) = i N Ri (X, A t ) Previous actions are not informative of X Same as before with average rewards (w.r.t. prior belief P(X i = H) = q) contribute(a 2 t = 1) don t contribute(a 2 t = 0) contribute(a 1 t = 1) 2 (mean total cost) 2 (qh + ql) don t contribute(a 1 t = 0) 2 (qh + ql) 0 chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

15 Team with no observation of X X is not observed at all (symmetric information) Single team objective R(X, A t ) = i N Ri (X, A t ) Previous actions are not informative of X Same as before with average rewards (w.r.t. prior belief P(X i = H) = q) contribute(a 2 t = 1) don t contribute(a 2 t = 0) contribute(a 1 t = 1) 2 (mean total cost) 2 (qh + ql) don t contribute(a 1 t = 0) 2 (qh + ql) 0 Optimal decisions are constant (A 1 t, A 2 t ) = (1, 0) or (0, 1) chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

16 Team with no observation of X X is not observed at all (symmetric information) Single team objective R(X, A t ) = i N Ri (X, A t ) Previous actions are not informative of X Same as before with average rewards (w.r.t. prior belief P(X i = H) = q) contribute(a 2 t = 1) don t contribute(a 2 t = 0) contribute(a 1 t = 1) 2 (mean total cost) 2 (qh + ql) don t contribute(a 1 t = 0) 2 (qh + ql) 0 Optimal decisions are constant (A 1 t, A 2 t ) = (1, 0) or (0, 1) What about time-varying types, e.g., Q(X t+1 X t ) or Q(X t+1 X t, A t )? POMDP chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

17 Classification of problems Teams Games Symmetric Information subgame-perfect equilibrium (SPE) Markov-perfect equilibrium (MPE) Asymmetric Information Achilleas Anastasopoulos (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

18 Game with perfect observation of X LL LH HL HH subgame L 1-L 1 1-L L L 1-L 1 1-H H H 1-H 1 1-L L H 1-H 1 1-H H 0 Players know exactly what branch they are on at each stage of the game Sub-game perfect equilibrium (SPE): given any history (path) players see a continuation game (sub-game) and do not want to deviate Algortihm: Backward induction Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

19 Game with perfect observation of X LL LH HL HH L 1-L 1-L L L 1-L 1 1-H H H 1-H 1-H 1-H 1 1-L H L 0 1-H 0 Here, at each stage of the game, the continuation game is the same SPE strategy profile does not depend on the entire history of actions but only on state X. Even with time-varying states, similar algorithm (backward induction) can be used Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

20 Classification of problems Teams Games Symmetric Asymmetric Information Information Common information approach Achilleas Anastasopoulos (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

21 Decentralized team problem Player i s observations: Private: X i, Common: A 1:t 1 Action (randomized) A i t σ i t( X i, A 1:t 1 ) Design objective for entire team { T } max E σ R(X, A t ) σ }{{} t=1 e.g., i N Ri (X,A t) chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

22 Decentralized team problem Player i s observations: Private: X i, Common: A 1:t 1 Action (randomized) A i t σ i t( X i, A 1:t 1 ) Design objective for entire team { T } max E σ R(X, A t ) σ }{{} t=1 e.g., i N Ri (X,A t) Problems to be addressed 4 1 Presence of common A 1:t 1 and private X i information for agent i 2 Decentralized, non-classical information structure (this is not a MDP/POMDP-like problem!) 3 Domain of policies A i t σ i t( X i, A 1:t 1) increases with time. 4 All these have been addressed in [Nayyar, Mahajan, Teneketzis, 2013] chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

23 A simple but powerful idea A policy σ i t( X i, A 1:t 1 ) can be interpreted in two equivalent ways: chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

24 A simple but powerful idea A policy σ i t( X i, A 1:t 1 ) can be interpreted in two equivalent ways: 1) A function of A 1:t 1 and X i to (A i ) X i A 1:t 1 H L (A i t) σ i t chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

25 A simple but powerful idea A policy σ i t( X i, A 1:t 1 ) can be interpreted in two equivalent ways: 1) A function of A 1:t 1 and X i to (A i ) 2) A function of A 1:t 1 to mappings from X i to (A i ) X i A 1:t 1 H L (A i t) A 1:t X i (A i ) σ i t ψ i t chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

26 A simple but powerful idea In the first interpretation, the policies to be designed (σ i ) i N have inherent asymmetric information structure X 1 σ 1 t A 1 t A 1:t 1 A 1:t 1 X 2 σ 2 t A 2 t Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

27 A simple but powerful idea In the second interpretation, each agent s action A i t σ i t( X i, A 1:t 1 ) can be thought of as a two-stage process chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

28 A simple but powerful idea In the second interpretation, each agent s action A i t σ i t( X i, A 1:t 1 ) can be thought of as a two-stage process 1 Based on common info A 1:t 1 select prescription functions Γ i t : X i (A i ) through the mapping ψ i Γ i t = ψ i t[a 1:t 1 ] A 1:t 1 ψ t Γ t = (Γ 1 t, Γ 2 t ) chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

29 A simple but powerful idea In the second interpretation, each agent s action A i t σ i t( X i, A 1:t 1 ) can be thought of as a two-stage process 1 Based on common info A 1:t 1 select prescription functions Γ i t : X i (A i ) through the mapping ψ i X 1 A 1 t Γ i t = ψ i t[a 1:t 1 ] 2 The actions A i t are determined by evaluating Γ i t at the private information X i, i.e., A 1:t 1 ψ t Γ t = (Γ 1 t, Γ 2 t ) X 2 Γ 1 t Γ 2 t A i t Γ i t( X i ) A 2 t chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

30 A simple but powerful idea In the second interpretation, each agent s action A i t σ i t( X i, A 1:t 1 ) can be thought of as a two-stage process 1 Based on common info A 1:t 1 select prescription functions Γ i t : X i (A i ) through the mapping ψ i X 1 A 1 t Γ i t = ψ i t[a 1:t 1 ] 2 The actions A i t are determined by evaluating Γ i t at the private information X i, i.e., A 1:t 1 ψ t Γ t = (Γ 1 t, Γ 2 t ) X 2 Γ 1 t Γ 2 t A i t Γ i t( X i ) A 2 t Overall A i t Γ i t( X i ) = ψ i t[a 1:t 1 ]( X i ) = σ i t( X i, A 1:t 1 ) chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

31 Transformation to a centralized problem X 1 σ 1 t A 1 t X 1 A 1 t Γ 1 t A1:t 1 A1:t 1 A1:t 1 ψt Γt = (Γ 1 t, Γ 2 t ) Γ 2 t X 2 σ 2 t A 2 t X 2 A 2 t Generation of A i t is a dumb evaluation A i t Γ i t( X i ) (nothing to be designed here) The control problem boils down to selecting prescription functions Γ i t = ψ i t[a 1:t 1 ] through policy ψ = (ψ i t) i N t T The decentralized control problem has been transformed to a centralized control problem with a fictitious common agent who observes A 1:t 1 and takes actions Γ t Last issue to address: increasing domain A t 1 of the pre-encoder mappings ψ t. Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

32 Introduction of information state We would like to summarize A 1:t 1 in a quantity (state) with time invariant domain chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

33 Introduction of information state We would like to summarize A 1:t 1 in a quantity (state) with time invariant domain Consider the dynamical system with state: (X, A t 1 ) observation: A t 1 action: Γ t reward: E{R(X, A t ) X, A 1:t 1, Γ 1:t } = a t Γ t (a t X )R(X, a t ) := R(X, Γ t ) chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

34 Introduction of information state We would like to summarize A 1:t 1 in a quantity (state) with time invariant domain Consider the dynamical system with state: (X, A t 1 ) observation: A t 1 action: Γ t reward: E{R(X, A t ) X, A 1:t 1, Γ 1:t } = a t Γ t (a t X )R(X, a t ) := R(X, Γ t ) This is a POMDP! Define the posterior belief Π t (X ) Π t (x) := P(X = x A 1:t 1, Γ 1:t 1 ) for all x X Can show that Π t can be updated using common information Π t+1 = F (Π t, Γ t, A t ) (Bayes law) (*) for this problem it also factors into its marginals Π t (x) = Π i t(x i ) with Π i t+1 = F (Π i t, Γ i t, A i t) i N chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

35 Characterization of optimal team policy From standard POMDP results, optimal policy is Markovian, i.e., Γ t = (Γ i t) i N = ψ t [A 1:t 1 ] = θ t [Π t ] A i t Γ i t( X i ) = θ i t[π t ]( X i ) = m i t( X i, Π t ) and can be obtained using backward dynamic programming (DP) θ t [π t ] = γ t = arg max γ t E {R(X, A t ) + V t+1 (F (π t, γ t, A t )) π t, γ t } V t (π t ) = max γ t E {R(X, A t ) + V t+1 (F (π t, γ t, A t )) π t, γ t } on the space of beliefs π t (X ) over prescriptions γ t (X i A i ) i N chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

36 Characterization of optimal team policy From standard POMDP results, optimal policy is Markovian, i.e., Γ t = (Γ i t) i N = ψ t [A 1:t 1 ] = θ t [Π t ] A i t Γ i t( X i ) = θ i t[π t ]( X i ) = m i t( X i, Π t ) and can be obtained using backward dynamic programming (DP) θ t [π t ] = γ t = arg max γ t E {R(X, A t ) + V t+1 (F (π t, γ t, A t )) π t, γ t } V t (π t ) = max γ t E {R(X, A t ) + V t+1 (F (π t, γ t, A t )) π t, γ t } on the space of beliefs π t (X ) over prescriptions γ t (X i A i ) i N In the public goods example: π t (π 1 t (H), π 2 t (H)) [0, 1] 2 and γ t (γ 1 t (0 H), γ 1 t (0 L), γ 2 t (0 H), γ 2 t (0 L)) [0, 1] 4 chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

37 Summary of team problem Introduction of prescription functions was crucial We gained: - Decentralized non-classical information structure POMDP A i t θ i t[π t ]( X i ) and θ can be obtained using DP chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

38 Summary of team problem Introduction of prescription functions was crucial We gained: - Decentralized non-classical information structure POMDP A i t θ i t[π t ]( X i ) and θ can be obtained using DP We gave up: - Fictitious common agent does not observe X i. - Can only maximize average reward-to-go E{ T t =t R(X, A t ) A 1:t 1} before seeing private information, - This is not a problem in teams since we are interested in maximizing the average reward chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

39 Classification of problems Teams Games Asymmetric Information Symmetric Information Perfect Bayesian (PBE) Sequential eq. (SE) and refinements No methodology!? Achilleas Anastasopoulos (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

40 Perfect Bayesian equilibria (PBE) Player s 1 perspective LL LH HL HH chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

41 Perfect Bayesian equilibria (PBE) Player s 1 perspective LL LH HL HH not a proper sub-game 2 need belief on X to evaluate expected future reward chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

42 Perfect Bayesian equilibria (PBE) Player s 1 perspective LL LH HL HH not a proper sub-game 2 need belief on X (conditioned on 01) to evaluate expected future reward SPE is not appropriate equilibrium concept! Perfect Bayesian equilibrium (PBE) chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

43 Perfect Bayesian equilibria (PBE) A PBE is an assessment (σ, µ ) of strategy profiles σ and beliefs µ satisfying (a) sequential rationality and (b) consistency (a) For every t T, agent i N, information set (A 1:t 1, X i ), and unilateral deviation σ i T T E µ,σ i σ i { R i (X, A t ) A 1:t 1, X i } E µ,σ i σ i { R i (X, A t ) A 1:t 1, X i } t =t (b) Beliefs µ should be updated by Bayes law (whenever possible) given σ and satisfy further consistency conditions [Fudenberg and Tirole, 1991, ch. 8] t =t Due to the circular dependence of µ and σ finding PBE is a large fixed-point problem (no time decomposition) Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

44 Ideas from teams: structured equilibrium strategies σ Useful idea from teams: Instead of considering equilibria with general strategies σ = (σt i ) i N t T form A i t σ i t ( X i, A 1:t 1 ) consider equilibria with structured strategies θ = (θ i t) i N t T where A i t Γ i t( X i ) = θ i t[π t ]( X i ) = m i t( X i, Π t ) of the form of the Π t+1 = F (Π t, Γ t, A t ) = F (Π t, θ t [Π t ], A t ) = F θ t (A 1:t ) (essentially Bayes law) σ θ Note: although equilibrium strategies are structured, unilateral deviations may be anything Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

45 Parenthesis: are structured strategies restrictive? Lemma For any given strategy profile σ = (σ i ) i N, there exists a structured strategy profile θ m = (m i ) i N with the players receiving the same average rewards for both σ and m. chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

46 Parenthesis: are structured strategies restrictive? Lemma For any given strategy profile σ = (σ i ) i N, there exists a structured strategy profile θ m = (m i ) i N with the players receiving the same average rewards for both σ and m. Bottom line: Structured strategy profiles m are a sufficiently rich class so that we can concentrate on equilibria within this class. Caveat: Each m i depends on the entire σ = (σ i ) i N, so unilateral deviations in σ i result in multilateral deviations in m chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

47 Ideas from teams: beliefs µ Recall that in PBE, µ is a set of beliefs on unobserved types X i for each agent i and for each private history (information set) (A 1:t 1, X i ) Consider beliefs that are: (a) only functions of the common history A 1:t 1 and (b) are generated from a common belief in product form µ t [A 1:t 1 ](X ) = µ j t [A 1:t 1 ](X j ) So, for each agent i and for each history (A 1:t 1, X i ) belief on X i is µ j t [A 1:t 1 ](X j ) j N \{i} j N In addition, given strategies σ θ, these beliefs are updated as µ i t+1[a 1:t ] = F (µ i t [A 1:t 1 ], θ }{{}}{{} t[µ i t [A 1:t 1 ]], A i }{{} t) Π i Π i t+1 t Γ i t chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

48 Ideas from teams: beliefs µ Recall that in PBE, µ is a set of beliefs on unobserved types X i for each agent i and for each private history (information set) (A 1:t 1, X i ) Consider beliefs that are: (a) only functions of the common history A 1:t 1 and (b) are generated from a common belief in product form µ t [A 1:t 1 ](X ) = µ j t [A 1:t 1 ](X j ) So, for each agent i and for each history (A 1:t 1, X i ) belief on X i is µ j t [A 1:t 1 ](X j ) j N \{i} j N In addition, given strategies σ θ, these beliefs are updated as µ i t+1[a 1:t ] = F (µ i t [A 1:t 1 ], θ }{{}}{{} t[µ i t [A 1:t 1 ]], A i }{{} t) Π i Π i t+1 t Γ i t Bottom line: all consistency conditions are satisfied automatically. chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

49 Summary so far We have motivated the use of structured (equilibrium) strategies σ θ {}}{ A i t σt i ( A 1:t 1, X i ) = θt[ i µ t [A 1:t 1 ]]( X i ) }{{} Γ i t We have restricted attention to a class of beliefs µ that remain independent and updated as µ i t+1[a 1:t ] = F (µ i t [A 1:t 1 ], θ }{{}}{{} t[µ i t [A 1:t 1 ]], A i }{{} t) Π i Π i t+1 t Γ i t PBE equilibrium (σ, µ ) (θ, µ ) even in this restricted class is still the solution of a large fixed point equation. Circularity between θ and µ still present Π t chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

50 Summary so far We have motivated the use of structured (equilibrium) strategies σ θ {}}{ A i t σt i ( A 1:t 1, X i ) = θt[ i µ t [A 1:t 1 ]]( X i ) }{{} Γ i t We have restricted attention to a class of beliefs µ that remain independent and updated as µ i t+1[a 1:t ] = F (µ i t [A 1:t 1 ], θ }{{}}{{} t[µ i t [A 1:t 1 ]], A i }{{} t) Π i Π i t+1 t Γ i t PBE equilibrium (σ, µ ) (θ, µ ) even in this restricted class is still the solution of a large fixed point equation. Circularity between θ and µ still present How can we find θ with a simple algorithm? Beliefs and policies are decomposed by considering the policies for all possible beliefs π; not just for µ chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44 Π t

51 First erroneous attempt Recall DP equation from team problem For each t = T, T 1,..., 1 and for every π t (X ) solve the following maximization problem θ t [π t ] = γt { = arg max E πt,γi t γ i t R(X, At ) + V t+1 (F (π t, γtγ i t i, A t )) } γt i γ i t What is the logical extension in games? chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

52 First erroneous attempt Recall DP equation from team problem For each t = T, T 1,..., 1 and for every π t (X ) solve the following maximization problem θ t [π t ] = γt { = arg max E πt,γi t γ i t R(X, At ) + V t+1 (F (π t, γtγ i t i, A t )) } γt i γ i t What is the logical extension in games? Transform it into a best-response type equation (fix γt i γt) i and maximize over γ i t for all i N arg max E πt,γi t γ i t γt i { R i (X, A t ) + V i t+1(f (π t, γ i tγ i t, A t )) } Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

53 First erroneous attempt: what is the catch? γ i t for all i N Why erroneous? arg max E πt,γi t γ i t γt i { R i (X, A t ) + V i t+1(f (π t, γ i tγ i t, A t )) } chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

54 First erroneous attempt: what is the catch? γ i t for all i N Why erroneous? arg max E πt,γi t γ i t γt i { R i (X, A t ) + V i t+1(f (π t, γ i tγ i t, A t )) } Explanation: reward-to-go is not conditioned on the entire history (A 1:t 1, X i ) for user i but only on part of it A 1:t 1 Π t. This was OK in teams but is not sufficient to prove sequential rationality in games! T T E µ,σ i σ i { R i (X, A t ) A 1:t 1, X i } E µ, σ i σ i { R i (X, A t ) A 1:t 1, X i } t =t t =t chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

55 Special case 5 Consider dynamical systems for which belief update is prescription-independent, i.e., Π t+1 = F (Π t, A t ) In that case the backward process decomposes and conditioning on X i is irrelevant A strong statement can be made for this special case: For every PBE there exists a structured PBE that corresponds to a SPE of an equivalent symmetric-information game 5 [Nayyar, Gupta, Langbort, Başar, 2014], [Gupta, Nayyar, Langbort, Başar, 2014] chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

56 Second erroneous attempt Condition on X i in the backward induction step to be consistent with sequential rationality condition For each t = T, T 1,..., 1 and for every π t (X ) solve the following one-step fixed-point equation γ i t for all i N and for all x i X i arg max E πt,γi t ( x i )γ i t γt i Note in this case reward-to-go is V i t (π t, x i ) { R i (X, A t ) + V i t+1(f (π t, γ i tγ i t, A t ), x i ) x i} Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

57 Second erroneous attempt: explanation E{ } = γt(a i t x i i )γt i (at i x i )π i (x i ) a t,x i ( R i (x i x i, a t ) + Vt+1(F i (π t, γtγ i t i, a t ), x i ) ) This is an unusual fixed point equation: dependence on γ i t( x i ) but also on the entire γ i t( ) (inside the belief update) chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

58 Second erroneous attempt: explanation E{ } = γt(a i t x i i )γt i (at i x i )π i (x i ) a t,x i ( R i (x i x i, a t ) + Vt+1(F i (π t, γtγ i t i, a t ), x i ) ) This is an unusual fixed point equation: dependence on γ i t( x i ) but also on the entire γ i t( ) (inside the belief update) Unfortunately this results in FP solution θ with γ t = θ t [π t, x] so resulting policy is of the form A i t Γ i t ( X i ) = θ i t[π t, X ]( X i ) which is not implementable (requires unknown private information X i for the strategy of i). chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

59 Third erroneous attempt Condition on X i in the backward induction step to be consistent with sequential rationality and optimize only over some part of the prescription For each t = T, T 1,..., 1 and for every π t (X ) solve the following one-step fixed-point equation γ i t ( x i ) for all i N and for all x i X i arg max E πt,γi t ( x i )γ i t γt i ( x i ) { R i (X, A t ) + V i t+1(f (π t, γ i t( x i )γ i t ( )γ i t, A t ), x i ) x i} chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

60 Third erroneous attempt Condition on X i in the backward induction step to be consistent with sequential rationality and optimize only over some part of the prescription For each t = T, T 1,..., 1 and for every π t (X ) solve the following one-step fixed-point equation γ i t ( x i ) for all i N and for all x i X i arg max E πt,γi t ( x i )γ i t γt i ( x i ) { R i (X, A t ) + V i t+1(f (π t, γ i t( x i )γ i t ( )γ i t, A t ), x i ) x i} This results in FP solution θ with γ t = θ t [π t ] for all π t (X ) Unfortunately, does not work in the proof: something more fundamental is going on... chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

61 An algorithm for PBE evaluation: backward recursion For each t = T, T 1,..., 1 and for every π t (X ) solve the following one-step fixed-point equation for all i N and for all x i X i γt i ( x i ) arg max E πt,γi t ( x i )γ i t γt i ( x i ) { R i (X, A t ) + V i t+1(f (π t, γ i t γ i t, A t ), x i ) x i} This results in FP solution θ with γ t = θ t [π t ] for all π t (X ) This is not a best-response type function: γt i side present on left/right hand Intuition: Find γ i t( x i ) that is optimal under unperturbed belief update! Remember the core concept in PBE... Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

62 An algorithm for PBE evaluation: forward recursion From backard recursion we have obtained θ = (θ i t) i N t T. For each t = 1, 2,..., T and for every i N, A 1:t, and X i σ i t (A i t A 1:t 1, X i ) := θt[µ i t [A 1:t 1 ]](A i }{{} t X i ) Γ i t µ t+1[a 1:t ] := F (µ t [A 1:t 1 ], θ t [µ t [A 1:t 1 ]], A t ) }{{}}{{}}{{} Π t+1 Π t Γ t In fact we can obtain a family of PBEs for any type distribution i N Qi (X i ) with appropriate initialization of µ 1 Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

63 Main Result Theorem (σ, µ ) generated by the backward/forward algorithm (whenever it exists) is a PBE, i.e. for all i, t, A 1:t 1, X i, σ i, E σ i t:t σ i t:t µ t { T R i (X, A n ) } A 1:t 1 X i n=t E σi t:t σ i t:t µ t and µ satisfies the consistency conditions. { T R i (X, A n ) } A 1:t 1 X i n=t chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

64 Sketch of the proof Independence of types and specific DP equation are crucial in proving the result Modified comparison principle (backward induction) chilleas Anastasopoulos (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

65 Sketch of the proof Independence of types and specific DP equation are crucial in proving the result Modified comparison principle (backward induction) Specific DP guarantees that unperturbed reward-to-go (LHS) at time t is the obtained value function Vt i = R i + Vt+1 i Specific DP guarantees that unilateral deviations with fixed belief update reduce Vt i Induction step reduces Vt+1 i to (perturbed) reward-to-go at time t + 1 Independence of types guarantees that resulting expression is exactly the (perturbed) reward-to-go at time t (RHS) chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

66 Comments on the new per-stage FP equation This is not a best-response type of FP equation (due to presence of γ i on both the LHS and RHS of equation) Standard tools for existence of solution (e.g., Brouwer, Kakutani) do not apply (problem with continuity of V ( ) functions) chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

67 Comments on the new per-stage FP equation This is not a best-response type of FP equation (due to presence of γ i on both the LHS and RHS of equation) Standard tools for existence of solution (e.g., Brouwer, Kakutani) do not apply (problem with continuity of V ( ) functions) Existence can be shown for a special case 6 where R i (X, A t ) does not depend on its own type X i In that case prescriptions Γ i t( X i ) = Γ i t( ) do not depend on private type X i and FP equation reduces to best response. No signaling! Essentially reduces to the model Π t+1 = F (Π t, A t ) 6 [Ouyang, Tavafoghi, Teneketzis, 2015] chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

68 Current/Future work Model generalizations: Types are independent controlled Markov processes (controlled by all actions) P(X t X 1:t 1, A 1:t 1) = i N Qi (Xt i Xt 1, i A t 1) 7 Dependence types with strategic independence 8 Types are observed through a noisy channel (even by same user) Q(Yt i Xt i ). Example: informational cascades literature Infinite horizon and continuous action spaces Existence results: prove existence for the simplest non-trivial class of problems. Core issue: the per-stage FP equation is not a best response Dynamic mechanism design (indirect mechanisms with message space smaller than type space) 7 [Vasal, Subramanian, A, 2015b] 8 [Battigalli, 1996] chilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

69 Thank you! Achilleas Anastasopoulos (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

70 Extra: FP equations First attempt γ 1 = f 1 (γ 2, π) γ 2 = f 2 (γ 1, π) } γ = f (γ, π) γ = θ(π) Second attempt γ 1 = f 1H (γ 2, π) γ 1 = f 1L (γ 2, π) γ 2 = f 2H (γ 1, π) γ 2 = f 2L (γ 1, π) γ = f LL (γ, π) γ = f LH (γ, π) γ = f HL (γ, π) γ = f HH (γ, π) γ = θ(π, x) Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

71 Extra: FP equations Third attempt Correct γ H 1 = f 1H (γl 1, γ2, π) γ L 1 = f 1L (γh 1, γ2, π) γ H 2 = f 2H (γl 2, γ1, π) γ L 2 = f 2L (γh 2, γ1, π) γ H 1 = f 1H (γ 1, γ 2, π) γ L 1 = f 1L (γ 1, γ 2, π) γ H 2 = f 2H (γ 2, γ 1, π) γ L 2 = f 2L (γ 2, γ 1, π) { γ 1 = f 1 (γ 1, γ 2, π) γ 2 = f 2 (γ 1, γ 2, π) { γ 1 = f 1 (γ 1, γ 2, π) γ 2 = f 2 (γ 1, γ 2, π) } γ = f (γ, π) γ = θ(π) } γ = f (γ, π) γ = θ(π) Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

72 Battigalli, P. (1996). Strategic independence and perfect Bayesian equilibria. Journal of Economic Theory, 70(1): Fudenberg, D. and Tirole, J. (1991). Game Theory. MIT Press, Cambridge, MA. Gupta, A., Nayyar, A., Langbort, C., and Başar, T. (2014). Common information based markov perfect equilibria for linear-gaussian games with asymmetric information. SIAM Journal on Control and Optimization, 52(5): Nayyar, A., Gupta, A., Langbort, C., and Başar, T. (2014). Common information based markov perfect equilibria for stochastic games with asymmetric information: Finite games. IEEE Trans. Automatic Control, 59(3): Nayyar, A., Mahajan, A., and Teneketzis, D. (2013). Decentralized stochastic control with partial history sharing: A common information approach. Automatic Control, IEEE Transactions on, 58(7): Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

73 Ouyang, Y., Tavafoghi, H., and Teneketzis, D. (2015). Dynamic oligopoly games with private markovian dynamics. Available at www-personal.umich.edu/~tavaf/oligopolygames.pdf. Vasal, D., Subramanian, V., and Anastasopoulos, A. (2015a). A systematic process for evaluating structured perfect Bayesian equilibria in dynamic games with asymmetric information. Technical report. Vasal, D., Subramanian, V., and Anastasopoulos, A. (2015b). A systematic process for evaluating structured perfect Bayesian equilibria in dynamic games with asymmetric information. In American Control Conference. (Accepted for publication/presentation). Achilleas Anastasopoulos anastas@umich.edu (U of Michigan) Equilibria for games with asymmetric information: from guesswork to systematic February evaluation 11, / 44

Dynamic Games with Asymmetric Information: Common Information Based Perfect Bayesian Equilibria and Sequential Decomposition

Dynamic Games with Asymmetric Information: Common Information Based Perfect Bayesian Equilibria and Sequential Decomposition Dynamic Games with Asymmetric Information: Common Information Based Perfect Bayesian Equilibria and Sequential Decomposition 1 arxiv:1510.07001v1 [cs.gt] 23 Oct 2015 Yi Ouyang, Hamidreza Tavafoghi and

More information

Introduction to Sequential Teams

Introduction to Sequential Teams Introduction to Sequential Teams Aditya Mahajan McGill University Joint work with: Ashutosh Nayyar and Demos Teneketzis, UMichigan MITACS Workshop on Fusion and Inference in Networks, 2011 Decentralized

More information

Problems in the intersection of Information theory and Control

Problems in the intersection of Information theory and Control Problems in the intersection of Information theory and Control Achilleas Anastasopoulos anastas@umich.edu EECS Department University of Michigan Dec 5, 2013 Former PhD student: Junghuyn Bae Current PhD

More information

Decentralized Stochastic Control with Partial Sharing Information Structures: A Common Information Approach

Decentralized Stochastic Control with Partial Sharing Information Structures: A Common Information Approach Decentralized Stochastic Control with Partial Sharing Information Structures: A Common Information Approach 1 Ashutosh Nayyar, Aditya Mahajan and Demosthenis Teneketzis Abstract A general model of decentralized

More information

Sufficient Statistics in Decentralized Decision-Making Problems

Sufficient Statistics in Decentralized Decision-Making Problems Sufficient Statistics in Decentralized Decision-Making Problems Ashutosh Nayyar University of Southern California Feb 8, 05 / 46 Decentralized Systems Transportation Networks Communication Networks Networked

More information

6196 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 9, SEPTEMBER 2011

6196 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 9, SEPTEMBER 2011 6196 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 9, SEPTEMBER 2011 On the Structure of Real-Time Encoding and Decoding Functions in a Multiterminal Communication System Ashutosh Nayyar, Student

More information

Higher Order Beliefs in Dynamic Environments

Higher Order Beliefs in Dynamic Environments University of Pennsylvania Department of Economics June 22, 2008 Introduction: Higher Order Beliefs Global Games (Carlsson and Van Damme, 1993): A B A 0, 0 0, θ 2 B θ 2, 0 θ, θ Dominance Regions: A if

More information

Mechanism Design: Implementation. Game Theory Course: Jackson, Leyton-Brown & Shoham

Mechanism Design: Implementation. Game Theory Course: Jackson, Leyton-Brown & Shoham Game Theory Course: Jackson, Leyton-Brown & Shoham Bayesian Game Setting Extend the social choice setting to a new setting where agents can t be relied upon to disclose their preferences honestly Start

More information

Optimal Decentralized Control of Coupled Subsystems With Control Sharing

Optimal Decentralized Control of Coupled Subsystems With Control Sharing IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 58, NO. 9, SEPTEMBER 2013 2377 Optimal Decentralized Control of Coupled Subsystems With Control Sharing Aditya Mahajan, Member, IEEE Abstract Subsystems that

More information

1 The General Definition

1 The General Definition MS&E 336 Lecture 1: Dynamic games Ramesh Johari April 4, 2007 1 The General Definition A dynamic game (or extensive game, or game in extensive form) consists of: A set of players N; A set H of sequences

More information

First Prev Next Last Go Back Full Screen Close Quit. Game Theory. Giorgio Fagiolo

First Prev Next Last Go Back Full Screen Close Quit. Game Theory. Giorgio Fagiolo Game Theory Giorgio Fagiolo giorgio.fagiolo@univr.it https://mail.sssup.it/ fagiolo/welcome.html Academic Year 2005-2006 University of Verona Summary 1. Why Game Theory? 2. Cooperative vs. Noncooperative

More information

Chapter 4 The Common-Information Approach to Decentralized Stochastic Control

Chapter 4 The Common-Information Approach to Decentralized Stochastic Control Chapter 4 The Common-Information Approach to Decentralized Stochastic Control Ashutosh Nayyar, Aditya Mahajan, and Demosthenis Teneketzis 4.1 Introduction Many modern technological systems, such as cyber-physical

More information

Dynamic stochastic game and macroeconomic equilibrium

Dynamic stochastic game and macroeconomic equilibrium Dynamic stochastic game and macroeconomic equilibrium Tianxiao Zheng SAIF 1. Introduction We have studied single agent problems. However, macro-economy consists of a large number of agents including individuals/households,

More information

Equilibrium Refinements

Equilibrium Refinements Equilibrium Refinements Mihai Manea MIT Sequential Equilibrium In many games information is imperfect and the only subgame is the original game... subgame perfect equilibrium = Nash equilibrium Play starting

More information

Common Knowledge and Sequential Team Problems

Common Knowledge and Sequential Team Problems Common Knowledge and Sequential Team Problems Authors: Ashutosh Nayyar and Demosthenis Teneketzis Computer Engineering Technical Report Number CENG-2018-02 Ming Hsieh Department of Electrical Engineering

More information

Rate of Convergence of Learning in Social Networks

Rate of Convergence of Learning in Social Networks Rate of Convergence of Learning in Social Networks Ilan Lobel, Daron Acemoglu, Munther Dahleh and Asuman Ozdaglar Massachusetts Institute of Technology Cambridge, MA Abstract We study the rate of convergence

More information

MS&E 246: Lecture 12 Static games of incomplete information. Ramesh Johari

MS&E 246: Lecture 12 Static games of incomplete information. Ramesh Johari MS&E 246: Lecture 12 Static games of incomplete information Ramesh Johari Incomplete information Complete information means the entire structure of the game is common knowledge Incomplete information means

More information

4: Dynamic games. Concordia February 6, 2017

4: Dynamic games. Concordia February 6, 2017 INSE6441 Jia Yuan Yu 4: Dynamic games Concordia February 6, 2017 We introduce dynamic game with non-simultaneous moves. Example 0.1 (Ultimatum game). Divide class into two groups at random: Proposers,

More information

Sequential Decision Making in Decentralized Systems

Sequential Decision Making in Decentralized Systems Sequential Decision Making in Decentralized Systems by Ashutosh Nayyar A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Electrical Engineering:

More information

Basics of Game Theory

Basics of Game Theory Basics of Game Theory Giacomo Bacci and Luca Sanguinetti Department of Information Engineering University of Pisa, Pisa, Italy {giacomo.bacci,luca.sanguinetti}@iet.unipi.it April - May, 2010 G. Bacci and

More information

Game Theory. Wolfgang Frimmel. Perfect Bayesian Equilibrium

Game Theory. Wolfgang Frimmel. Perfect Bayesian Equilibrium Game Theory Wolfgang Frimmel Perfect Bayesian Equilibrium / 22 Bayesian Nash equilibrium and dynamic games L M R 3 2 L R L R 2 2 L R L 2,, M,2, R,3,3 2 NE and 2 SPNE (only subgame!) 2 / 22 Non-credible

More information

Bayesian Learning in Social Networks

Bayesian Learning in Social Networks Bayesian Learning in Social Networks Asu Ozdaglar Joint work with Daron Acemoglu, Munther Dahleh, Ilan Lobel Department of Electrical Engineering and Computer Science, Department of Economics, Operations

More information

Government 2005: Formal Political Theory I

Government 2005: Formal Political Theory I Government 2005: Formal Political Theory I Lecture 11 Instructor: Tommaso Nannicini Teaching Fellow: Jeremy Bowles Harvard University November 9, 2017 Overview * Today s lecture Dynamic games of incomplete

More information

Optimal Control Strategies in Delayed Sharing Information Structures

Optimal Control Strategies in Delayed Sharing Information Structures Optimal Control Strategies in Delayed Sharing Information Structures Ashutosh Nayyar, Aditya Mahajan and Demosthenis Teneketzis arxiv:1002.4172v1 [cs.oh] 22 Feb 2010 February 12, 2010 Abstract The n-step

More information

Economics 201A Economic Theory (Fall 2009) Extensive Games with Perfect and Imperfect Information

Economics 201A Economic Theory (Fall 2009) Extensive Games with Perfect and Imperfect Information Economics 201A Economic Theory (Fall 2009) Extensive Games with Perfect and Imperfect Information Topics: perfect information (OR 6.1), subgame perfection (OR 6.2), forward induction (OR 6.6), imperfect

More information

6.207/14.15: Networks Lecture 11: Introduction to Game Theory 3

6.207/14.15: Networks Lecture 11: Introduction to Game Theory 3 6.207/14.15: Networks Lecture 11: Introduction to Game Theory 3 Daron Acemoglu and Asu Ozdaglar MIT October 19, 2009 1 Introduction Outline Existence of Nash Equilibrium in Infinite Games Extensive Form

More information

Solution to Tutorial 9

Solution to Tutorial 9 Solution to Tutorial 9 2011/2012 Semester I MA4264 Game Theory Tutor: Xiang Sun October 27, 2011 Exercise 1. A buyer and a seller have valuations v b and v s. It is common knowledge that there are gains

More information

Solution to Tutorial 9

Solution to Tutorial 9 Solution to Tutorial 9 2012/2013 Semester I MA4264 Game Theory Tutor: Xiang Sun November 2, 2012 Exercise 1. We consider a game between two software developers, who sell operating systems (OS) for personal

More information

Impact of social connectivity on herding behavior

Impact of social connectivity on herding behavior Impact of social connectivity on herding behavior Deepanshu Vasal [0000 0003 089 8080] Department of Electrical and Computer Engineering, University of Texas, Austin, Texas, 7875, USA {dvasal} at utexas.edu

More information

BELIEFS & EVOLUTIONARY GAME THEORY

BELIEFS & EVOLUTIONARY GAME THEORY 1 / 32 BELIEFS & EVOLUTIONARY GAME THEORY Heinrich H. Nax hnax@ethz.ch & Bary S. R. Pradelski bpradelski@ethz.ch May 15, 217: Lecture 1 2 / 32 Plan Normal form games Equilibrium invariance Equilibrium

More information

Economics 703 Advanced Microeconomics. Professor Peter Cramton Fall 2017

Economics 703 Advanced Microeconomics. Professor Peter Cramton Fall 2017 Economics 703 Advanced Microeconomics Professor Peter Cramton Fall 2017 1 Outline Introduction Syllabus Web demonstration Examples 2 About Me: Peter Cramton B.S. Engineering, Cornell University Ph.D. Business

More information

Political Economy of Institutions and Development: Problem Set 1. Due Date: Thursday, February 23, in class.

Political Economy of Institutions and Development: Problem Set 1. Due Date: Thursday, February 23, in class. Political Economy of Institutions and Development: 14.773 Problem Set 1 Due Date: Thursday, February 23, in class. Answer Questions 1-3. handed in. The other two questions are for practice and are not

More information

Definitions and Proofs

Definitions and Proofs Giving Advice vs. Making Decisions: Transparency, Information, and Delegation Online Appendix A Definitions and Proofs A. The Informational Environment The set of states of nature is denoted by = [, ],

More information

SEQUENTIAL ESTIMATION OF DYNAMIC DISCRETE GAMES. Victor Aguirregabiria (Boston University) and. Pedro Mira (CEMFI) Applied Micro Workshop at Minnesota

SEQUENTIAL ESTIMATION OF DYNAMIC DISCRETE GAMES. Victor Aguirregabiria (Boston University) and. Pedro Mira (CEMFI) Applied Micro Workshop at Minnesota SEQUENTIAL ESTIMATION OF DYNAMIC DISCRETE GAMES Victor Aguirregabiria (Boston University) and Pedro Mira (CEMFI) Applied Micro Workshop at Minnesota February 16, 2006 CONTEXT AND MOTIVATION Many interesting

More information

Industrial Organization Lecture 3: Game Theory

Industrial Organization Lecture 3: Game Theory Industrial Organization Lecture 3: Game Theory Nicolas Schutz Nicolas Schutz Game Theory 1 / 43 Introduction Why game theory? In the introductory lecture, we defined Industrial Organization as the economics

More information

Informed Principal in Private-Value Environments

Informed Principal in Private-Value Environments Informed Principal in Private-Value Environments Tymofiy Mylovanov Thomas Tröger University of Bonn June 21, 2008 1/28 Motivation 2/28 Motivation In most applications of mechanism design, the proposer

More information

The Social Value of Credible Public Information

The Social Value of Credible Public Information The Social Value of Credible Public Information Ercan Karadas NYU September, 2017 Introduction Model Analysis MOTIVATION This paper is motivated by the paper Social Value of Public Information, Morris

More information

Partially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS

Partially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS Partially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS Many slides adapted from Jur van den Berg Outline POMDPs Separation Principle / Certainty Equivalence Locally Optimal

More information

Adaptive State Feedback Nash Strategies for Linear Quadratic Discrete-Time Games

Adaptive State Feedback Nash Strategies for Linear Quadratic Discrete-Time Games Adaptive State Feedbac Nash Strategies for Linear Quadratic Discrete-Time Games Dan Shen and Jose B. Cruz, Jr. Intelligent Automation Inc., Rocville, MD 2858 USA (email: dshen@i-a-i.com). The Ohio State

More information

Structure of optimal decentralized control policies An axiomatic approach

Structure of optimal decentralized control policies An axiomatic approach Structure of optimal decentralized control policies An axiomatic approach Aditya Mahajan Yale Joint work with: Demos Teneketzis (UofM), Sekhar Tatikonda (Yale), Ashutosh Nayyar (UofM), Serdar Yüksel (Queens)

More information

Threshold Policy for Global Games with Noisy Information Sharing

Threshold Policy for Global Games with Noisy Information Sharing 05 IEEE 54th Annual Conference on Decision and Control CDC December 5-8, 05 Osaka, Japan Threshold olicy for Global Games with Noisy Information Sharing Hessam Mahdavifar, Ahmad Beirami, Behrouz Touri,

More information

Preliminary Results on Social Learning with Partial Observations

Preliminary Results on Social Learning with Partial Observations Preliminary Results on Social Learning with Partial Observations Ilan Lobel, Daron Acemoglu, Munther Dahleh and Asuman Ozdaglar ABSTRACT We study a model of social learning with partial observations from

More information

Crowdsourcing contests

Crowdsourcing contests December 8, 2012 Table of contents 1 Introduction 2 Related Work 3 Model: Basics 4 Model: Participants 5 Homogeneous Effort 6 Extensions Table of Contents 1 Introduction 2 Related Work 3 Model: Basics

More information

Hidden Markov Models. AIMA Chapter 15, Sections 1 5. AIMA Chapter 15, Sections 1 5 1

Hidden Markov Models. AIMA Chapter 15, Sections 1 5. AIMA Chapter 15, Sections 1 5 1 Hidden Markov Models AIMA Chapter 15, Sections 1 5 AIMA Chapter 15, Sections 1 5 1 Consider a target tracking problem Time and uncertainty X t = set of unobservable state variables at time t e.g., Position

More information

Cheap talk theoretic tools for distributed sensing in the presence of strategic sensors

Cheap talk theoretic tools for distributed sensing in the presence of strategic sensors 1 / 37 Cheap talk theoretic tools for distributed sensing in the presence of strategic sensors Cédric Langbort Department of Aerospace Engineering & Coordinated Science Laboratory UNIVERSITY OF ILLINOIS

More information

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012 CSE 573: Artificial Intelligence Autumn 2012 Reasoning about Uncertainty & Hidden Markov Models Daniel Weld Many slides adapted from Dan Klein, Stuart Russell, Andrew Moore & Luke Zettlemoyer 1 Outline

More information

MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti

MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti 1 MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti Historical background 2 Original motivation: animal learning Early

More information

Multi-armed bandit models: a tutorial

Multi-armed bandit models: a tutorial Multi-armed bandit models: a tutorial CERMICS seminar, March 30th, 2016 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions)

More information

Game Theory with Information: Introducing the Witsenhausen Intrinsic Model

Game Theory with Information: Introducing the Witsenhausen Intrinsic Model Game Theory with Information: Introducing the Witsenhausen Intrinsic Model Michel De Lara and Benjamin Heymann Cermics, École des Ponts ParisTech France École des Ponts ParisTech March 15, 2017 Information

More information

Optimal Convergence in Multi-Agent MDPs

Optimal Convergence in Multi-Agent MDPs Optimal Convergence in Multi-Agent MDPs Peter Vrancx 1, Katja Verbeeck 2, and Ann Nowé 1 1 {pvrancx, ann.nowe}@vub.ac.be, Computational Modeling Lab, Vrije Universiteit Brussel 2 k.verbeeck@micc.unimaas.nl,

More information

Multi-channel Opportunistic Access: A Case of Restless Bandits with Multiple Plays

Multi-channel Opportunistic Access: A Case of Restless Bandits with Multiple Plays Multi-channel Opportunistic Access: A Case of Restless Bandits with Multiple Plays Sahand Haji Ali Ahmad, Mingyan Liu Abstract This paper considers the following stochastic control problem that arises

More information

Persuading Skeptics and Reaffirming Believers

Persuading Skeptics and Reaffirming Believers Persuading Skeptics and Reaffirming Believers May, 31 st, 2014 Becker-Friedman Institute Ricardo Alonso and Odilon Camara Marshall School of Business - USC Introduction Sender wants to influence decisions

More information

6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2

6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2 6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2 Daron Acemoglu and Asu Ozdaglar MIT October 14, 2009 1 Introduction Outline Mixed Strategies Existence of Mixed Strategy Nash Equilibrium

More information

An MDP-Based Approach to Online Mechanism Design

An MDP-Based Approach to Online Mechanism Design An MDP-Based Approach to Online Mechanism Design The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Parkes, David C., and

More information

Basic Game Theory. Kate Larson. January 7, University of Waterloo. Kate Larson. What is Game Theory? Normal Form Games. Computing Equilibria

Basic Game Theory. Kate Larson. January 7, University of Waterloo. Kate Larson. What is Game Theory? Normal Form Games. Computing Equilibria Basic Game Theory University of Waterloo January 7, 2013 Outline 1 2 3 What is game theory? The study of games! Bluffing in poker What move to make in chess How to play Rock-Scissors-Paper Also study of

More information

Social Learning with Endogenous Network Formation

Social Learning with Endogenous Network Formation Social Learning with Endogenous Network Formation Yangbo Song November 10, 2014 Abstract I study the problem of social learning in a model where agents move sequentially. Each agent receives a private

More information

Robust Predictions in Games with Incomplete Information

Robust Predictions in Games with Incomplete Information Robust Predictions in Games with Incomplete Information joint with Stephen Morris (Princeton University) November 2010 Payoff Environment in games with incomplete information, the agents are uncertain

More information

Refinements - change set of equilibria to find "better" set of equilibria by eliminating some that are less plausible

Refinements - change set of equilibria to find better set of equilibria by eliminating some that are less plausible efinements efinements - change set of equilibria to find "better" set of equilibria by eliminating some that are less plausible Strategic Form Eliminate Weakly Dominated Strategies - Purpose - throwing

More information

Wars of Attrition with Budget Constraints

Wars of Attrition with Budget Constraints Wars of Attrition with Budget Constraints Gagan Ghosh Bingchao Huangfu Heng Liu October 19, 2017 (PRELIMINARY AND INCOMPLETE: COMMENTS WELCOME) Abstract We study wars of attrition between two bidders who

More information

Bayesian Congestion Control over a Markovian Network Bandwidth Process

Bayesian Congestion Control over a Markovian Network Bandwidth Process Bayesian Congestion Control over a Markovian Network Bandwidth Process Parisa Mansourifard 1/30 Bayesian Congestion Control over a Markovian Network Bandwidth Process Parisa Mansourifard (USC) Joint work

More information

Optimally Solving Dec-POMDPs as Continuous-State MDPs

Optimally Solving Dec-POMDPs as Continuous-State MDPs Optimally Solving Dec-POMDPs as Continuous-State MDPs Jilles Dibangoye (1), Chris Amato (2), Olivier Buffet (1) and François Charpillet (1) (1) Inria, Université de Lorraine France (2) MIT, CSAIL USA IJCAI

More information

1 Extensive Form Games

1 Extensive Form Games 1 Extensive Form Games De nition 1 A nite extensive form game is am object K = fn; (T ) ; P; A; H; u; g where: N = f0; 1; :::; ng is the set of agents (player 0 is nature ) (T ) is the game tree P is the

More information

Recap Social Choice Functions Fun Game Mechanism Design. Mechanism Design. Lecture 13. Mechanism Design Lecture 13, Slide 1

Recap Social Choice Functions Fun Game Mechanism Design. Mechanism Design. Lecture 13. Mechanism Design Lecture 13, Slide 1 Mechanism Design Lecture 13 Mechanism Design Lecture 13, Slide 1 Lecture Overview 1 Recap 2 Social Choice Functions 3 Fun Game 4 Mechanism Design Mechanism Design Lecture 13, Slide 2 Notation N is the

More information

Optimal Strategies for Communication and Remote Estimation with an Energy Harvesting Sensor

Optimal Strategies for Communication and Remote Estimation with an Energy Harvesting Sensor 1 Optimal Strategies for Communication and Remote Estimation with an Energy Harvesting Sensor A. Nayyar, T. Başar, D. Teneketzis and V. V. Veeravalli Abstract We consider a remote estimation problem with

More information

Perfect Bayesian Equilibrium

Perfect Bayesian Equilibrium Perfect Bayesian Equilibrium For an important class of extensive games, a solution concept is available that is simpler than sequential equilibrium, but with similar properties. In a Bayesian extensive

More information

Equivalences of Extensive Forms with Perfect Recall

Equivalences of Extensive Forms with Perfect Recall Equivalences of Extensive Forms with Perfect Recall Carlos Alós-Ferrer and Klaus Ritzberger University of Cologne and Royal Holloway, University of London, 1 and VGSF 1 as of Aug. 1, 2016 1 Introduction

More information

Evolutionary Dynamics and Extensive Form Games by Ross Cressman. Reviewed by William H. Sandholm *

Evolutionary Dynamics and Extensive Form Games by Ross Cressman. Reviewed by William H. Sandholm * Evolutionary Dynamics and Extensive Form Games by Ross Cressman Reviewed by William H. Sandholm * Noncooperative game theory is one of a handful of fundamental frameworks used for economic modeling. It

More information

Partially observable Markov decision processes. Department of Computer Science, Czech Technical University in Prague

Partially observable Markov decision processes. Department of Computer Science, Czech Technical University in Prague Partially observable Markov decision processes Jiří Kléma Department of Computer Science, Czech Technical University in Prague https://cw.fel.cvut.cz/wiki/courses/b4b36zui/prednasky pagenda Previous lecture:

More information

A Folk Theorem for Contract Games with Multiple Principals and Agents

A Folk Theorem for Contract Games with Multiple Principals and Agents A Folk Theorem for Contract Games with Multiple Principals and Agents Siyang Xiong March 27, 2013 Abstract We fully characterize the set of deteristic equilibrium allocations in a competing contract game

More information

Ergodicity and Non-Ergodicity in Economics

Ergodicity and Non-Ergodicity in Economics Abstract An stochastic system is called ergodic if it tends in probability to a limiting form that is independent of the initial conditions. Breakdown of ergodicity gives rise to path dependence. We illustrate

More information

Robust Partially Observable Markov Decision Processes

Robust Partially Observable Markov Decision Processes Submitted to manuscript Robust Partially Observable Markov Decision Processes Mohammad Rasouli 1, Soroush Saghafian 2 1 Management Science and Engineering, Stanford University, Palo Alto, CA 2 Harvard

More information

Economics 3012 Strategic Behavior Andy McLennan October 20, 2006

Economics 3012 Strategic Behavior Andy McLennan October 20, 2006 Economics 301 Strategic Behavior Andy McLennan October 0, 006 Lecture 11 Topics Problem Set 9 Extensive Games of Imperfect Information An Example General Description Strategies and Nash Equilibrium Beliefs

More information

9 A Class of Dynamic Games of Incomplete Information:

9 A Class of Dynamic Games of Incomplete Information: A Class of Dynamic Games of Incomplete Information: Signalling Games In general, a dynamic game of incomplete information is any extensive form game in which at least one player is uninformed about some

More information

Voluntary Leadership in Teams and the Leadership Paradox

Voluntary Leadership in Teams and the Leadership Paradox Voluntary Leadership in Teams and the Leadership Paradox Susanne Mayer 1 and Christian Hilbe 2 1 Institute for Social Policy, Department Socioeconomics, Vienna University of Economics and Business, Nordbergstraße

More information

6.254 : Game Theory with Engineering Applications Lecture 13: Extensive Form Games

6.254 : Game Theory with Engineering Applications Lecture 13: Extensive Form Games 6.254 : Game Theory with Engineering Lecture 13: Extensive Form Games Asu Ozdaglar MIT March 18, 2010 1 Introduction Outline Extensive Form Games with Perfect Information One-stage Deviation Principle

More information

On the Unique D1 Equilibrium in the Stackelberg Model with Asymmetric Information Janssen, M.C.W.; Maasland, E.

On the Unique D1 Equilibrium in the Stackelberg Model with Asymmetric Information Janssen, M.C.W.; Maasland, E. Tilburg University On the Unique D1 Equilibrium in the Stackelberg Model with Asymmetric Information Janssen, M.C.W.; Maasland, E. Publication date: 1997 Link to publication General rights Copyright and

More information

A Solution to the Problem of Externalities When Agents Are Well-Informed

A Solution to the Problem of Externalities When Agents Are Well-Informed A Solution to the Problem of Externalities When Agents Are Well-Informed Hal R. Varian. The American Economic Review, Vol. 84, No. 5 (Dec., 1994), pp. 1278-1293 Introduction There is a unilateral externality

More information

Introduction to Artificial Intelligence (AI)

Introduction to Artificial Intelligence (AI) Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 10 Oct, 13, 2011 CPSC 502, Lecture 10 Slide 1 Today Oct 13 Inference in HMMs More on Robot Localization CPSC 502, Lecture

More information

Hierarchical Bayesian Persuasion

Hierarchical Bayesian Persuasion Hierarchical Bayesian Persuasion Weijie Zhong May 12, 2015 1 Introduction Information transmission between an informed sender and an uninformed decision maker has been extensively studied and applied in

More information

EC319 Economic Theory and Its Applications, Part II: Lecture 7

EC319 Economic Theory and Its Applications, Part II: Lecture 7 EC319 Economic Theory and Its Applications, Part II: Lecture 7 Leonardo Felli NAB.2.14 27 February 2014 Signalling Games Consider the following Bayesian game: Set of players: N = {N, S, }, Nature N strategy

More information

Bayesian Machine Learning - Lecture 7

Bayesian Machine Learning - Lecture 7 Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1

More information

Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning

Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning Christos Dimitrakakis Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands

More information

STRUCTURE AND OPTIMALITY OF MYOPIC SENSING FOR OPPORTUNISTIC SPECTRUM ACCESS

STRUCTURE AND OPTIMALITY OF MYOPIC SENSING FOR OPPORTUNISTIC SPECTRUM ACCESS STRUCTURE AND OPTIMALITY OF MYOPIC SENSING FOR OPPORTUNISTIC SPECTRUM ACCESS Qing Zhao University of California Davis, CA 95616 qzhao@ece.ucdavis.edu Bhaskar Krishnamachari University of Southern California

More information

Notes on Coursera s Game Theory

Notes on Coursera s Game Theory Notes on Coursera s Game Theory Manoel Horta Ribeiro Week 01: Introduction and Overview Game theory is about self interested agents interacting within a specific set of rules. Self-Interested Agents have

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Dynamic Programming Marc Toussaint University of Stuttgart Winter 2018/19 Motivation: So far we focussed on tree search-like solvers for decision problems. There is a second important

More information

On Design and Analysis of Cyber-Physical Systems with Strategic Agents

On Design and Analysis of Cyber-Physical Systems with Strategic Agents On Design and Analysis of Cyber-Physical Systems with Strategic Agents by Hamidreza Tavafoghi Jahormi A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy

More information

Decentralized stochastic control

Decentralized stochastic control DOI 0.007/s0479-04-652-0 Decentralized stochastic control Aditya Mahajan Mehnaz Mannan Springer Science+Business Media New York 204 Abstract Decentralized stochastic control refers to the multi-stage optimization

More information

Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games

Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games Gabriel Y. Weintraub, Lanier Benkard, and Benjamin Van Roy Stanford University {gweintra,lanierb,bvr}@stanford.edu Abstract

More information

Local Communication in Repeated Games with Local Monitoring

Local Communication in Repeated Games with Local Monitoring Local Communication in Repeated Games with Local Monitoring M Laclau To cite this version: M Laclau. Local Communication in Repeated Games with Local Monitoring. 2012. HAL Id: hal-01285070

More information

Power Allocation over Two Identical Gilbert-Elliott Channels

Power Allocation over Two Identical Gilbert-Elliott Channels Power Allocation over Two Identical Gilbert-Elliott Channels Junhua Tang School of Electronic Information and Electrical Engineering Shanghai Jiao Tong University, China Email: junhuatang@sjtu.edu.cn Parisa

More information

Bandit models: a tutorial

Bandit models: a tutorial Gdt COS, December 3rd, 2015 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions) Bandit game: a each round t, an agent chooses

More information

UC Berkeley Haas School of Business Game Theory (EMBA 296 & EWMBA 211) Summer 2016

UC Berkeley Haas School of Business Game Theory (EMBA 296 & EWMBA 211) Summer 2016 UC Berkeley Haas School of Business Game Theory (EMBA 296 & EWMBA 211) Summer 2016 More on strategic games and extensive games with perfect information Block 2 Jun 12, 2016 Food for thought LUPI Many players

More information

On Equilibria of Distributed Message-Passing Games

On Equilibria of Distributed Message-Passing Games On Equilibria of Distributed Message-Passing Games Concetta Pilotto and K. Mani Chandy California Institute of Technology, Computer Science Department 1200 E. California Blvd. MC 256-80 Pasadena, US {pilotto,mani}@cs.caltech.edu

More information

Economics of Networks Social Learning

Economics of Networks Social Learning Economics of Networks Social Learning Evan Sadler Massachusetts Institute of Technology Evan Sadler Social Learning 1/38 Agenda Recap of rational herding Observational learning in a network DeGroot learning

More information

Learning from Sequential and Time-Series Data

Learning from Sequential and Time-Series Data Learning from Sequential and Time-Series Data Sridhar Mahadevan mahadeva@cs.umass.edu University of Massachusetts Sridhar Mahadevan: CMPSCI 689 p. 1/? Sequential and Time-Series Data Many real-world applications

More information

Optimal Control of Partiality Observable Markov. Processes over a Finite Horizon

Optimal Control of Partiality Observable Markov. Processes over a Finite Horizon Optimal Control of Partiality Observable Markov Processes over a Finite Horizon Report by Jalal Arabneydi 04/11/2012 Taken from Control of Partiality Observable Markov Processes over a finite Horizon by

More information

A Decentralized Approach to Multi-agent Planning in the Presence of Constraints and Uncertainty

A Decentralized Approach to Multi-agent Planning in the Presence of Constraints and Uncertainty 2011 IEEE International Conference on Robotics and Automation Shanghai International Conference Center May 9-13, 2011, Shanghai, China A Decentralized Approach to Multi-agent Planning in the Presence of

More information

Learning Equilibrium as a Generalization of Learning to Optimize

Learning Equilibrium as a Generalization of Learning to Optimize Learning Equilibrium as a Generalization of Learning to Optimize Dov Monderer and Moshe Tennenholtz Faculty of Industrial Engineering and Management Technion Israel Institute of Technology Haifa 32000,

More information

9 Forward-backward algorithm, sum-product on factor graphs

9 Forward-backward algorithm, sum-product on factor graphs Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous

More information

Deceptive Advertising with Rational Buyers

Deceptive Advertising with Rational Buyers Deceptive Advertising with Rational Buyers September 6, 016 ONLINE APPENDIX In this Appendix we present in full additional results and extensions which are only mentioned in the paper. In the exposition

More information