Asymmetric Information Security Games 1/43

Size: px
Start display at page:

Download "Asymmetric Information Security Games 1/43"

Transcription

1 Asymmetric Information Security Games Jeff S. Shamma with Lichun Li & Malachi Jones & IPAM Graduate Summer School Games and Contracts for Cyber-Physical Security 7 23 July 2015 Jeff S. Shamma Asymmetric Information Security Games 1/43

2 Asymmetric information games Motivation: One player has superior information Attacker knowns own skill set Defender knows resource characteristics Jeff S. Shamma Asymmetric Information Security Games 2/43

3 Asymmetric information games Motivation: One player has superior information Attacker knowns own skill set Defender knows resource characteristics Tension: Exploiting information also reveals information Jeff S. Shamma Asymmetric Information Security Games 2/43

4 Asymmetric information games Motivation: One player has superior information Attacker knowns own skill set Defender knows resource characteristics Tension: Exploiting information also reveals information Solution: Randomized deception Deliberately do not utilize best resources Jeff S. Shamma Asymmetric Information Security Games 2/43

5 Illustration: Network interdiction System: One high capacity resource and several low capacity (unknown to attacker) Attacker: Observes usage (binary) during Phase I Disables selected resource for Phase II Tension: Initial vs future usage Jeff S. Shamma Asymmetric Information Security Games 3/43

6 Framework: Repeated games Players: Maximizer Row with actions Minimizer Col with actions a A = {1, 2,..., A } b B = {1, 2,..., B } Jeff S. Shamma Asymmetric Information Security Games 4/43

7 Framework: Repeated games Players: Maximizer Row with actions Minimizer Col with actions a A = {1, 2,..., A } b B = {1, 2,..., B } Stages: t = 0, 1, 2,... Jeff S. Shamma Asymmetric Information Security Games 4/43

8 Framework: Repeated games Players: Maximizer Row with actions Minimizer Col with actions a A = {1, 2,..., A } b B = {1, 2,..., B } Stages: t = 0, 1, 2,... History: h t = {(a 0, b 0), (a 1, b 1),..., (a t 1, b t 1)} H t Set of finite histories: H Jeff S. Shamma Asymmetric Information Security Games 4/43

9 Repeated games, cont Stage payoff: Matrix M = [m ab ] m ab = Payoff to Row under action pair (a, b) m ab = Penalty to Col under action pair (a, b) Jeff S. Shamma Asymmetric Information Security Games 5/43

10 Repeated games, cont Stage payoff: Matrix M = [m ab ] m ab = Payoff to Row under action pair (a, b) m ab = Penalty to Col under action pair (a, b) State: k K = {1, 2,..., K } } Index of stage game: {M 1, M 2,..., M K Jeff S. Shamma Asymmetric Information Security Games 5/43

11 Repeated games, cont Stage payoff: Matrix M = [m ab ] m ab = Payoff to Row under action pair (a, b) m ab = Penalty to Col under action pair (a, b) State: k K = {1, 2,..., K } } Index of stage game: {M 1, M 2,..., M K Asymmetric information: } At t = 0, nature selects M {M 1, M 2,..., M K } Prior probabilities {p 1, p 2,..., p K Row informed of selected game Jeff S. Shamma Asymmetric Information Security Games 5/43

12 Repeated games, cont Behavioral strategies: σ : H K (A) (Row) τ : H (B) (Col) Jeff S. Shamma Asymmetric Information Security Games 6/43

13 Repeated games, cont Behavioral strategies: σ : H K (A) (Row) τ : H (B) (Col) Note: Do not measure payoffs! Jeff S. Shamma Asymmetric Information Security Games 6/43

14 Repeated games, cont Behavioral strategies: σ : H K (A) (Row) τ : H (B) (Col) Note: Do not measure payoffs! T -stage game, Γ T (p): (Γ (p) later...) [ ] T 1 1 γ t (σ, τ) = E M k (a t, b t ) T t=0 Jeff S. Shamma Asymmetric Information Security Games 6/43

15 Extensive literature Foundations: Aumann & Maschler (1967), Repeated games with incomplete information: A survey of resent results, Report to US Arms Control and Disarmament Agency. Surveys: Zamir (1992), Repeated games of incomplete information: Zero-sum, Handbook of Game Theory, v. I. Laraki & Sorin (2014), Advances in zero-sum dynamic games, Handbook of Game Theory, v. IV. Monographs: Aumann & Maschler (1967/1995), Repeated Games with Incomplete Information. Mertens, Sorin & Zamir (1994/2015), Repeated Games. Sorin (2002), A First Course on Zero-Sum Repeated Games. Variations: Evolving state, signal monitoring, two-sided incomplete information,... Jeff S. Shamma Asymmetric Information Security Games 7/43

16 Pre-example: Dominant strategies L R T 4 3 B 2 1 Case I L R T 4 2 B 1 3 Case II Case I: T is a dominant strategy, i.e., oblivious best response Case II: No dominant strategy, i.e., contingent best response Jeff S. Shamma Asymmetric Information Security Games 8/43

17 Example I: Non-revelation L R T 1 0 B 0 0 M 1 L R T 0 0 B 0 1 M 2 Should Row use (weakly) dominant strategy? Jeff S. Shamma Asymmetric Information Security Games 9/43

18 Example I: Non-revelation L R T 1 0 B 0 0 M 1 L R T 0 0 B 0 1 M 2 Should Row use (weakly) dominant strategy? Dominant strategy payoff stream: 1-or-0, 0, 0,... Jeff S. Shamma Asymmetric Information Security Games 9/43

19 Example I: Non-revelation L R T 1 0 B 0 0 M 1 L R T 0 0 B 0 1 M 2 Should Row use (weakly) dominant strategy? Dominant strategy payoff stream: 1-or-0, 0, 0,... Compare to perpetual strategy: Col faced with averaged game ( 1/2 0 ) 0 1/2 Expected payoff of 1/4 Jeff S. Shamma Asymmetric Information Security Games 9/43

20 Example I: Non-revelation L R T 1 0 B 0 0 M 1 L R T 0 0 B 0 1 M 2 Should Row use (weakly) dominant strategy? Dominant strategy payoff stream: 1-or-0, 0, 0,... Compare to perpetual strategy: Col faced with averaged game ( 1/2 0 ) 0 1/2 Expected payoff of 1/4 Conclusion: Row better off ignoring information Jeff S. Shamma Asymmetric Information Security Games 9/43

21 Example II: Full revelation L R T -1 0 B 0 0 M 1 L R T 0 0 B 0-1 M 2 Dominant strategy (fully revealing) payoff: 0 Non-revealing game: ( 1/2 ) 0 0 1/2 has expected payoff 1/4 Conclusion: Row better off fully revealing Jeff S. Shamma Asymmetric Information Security Games 10/43

22 Example III: Partial revelation L C R T B M 1 L C R T B M 2 Dominant strategy (long run) payoff: 0 Non-revealing game: ( ) has expected payoff 0 Can Row do better? Jeff S. Shamma Asymmetric Information Security Games 11/43

23 Example III: Partial revelation, cont L C R T B M 1 State dependent lottery (h, t): k = 1 probabilities: (3/4, 1/4) k = 2 probabilities: (1/4, 3/4) Outcome dependent strategy: h: Play T forever t: Play B forever L C R T B M 2 Jeff S. Shamma Asymmetric Information Security Games 12/43

24 Example III: Partial revelation, cont L C R T B M 1 L C R T B After stage 0, Col knows outcome Pr [ k = 1 ] h = 3/4 leading to average game ( 3 1 ) in which Row plays T. Pr [ k = 1 ] t = 1/4 leading to average game ( 3 1 ) in which Row plays B. Long run payoff: 1 M 2 Jeff S. Shamma Asymmetric Information Security Games 13/43

25 Outline Selected basic results Computational approaches Jeff S. Shamma Asymmetric Information Security Games 14/43

26 Review: Finite zero-sum games Payoff matrix: M = [m ab ] Jeff S. Shamma Asymmetric Information Security Games 15/43

27 Review: Finite zero-sum games Payoff matrix: M = [m ab ] Mixed strategies: x, y x(a) = Pr [a] & y(b) = Pr [b] x T My = Expected payoff to Row under strategies (x, y) x T My = Expected penalty to Col under strategies (x, y) Jeff S. Shamma Asymmetric Information Security Games 15/43

28 Review: Finite zero-sum games Payoff matrix: M = [m ab ] Mixed strategies: x, y x(a) = Pr [a] & y(b) = Pr [b] x T My = Expected payoff to Row under strategies (x, y) x T My = Expected penalty to Col under strategies (x, y) Security levels and strategies: v = max x = max l x,l min x T My y x T M l 1 T x v = min y = min y,l l My l 1 y max x T My x Jeff S. Shamma Asymmetric Information Security Games 15/43

29 Review: Finite zero-sum games Payoff matrix: M = [m ab ] Mixed strategies: x, y x(a) = Pr [a] & y(b) = Pr [b] x T My = Expected payoff to Row under strategies (x, y) x T My = Expected penalty to Col under strategies (x, y) Security levels and strategies: v = max x = max l x,l min x T My y x T M l 1 T x v = min y = min y,l l My l 1 y max x T My x Value: val[m] = v = v Jeff S. Shamma Asymmetric Information Security Games 15/43

30 Analysis: Finite horizon Pure strategies: σ : H K A (Row) τ : H B (Col) Jeff S. Shamma Asymmetric Information Security Games 16/43

31 Analysis: Finite horizon Pure strategies: σ : H K A (Row) τ : H B (Col) Strategic form conversion: Enumerate all pure strategies Define M(p) as associated (large) matrix game for p (K) Jeff S. Shamma Asymmetric Information Security Games 16/43

32 Analysis: Finite horizon Pure strategies: σ : H K A (Row) τ : H B (Col) Strategic form conversion: Enumerate all pure strategies Define M(p) as associated (large) matrix game for p (K) Consequences: Value v t(p) = val[m(p)] exists along with associated (mixed) security strategies Equivalence to behavioral strategies (Kuhn s theorem) Jeff S. Shamma Asymmetric Information Security Games 16/43

33 Non-revealing game & u(p) Average game: D(p) = k p k M k & u(p) = val[d(p)] Jeff S. Shamma Asymmetric Information Security Games 17/43

34 Non-revealing game & u(p) Average game: D(p) = k p k M k & u(p) = val[d(p)] Claim: v t (p) u(p) Proof: Row plays security strategy for D(p) Jeff S. Shamma Asymmetric Information Security Games 17/43

35 Non-revealing game & u(p) Average game: D(p) = k p k M k & u(p) = val[d(p)] Claim: v t (p) u(p) Proof: Row plays security strategy for D(p) Jeff S. Shamma Asymmetric Information Security Games 17/43

36 Computing u(p) u(p) = max min x T x y x T ( k max l x,l ) p k M k ( ) p k M k y k l 1 T x Jeff S. Shamma Asymmetric Information Security Games 18/43

37 Cavu(p) Claim: Suppose L p = λ l p l l=1 There exists a Row strategy such that L v t (p) λ l u(p l ) l=1 Jeff S. Shamma Asymmetric Information Security Games 19/43

38 Cavu(p) Claim: Suppose L p = λ l p l l=1 There exists a Row strategy such that L v t (p) λ l u(p l ) l=1 Implication: By optimally selecting mixtures v T (p) Cavu(p) where Cavu(p) u(p) is pointwise smallest concave function Jeff S. Shamma Asymmetric Information Security Games 19/43

39 Cavu(p), cont Jeff S. Shamma Asymmetric Information Security Games 20/43

40 Belief splitting For p, p 1,..., p N, suppose L p = λ l p l l=1 Define joint distribution over {1, 2,..., L} {1, 2,..., K } Q(l, k) = λ l pl k Jeff S. Shamma Asymmetric Information Security Games 21/43

41 Belief splitting For p, p 1,..., p N, suppose L p = λ l p l l=1 Define joint distribution over {1, 2,..., L} {1, 2,..., K } Q(l, k) = λ l pl k Properties: Pr [k] = p k Pr [l] = λ l Pr [ k l ] = p k l Pr [ l k ] λl p k l 1 2 p λ1p1 λ2p2 λlpl K λ 1 2 L Jeff S. Shamma Asymmetric Information Security Games 21/43

42 Proof of claim Starting point: p = L l=1 λ lp l Row strategy: 1 Let x l be Row optimal strategy for u(p l ) 2 Select l λ l p k l (k-dependent lottery) 3 Play selected x l Jeff S. Shamma Asymmetric Information Security Games 22/43

43 Proof of claim Starting point: p = L l=1 λ lp l Row strategy: 1 Let x l be Row optimal strategy for u(p l ) 2 Select l λ l p k l (k-dependent lottery) 3 Play selected x l Assess Col response as if observed l min y p k( Pr [ l k ] ) TM x k l y k l min y l,l=1,2,...,l = l λ l u(p l ) p k( Pr [ l k ] ) TM x k l y l k Jeff S. Shamma Asymmetric Information Security Games 22/43

44 Belief updates for Col Belief splitting: Posterior belief is p l with probability λ l Jeff S. Shamma Asymmetric Information Security Games 23/43

45 Belief updates for Col Belief splitting: Posterior belief is p l with probability λ l General setup: Prior beliefs p over K k-dependent strategy for Row, X = (x 1,..., x K ) (A) K x k a = Pr [ a k ] Jeff S. Shamma Asymmetric Information Security Games 23/43

46 Belief updates for Col Belief splitting: Posterior belief is p l with probability λ l General setup: Prior beliefs p over K k-dependent strategy for Row, X = (x 1,..., x K ) (A) K x k a = Pr [ a k ] Computations: Probability Row plays a: π(a; X, p) = k Pr [ a ] k Pr [k] = xa k p k k Posterior belief after Row plays a: B k (a; X, p) = Pr [ a k ] Pr [k] Pr [a] = x k a p k π(a; X, p) Jeff S. Shamma Asymmetric Information Security Games 23/43

47 Naive strategy for Col Naive defense: Define p k t = Pr [ k h t ] (requires Row strategy σ) Set τ t(h t) to be security strategy for D(p t) Jeff S. Shamma Asymmetric Information Security Games 24/43

48 Naive strategy for Col Naive defense: Define p k t = Pr [ k h t ] (requires Row strategy σ) Set τ t(h t) to be security strategy for D(p t) Claim: E [γ t (σ, τ)] Cavu(p) + M p k (1 p k ) t k Jeff S. Shamma Asymmetric Information Security Games 24/43

49 Naive strategy for Col Naive defense: Define p k t = Pr [ k h t ] (requires Row strategy σ) Set τ t(h t) to be security strategy for D(p t) Claim: E [γ t (σ, τ)] Cavu(p) + M p k (1 p k ) t k Implication: Cavu(p) v t (p) Cavu(p) + M p k (1 p k ) t k Jeff S. Shamma Asymmetric Information Security Games 24/43

50 Uninformed player: Defend vs guarantee Defend: React to σ s.t. γ t (σ, BR(σ)) l, σ e.g., Belief based reaction Jeff S. Shamma Asymmetric Information Security Games 25/43

51 Uninformed player: Defend vs guarantee Defend: React to σ s.t. γ t (σ, BR(σ)) l, σ e.g., Belief based reaction Guarantee: Find τ s.t. γ t (σ, τ) l, σ Jeff S. Shamma Asymmetric Information Security Games 25/43

52 Uninformed player: Defend vs guarantee Defend: React to σ s.t. γ t (σ, BR(σ)) l, σ e.g., Belief based reaction Guarantee: Find τ s.t. γ t (σ, τ) l, σ Example: L R T 1-1 B -1 1 Jeff S. Shamma Asymmetric Information Security Games 25/43

53 Uninformed player: Defend vs guarantee Defend: React to σ s.t. γ t (σ, BR(σ)) l, σ e.g., Belief based reaction Guarantee: Find τ s.t. γ t (σ, τ) l, σ Example: L R T 1-1 B -1 1 Defend: BR(T) = R & BR(B) = L & BR(50-50) = L/R (l = 0) Jeff S. Shamma Asymmetric Information Security Games 25/43

54 Uninformed player: Defend vs guarantee Defend: React to σ s.t. γ t (σ, BR(σ)) l, σ e.g., Belief based reaction Guarantee: Find τ s.t. γ t (σ, τ) l, σ Example: L R T 1-1 B -1 1 Defend: BR(T) = R & BR(B) = L & BR(50-50) = L/R (l = 0) Guarantee: (l = 0) Jeff S. Shamma Asymmetric Information Security Games 25/43

55 Approachability strategy for Col Hypothetical payoff vector: Given observations Col can compute g t (a t, b t ) = ( M 1 (a t, b t ) M 2 (a t, b t )... M K (a t, b t ) ) as well as its running average g t+1 = g t + 1 t + 1 (g t(a t, b t ) g t ) Jeff S. Shamma Asymmetric Information Security Games 26/43

56 Approachability strategy for Col Hypothetical payoff vector: Given observations Col can compute g t (a t, b t ) = ( M 1 (a t, b t ) M 2 (a t, b t )... M K (a t, b t ) ) as well as its running average g t+1 = g t + 1 t + 1 (g t(a t, b t ) g t ) Challenge: Steer g t so that lim sup p T g t Cavu(p) t Jeff S. Shamma Asymmetric Information Security Games 26/43

57 Blackwell approachability g t+1 = g t + 1 t + 1 (g t(a t, b t ) g t ) Approachability: A closed convex set, C, is approachable, i.e., Pr [dist(g t, C) 0] = 1 if and only if for all half-spaces, H, containing C, there exists a y such that { Co y b g t (a, b) } a A H b * Jeff S. Shamma Asymmetric Information Security Games 27/43

58 Approachability strategy for Col, cont Construction: 1 Find a supporting hyperplane v R K such that v T p = Cavu(p) u(q) v T q, q 2 Define C = { x R K x v } 3 At stage t, if g t C, define q = 1 Z ( ) g t Π(g t, C) and play optimal strategy for D(q) q * g t Jeff S. Shamma Asymmetric Information Security Games 28/43

59 Recap Row belief splitting: Cavu(p) v t (p) Jeff S. Shamma Asymmetric Information Security Games 29/43

60 Recap Row belief splitting: Cavu(p) v t (p) Col naive defense: v t (p) Cavu(p) + M p k (1 p k ) t k Jeff S. Shamma Asymmetric Information Security Games 29/43

61 Recap Row belief splitting: Cavu(p) v t (p) Col naive defense: v t (p) Cavu(p) + M p k (1 p k ) t k Col approachability: lim sup E [γ t (σ, τ)] Cavu(p) t Jeff S. Shamma Asymmetric Information Security Games 29/43

62 Recap Row belief splitting: Cavu(p) v t (p) Col naive defense: v t (p) Cavu(p) + M p k (1 p k ) t k Col approachability: lim sup E [γ t (σ, τ)] Cavu(p) t Implication: Infinite horizon game, Γ (p) has value Cavu(p) Jeff S. Shamma Asymmetric Information Security Games 29/43

63 Computations: Face value Suppose M is S S Assume oblivious Row who ignores Col s actions (optimal) Jeff S. Shamma Asymmetric Information Security Games 30/43

64 Computations: Face value Suppose M is S S Assume oblivious Row who ignores Col s actions (optimal) Number of strategies: Stage 0: state action: S K Stage 1: (state,action) action: S K S. Stage T : (state, action,..., action) action: S K ST Total: T t=0 S K St Conversion to strategic form computationally prohibitive Jeff S. Shamma Asymmetric Information Security Games 30/43

65 Recursive structure ( v t+1 (p) = 1 t + 1 max X min y p k x k M k y + t a k π(a; X, p)v t (B(a; X, p)) ) Row is oblivious to Col Col plays myopic defense based on current beliefs Jeff S. Shamma Asymmetric Information Security Games 31/43

66 Recursive structure intuition What is Col s best response to z t+1 = Z t (z t, a t ) a t X t (z t ) Jeff S. Shamma Asymmetric Information Security Games 32/43

67 Recursive structure intuition What is Col s best response to z t+1 = Z t (z t, a t ) a t X t (z t ) Value iteration: Col plays myopic defense V t (z t, p t ) = min y V T (z T, p T ) = min E [ M k (a T, b T ) ] b T y = min pt k X k (z T ) T M k y y pt k X k (z t ) T M k y k k + E [V t+1 (Z t (z t, a t ), B(a t ; X(z t ), p t ))] Note: Reversed time indexing and neglected normalization. Jeff S. Shamma Asymmetric Information Security Games 32/43

68 Recursive structure intuition, cont V t (z t, p t ) = min y pt k X k (z t ) T M k y + E [V t+1 (Z t (z t, a t ), B(a t ; X(z t ), p t ))] k Jeff S. Shamma Asymmetric Information Security Games 33/43

69 Recursive structure intuition, cont V t (z t, p t ) = min y pt k X k (z t ) T M k y + E [V t+1 (Z t (z t, a t ), B(a t ; X(z t ), p t ))] k Row s task as a maximizer v t (p t ) = max X v T (p T ) = max X k min y pt k x k M k y k min y p k t x k M k y + E [v t+1 (B(a t ; X, p t ))] Jeff S. Shamma Asymmetric Information Security Games 33/43

70 Recursive structure intuition, cont V t (z t, p t ) = min y pt k X k (z t ) T M k y + E [V t+1 (Z t (z t, a t ), B(a t ; X(z t ), p t ))] k Row s task as a maximizer v t (p t ) = max X v T (p T ) = max X Equivalent problem: State space: p t+1 = B(p t, a t) Action space: (A) K Stage reward: min y k x k M k y k min y pt k x k M k y k min y p k t x k M k y + E [v t+1 (B(a t ; X, p t ))] Jeff S. Shamma Asymmetric Information Security Games 33/43

71 LP for v 1 (p) S K + 1 variables & S constraints ( max min p k (x k ) T M k) y X K y k max l X,l p k x k M k l 1 T k x k, k Jeff S. Shamma Asymmetric Information Security Games 34/43

72 LP for v 1 (p) S K + 1 variables & S constraints Extended v 1 ( ): ( max min p k (x k ) T M k) y X K y k max l X,l p k x k M k l 1 T k x k, k Redefine v 1 (q) over positive q R K + Positive homogeneity: c v 1 (q) = v 1 (c q) Jeff S. Shamma Asymmetric Information Security Games 34/43

73 LP for v 2 (p)? v 2 (p) = max X = max X min θ y k min θ y k p k x k M k y + (1 θ) a p k x k M k y + (1 θ) a π(a; X, p)v 1 (B(a; X, p)) v 1 (x 1 a p 1,..., x K a p K ) Jeff S. Shamma Asymmetric Information Security Games 35/43

74 LP for v 2 (p)? v 2 (p) = max X min θ y k p k x k M k y + (1 θ) a π(a; X, p)v 1 (B(a; X, p)) = max X min θ y k p k x k M k y + (1 θ) a v 2 (p) = max θl 0 + (1 θ) X,l 0,...,l A a p k x k M k l 0 1 T k v 1 (x 1 a p 1,..., x K a p K ) l a, a x k, k v 1 (x 1 a p 1,..., x K a p K ) l a Jeff S. Shamma Asymmetric Information Security Games 35/43

75 LP for v 2 (p)? v 2 (p) = max X min θ y k p k x k M k y + (1 θ) a π(a; X, p)v 1 (B(a; X, p)) = max X min θ y k p k x k M k y + (1 θ) a v 2 (p) = max θl 0 + (1 θ) X,l 0,...,l A a p k x k M k l 0 1 T k v 1 (x 1 a p 1,..., x K a p K ) l a, a x k, k v 1 (x 1 a p 1,..., x K a p K ) l a Constraints on v 1 ( ) result in product terms in LP Jeff S. Shamma Asymmetric Information Security Games 35/43

76 LHS vs RHS LHS Nested LP: max c T v v Av b v i F i w f i Jeff S. Shamma Asymmetric Information Security Games 36/43

77 LHS vs RHS LHS Nested LP: max c T v v Av b v i F i w f i LP structure lost! Jeff S. Shamma Asymmetric Information Security Games 36/43

78 LHS vs RHS LHS Nested LP: max c T v v Av b v i F i w f i LP structure lost! RHS Nested LP: max c T v v Av b Fw i v i f i LP structure preserved! Jeff S. Shamma Asymmetric Information Security Games 36/43

79 LP for v 1 (p) revisited v 1 (p) = max l X,l p k x k M k l 1 T k x k, k v 1 (p) = max l Z,l z k M k l 1 T k 1 T z k = p k, k Change of variables: z k = p k x k 0 Probabilities now enter in RHS Jeff S. Shamma Asymmetric Information Security Games 37/43

80 LP for v 2 (p) revisited v 2 (p) = max θl 0 + (1 θ) X,l 0,...,l A a p k x k M k l 0 1 T k v 1 (x 1 a p 1,..., x K a p K ) l a, a x k, k Each constraint on v 1 ( ) is an LP: z k (a)m k l a 1 T k 1 T z k (a) = x k a p k l a (S + 1) (S K vars & S cons ) Jeff S. Shamma Asymmetric Information Security Games 38/43

81 LP computations Claim: Recursive structure: If v t(p) has RHS LP-dependence on p, then v t+1(p) has RHS LP-dependence on p. Growth in LP size: LP size grows with size[v t] S size[v t 1] Jeff S. Shamma Asymmetric Information Security Games 39/43

82 LP computations Claim: Recursive structure: If v t(p) has RHS LP-dependence on p, then v t+1(p) has RHS LP-dependence on p. Growth in LP size: LP size grows with size[v t] S size[v t 1] Polynomial dependence on size of game (but not length): (S + S S T ) K vs T t=0 S K St }{{} face value (cf., sequence form of Koller, von Stengel, & Megiddo, 1996) Jeff S. Shamma Asymmetric Information Security Games 39/43

83 LP computations Claim: Recursive structure: If v t(p) has RHS LP-dependence on p, then v t+1(p) has RHS LP-dependence on p. Growth in LP size: LP size grows with size[v t] S size[v t 1] Polynomial dependence on size of game (but not length): (S + S S T ) K vs T t=0 S K St }{{} face value (cf., sequence form of Koller, von Stengel, & Megiddo, 1996) Recursive computation applicable to time-varying repeated games (i.e., changing M-matrices) Jeff S. Shamma Asymmetric Information Security Games 39/43

84 Illustration: Example I L R T 1 0 B 0 0 M 1 L R T 0 0 B 0 1 M Jeff S. Shamma Asymmetric Information Security Games 40/43

85 Illustration: Network interdiction System: One high capacity resource and several low capacity (unknown to attacker) Attacker: Observes usage (binary) during Phase I Disables selected resource for Phase II Tension: Initial vs future usage Note: Time-varying M-matrices Jeff S. Shamma Asymmetric Information Security Games 41/43

86 Network interdiction, cont There is one high capacity channel, and the attacker can block one channel from stage 4 to 11 2 value of the game total number of channels Initial phase: 3 stages Remaining phase: R stages Probability of using high capacity resource: r \ t Jeff S. Shamma Asymmetric Information Security Games 42/43

87 Concluding remarks Recap: Examples Basic results Computational approach Jeff S. Shamma Asymmetric Information Security Games 43/43

88 Concluding remarks Recap: Examples Basic results Computational approach Extensions: Discounted problems: γ λ (σ, τ) = E [ (1 λ) ] λ t M k (a t, b t) t=0 Markov chains with informed controller: Receding horizon implementation k t+1 φ(k t, a t) Jeff S. Shamma Asymmetric Information Security Games 43/43

89 Concluding remarks Recap: Examples Basic results Computational approach Extensions: Discounted problems: γ λ (σ, τ) = E [ (1 λ) ] λ t M k (a t, b t) t=0 Markov chains with informed controller: Receding horizon implementation k t+1 φ(k t, a t) Lingering issue: Computational policies for uninformed player Jeff S. Shamma Asymmetric Information Security Games 43/43

Cyber Security Games with Asymmetric Information

Cyber Security Games with Asymmetric Information Cyber Security Games with Asymmetric Information Jeff S. Shamma Georgia Institute of Technology Joint work with Georgios Kotsalis & Malachi Jones ARO MURI Annual Review November 15, 2012 Research Thrust:

More information

Cyber-Awareness and Games of Incomplete Information

Cyber-Awareness and Games of Incomplete Information Cyber-Awareness and Games of Incomplete Information Jeff S Shamma Georgia Institute of Technology ARO/MURI Annual Review August 23 24, 2010 Preview Game theoretic modeling formalisms Main issue: Information

More information

Institute of Electrical and Electronics Engineers (IEEE) 53rd IEEE Conference on Decision and Control

Institute of Electrical and Electronics Engineers (IEEE) 53rd IEEE Conference on Decision and Control KAUST Repository LP formulation of asymmetric zero-sum stochastic games Item type Conference Paper Authors Li, Lichun; Shamma, Jeff S. Eprint version DOI Publisher Journal Rights Post-print 0.09/CDC.204.7039680

More information

Correlated Equilibrium in Games with Incomplete Information

Correlated Equilibrium in Games with Incomplete Information Correlated Equilibrium in Games with Incomplete Information Dirk Bergemann and Stephen Morris Econometric Society Summer Meeting June 2012 Robust Predictions Agenda game theoretic predictions are very

More information

Computing Minmax; Dominance

Computing Minmax; Dominance Computing Minmax; Dominance CPSC 532A Lecture 5 Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 1 Lecture Overview 1 Recap 2 Linear Programming 3 Computational Problems Involving Maxmin 4 Domination

More information

Influencing Social Evolutionary Dynamics

Influencing Social Evolutionary Dynamics Influencing Social Evolutionary Dynamics Jeff S Shamma Georgia Institute of Technology MURI Kickoff February 13, 2013 Influence in social networks Motivating scenarios: Competing for customers Influencing

More information

Principal-Agent Games - Equilibria under Asymmetric Information -

Principal-Agent Games - Equilibria under Asymmetric Information - Principal-Agent Games - Equilibria under Asymmetric Information - Ulrich Horst 1 Humboldt-Universität zu Berlin Department of Mathematics and School of Business and Economics Work in progress - Comments

More information

, and rewards and transition matrices as shown below:

, and rewards and transition matrices as shown below: CSE 50a. Assignment 7 Out: Tue Nov Due: Thu Dec Reading: Sutton & Barto, Chapters -. 7. Policy improvement Consider the Markov decision process (MDP) with two states s {0, }, two actions a {0, }, discount

More information

On Reputation with Imperfect Monitoring

On Reputation with Imperfect Monitoring On Reputation with Imperfect Monitoring M. W. Cripps, G. Mailath, L. Samuelson UCL, Northwestern, Pennsylvania, Yale Theory Workshop Reputation Effects or Equilibrium Robustness Reputation Effects: Kreps,

More information

Long-Run versus Short-Run Player

Long-Run versus Short-Run Player Repeated Games 1 Long-Run versus Short-Run Player a fixed simultaneous move stage game Player 1 is long-run with discount factor δ actions a A a finite set 1 1 1 1 2 utility u ( a, a ) Player 2 is short-run

More information

Game Theory and Control

Game Theory and Control Game Theory and Control Jason R. Marden 1 and Jeff S. Shamma 2 1 Department of Electrical and Computer Engineering, University of California, Santa Barbara, Santa Barbara, CA, USA, 93106; jrmarden@ece.ucsb.edu

More information

First Prev Next Last Go Back Full Screen Close Quit. Game Theory. Giorgio Fagiolo

First Prev Next Last Go Back Full Screen Close Quit. Game Theory. Giorgio Fagiolo Game Theory Giorgio Fagiolo giorgio.fagiolo@univr.it https://mail.sssup.it/ fagiolo/welcome.html Academic Year 2005-2006 University of Verona Summary 1. Why Game Theory? 2. Cooperative vs. Noncooperative

More information

Coordinating over Signals

Coordinating over Signals Coordinating over Signals Jeff S. Shamma Behrouz Touri & Kwang-Ki Kim School of Electrical and Computer Engineering Georgia Institute of Technology ARO MURI Program Review March 18, 2014 Jeff S. Shamma

More information

Persuading Skeptics and Reaffirming Believers

Persuading Skeptics and Reaffirming Believers Persuading Skeptics and Reaffirming Believers May, 31 st, 2014 Becker-Friedman Institute Ricardo Alonso and Odilon Camara Marshall School of Business - USC Introduction Sender wants to influence decisions

More information

Computing Minmax; Dominance

Computing Minmax; Dominance Computing Minmax; Dominance CPSC 532A Lecture 5 Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 1 Lecture Overview 1 Recap 2 Linear Programming 3 Computational Problems Involving Maxmin 4 Domination

More information

Game Theory and Rationality

Game Theory and Rationality April 6, 2015 Notation for Strategic Form Games Definition A strategic form game (or normal form game) is defined by 1 The set of players i = {1,..., N} 2 The (usually finite) set of actions A i for each

More information

Higher Order Beliefs in Dynamic Environments

Higher Order Beliefs in Dynamic Environments University of Pennsylvania Department of Economics June 22, 2008 Introduction: Higher Order Beliefs Global Games (Carlsson and Van Damme, 1993): A B A 0, 0 0, θ 2 B θ 2, 0 θ, θ Dominance Regions: A if

More information

Experts in a Markov Decision Process

Experts in a Markov Decision Process University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 2004 Experts in a Markov Decision Process Eyal Even-Dar Sham Kakade University of Pennsylvania Yishay Mansour Follow

More information

Theory Field Examination Game Theory (209A) Jan Question 1 (duopoly games with imperfect information)

Theory Field Examination Game Theory (209A) Jan Question 1 (duopoly games with imperfect information) Theory Field Examination Game Theory (209A) Jan 200 Good luck!!! Question (duopoly games with imperfect information) Consider a duopoly game in which the inverse demand function is linear where it is positive

More information

Abstract. This chapter deals with stochastic games where the state is not publicly known.

Abstract. This chapter deals with stochastic games where the state is not publicly known. STOCHASTIC GAMES WITH INCOMPLETE INFORMATION SYLVAIN SORIN Université P. et M. Curie and École Polytechnique Paris, France Abstract. This chapter deals with stochastic games where the state is not publicly

More information

Bandit models: a tutorial

Bandit models: a tutorial Gdt COS, December 3rd, 2015 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions) Bandit game: a each round t, an agent chooses

More information

Practicable Robust Markov Decision Processes

Practicable Robust Markov Decision Processes Practicable Robust Markov Decision Processes Huan Xu Department of Mechanical Engineering National University of Singapore Joint work with Shiau-Hong Lim (IBM), Shie Mannor (Techion), Ofir Mebel (Apple)

More information

Decision Theory: Markov Decision Processes

Decision Theory: Markov Decision Processes Decision Theory: Markov Decision Processes CPSC 322 Lecture 33 March 31, 2006 Textbook 12.5 Decision Theory: Markov Decision Processes CPSC 322 Lecture 33, Slide 1 Lecture Overview Recap Rewards and Policies

More information

Motivation for introducing probabilities

Motivation for introducing probabilities for introducing probabilities Reaching the goals is often not sufficient: it is important that the expected costs do not outweigh the benefit of reaching the goals. 1 Objective: maximize benefits - costs.

More information

A Rothschild-Stiglitz approach to Bayesian persuasion

A Rothschild-Stiglitz approach to Bayesian persuasion A Rothschild-Stiglitz approach to Bayesian persuasion Matthew Gentzkow and Emir Kamenica Stanford University and University of Chicago January 2016 Consider a situation where one person, call him Sender,

More information

A Rothschild-Stiglitz approach to Bayesian persuasion

A Rothschild-Stiglitz approach to Bayesian persuasion A Rothschild-Stiglitz approach to Bayesian persuasion Matthew Gentzkow and Emir Kamenica Stanford University and University of Chicago December 2015 Abstract Rothschild and Stiglitz (1970) represent random

More information

Computing Equilibria of Repeated And Dynamic Games

Computing Equilibria of Repeated And Dynamic Games Computing Equilibria of Repeated And Dynamic Games Şevin Yeltekin Carnegie Mellon University ICE 2012 July 2012 1 / 44 Introduction Repeated and dynamic games have been used to model dynamic interactions

More information

arxiv: v1 [math.pr] 21 Jul 2014

arxiv: v1 [math.pr] 21 Jul 2014 Optimal Dynamic Information Provision arxiv:1407.5649v1 [math.pr] 21 Jul 2014 Jérôme Renault, Eilon Solan, and Nicolas Vieille August 20, 2018 Abstract We study a dynamic model of information provision.

More information

A 2-PERSON GAME WITH LACK OF INFORMATION ON 1^ SIDES*

A 2-PERSON GAME WITH LACK OF INFORMATION ON 1^ SIDES* MATHEMATICS OF OPERATIONS RESEARCH Vol. 10. No. I. February 1985 Priraed in U.S.A. A 2-PERSON GAME WITH LACK OF INFORMATION ON 1^ SIDES* SYLVAIN SORINt AND S H M U E L Z A M I R We consider a repeated

More information

Crowdsourcing & Optimal Budget Allocation in Crowd Labeling

Crowdsourcing & Optimal Budget Allocation in Crowd Labeling Crowdsourcing & Optimal Budget Allocation in Crowd Labeling Madhav Mohandas, Richard Zhu, Vincent Zhuang May 5, 2016 Table of Contents 1. Intro to Crowdsourcing 2. The Problem 3. Knowledge Gradient Algorithm

More information

Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning

Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning Christos Dimitrakakis Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands

More information

A Polynomial-time Nash Equilibrium Algorithm for Repeated Games

A Polynomial-time Nash Equilibrium Algorithm for Repeated Games A Polynomial-time Nash Equilibrium Algorithm for Repeated Games Michael L. Littman mlittman@cs.rutgers.edu Rutgers University Peter Stone pstone@cs.utexas.edu The University of Texas at Austin Main Result

More information

REVERSIBILITY AND OSCILLATIONS IN ZERO-SUM DISCOUNTED STOCHASTIC GAMES

REVERSIBILITY AND OSCILLATIONS IN ZERO-SUM DISCOUNTED STOCHASTIC GAMES REVERSIBILITY AND OSCILLATIONS IN ZERO-SUM DISCOUNTED STOCHASTIC GAMES Sylvain Sorin, Guillaume Vigeral To cite this version: Sylvain Sorin, Guillaume Vigeral. REVERSIBILITY AND OSCILLATIONS IN ZERO-SUM

More information

LECTURE 2. Convexity and related notions. Last time: mutual information: definitions and properties. Lecture outline

LECTURE 2. Convexity and related notions. Last time: mutual information: definitions and properties. Lecture outline LECTURE 2 Convexity and related notions Last time: Goals and mechanics of the class notation entropy: definitions and properties mutual information: definitions and properties Lecture outline Convexity

More information

Area I: Contract Theory Question (Econ 206)

Area I: Contract Theory Question (Econ 206) Theory Field Exam Winter 2011 Instructions You must complete two of the three areas (the areas being (I) contract theory, (II) game theory, and (III) psychology & economics). Be sure to indicate clearly

More information

Equilibria for games with asymmetric information: from guesswork to systematic evaluation

Equilibria for games with asymmetric information: from guesswork to systematic evaluation Equilibria for games with asymmetric information: from guesswork to systematic evaluation Achilleas Anastasopoulos anastas@umich.edu EECS Department University of Michigan February 11, 2016 Joint work

More information

MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti

MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti 1 MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti Historical background 2 Original motivation: animal learning Early

More information

Bayesian Congestion Control over a Markovian Network Bandwidth Process

Bayesian Congestion Control over a Markovian Network Bandwidth Process Bayesian Congestion Control over a Markovian Network Bandwidth Process Parisa Mansourifard 1/30 Bayesian Congestion Control over a Markovian Network Bandwidth Process Parisa Mansourifard (USC) Joint work

More information

Bargaining, Contracts, and Theories of the Firm. Dr. Margaret Meyer Nuffield College

Bargaining, Contracts, and Theories of the Firm. Dr. Margaret Meyer Nuffield College Bargaining, Contracts, and Theories of the Firm Dr. Margaret Meyer Nuffield College 2015 Course Overview 1. Bargaining 2. Hidden information and self-selection Optimal contracting with hidden information

More information

Mean-field equilibrium: An approximation approach for large dynamic games

Mean-field equilibrium: An approximation approach for large dynamic games Mean-field equilibrium: An approximation approach for large dynamic games Ramesh Johari Stanford University Sachin Adlakha, Gabriel Y. Weintraub, Andrea Goldsmith Single agent dynamic control Two agents:

More information

MS&E 246: Lecture 12 Static games of incomplete information. Ramesh Johari

MS&E 246: Lecture 12 Static games of incomplete information. Ramesh Johari MS&E 246: Lecture 12 Static games of incomplete information Ramesh Johari Incomplete information Complete information means the entire structure of the game is common knowledge Incomplete information means

More information

A Rothschild-Stiglitz approach to Bayesian persuasion

A Rothschild-Stiglitz approach to Bayesian persuasion A Rothschild-Stiglitz approach to Bayesian persuasion Matthew Gentzkow and Emir Kamenica Stanford University and University of Chicago September 2015 Abstract Rothschild and Stiglitz (1970) introduce a

More information

CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 12: Approximate Mechanism Design in Multi-Parameter Bayesian Settings

CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 12: Approximate Mechanism Design in Multi-Parameter Bayesian Settings CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 12: Approximate Mechanism Design in Multi-Parameter Bayesian Settings Instructor: Shaddin Dughmi Administrivia HW1 graded, solutions on website

More information

Learning, Games, and Networks

Learning, Games, and Networks Learning, Games, and Networks Abhishek Sinha Laboratory for Information and Decision Systems MIT ML Talk Series @CNRG December 12, 2016 1 / 44 Outline 1 Prediction With Experts Advice 2 Application to

More information

Reinforcement Learning Active Learning

Reinforcement Learning Active Learning Reinforcement Learning Active Learning Alan Fern * Based in part on slides by Daniel Weld 1 Active Reinforcement Learning So far, we ve assumed agent has a policy We just learned how good it is Now, suppose

More information

Section Notes 9. Midterm 2 Review. Applied Math / Engineering Sciences 121. Week of December 3, 2018

Section Notes 9. Midterm 2 Review. Applied Math / Engineering Sciences 121. Week of December 3, 2018 Section Notes 9 Midterm 2 Review Applied Math / Engineering Sciences 121 Week of December 3, 2018 The following list of topics is an overview of the material that was covered in the lectures and sections

More information

A Review of the E 3 Algorithm: Near-Optimal Reinforcement Learning in Polynomial Time

A Review of the E 3 Algorithm: Near-Optimal Reinforcement Learning in Polynomial Time A Review of the E 3 Algorithm: Near-Optimal Reinforcement Learning in Polynomial Time April 16, 2016 Abstract In this exposition we study the E 3 algorithm proposed by Kearns and Singh for reinforcement

More information

Lecture 1. Evolution of Market Concentration

Lecture 1. Evolution of Market Concentration Lecture 1 Evolution of Market Concentration Take a look at : Doraszelski and Pakes, A Framework for Applied Dynamic Analysis in IO, Handbook of I.O. Chapter. (see link at syllabus). Matt Shum s notes are

More information

Today s Outline. Recap: MDPs. Bellman Equations. Q-Value Iteration. Bellman Backup 5/7/2012. CSE 473: Artificial Intelligence Reinforcement Learning

Today s Outline. Recap: MDPs. Bellman Equations. Q-Value Iteration. Bellman Backup 5/7/2012. CSE 473: Artificial Intelligence Reinforcement Learning CSE 473: Artificial Intelligence Reinforcement Learning Dan Weld Today s Outline Reinforcement Learning Q-value iteration Q-learning Exploration / exploitation Linear function approximation Many slides

More information

MIT Spring 2016

MIT Spring 2016 MIT 18.655 Dr. Kempthorne Spring 2016 1 MIT 18.655 Outline 1 2 MIT 18.655 3 Decision Problem: Basic Components P = {P θ : θ Θ} : parametric model. Θ = {θ}: Parameter space. A{a} : Action space. L(θ, a)

More information

Notes on Iterated Expectations Stephen Morris February 2002

Notes on Iterated Expectations Stephen Morris February 2002 Notes on Iterated Expectations Stephen Morris February 2002 1. Introduction Consider the following sequence of numbers. Individual 1's expectation of random variable X; individual 2's expectation of individual

More information

The value of Markov Chain Games with incomplete information on both sides.

The value of Markov Chain Games with incomplete information on both sides. The value of Markov Chain Games with incomplete information on both sides. Fabien Gensbittel, Jérôme Renault To cite this version: Fabien Gensbittel, Jérôme Renault. The value of Markov Chain Games with

More information

Markov Decision Processes Infinite Horizon Problems

Markov Decision Processes Infinite Horizon Problems Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld 1 What is a solution to an MDP? MDP Planning Problem: Input: an MDP (S,A,R,T)

More information

Some notes on Markov Decision Theory

Some notes on Markov Decision Theory Some notes on Markov Decision Theory Nikolaos Laoutaris laoutaris@di.uoa.gr January, 2004 1 Markov Decision Theory[1, 2, 3, 4] provides a methodology for the analysis of probabilistic sequential decision

More information

Sequential Decision Problems

Sequential Decision Problems Sequential Decision Problems Michael A. Goodrich November 10, 2006 If I make changes to these notes after they are posted and if these changes are important (beyond cosmetic), the changes will highlighted

More information

New Approaches and Recent Advances in Two-Person Zero-Sum Repeated Games

New Approaches and Recent Advances in Two-Person Zero-Sum Repeated Games New Approaches and Recent Advances in Two-Person Zero-Sum Repeated Games Sylvain Sorin Laboratoire d Econométrie Ecole Polytechnique 1 rue Descartes 75005 Paris, France and Equipe Combinatoire et Optimisation

More information

Chapter 9. Mixed Extensions. 9.1 Mixed strategies

Chapter 9. Mixed Extensions. 9.1 Mixed strategies Chapter 9 Mixed Extensions We now study a special case of infinite strategic games that are obtained in a canonic way from the finite games, by allowing mixed strategies. Below [0, 1] stands for the real

More information

Game Theory and Control

Game Theory and Control Annu. Rev. Control Robot. Auton. Syst. 2018. 1:2.1 2.30 The Annual Review of Control, Robotics, and Autonomous Systems is online at control.annualreviews.org https://doi.org/10.1146/annurev-control-060117-105102

More information

CSC321 Lecture 22: Q-Learning

CSC321 Lecture 22: Q-Learning CSC321 Lecture 22: Q-Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Q-Learning 1 / 21 Overview Second of 3 lectures on reinforcement learning Last time: policy gradient (e.g. REINFORCE) Optimize

More information

Economics 2010c: Lecture 2 Iterative Methods in Dynamic Programming

Economics 2010c: Lecture 2 Iterative Methods in Dynamic Programming Economics 2010c: Lecture 2 Iterative Methods in Dynamic Programming David Laibson 9/04/2014 Outline: 1. Functional operators 2. Iterative solutions for the Bellman Equation 3. Contraction Mapping Theorem

More information

U Logo Use Guidelines

U Logo Use Guidelines Information Theory Lecture 3: Applications to Machine Learning U Logo Use Guidelines Mark Reid logo is a contemporary n of our heritage. presents our name, d and our motto: arn the nature of things. authenticity

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Dynamic Programming Marc Toussaint University of Stuttgart Winter 2018/19 Motivation: So far we focussed on tree search-like solvers for decision problems. There is a second important

More information

Strategic resource allocation

Strategic resource allocation Strategic resource allocation Patrick Loiseau, EURECOM (Sophia-Antipolis) Graduate Summer School: Games and Contracts for Cyber-Physical Security IPAM, UCLA, July 2015 North American Aerospace Defense

More information

On the Total Variation Distance of Labelled Markov Chains

On the Total Variation Distance of Labelled Markov Chains On the Total Variation Distance of Labelled Markov Chains Taolue Chen Stefan Kiefer Middlesex University London, UK University of Oxford, UK CSL-LICS, Vienna 4 July 04 Labelled Markov Chains (LMCs) a c

More information

Decomposition Methods for Large Scale LP Decoding

Decomposition Methods for Large Scale LP Decoding Decomposition Methods for Large Scale LP Decoding Siddharth Barman Joint work with Xishuo Liu, Stark Draper, and Ben Recht Outline Background and Problem Setup LP Decoding Formulation Optimization Framework

More information

Coevolutionary Modeling in Networks 1/39

Coevolutionary Modeling in Networks 1/39 Coevolutionary Modeling in Networks Jeff S. Shamma joint work with Ibrahim Al-Shyoukh & Georgios Chasparis & IMA Workshop on Analysis and Control of Network Dynamics October 19 23, 2015 Jeff S. Shamma

More information

Equilibrium Refinements

Equilibrium Refinements Equilibrium Refinements Mihai Manea MIT Sequential Equilibrium In many games information is imperfect and the only subgame is the original game... subgame perfect equilibrium = Nash equilibrium Play starting

More information

Lecture Slides - Part 1

Lecture Slides - Part 1 Lecture Slides - Part 1 Bengt Holmstrom MIT February 2, 2016. Bengt Holmstrom (MIT) Lecture Slides - Part 1 February 2, 2016. 1 / 36 Going to raise the level a little because 14.281 is now taught by Juuso

More information

Fictitious Self-Play in Extensive-Form Games

Fictitious Self-Play in Extensive-Form Games Johannes Heinrich, Marc Lanctot, David Silver University College London, Google DeepMind July 9, 05 Problem Learn from self-play in games with imperfect information. Games: Multi-agent decision making

More information

Ergodicity and Non-Ergodicity in Economics

Ergodicity and Non-Ergodicity in Economics Abstract An stochastic system is called ergodic if it tends in probability to a limiting form that is independent of the initial conditions. Breakdown of ergodicity gives rise to path dependence. We illustrate

More information

Theory and Internet Protocols

Theory and Internet Protocols Game Lecture 2: Linear Programming and Zero Sum Nash Equilibrium Xiaotie Deng AIMS Lab Department of Computer Science Shanghai Jiaotong University September 26, 2016 1 2 3 4 Standard Form (P) Outline

More information

Multi-armed bandit models: a tutorial

Multi-armed bandit models: a tutorial Multi-armed bandit models: a tutorial CERMICS seminar, March 30th, 2016 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions)

More information

Reputations. Larry Samuelson. Yale University. February 13, 2013

Reputations. Larry Samuelson. Yale University. February 13, 2013 Reputations Larry Samuelson Yale University February 13, 2013 I. Introduction I.1 An Example: The Chain Store Game Consider the chain-store game: Out In Acquiesce 5, 0 2, 2 F ight 5,0 1, 1 If played once,

More information

1 AUTOCRATIC STRATEGIES

1 AUTOCRATIC STRATEGIES AUTOCRATIC STRATEGIES. ORIGINAL DISCOVERY Recall that the transition matrix M for two interacting players X and Y with memory-one strategies p and q, respectively, is given by p R q R p R ( q R ) ( p R

More information

A Theory of Financing Constraints and Firm Dynamics by Clementi and Hopenhayn - Quarterly Journal of Economics (2006)

A Theory of Financing Constraints and Firm Dynamics by Clementi and Hopenhayn - Quarterly Journal of Economics (2006) A Theory of Financing Constraints and Firm Dynamics by Clementi and Hopenhayn - Quarterly Journal of Economics (2006) A Presentation for Corporate Finance 1 Graduate School of Economics December, 2009

More information

LINEAR PROGRAMMING III

LINEAR PROGRAMMING III LINEAR PROGRAMMING III ellipsoid algorithm combinatorial optimization matrix games open problems Lecture slides by Kevin Wayne Last updated on 7/25/17 11:09 AM LINEAR PROGRAMMING III ellipsoid algorithm

More information

Pre-Bayesian Games. Krzysztof R. Apt. CWI, Amsterdam, the Netherlands, University of Amsterdam. (so not Krzystof and definitely not Krystof)

Pre-Bayesian Games. Krzysztof R. Apt. CWI, Amsterdam, the Netherlands, University of Amsterdam. (so not Krzystof and definitely not Krystof) Pre-Bayesian Games Krzysztof R. Apt (so not Krzystof and definitely not Krystof) CWI, Amsterdam, the Netherlands, University of Amsterdam Pre-Bayesian Games p. 1/1 Pre-Bayesian Games (Hyafil, Boutilier

More information

Economics 209A Theory and Application of Non-Cooperative Games (Fall 2013) Extensive games with perfect information OR6and7,FT3,4and11

Economics 209A Theory and Application of Non-Cooperative Games (Fall 2013) Extensive games with perfect information OR6and7,FT3,4and11 Economics 209A Theory and Application of Non-Cooperative Games (Fall 2013) Extensive games with perfect information OR6and7,FT3,4and11 Perfect information A finite extensive game with perfect information

More information

Review of topics since what was covered in the midterm: Topics that we covered before the midterm (also may be included in final):

Review of topics since what was covered in the midterm: Topics that we covered before the midterm (also may be included in final): Review of topics since what was covered in the midterm: Subgame-perfect eqms in extensive games with perfect information where players choose a number (first-order conditions, boundary conditions, favoring

More information

SEQUENTIAL ESTIMATION OF DYNAMIC DISCRETE GAMES. Victor Aguirregabiria (Boston University) and. Pedro Mira (CEMFI) Applied Micro Workshop at Minnesota

SEQUENTIAL ESTIMATION OF DYNAMIC DISCRETE GAMES. Victor Aguirregabiria (Boston University) and. Pedro Mira (CEMFI) Applied Micro Workshop at Minnesota SEQUENTIAL ESTIMATION OF DYNAMIC DISCRETE GAMES Victor Aguirregabiria (Boston University) and Pedro Mira (CEMFI) Applied Micro Workshop at Minnesota February 16, 2006 CONTEXT AND MOTIVATION Many interesting

More information

Hybrid Machine Learning Algorithms

Hybrid Machine Learning Algorithms Hybrid Machine Learning Algorithms Umar Syed Princeton University Includes joint work with: Rob Schapire (Princeton) Nina Mishra, Alex Slivkins (Microsoft) Common Approaches to Machine Learning!! Supervised

More information

Zero-sum Stochastic Games

Zero-sum Stochastic Games 1/53 Jérôme Renault, TSE Université Toulouse Stochastic Methods in Game Theory, Singapore 2015 2/53 Outline Zero-sum stochastic games 1. The basic model: finitely many states and actions 1.1 Description

More information

Lecture notes for Analysis of Algorithms : Markov decision processes

Lecture notes for Analysis of Algorithms : Markov decision processes Lecture notes for Analysis of Algorithms : Markov decision processes Lecturer: Thomas Dueholm Hansen June 6, 013 Abstract We give an introduction to infinite-horizon Markov decision processes (MDPs) with

More information

Grundlagen der Künstlichen Intelligenz

Grundlagen der Künstlichen Intelligenz Grundlagen der Künstlichen Intelligenz Uncertainty & Probabilities & Bandits Daniel Hennes 16.11.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Uncertainty Probability

More information

Small Sample of Related Literature

Small Sample of Related Literature UCLA IPAM July 2015 Learning in (infinitely) repeated games with n players. Prediction and stability in one-shot large (many players) games. Prediction and stability in large repeated games (big games).

More information

9 - Markov processes and Burt & Allison 1963 AGEC

9 - Markov processes and Burt & Allison 1963 AGEC This document was generated at 8:37 PM on Saturday, March 17, 2018 Copyright 2018 Richard T. Woodward 9 - Markov processes and Burt & Allison 1963 AGEC 642-2018 I. What is a Markov Chain? A Markov chain

More information

BAYES CORRELATED EQUILIBRIUM AND THE COMPARISON OF INFORMATION STRUCTURES IN GAMES. Dirk Bergemann and Stephen Morris

BAYES CORRELATED EQUILIBRIUM AND THE COMPARISON OF INFORMATION STRUCTURES IN GAMES. Dirk Bergemann and Stephen Morris BAYES CORRELATED EQUILIBRIUM AND THE COMPARISON OF INFORMATION STRUCTURES IN GAMES By Dirk Bergemann and Stephen Morris September 203 Revised April 205 COWLES FOUNDATION DISCUSSION PAPER NO. 909RRR COWLES

More information

Some AI Planning Problems

Some AI Planning Problems Course Logistics CS533: Intelligent Agents and Decision Making M, W, F: 1:00 1:50 Instructor: Alan Fern (KEC2071) Office hours: by appointment (see me after class or send email) Emailing me: include CS533

More information

Dynamic Games with Asymmetric Information: Common Information Based Perfect Bayesian Equilibria and Sequential Decomposition

Dynamic Games with Asymmetric Information: Common Information Based Perfect Bayesian Equilibria and Sequential Decomposition Dynamic Games with Asymmetric Information: Common Information Based Perfect Bayesian Equilibria and Sequential Decomposition 1 arxiv:1510.07001v1 [cs.gt] 23 Oct 2015 Yi Ouyang, Hamidreza Tavafoghi and

More information

Intertemporal Risk Aversion, Stationarity, and Discounting

Intertemporal Risk Aversion, Stationarity, and Discounting Traeger, CES ifo 10 p. 1 Intertemporal Risk Aversion, Stationarity, and Discounting Christian Traeger Department of Agricultural & Resource Economics, UC Berkeley Introduce a more general preference representation

More information

REPEATED GAMES. Jörgen Weibull. April 13, 2010

REPEATED GAMES. Jörgen Weibull. April 13, 2010 REPEATED GAMES Jörgen Weibull April 13, 2010 Q1: Can repetition induce cooperation? Peace and war Oligopolistic collusion Cooperation in the tragedy of the commons Q2: Can a game be repeated? Game protocols

More information

Elements of Reinforcement Learning

Elements of Reinforcement Learning Elements of Reinforcement Learning Policy: way learning algorithm behaves (mapping from state to action) Reward function: Mapping of state action pair to reward or cost Value function: long term reward,

More information

6.254 : Game Theory with Engineering Applications Lecture 13: Extensive Form Games

6.254 : Game Theory with Engineering Applications Lecture 13: Extensive Form Games 6.254 : Game Theory with Engineering Lecture 13: Extensive Form Games Asu Ozdaglar MIT March 18, 2010 1 Introduction Outline Extensive Form Games with Perfect Information One-stage Deviation Principle

More information

Bayesian Contextual Multi-armed Bandits

Bayesian Contextual Multi-armed Bandits Bayesian Contextual Multi-armed Bandits Xiaoting Zhao Joint Work with Peter I. Frazier School of Operations Research and Information Engineering Cornell University October 22, 2012 1 / 33 Outline 1 Motivating

More information

Thema Working Paper n Université de Cergy Pontoise, France. Hölder Continuous Implementation. Oury Marion

Thema Working Paper n Université de Cergy Pontoise, France. Hölder Continuous Implementation. Oury Marion Thema Working Paper n 2010-06 Université de Cergy Pontoise, France Hölder Continuous Implementation Oury Marion November, 2010 Hölder Continuous Implementation Marion Oury November 2010 Abstract Building

More information

Bayes Correlated Equilibrium and Comparing Information Structures

Bayes Correlated Equilibrium and Comparing Information Structures Bayes Correlated Equilibrium and Comparing Information Structures Dirk Bergemann and Stephen Morris Spring 2013: 521 B Introduction game theoretic predictions are very sensitive to "information structure"

More information

6.254 : Game Theory with Engineering Applications Lecture 7: Supermodular Games

6.254 : Game Theory with Engineering Applications Lecture 7: Supermodular Games 6.254 : Game Theory with Engineering Applications Lecture 7: Asu Ozdaglar MIT February 25, 2010 1 Introduction Outline Uniqueness of a Pure Nash Equilibrium for Continuous Games Reading: Rosen J.B., Existence

More information

Near-Potential Games: Geometry and Dynamics

Near-Potential Games: Geometry and Dynamics Near-Potential Games: Geometry and Dynamics Ozan Candogan, Asuman Ozdaglar and Pablo A. Parrilo January 29, 2012 Abstract Potential games are a special class of games for which many adaptive user dynamics

More information

Reinforcement Learning

Reinforcement Learning CS7/CS7 Fall 005 Supervised Learning: Training examples: (x,y) Direct feedback y for each input x Sequence of decisions with eventual feedback No teacher that critiques individual actions Learn to act

More information

Q-Learning in Continuous State Action Spaces

Q-Learning in Continuous State Action Spaces Q-Learning in Continuous State Action Spaces Alex Irpan alexirpan@berkeley.edu December 5, 2015 Contents 1 Introduction 1 2 Background 1 3 Q-Learning 2 4 Q-Learning In Continuous Spaces 4 5 Experimental

More information