Asymmetric Information Security Games 1/43
|
|
- Wendy Page
- 5 years ago
- Views:
Transcription
1 Asymmetric Information Security Games Jeff S. Shamma with Lichun Li & Malachi Jones & IPAM Graduate Summer School Games and Contracts for Cyber-Physical Security 7 23 July 2015 Jeff S. Shamma Asymmetric Information Security Games 1/43
2 Asymmetric information games Motivation: One player has superior information Attacker knowns own skill set Defender knows resource characteristics Jeff S. Shamma Asymmetric Information Security Games 2/43
3 Asymmetric information games Motivation: One player has superior information Attacker knowns own skill set Defender knows resource characteristics Tension: Exploiting information also reveals information Jeff S. Shamma Asymmetric Information Security Games 2/43
4 Asymmetric information games Motivation: One player has superior information Attacker knowns own skill set Defender knows resource characteristics Tension: Exploiting information also reveals information Solution: Randomized deception Deliberately do not utilize best resources Jeff S. Shamma Asymmetric Information Security Games 2/43
5 Illustration: Network interdiction System: One high capacity resource and several low capacity (unknown to attacker) Attacker: Observes usage (binary) during Phase I Disables selected resource for Phase II Tension: Initial vs future usage Jeff S. Shamma Asymmetric Information Security Games 3/43
6 Framework: Repeated games Players: Maximizer Row with actions Minimizer Col with actions a A = {1, 2,..., A } b B = {1, 2,..., B } Jeff S. Shamma Asymmetric Information Security Games 4/43
7 Framework: Repeated games Players: Maximizer Row with actions Minimizer Col with actions a A = {1, 2,..., A } b B = {1, 2,..., B } Stages: t = 0, 1, 2,... Jeff S. Shamma Asymmetric Information Security Games 4/43
8 Framework: Repeated games Players: Maximizer Row with actions Minimizer Col with actions a A = {1, 2,..., A } b B = {1, 2,..., B } Stages: t = 0, 1, 2,... History: h t = {(a 0, b 0), (a 1, b 1),..., (a t 1, b t 1)} H t Set of finite histories: H Jeff S. Shamma Asymmetric Information Security Games 4/43
9 Repeated games, cont Stage payoff: Matrix M = [m ab ] m ab = Payoff to Row under action pair (a, b) m ab = Penalty to Col under action pair (a, b) Jeff S. Shamma Asymmetric Information Security Games 5/43
10 Repeated games, cont Stage payoff: Matrix M = [m ab ] m ab = Payoff to Row under action pair (a, b) m ab = Penalty to Col under action pair (a, b) State: k K = {1, 2,..., K } } Index of stage game: {M 1, M 2,..., M K Jeff S. Shamma Asymmetric Information Security Games 5/43
11 Repeated games, cont Stage payoff: Matrix M = [m ab ] m ab = Payoff to Row under action pair (a, b) m ab = Penalty to Col under action pair (a, b) State: k K = {1, 2,..., K } } Index of stage game: {M 1, M 2,..., M K Asymmetric information: } At t = 0, nature selects M {M 1, M 2,..., M K } Prior probabilities {p 1, p 2,..., p K Row informed of selected game Jeff S. Shamma Asymmetric Information Security Games 5/43
12 Repeated games, cont Behavioral strategies: σ : H K (A) (Row) τ : H (B) (Col) Jeff S. Shamma Asymmetric Information Security Games 6/43
13 Repeated games, cont Behavioral strategies: σ : H K (A) (Row) τ : H (B) (Col) Note: Do not measure payoffs! Jeff S. Shamma Asymmetric Information Security Games 6/43
14 Repeated games, cont Behavioral strategies: σ : H K (A) (Row) τ : H (B) (Col) Note: Do not measure payoffs! T -stage game, Γ T (p): (Γ (p) later...) [ ] T 1 1 γ t (σ, τ) = E M k (a t, b t ) T t=0 Jeff S. Shamma Asymmetric Information Security Games 6/43
15 Extensive literature Foundations: Aumann & Maschler (1967), Repeated games with incomplete information: A survey of resent results, Report to US Arms Control and Disarmament Agency. Surveys: Zamir (1992), Repeated games of incomplete information: Zero-sum, Handbook of Game Theory, v. I. Laraki & Sorin (2014), Advances in zero-sum dynamic games, Handbook of Game Theory, v. IV. Monographs: Aumann & Maschler (1967/1995), Repeated Games with Incomplete Information. Mertens, Sorin & Zamir (1994/2015), Repeated Games. Sorin (2002), A First Course on Zero-Sum Repeated Games. Variations: Evolving state, signal monitoring, two-sided incomplete information,... Jeff S. Shamma Asymmetric Information Security Games 7/43
16 Pre-example: Dominant strategies L R T 4 3 B 2 1 Case I L R T 4 2 B 1 3 Case II Case I: T is a dominant strategy, i.e., oblivious best response Case II: No dominant strategy, i.e., contingent best response Jeff S. Shamma Asymmetric Information Security Games 8/43
17 Example I: Non-revelation L R T 1 0 B 0 0 M 1 L R T 0 0 B 0 1 M 2 Should Row use (weakly) dominant strategy? Jeff S. Shamma Asymmetric Information Security Games 9/43
18 Example I: Non-revelation L R T 1 0 B 0 0 M 1 L R T 0 0 B 0 1 M 2 Should Row use (weakly) dominant strategy? Dominant strategy payoff stream: 1-or-0, 0, 0,... Jeff S. Shamma Asymmetric Information Security Games 9/43
19 Example I: Non-revelation L R T 1 0 B 0 0 M 1 L R T 0 0 B 0 1 M 2 Should Row use (weakly) dominant strategy? Dominant strategy payoff stream: 1-or-0, 0, 0,... Compare to perpetual strategy: Col faced with averaged game ( 1/2 0 ) 0 1/2 Expected payoff of 1/4 Jeff S. Shamma Asymmetric Information Security Games 9/43
20 Example I: Non-revelation L R T 1 0 B 0 0 M 1 L R T 0 0 B 0 1 M 2 Should Row use (weakly) dominant strategy? Dominant strategy payoff stream: 1-or-0, 0, 0,... Compare to perpetual strategy: Col faced with averaged game ( 1/2 0 ) 0 1/2 Expected payoff of 1/4 Conclusion: Row better off ignoring information Jeff S. Shamma Asymmetric Information Security Games 9/43
21 Example II: Full revelation L R T -1 0 B 0 0 M 1 L R T 0 0 B 0-1 M 2 Dominant strategy (fully revealing) payoff: 0 Non-revealing game: ( 1/2 ) 0 0 1/2 has expected payoff 1/4 Conclusion: Row better off fully revealing Jeff S. Shamma Asymmetric Information Security Games 10/43
22 Example III: Partial revelation L C R T B M 1 L C R T B M 2 Dominant strategy (long run) payoff: 0 Non-revealing game: ( ) has expected payoff 0 Can Row do better? Jeff S. Shamma Asymmetric Information Security Games 11/43
23 Example III: Partial revelation, cont L C R T B M 1 State dependent lottery (h, t): k = 1 probabilities: (3/4, 1/4) k = 2 probabilities: (1/4, 3/4) Outcome dependent strategy: h: Play T forever t: Play B forever L C R T B M 2 Jeff S. Shamma Asymmetric Information Security Games 12/43
24 Example III: Partial revelation, cont L C R T B M 1 L C R T B After stage 0, Col knows outcome Pr [ k = 1 ] h = 3/4 leading to average game ( 3 1 ) in which Row plays T. Pr [ k = 1 ] t = 1/4 leading to average game ( 3 1 ) in which Row plays B. Long run payoff: 1 M 2 Jeff S. Shamma Asymmetric Information Security Games 13/43
25 Outline Selected basic results Computational approaches Jeff S. Shamma Asymmetric Information Security Games 14/43
26 Review: Finite zero-sum games Payoff matrix: M = [m ab ] Jeff S. Shamma Asymmetric Information Security Games 15/43
27 Review: Finite zero-sum games Payoff matrix: M = [m ab ] Mixed strategies: x, y x(a) = Pr [a] & y(b) = Pr [b] x T My = Expected payoff to Row under strategies (x, y) x T My = Expected penalty to Col under strategies (x, y) Jeff S. Shamma Asymmetric Information Security Games 15/43
28 Review: Finite zero-sum games Payoff matrix: M = [m ab ] Mixed strategies: x, y x(a) = Pr [a] & y(b) = Pr [b] x T My = Expected payoff to Row under strategies (x, y) x T My = Expected penalty to Col under strategies (x, y) Security levels and strategies: v = max x = max l x,l min x T My y x T M l 1 T x v = min y = min y,l l My l 1 y max x T My x Jeff S. Shamma Asymmetric Information Security Games 15/43
29 Review: Finite zero-sum games Payoff matrix: M = [m ab ] Mixed strategies: x, y x(a) = Pr [a] & y(b) = Pr [b] x T My = Expected payoff to Row under strategies (x, y) x T My = Expected penalty to Col under strategies (x, y) Security levels and strategies: v = max x = max l x,l min x T My y x T M l 1 T x v = min y = min y,l l My l 1 y max x T My x Value: val[m] = v = v Jeff S. Shamma Asymmetric Information Security Games 15/43
30 Analysis: Finite horizon Pure strategies: σ : H K A (Row) τ : H B (Col) Jeff S. Shamma Asymmetric Information Security Games 16/43
31 Analysis: Finite horizon Pure strategies: σ : H K A (Row) τ : H B (Col) Strategic form conversion: Enumerate all pure strategies Define M(p) as associated (large) matrix game for p (K) Jeff S. Shamma Asymmetric Information Security Games 16/43
32 Analysis: Finite horizon Pure strategies: σ : H K A (Row) τ : H B (Col) Strategic form conversion: Enumerate all pure strategies Define M(p) as associated (large) matrix game for p (K) Consequences: Value v t(p) = val[m(p)] exists along with associated (mixed) security strategies Equivalence to behavioral strategies (Kuhn s theorem) Jeff S. Shamma Asymmetric Information Security Games 16/43
33 Non-revealing game & u(p) Average game: D(p) = k p k M k & u(p) = val[d(p)] Jeff S. Shamma Asymmetric Information Security Games 17/43
34 Non-revealing game & u(p) Average game: D(p) = k p k M k & u(p) = val[d(p)] Claim: v t (p) u(p) Proof: Row plays security strategy for D(p) Jeff S. Shamma Asymmetric Information Security Games 17/43
35 Non-revealing game & u(p) Average game: D(p) = k p k M k & u(p) = val[d(p)] Claim: v t (p) u(p) Proof: Row plays security strategy for D(p) Jeff S. Shamma Asymmetric Information Security Games 17/43
36 Computing u(p) u(p) = max min x T x y x T ( k max l x,l ) p k M k ( ) p k M k y k l 1 T x Jeff S. Shamma Asymmetric Information Security Games 18/43
37 Cavu(p) Claim: Suppose L p = λ l p l l=1 There exists a Row strategy such that L v t (p) λ l u(p l ) l=1 Jeff S. Shamma Asymmetric Information Security Games 19/43
38 Cavu(p) Claim: Suppose L p = λ l p l l=1 There exists a Row strategy such that L v t (p) λ l u(p l ) l=1 Implication: By optimally selecting mixtures v T (p) Cavu(p) where Cavu(p) u(p) is pointwise smallest concave function Jeff S. Shamma Asymmetric Information Security Games 19/43
39 Cavu(p), cont Jeff S. Shamma Asymmetric Information Security Games 20/43
40 Belief splitting For p, p 1,..., p N, suppose L p = λ l p l l=1 Define joint distribution over {1, 2,..., L} {1, 2,..., K } Q(l, k) = λ l pl k Jeff S. Shamma Asymmetric Information Security Games 21/43
41 Belief splitting For p, p 1,..., p N, suppose L p = λ l p l l=1 Define joint distribution over {1, 2,..., L} {1, 2,..., K } Q(l, k) = λ l pl k Properties: Pr [k] = p k Pr [l] = λ l Pr [ k l ] = p k l Pr [ l k ] λl p k l 1 2 p λ1p1 λ2p2 λlpl K λ 1 2 L Jeff S. Shamma Asymmetric Information Security Games 21/43
42 Proof of claim Starting point: p = L l=1 λ lp l Row strategy: 1 Let x l be Row optimal strategy for u(p l ) 2 Select l λ l p k l (k-dependent lottery) 3 Play selected x l Jeff S. Shamma Asymmetric Information Security Games 22/43
43 Proof of claim Starting point: p = L l=1 λ lp l Row strategy: 1 Let x l be Row optimal strategy for u(p l ) 2 Select l λ l p k l (k-dependent lottery) 3 Play selected x l Assess Col response as if observed l min y p k( Pr [ l k ] ) TM x k l y k l min y l,l=1,2,...,l = l λ l u(p l ) p k( Pr [ l k ] ) TM x k l y l k Jeff S. Shamma Asymmetric Information Security Games 22/43
44 Belief updates for Col Belief splitting: Posterior belief is p l with probability λ l Jeff S. Shamma Asymmetric Information Security Games 23/43
45 Belief updates for Col Belief splitting: Posterior belief is p l with probability λ l General setup: Prior beliefs p over K k-dependent strategy for Row, X = (x 1,..., x K ) (A) K x k a = Pr [ a k ] Jeff S. Shamma Asymmetric Information Security Games 23/43
46 Belief updates for Col Belief splitting: Posterior belief is p l with probability λ l General setup: Prior beliefs p over K k-dependent strategy for Row, X = (x 1,..., x K ) (A) K x k a = Pr [ a k ] Computations: Probability Row plays a: π(a; X, p) = k Pr [ a ] k Pr [k] = xa k p k k Posterior belief after Row plays a: B k (a; X, p) = Pr [ a k ] Pr [k] Pr [a] = x k a p k π(a; X, p) Jeff S. Shamma Asymmetric Information Security Games 23/43
47 Naive strategy for Col Naive defense: Define p k t = Pr [ k h t ] (requires Row strategy σ) Set τ t(h t) to be security strategy for D(p t) Jeff S. Shamma Asymmetric Information Security Games 24/43
48 Naive strategy for Col Naive defense: Define p k t = Pr [ k h t ] (requires Row strategy σ) Set τ t(h t) to be security strategy for D(p t) Claim: E [γ t (σ, τ)] Cavu(p) + M p k (1 p k ) t k Jeff S. Shamma Asymmetric Information Security Games 24/43
49 Naive strategy for Col Naive defense: Define p k t = Pr [ k h t ] (requires Row strategy σ) Set τ t(h t) to be security strategy for D(p t) Claim: E [γ t (σ, τ)] Cavu(p) + M p k (1 p k ) t k Implication: Cavu(p) v t (p) Cavu(p) + M p k (1 p k ) t k Jeff S. Shamma Asymmetric Information Security Games 24/43
50 Uninformed player: Defend vs guarantee Defend: React to σ s.t. γ t (σ, BR(σ)) l, σ e.g., Belief based reaction Jeff S. Shamma Asymmetric Information Security Games 25/43
51 Uninformed player: Defend vs guarantee Defend: React to σ s.t. γ t (σ, BR(σ)) l, σ e.g., Belief based reaction Guarantee: Find τ s.t. γ t (σ, τ) l, σ Jeff S. Shamma Asymmetric Information Security Games 25/43
52 Uninformed player: Defend vs guarantee Defend: React to σ s.t. γ t (σ, BR(σ)) l, σ e.g., Belief based reaction Guarantee: Find τ s.t. γ t (σ, τ) l, σ Example: L R T 1-1 B -1 1 Jeff S. Shamma Asymmetric Information Security Games 25/43
53 Uninformed player: Defend vs guarantee Defend: React to σ s.t. γ t (σ, BR(σ)) l, σ e.g., Belief based reaction Guarantee: Find τ s.t. γ t (σ, τ) l, σ Example: L R T 1-1 B -1 1 Defend: BR(T) = R & BR(B) = L & BR(50-50) = L/R (l = 0) Jeff S. Shamma Asymmetric Information Security Games 25/43
54 Uninformed player: Defend vs guarantee Defend: React to σ s.t. γ t (σ, BR(σ)) l, σ e.g., Belief based reaction Guarantee: Find τ s.t. γ t (σ, τ) l, σ Example: L R T 1-1 B -1 1 Defend: BR(T) = R & BR(B) = L & BR(50-50) = L/R (l = 0) Guarantee: (l = 0) Jeff S. Shamma Asymmetric Information Security Games 25/43
55 Approachability strategy for Col Hypothetical payoff vector: Given observations Col can compute g t (a t, b t ) = ( M 1 (a t, b t ) M 2 (a t, b t )... M K (a t, b t ) ) as well as its running average g t+1 = g t + 1 t + 1 (g t(a t, b t ) g t ) Jeff S. Shamma Asymmetric Information Security Games 26/43
56 Approachability strategy for Col Hypothetical payoff vector: Given observations Col can compute g t (a t, b t ) = ( M 1 (a t, b t ) M 2 (a t, b t )... M K (a t, b t ) ) as well as its running average g t+1 = g t + 1 t + 1 (g t(a t, b t ) g t ) Challenge: Steer g t so that lim sup p T g t Cavu(p) t Jeff S. Shamma Asymmetric Information Security Games 26/43
57 Blackwell approachability g t+1 = g t + 1 t + 1 (g t(a t, b t ) g t ) Approachability: A closed convex set, C, is approachable, i.e., Pr [dist(g t, C) 0] = 1 if and only if for all half-spaces, H, containing C, there exists a y such that { Co y b g t (a, b) } a A H b * Jeff S. Shamma Asymmetric Information Security Games 27/43
58 Approachability strategy for Col, cont Construction: 1 Find a supporting hyperplane v R K such that v T p = Cavu(p) u(q) v T q, q 2 Define C = { x R K x v } 3 At stage t, if g t C, define q = 1 Z ( ) g t Π(g t, C) and play optimal strategy for D(q) q * g t Jeff S. Shamma Asymmetric Information Security Games 28/43
59 Recap Row belief splitting: Cavu(p) v t (p) Jeff S. Shamma Asymmetric Information Security Games 29/43
60 Recap Row belief splitting: Cavu(p) v t (p) Col naive defense: v t (p) Cavu(p) + M p k (1 p k ) t k Jeff S. Shamma Asymmetric Information Security Games 29/43
61 Recap Row belief splitting: Cavu(p) v t (p) Col naive defense: v t (p) Cavu(p) + M p k (1 p k ) t k Col approachability: lim sup E [γ t (σ, τ)] Cavu(p) t Jeff S. Shamma Asymmetric Information Security Games 29/43
62 Recap Row belief splitting: Cavu(p) v t (p) Col naive defense: v t (p) Cavu(p) + M p k (1 p k ) t k Col approachability: lim sup E [γ t (σ, τ)] Cavu(p) t Implication: Infinite horizon game, Γ (p) has value Cavu(p) Jeff S. Shamma Asymmetric Information Security Games 29/43
63 Computations: Face value Suppose M is S S Assume oblivious Row who ignores Col s actions (optimal) Jeff S. Shamma Asymmetric Information Security Games 30/43
64 Computations: Face value Suppose M is S S Assume oblivious Row who ignores Col s actions (optimal) Number of strategies: Stage 0: state action: S K Stage 1: (state,action) action: S K S. Stage T : (state, action,..., action) action: S K ST Total: T t=0 S K St Conversion to strategic form computationally prohibitive Jeff S. Shamma Asymmetric Information Security Games 30/43
65 Recursive structure ( v t+1 (p) = 1 t + 1 max X min y p k x k M k y + t a k π(a; X, p)v t (B(a; X, p)) ) Row is oblivious to Col Col plays myopic defense based on current beliefs Jeff S. Shamma Asymmetric Information Security Games 31/43
66 Recursive structure intuition What is Col s best response to z t+1 = Z t (z t, a t ) a t X t (z t ) Jeff S. Shamma Asymmetric Information Security Games 32/43
67 Recursive structure intuition What is Col s best response to z t+1 = Z t (z t, a t ) a t X t (z t ) Value iteration: Col plays myopic defense V t (z t, p t ) = min y V T (z T, p T ) = min E [ M k (a T, b T ) ] b T y = min pt k X k (z T ) T M k y y pt k X k (z t ) T M k y k k + E [V t+1 (Z t (z t, a t ), B(a t ; X(z t ), p t ))] Note: Reversed time indexing and neglected normalization. Jeff S. Shamma Asymmetric Information Security Games 32/43
68 Recursive structure intuition, cont V t (z t, p t ) = min y pt k X k (z t ) T M k y + E [V t+1 (Z t (z t, a t ), B(a t ; X(z t ), p t ))] k Jeff S. Shamma Asymmetric Information Security Games 33/43
69 Recursive structure intuition, cont V t (z t, p t ) = min y pt k X k (z t ) T M k y + E [V t+1 (Z t (z t, a t ), B(a t ; X(z t ), p t ))] k Row s task as a maximizer v t (p t ) = max X v T (p T ) = max X k min y pt k x k M k y k min y p k t x k M k y + E [v t+1 (B(a t ; X, p t ))] Jeff S. Shamma Asymmetric Information Security Games 33/43
70 Recursive structure intuition, cont V t (z t, p t ) = min y pt k X k (z t ) T M k y + E [V t+1 (Z t (z t, a t ), B(a t ; X(z t ), p t ))] k Row s task as a maximizer v t (p t ) = max X v T (p T ) = max X Equivalent problem: State space: p t+1 = B(p t, a t) Action space: (A) K Stage reward: min y k x k M k y k min y pt k x k M k y k min y p k t x k M k y + E [v t+1 (B(a t ; X, p t ))] Jeff S. Shamma Asymmetric Information Security Games 33/43
71 LP for v 1 (p) S K + 1 variables & S constraints ( max min p k (x k ) T M k) y X K y k max l X,l p k x k M k l 1 T k x k, k Jeff S. Shamma Asymmetric Information Security Games 34/43
72 LP for v 1 (p) S K + 1 variables & S constraints Extended v 1 ( ): ( max min p k (x k ) T M k) y X K y k max l X,l p k x k M k l 1 T k x k, k Redefine v 1 (q) over positive q R K + Positive homogeneity: c v 1 (q) = v 1 (c q) Jeff S. Shamma Asymmetric Information Security Games 34/43
73 LP for v 2 (p)? v 2 (p) = max X = max X min θ y k min θ y k p k x k M k y + (1 θ) a p k x k M k y + (1 θ) a π(a; X, p)v 1 (B(a; X, p)) v 1 (x 1 a p 1,..., x K a p K ) Jeff S. Shamma Asymmetric Information Security Games 35/43
74 LP for v 2 (p)? v 2 (p) = max X min θ y k p k x k M k y + (1 θ) a π(a; X, p)v 1 (B(a; X, p)) = max X min θ y k p k x k M k y + (1 θ) a v 2 (p) = max θl 0 + (1 θ) X,l 0,...,l A a p k x k M k l 0 1 T k v 1 (x 1 a p 1,..., x K a p K ) l a, a x k, k v 1 (x 1 a p 1,..., x K a p K ) l a Jeff S. Shamma Asymmetric Information Security Games 35/43
75 LP for v 2 (p)? v 2 (p) = max X min θ y k p k x k M k y + (1 θ) a π(a; X, p)v 1 (B(a; X, p)) = max X min θ y k p k x k M k y + (1 θ) a v 2 (p) = max θl 0 + (1 θ) X,l 0,...,l A a p k x k M k l 0 1 T k v 1 (x 1 a p 1,..., x K a p K ) l a, a x k, k v 1 (x 1 a p 1,..., x K a p K ) l a Constraints on v 1 ( ) result in product terms in LP Jeff S. Shamma Asymmetric Information Security Games 35/43
76 LHS vs RHS LHS Nested LP: max c T v v Av b v i F i w f i Jeff S. Shamma Asymmetric Information Security Games 36/43
77 LHS vs RHS LHS Nested LP: max c T v v Av b v i F i w f i LP structure lost! Jeff S. Shamma Asymmetric Information Security Games 36/43
78 LHS vs RHS LHS Nested LP: max c T v v Av b v i F i w f i LP structure lost! RHS Nested LP: max c T v v Av b Fw i v i f i LP structure preserved! Jeff S. Shamma Asymmetric Information Security Games 36/43
79 LP for v 1 (p) revisited v 1 (p) = max l X,l p k x k M k l 1 T k x k, k v 1 (p) = max l Z,l z k M k l 1 T k 1 T z k = p k, k Change of variables: z k = p k x k 0 Probabilities now enter in RHS Jeff S. Shamma Asymmetric Information Security Games 37/43
80 LP for v 2 (p) revisited v 2 (p) = max θl 0 + (1 θ) X,l 0,...,l A a p k x k M k l 0 1 T k v 1 (x 1 a p 1,..., x K a p K ) l a, a x k, k Each constraint on v 1 ( ) is an LP: z k (a)m k l a 1 T k 1 T z k (a) = x k a p k l a (S + 1) (S K vars & S cons ) Jeff S. Shamma Asymmetric Information Security Games 38/43
81 LP computations Claim: Recursive structure: If v t(p) has RHS LP-dependence on p, then v t+1(p) has RHS LP-dependence on p. Growth in LP size: LP size grows with size[v t] S size[v t 1] Jeff S. Shamma Asymmetric Information Security Games 39/43
82 LP computations Claim: Recursive structure: If v t(p) has RHS LP-dependence on p, then v t+1(p) has RHS LP-dependence on p. Growth in LP size: LP size grows with size[v t] S size[v t 1] Polynomial dependence on size of game (but not length): (S + S S T ) K vs T t=0 S K St }{{} face value (cf., sequence form of Koller, von Stengel, & Megiddo, 1996) Jeff S. Shamma Asymmetric Information Security Games 39/43
83 LP computations Claim: Recursive structure: If v t(p) has RHS LP-dependence on p, then v t+1(p) has RHS LP-dependence on p. Growth in LP size: LP size grows with size[v t] S size[v t 1] Polynomial dependence on size of game (but not length): (S + S S T ) K vs T t=0 S K St }{{} face value (cf., sequence form of Koller, von Stengel, & Megiddo, 1996) Recursive computation applicable to time-varying repeated games (i.e., changing M-matrices) Jeff S. Shamma Asymmetric Information Security Games 39/43
84 Illustration: Example I L R T 1 0 B 0 0 M 1 L R T 0 0 B 0 1 M Jeff S. Shamma Asymmetric Information Security Games 40/43
85 Illustration: Network interdiction System: One high capacity resource and several low capacity (unknown to attacker) Attacker: Observes usage (binary) during Phase I Disables selected resource for Phase II Tension: Initial vs future usage Note: Time-varying M-matrices Jeff S. Shamma Asymmetric Information Security Games 41/43
86 Network interdiction, cont There is one high capacity channel, and the attacker can block one channel from stage 4 to 11 2 value of the game total number of channels Initial phase: 3 stages Remaining phase: R stages Probability of using high capacity resource: r \ t Jeff S. Shamma Asymmetric Information Security Games 42/43
87 Concluding remarks Recap: Examples Basic results Computational approach Jeff S. Shamma Asymmetric Information Security Games 43/43
88 Concluding remarks Recap: Examples Basic results Computational approach Extensions: Discounted problems: γ λ (σ, τ) = E [ (1 λ) ] λ t M k (a t, b t) t=0 Markov chains with informed controller: Receding horizon implementation k t+1 φ(k t, a t) Jeff S. Shamma Asymmetric Information Security Games 43/43
89 Concluding remarks Recap: Examples Basic results Computational approach Extensions: Discounted problems: γ λ (σ, τ) = E [ (1 λ) ] λ t M k (a t, b t) t=0 Markov chains with informed controller: Receding horizon implementation k t+1 φ(k t, a t) Lingering issue: Computational policies for uninformed player Jeff S. Shamma Asymmetric Information Security Games 43/43
Cyber Security Games with Asymmetric Information
Cyber Security Games with Asymmetric Information Jeff S. Shamma Georgia Institute of Technology Joint work with Georgios Kotsalis & Malachi Jones ARO MURI Annual Review November 15, 2012 Research Thrust:
More informationCyber-Awareness and Games of Incomplete Information
Cyber-Awareness and Games of Incomplete Information Jeff S Shamma Georgia Institute of Technology ARO/MURI Annual Review August 23 24, 2010 Preview Game theoretic modeling formalisms Main issue: Information
More informationInstitute of Electrical and Electronics Engineers (IEEE) 53rd IEEE Conference on Decision and Control
KAUST Repository LP formulation of asymmetric zero-sum stochastic games Item type Conference Paper Authors Li, Lichun; Shamma, Jeff S. Eprint version DOI Publisher Journal Rights Post-print 0.09/CDC.204.7039680
More informationCorrelated Equilibrium in Games with Incomplete Information
Correlated Equilibrium in Games with Incomplete Information Dirk Bergemann and Stephen Morris Econometric Society Summer Meeting June 2012 Robust Predictions Agenda game theoretic predictions are very
More informationComputing Minmax; Dominance
Computing Minmax; Dominance CPSC 532A Lecture 5 Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 1 Lecture Overview 1 Recap 2 Linear Programming 3 Computational Problems Involving Maxmin 4 Domination
More informationInfluencing Social Evolutionary Dynamics
Influencing Social Evolutionary Dynamics Jeff S Shamma Georgia Institute of Technology MURI Kickoff February 13, 2013 Influence in social networks Motivating scenarios: Competing for customers Influencing
More informationPrincipal-Agent Games - Equilibria under Asymmetric Information -
Principal-Agent Games - Equilibria under Asymmetric Information - Ulrich Horst 1 Humboldt-Universität zu Berlin Department of Mathematics and School of Business and Economics Work in progress - Comments
More information, and rewards and transition matrices as shown below:
CSE 50a. Assignment 7 Out: Tue Nov Due: Thu Dec Reading: Sutton & Barto, Chapters -. 7. Policy improvement Consider the Markov decision process (MDP) with two states s {0, }, two actions a {0, }, discount
More informationOn Reputation with Imperfect Monitoring
On Reputation with Imperfect Monitoring M. W. Cripps, G. Mailath, L. Samuelson UCL, Northwestern, Pennsylvania, Yale Theory Workshop Reputation Effects or Equilibrium Robustness Reputation Effects: Kreps,
More informationLong-Run versus Short-Run Player
Repeated Games 1 Long-Run versus Short-Run Player a fixed simultaneous move stage game Player 1 is long-run with discount factor δ actions a A a finite set 1 1 1 1 2 utility u ( a, a ) Player 2 is short-run
More informationGame Theory and Control
Game Theory and Control Jason R. Marden 1 and Jeff S. Shamma 2 1 Department of Electrical and Computer Engineering, University of California, Santa Barbara, Santa Barbara, CA, USA, 93106; jrmarden@ece.ucsb.edu
More informationFirst Prev Next Last Go Back Full Screen Close Quit. Game Theory. Giorgio Fagiolo
Game Theory Giorgio Fagiolo giorgio.fagiolo@univr.it https://mail.sssup.it/ fagiolo/welcome.html Academic Year 2005-2006 University of Verona Summary 1. Why Game Theory? 2. Cooperative vs. Noncooperative
More informationCoordinating over Signals
Coordinating over Signals Jeff S. Shamma Behrouz Touri & Kwang-Ki Kim School of Electrical and Computer Engineering Georgia Institute of Technology ARO MURI Program Review March 18, 2014 Jeff S. Shamma
More informationPersuading Skeptics and Reaffirming Believers
Persuading Skeptics and Reaffirming Believers May, 31 st, 2014 Becker-Friedman Institute Ricardo Alonso and Odilon Camara Marshall School of Business - USC Introduction Sender wants to influence decisions
More informationComputing Minmax; Dominance
Computing Minmax; Dominance CPSC 532A Lecture 5 Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 1 Lecture Overview 1 Recap 2 Linear Programming 3 Computational Problems Involving Maxmin 4 Domination
More informationGame Theory and Rationality
April 6, 2015 Notation for Strategic Form Games Definition A strategic form game (or normal form game) is defined by 1 The set of players i = {1,..., N} 2 The (usually finite) set of actions A i for each
More informationHigher Order Beliefs in Dynamic Environments
University of Pennsylvania Department of Economics June 22, 2008 Introduction: Higher Order Beliefs Global Games (Carlsson and Van Damme, 1993): A B A 0, 0 0, θ 2 B θ 2, 0 θ, θ Dominance Regions: A if
More informationExperts in a Markov Decision Process
University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 2004 Experts in a Markov Decision Process Eyal Even-Dar Sham Kakade University of Pennsylvania Yishay Mansour Follow
More informationTheory Field Examination Game Theory (209A) Jan Question 1 (duopoly games with imperfect information)
Theory Field Examination Game Theory (209A) Jan 200 Good luck!!! Question (duopoly games with imperfect information) Consider a duopoly game in which the inverse demand function is linear where it is positive
More informationAbstract. This chapter deals with stochastic games where the state is not publicly known.
STOCHASTIC GAMES WITH INCOMPLETE INFORMATION SYLVAIN SORIN Université P. et M. Curie and École Polytechnique Paris, France Abstract. This chapter deals with stochastic games where the state is not publicly
More informationBandit models: a tutorial
Gdt COS, December 3rd, 2015 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions) Bandit game: a each round t, an agent chooses
More informationPracticable Robust Markov Decision Processes
Practicable Robust Markov Decision Processes Huan Xu Department of Mechanical Engineering National University of Singapore Joint work with Shiau-Hong Lim (IBM), Shie Mannor (Techion), Ofir Mebel (Apple)
More informationDecision Theory: Markov Decision Processes
Decision Theory: Markov Decision Processes CPSC 322 Lecture 33 March 31, 2006 Textbook 12.5 Decision Theory: Markov Decision Processes CPSC 322 Lecture 33, Slide 1 Lecture Overview Recap Rewards and Policies
More informationMotivation for introducing probabilities
for introducing probabilities Reaching the goals is often not sufficient: it is important that the expected costs do not outweigh the benefit of reaching the goals. 1 Objective: maximize benefits - costs.
More informationA Rothschild-Stiglitz approach to Bayesian persuasion
A Rothschild-Stiglitz approach to Bayesian persuasion Matthew Gentzkow and Emir Kamenica Stanford University and University of Chicago January 2016 Consider a situation where one person, call him Sender,
More informationA Rothschild-Stiglitz approach to Bayesian persuasion
A Rothschild-Stiglitz approach to Bayesian persuasion Matthew Gentzkow and Emir Kamenica Stanford University and University of Chicago December 2015 Abstract Rothschild and Stiglitz (1970) represent random
More informationComputing Equilibria of Repeated And Dynamic Games
Computing Equilibria of Repeated And Dynamic Games Şevin Yeltekin Carnegie Mellon University ICE 2012 July 2012 1 / 44 Introduction Repeated and dynamic games have been used to model dynamic interactions
More informationarxiv: v1 [math.pr] 21 Jul 2014
Optimal Dynamic Information Provision arxiv:1407.5649v1 [math.pr] 21 Jul 2014 Jérôme Renault, Eilon Solan, and Nicolas Vieille August 20, 2018 Abstract We study a dynamic model of information provision.
More informationA 2-PERSON GAME WITH LACK OF INFORMATION ON 1^ SIDES*
MATHEMATICS OF OPERATIONS RESEARCH Vol. 10. No. I. February 1985 Priraed in U.S.A. A 2-PERSON GAME WITH LACK OF INFORMATION ON 1^ SIDES* SYLVAIN SORINt AND S H M U E L Z A M I R We consider a repeated
More informationCrowdsourcing & Optimal Budget Allocation in Crowd Labeling
Crowdsourcing & Optimal Budget Allocation in Crowd Labeling Madhav Mohandas, Richard Zhu, Vincent Zhuang May 5, 2016 Table of Contents 1. Intro to Crowdsourcing 2. The Problem 3. Knowledge Gradient Algorithm
More informationComplexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning
Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning Christos Dimitrakakis Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands
More informationA Polynomial-time Nash Equilibrium Algorithm for Repeated Games
A Polynomial-time Nash Equilibrium Algorithm for Repeated Games Michael L. Littman mlittman@cs.rutgers.edu Rutgers University Peter Stone pstone@cs.utexas.edu The University of Texas at Austin Main Result
More informationREVERSIBILITY AND OSCILLATIONS IN ZERO-SUM DISCOUNTED STOCHASTIC GAMES
REVERSIBILITY AND OSCILLATIONS IN ZERO-SUM DISCOUNTED STOCHASTIC GAMES Sylvain Sorin, Guillaume Vigeral To cite this version: Sylvain Sorin, Guillaume Vigeral. REVERSIBILITY AND OSCILLATIONS IN ZERO-SUM
More informationLECTURE 2. Convexity and related notions. Last time: mutual information: definitions and properties. Lecture outline
LECTURE 2 Convexity and related notions Last time: Goals and mechanics of the class notation entropy: definitions and properties mutual information: definitions and properties Lecture outline Convexity
More informationArea I: Contract Theory Question (Econ 206)
Theory Field Exam Winter 2011 Instructions You must complete two of the three areas (the areas being (I) contract theory, (II) game theory, and (III) psychology & economics). Be sure to indicate clearly
More informationEquilibria for games with asymmetric information: from guesswork to systematic evaluation
Equilibria for games with asymmetric information: from guesswork to systematic evaluation Achilleas Anastasopoulos anastas@umich.edu EECS Department University of Michigan February 11, 2016 Joint work
More informationMARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti
1 MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti Historical background 2 Original motivation: animal learning Early
More informationBayesian Congestion Control over a Markovian Network Bandwidth Process
Bayesian Congestion Control over a Markovian Network Bandwidth Process Parisa Mansourifard 1/30 Bayesian Congestion Control over a Markovian Network Bandwidth Process Parisa Mansourifard (USC) Joint work
More informationBargaining, Contracts, and Theories of the Firm. Dr. Margaret Meyer Nuffield College
Bargaining, Contracts, and Theories of the Firm Dr. Margaret Meyer Nuffield College 2015 Course Overview 1. Bargaining 2. Hidden information and self-selection Optimal contracting with hidden information
More informationMean-field equilibrium: An approximation approach for large dynamic games
Mean-field equilibrium: An approximation approach for large dynamic games Ramesh Johari Stanford University Sachin Adlakha, Gabriel Y. Weintraub, Andrea Goldsmith Single agent dynamic control Two agents:
More informationMS&E 246: Lecture 12 Static games of incomplete information. Ramesh Johari
MS&E 246: Lecture 12 Static games of incomplete information Ramesh Johari Incomplete information Complete information means the entire structure of the game is common knowledge Incomplete information means
More informationA Rothschild-Stiglitz approach to Bayesian persuasion
A Rothschild-Stiglitz approach to Bayesian persuasion Matthew Gentzkow and Emir Kamenica Stanford University and University of Chicago September 2015 Abstract Rothschild and Stiglitz (1970) introduce a
More informationCS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 12: Approximate Mechanism Design in Multi-Parameter Bayesian Settings
CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 12: Approximate Mechanism Design in Multi-Parameter Bayesian Settings Instructor: Shaddin Dughmi Administrivia HW1 graded, solutions on website
More informationLearning, Games, and Networks
Learning, Games, and Networks Abhishek Sinha Laboratory for Information and Decision Systems MIT ML Talk Series @CNRG December 12, 2016 1 / 44 Outline 1 Prediction With Experts Advice 2 Application to
More informationReinforcement Learning Active Learning
Reinforcement Learning Active Learning Alan Fern * Based in part on slides by Daniel Weld 1 Active Reinforcement Learning So far, we ve assumed agent has a policy We just learned how good it is Now, suppose
More informationSection Notes 9. Midterm 2 Review. Applied Math / Engineering Sciences 121. Week of December 3, 2018
Section Notes 9 Midterm 2 Review Applied Math / Engineering Sciences 121 Week of December 3, 2018 The following list of topics is an overview of the material that was covered in the lectures and sections
More informationA Review of the E 3 Algorithm: Near-Optimal Reinforcement Learning in Polynomial Time
A Review of the E 3 Algorithm: Near-Optimal Reinforcement Learning in Polynomial Time April 16, 2016 Abstract In this exposition we study the E 3 algorithm proposed by Kearns and Singh for reinforcement
More informationLecture 1. Evolution of Market Concentration
Lecture 1 Evolution of Market Concentration Take a look at : Doraszelski and Pakes, A Framework for Applied Dynamic Analysis in IO, Handbook of I.O. Chapter. (see link at syllabus). Matt Shum s notes are
More informationToday s Outline. Recap: MDPs. Bellman Equations. Q-Value Iteration. Bellman Backup 5/7/2012. CSE 473: Artificial Intelligence Reinforcement Learning
CSE 473: Artificial Intelligence Reinforcement Learning Dan Weld Today s Outline Reinforcement Learning Q-value iteration Q-learning Exploration / exploitation Linear function approximation Many slides
More informationMIT Spring 2016
MIT 18.655 Dr. Kempthorne Spring 2016 1 MIT 18.655 Outline 1 2 MIT 18.655 3 Decision Problem: Basic Components P = {P θ : θ Θ} : parametric model. Θ = {θ}: Parameter space. A{a} : Action space. L(θ, a)
More informationNotes on Iterated Expectations Stephen Morris February 2002
Notes on Iterated Expectations Stephen Morris February 2002 1. Introduction Consider the following sequence of numbers. Individual 1's expectation of random variable X; individual 2's expectation of individual
More informationThe value of Markov Chain Games with incomplete information on both sides.
The value of Markov Chain Games with incomplete information on both sides. Fabien Gensbittel, Jérôme Renault To cite this version: Fabien Gensbittel, Jérôme Renault. The value of Markov Chain Games with
More informationMarkov Decision Processes Infinite Horizon Problems
Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld 1 What is a solution to an MDP? MDP Planning Problem: Input: an MDP (S,A,R,T)
More informationSome notes on Markov Decision Theory
Some notes on Markov Decision Theory Nikolaos Laoutaris laoutaris@di.uoa.gr January, 2004 1 Markov Decision Theory[1, 2, 3, 4] provides a methodology for the analysis of probabilistic sequential decision
More informationSequential Decision Problems
Sequential Decision Problems Michael A. Goodrich November 10, 2006 If I make changes to these notes after they are posted and if these changes are important (beyond cosmetic), the changes will highlighted
More informationNew Approaches and Recent Advances in Two-Person Zero-Sum Repeated Games
New Approaches and Recent Advances in Two-Person Zero-Sum Repeated Games Sylvain Sorin Laboratoire d Econométrie Ecole Polytechnique 1 rue Descartes 75005 Paris, France and Equipe Combinatoire et Optimisation
More informationChapter 9. Mixed Extensions. 9.1 Mixed strategies
Chapter 9 Mixed Extensions We now study a special case of infinite strategic games that are obtained in a canonic way from the finite games, by allowing mixed strategies. Below [0, 1] stands for the real
More informationGame Theory and Control
Annu. Rev. Control Robot. Auton. Syst. 2018. 1:2.1 2.30 The Annual Review of Control, Robotics, and Autonomous Systems is online at control.annualreviews.org https://doi.org/10.1146/annurev-control-060117-105102
More informationCSC321 Lecture 22: Q-Learning
CSC321 Lecture 22: Q-Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Q-Learning 1 / 21 Overview Second of 3 lectures on reinforcement learning Last time: policy gradient (e.g. REINFORCE) Optimize
More informationEconomics 2010c: Lecture 2 Iterative Methods in Dynamic Programming
Economics 2010c: Lecture 2 Iterative Methods in Dynamic Programming David Laibson 9/04/2014 Outline: 1. Functional operators 2. Iterative solutions for the Bellman Equation 3. Contraction Mapping Theorem
More informationU Logo Use Guidelines
Information Theory Lecture 3: Applications to Machine Learning U Logo Use Guidelines Mark Reid logo is a contemporary n of our heritage. presents our name, d and our motto: arn the nature of things. authenticity
More informationArtificial Intelligence
Artificial Intelligence Dynamic Programming Marc Toussaint University of Stuttgart Winter 2018/19 Motivation: So far we focussed on tree search-like solvers for decision problems. There is a second important
More informationStrategic resource allocation
Strategic resource allocation Patrick Loiseau, EURECOM (Sophia-Antipolis) Graduate Summer School: Games and Contracts for Cyber-Physical Security IPAM, UCLA, July 2015 North American Aerospace Defense
More informationOn the Total Variation Distance of Labelled Markov Chains
On the Total Variation Distance of Labelled Markov Chains Taolue Chen Stefan Kiefer Middlesex University London, UK University of Oxford, UK CSL-LICS, Vienna 4 July 04 Labelled Markov Chains (LMCs) a c
More informationDecomposition Methods for Large Scale LP Decoding
Decomposition Methods for Large Scale LP Decoding Siddharth Barman Joint work with Xishuo Liu, Stark Draper, and Ben Recht Outline Background and Problem Setup LP Decoding Formulation Optimization Framework
More informationCoevolutionary Modeling in Networks 1/39
Coevolutionary Modeling in Networks Jeff S. Shamma joint work with Ibrahim Al-Shyoukh & Georgios Chasparis & IMA Workshop on Analysis and Control of Network Dynamics October 19 23, 2015 Jeff S. Shamma
More informationEquilibrium Refinements
Equilibrium Refinements Mihai Manea MIT Sequential Equilibrium In many games information is imperfect and the only subgame is the original game... subgame perfect equilibrium = Nash equilibrium Play starting
More informationLecture Slides - Part 1
Lecture Slides - Part 1 Bengt Holmstrom MIT February 2, 2016. Bengt Holmstrom (MIT) Lecture Slides - Part 1 February 2, 2016. 1 / 36 Going to raise the level a little because 14.281 is now taught by Juuso
More informationFictitious Self-Play in Extensive-Form Games
Johannes Heinrich, Marc Lanctot, David Silver University College London, Google DeepMind July 9, 05 Problem Learn from self-play in games with imperfect information. Games: Multi-agent decision making
More informationErgodicity and Non-Ergodicity in Economics
Abstract An stochastic system is called ergodic if it tends in probability to a limiting form that is independent of the initial conditions. Breakdown of ergodicity gives rise to path dependence. We illustrate
More informationTheory and Internet Protocols
Game Lecture 2: Linear Programming and Zero Sum Nash Equilibrium Xiaotie Deng AIMS Lab Department of Computer Science Shanghai Jiaotong University September 26, 2016 1 2 3 4 Standard Form (P) Outline
More informationMulti-armed bandit models: a tutorial
Multi-armed bandit models: a tutorial CERMICS seminar, March 30th, 2016 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions)
More informationReputations. Larry Samuelson. Yale University. February 13, 2013
Reputations Larry Samuelson Yale University February 13, 2013 I. Introduction I.1 An Example: The Chain Store Game Consider the chain-store game: Out In Acquiesce 5, 0 2, 2 F ight 5,0 1, 1 If played once,
More information1 AUTOCRATIC STRATEGIES
AUTOCRATIC STRATEGIES. ORIGINAL DISCOVERY Recall that the transition matrix M for two interacting players X and Y with memory-one strategies p and q, respectively, is given by p R q R p R ( q R ) ( p R
More informationA Theory of Financing Constraints and Firm Dynamics by Clementi and Hopenhayn - Quarterly Journal of Economics (2006)
A Theory of Financing Constraints and Firm Dynamics by Clementi and Hopenhayn - Quarterly Journal of Economics (2006) A Presentation for Corporate Finance 1 Graduate School of Economics December, 2009
More informationLINEAR PROGRAMMING III
LINEAR PROGRAMMING III ellipsoid algorithm combinatorial optimization matrix games open problems Lecture slides by Kevin Wayne Last updated on 7/25/17 11:09 AM LINEAR PROGRAMMING III ellipsoid algorithm
More informationPre-Bayesian Games. Krzysztof R. Apt. CWI, Amsterdam, the Netherlands, University of Amsterdam. (so not Krzystof and definitely not Krystof)
Pre-Bayesian Games Krzysztof R. Apt (so not Krzystof and definitely not Krystof) CWI, Amsterdam, the Netherlands, University of Amsterdam Pre-Bayesian Games p. 1/1 Pre-Bayesian Games (Hyafil, Boutilier
More informationEconomics 209A Theory and Application of Non-Cooperative Games (Fall 2013) Extensive games with perfect information OR6and7,FT3,4and11
Economics 209A Theory and Application of Non-Cooperative Games (Fall 2013) Extensive games with perfect information OR6and7,FT3,4and11 Perfect information A finite extensive game with perfect information
More informationReview of topics since what was covered in the midterm: Topics that we covered before the midterm (also may be included in final):
Review of topics since what was covered in the midterm: Subgame-perfect eqms in extensive games with perfect information where players choose a number (first-order conditions, boundary conditions, favoring
More informationSEQUENTIAL ESTIMATION OF DYNAMIC DISCRETE GAMES. Victor Aguirregabiria (Boston University) and. Pedro Mira (CEMFI) Applied Micro Workshop at Minnesota
SEQUENTIAL ESTIMATION OF DYNAMIC DISCRETE GAMES Victor Aguirregabiria (Boston University) and Pedro Mira (CEMFI) Applied Micro Workshop at Minnesota February 16, 2006 CONTEXT AND MOTIVATION Many interesting
More informationHybrid Machine Learning Algorithms
Hybrid Machine Learning Algorithms Umar Syed Princeton University Includes joint work with: Rob Schapire (Princeton) Nina Mishra, Alex Slivkins (Microsoft) Common Approaches to Machine Learning!! Supervised
More informationZero-sum Stochastic Games
1/53 Jérôme Renault, TSE Université Toulouse Stochastic Methods in Game Theory, Singapore 2015 2/53 Outline Zero-sum stochastic games 1. The basic model: finitely many states and actions 1.1 Description
More informationLecture notes for Analysis of Algorithms : Markov decision processes
Lecture notes for Analysis of Algorithms : Markov decision processes Lecturer: Thomas Dueholm Hansen June 6, 013 Abstract We give an introduction to infinite-horizon Markov decision processes (MDPs) with
More informationGrundlagen der Künstlichen Intelligenz
Grundlagen der Künstlichen Intelligenz Uncertainty & Probabilities & Bandits Daniel Hennes 16.11.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Uncertainty Probability
More informationSmall Sample of Related Literature
UCLA IPAM July 2015 Learning in (infinitely) repeated games with n players. Prediction and stability in one-shot large (many players) games. Prediction and stability in large repeated games (big games).
More information9 - Markov processes and Burt & Allison 1963 AGEC
This document was generated at 8:37 PM on Saturday, March 17, 2018 Copyright 2018 Richard T. Woodward 9 - Markov processes and Burt & Allison 1963 AGEC 642-2018 I. What is a Markov Chain? A Markov chain
More informationBAYES CORRELATED EQUILIBRIUM AND THE COMPARISON OF INFORMATION STRUCTURES IN GAMES. Dirk Bergemann and Stephen Morris
BAYES CORRELATED EQUILIBRIUM AND THE COMPARISON OF INFORMATION STRUCTURES IN GAMES By Dirk Bergemann and Stephen Morris September 203 Revised April 205 COWLES FOUNDATION DISCUSSION PAPER NO. 909RRR COWLES
More informationSome AI Planning Problems
Course Logistics CS533: Intelligent Agents and Decision Making M, W, F: 1:00 1:50 Instructor: Alan Fern (KEC2071) Office hours: by appointment (see me after class or send email) Emailing me: include CS533
More informationDynamic Games with Asymmetric Information: Common Information Based Perfect Bayesian Equilibria and Sequential Decomposition
Dynamic Games with Asymmetric Information: Common Information Based Perfect Bayesian Equilibria and Sequential Decomposition 1 arxiv:1510.07001v1 [cs.gt] 23 Oct 2015 Yi Ouyang, Hamidreza Tavafoghi and
More informationIntertemporal Risk Aversion, Stationarity, and Discounting
Traeger, CES ifo 10 p. 1 Intertemporal Risk Aversion, Stationarity, and Discounting Christian Traeger Department of Agricultural & Resource Economics, UC Berkeley Introduce a more general preference representation
More informationREPEATED GAMES. Jörgen Weibull. April 13, 2010
REPEATED GAMES Jörgen Weibull April 13, 2010 Q1: Can repetition induce cooperation? Peace and war Oligopolistic collusion Cooperation in the tragedy of the commons Q2: Can a game be repeated? Game protocols
More informationElements of Reinforcement Learning
Elements of Reinforcement Learning Policy: way learning algorithm behaves (mapping from state to action) Reward function: Mapping of state action pair to reward or cost Value function: long term reward,
More information6.254 : Game Theory with Engineering Applications Lecture 13: Extensive Form Games
6.254 : Game Theory with Engineering Lecture 13: Extensive Form Games Asu Ozdaglar MIT March 18, 2010 1 Introduction Outline Extensive Form Games with Perfect Information One-stage Deviation Principle
More informationBayesian Contextual Multi-armed Bandits
Bayesian Contextual Multi-armed Bandits Xiaoting Zhao Joint Work with Peter I. Frazier School of Operations Research and Information Engineering Cornell University October 22, 2012 1 / 33 Outline 1 Motivating
More informationThema Working Paper n Université de Cergy Pontoise, France. Hölder Continuous Implementation. Oury Marion
Thema Working Paper n 2010-06 Université de Cergy Pontoise, France Hölder Continuous Implementation Oury Marion November, 2010 Hölder Continuous Implementation Marion Oury November 2010 Abstract Building
More informationBayes Correlated Equilibrium and Comparing Information Structures
Bayes Correlated Equilibrium and Comparing Information Structures Dirk Bergemann and Stephen Morris Spring 2013: 521 B Introduction game theoretic predictions are very sensitive to "information structure"
More information6.254 : Game Theory with Engineering Applications Lecture 7: Supermodular Games
6.254 : Game Theory with Engineering Applications Lecture 7: Asu Ozdaglar MIT February 25, 2010 1 Introduction Outline Uniqueness of a Pure Nash Equilibrium for Continuous Games Reading: Rosen J.B., Existence
More informationNear-Potential Games: Geometry and Dynamics
Near-Potential Games: Geometry and Dynamics Ozan Candogan, Asuman Ozdaglar and Pablo A. Parrilo January 29, 2012 Abstract Potential games are a special class of games for which many adaptive user dynamics
More informationReinforcement Learning
CS7/CS7 Fall 005 Supervised Learning: Training examples: (x,y) Direct feedback y for each input x Sequence of decisions with eventual feedback No teacher that critiques individual actions Learn to act
More informationQ-Learning in Continuous State Action Spaces
Q-Learning in Continuous State Action Spaces Alex Irpan alexirpan@berkeley.edu December 5, 2015 Contents 1 Introduction 1 2 Background 1 3 Q-Learning 2 4 Q-Learning In Continuous Spaces 4 5 Experimental
More information