Lecture 6 Lecturer: Shaddin Dughmi Scribes: Omkar Thakoor & Umang Gupta
|
|
- Sabrina Craig
- 5 years ago
- Views:
Transcription
1 CSCI699: opics in Learning & Game heory Lecture 6 Lecturer: Shaddin Dughmi Scribes: Omkar hakoor & Umang Gupta No regret learning in zero-sum games Definition. A 2-player game of complete information is said to be zero-sum if U 2 (a, a 2 ) = U (a, a 2 ) action profiles (a, a 2 ) Such a game can be described by an n m matri A, where n, m are the number of actions of P, P 2 resp. and we have, A ij = U (i, j) ( U 2 (i, j) = A ij ) () Eample We can represent the well-known Rock-Paper-Scissor game with the following matri (able 2 shows the general-sum game representation of this game.) R P S R 0 + P + 0 S + 0 Net, let n, y m be mied strategies of P, P 2. hen, P plays his action i with probability i, and P 2 plays his action j with probability y j, and thus P gets a payoff A ij with probability i y j. Hence, his epected payoff can be written as, n m U (, y) = i y j A ij = Ay i= j= (=> U 2 (, y) = Ay) Recall that, by definition, (,y) form a Nash Equilibrium iff U (, y) U (, y) & U 2 (, y) U 2 (, y ) y Ay Ay & Ay Ay y Equivalently, we can reduce it to, Ay (Ay) i i [n] & Ay ( A) j j [m]
2 NO REGRE LEARNING IN ZERO-SUM GAMES 2 Definition 2. he maimin value of A is ma min y Ay = ma min j [m] ( A) j Further, we call any argma min y Ay as P s maimin strategy. Similarly, we also define the following: Definition 3. he minima value of A is min y ma Ay = min y ma (Ay) i i [n] Further, we call any y argmin ma y Ay as P 2 s minima strategy. An interesting question that arises, is, Which is better, moving first or second?. o answer this, we first show the following result, which is also known as the weak 2 nd mover s advantage. heorem. maimin(a) minima(a) Proof. maimin(a) = ma min y Ay = min y Ay (By definition of (Def. 2)) Ay ma Ay = min y (By definition of min) (By definition of ma) ma Ay (By definition of y (Def. 3)) = minima(a) Net, we prove, that the two values are in fact equal. heorem 2 (Minima hm). maimin(a) = minima(a) Before proceeding with the proof of the theorem, note the following consequence of the theorem. Corollary 3. (, y ) is a Nash Equilibrium of the simultaneous game.
3 NO REGRE LEARNING IN ZERO-SUM GAMES 3 Proof. Using heorem 2, it follows that every inequality in the proof of heorem is, in fact, an equality. Hence, Ay = ma Ay & Ay = min Ay y hus,, y are best responses to each other, making (, y ) a Nash Equilibrium by definition. Net, we present the proof of heorem 2. Proof. (Minima hm) Consider the following setting. - Assume that the two players play repeatedly times where is large. - At each step t =,...,, they play simultaneously: P, P 2 choose t, y t respectively. P sees all the previous iterations before choosing t, but not see y t. Similarly, P 2 sees all the previous iterations before choosing y t, but not see t. - Utility of P at time t is t Ay t, which is also the loss of P 2 at time t. - Let Ui t = (Ay t ) i be the utility of action i for P at time t. Similarly, let Cj t = ( t A) j be the cost of action j for P 2 at time t. - Each player faces an online learning problem: P must select t based only on Ui,..., Ui t i. similarly, P 2 must select y t based only on Cj,..., Cj t j. - Assume each uses a no-eternal regret algorithm (e.g. Multiplicative Weights). - Let δ( ) be bound on eternal regret. - Let v = t Ay t t= hen, we have, v ma (Ay t ) i δ( ) i t= ( t= y t ) = ma A δ( ) i i min ma(ay) i δ( ) y i = minima(a) δ( ) Similarly, from P 2 s perspective, we can get, Combining the two inequalities, v maimin(a) + δ( ) minima(a) maimin(a) + 2δ( ) (since no-et regret) minima(a) = maimin(a) (since δ( ) 0 as )
4 2 NO REGRE LEARNING IN GENERAL-SUM GAMES 4 As a result of this, the following immediately follows by definition: Corollary 4. ( t= t, ) y t t= is δ-ne. 2 No regret learning in general-sum games In this section, we discuss no regret learning in general-sum games. We talk about equilibrium criterion Correlated equilibrium & Coarse correlated equilibrium and we talk about their eistence in general sum games. We introduce a new regret measure/benchmark called swap regret and an algorithm that has vanishing swap regret. Recall that the Games of Complete Information have : N players, n =,..., n Action set A i for each player i; A = A A 2... A n utility function u i : A [, ] for player i; u i (a, a 2..., a n ) is player s utility when players play A = (a,..., a n ) Definition 4. Correlated equilibrium is a distribution χ over A such that i, j, j A i E a χ [u i (a) a i = j] E a χ [u i (j, a i ) a i = j] In other words, if player i is recommended an action j, he should choose to move j. his is similar to saying that i, s : A i A i (swapping function) E a χ [u i (a)] E a χ [u i (s(a i ), a i )] Note that knowing his own recommended action player can infer something about other player s move yet he is better off playing the recommended action. For eample, consider the chicken-dare game in which players can choose to chickenout or dare. Pay-offs are as mentioned in table 2. If players are recommended action profiles (C,D), (D,C) & (C,C) with equal probability, it is a Correlated equilibrium. None of the players would want to deviate from recommended strategy in this case. Consider player if it is recommended to chicken out, it knows other player is going to dare and chicken out with equal probability and so player s pay off is better off not deviating. If player is recommended dare, it knows other player is going to chicken out and hence won t deviate.
5 2 NO REGRE LEARNING IN GENERAL-SUM GAMES 5 Chicken Dare Chicken 0, 0-2, Dare, -2-0, -0 able : Pay-off in Chicken Dare Game Definition 5. Coarse correlated equilibrium is a distribution χ such that i, j E a χ [u i (a)] E a χ [u i (a j, a i )] Above definition of Coarse correlated equilibrium states that the agent is not going to make any profit by deviating to some other constant/fied action. Note the subtle difference in definition of Coarse correlated equilibrium from that of Correlated equilibrium. Note that Correlated equilibrium uses a smart swap function. herefore, Coarse correlated equilibrium is a weaker equilibrium criteria as there is no correlation between swap and the recommended action in general as compared to Correlated equilibrium. Another way to look at it is to compute Correlated equilibrium, we assume players have more knowledge and hence can respond better in which case it becomes difficult to make them obey. DominantStrategy N asheq. CorrelatedEq. CoarseCorrelatedEq. For eample, consider the rock-paper-scissor game and the pay off matri given as in table 2. Choosing all the actions equally likely ecept the diagonal once is a Coarse correlated equilibrium. o understand why, suppose the player is playing rock, player 2 will respond with either paper or scissor with equal probability, but if he starts responding with paper higher possibility he will incur more loss when player plays scissor (recall player play rock as well scissor with equal probability). Hence, player 2 is not better off playing other strategies. his is not correlated equilibrium strategy. If player 2 is instructed to play paper. He knows that other player is playing either rock or scissor and player2 s average pay off is 0. He can change his move to rock and improve pay off to /2 as against to 0 if he plays recommended action. In this case, player2 could eploit knowledge of action recommended to him to improve pay off.
6 2 NO REGRE LEARNING IN GENERAL-SUM GAMES 6 Rock Paper Scissor Rock 0,0 -,,- Paper,- -0,0 -, Scissor -,-,- 0,0 able 2: Pay-off in rock, paper, scissor Game Definition 6. δ-correlated equilibrium (or approimate Correlated equilibrium ) is a distribution χ over A such that i, j, j A i & δ > 0 E a χ [u i (a) a i = j] E a χ [u i (j, a i ) a i = j] δ In other words, if player i is recommended action j, he is only better off deviating from his action by a small quantity δ. his is similar to saying that i, s : A i A i (swapping function) E a χ [u i (a)] E a χ [u i (s(a i ), a i )] δ Definition 7. δ-coarse correlated equilibrium (or approimate Coarse correlated equilibrium ) is a distribution χ such that δ > 0 & i, j E a χ [u i (a)] E a χ [u i (a j, a i )] δ Note that in both δ-correlated equilibrium & δ-coarse correlated equilibrium inequalities can slack by an arbitrary δ from the Correlated equilibrium & Coarse correlated equilibrium criterion. 2. Eistence of δ-coarse correlated equilibrium heorem 5. Fi a game of complete information, suppose that player plays the game repeatedly times & each player uses a vanishing eternal regret algorithm. hen, the time averaged mied strategy profiles form an approimate Coarse correlated equilibrium Formally, Let Pi t D(A i ) be player i s mied strategy at time t. Let χ t = P t... Pn t & Let χ = i= t be the time averaged joint action profile distribution, hen χ is a δ-coarse correlated equilibrium where δ( ) 0 as Note that χ can be realized in two steps by sampling time t from uniform {, }, then sampling from χ t
7 2 NO REGRE LEARNING IN GENERAL-SUM GAMES 7 Proof. Fi player i & assume it uses a vanishing eternal regret algorithm to choose his action. Vanishing eternal regret implies that switching to fied action j in hindsight cannot give more that δ( ) per time step & on an average δ( ) 0 as. Recall, Multiplicative weights is one such vanishing eternal regret algorithm. For a vanishing eternal regret algorithm, we know that at time, E a t χ t[u i(a t )] E a t t χ t[u i(a t j, a t i)] δ( ) t for any j Since epectations are linear so we reconsider the above equations as picking some t at random & get action from it. herefore the above eq. can be approimately written using χ as follows E a χ [u i (a)] E a χ [u i (a j, a i )] δ( ) Above relation clearly implies that δ-coarse correlated equilibrium eists in above scenario and it can be achieved if agents use a vanishing eternal regret algorithm. Corollary 6. δ-coarse correlated equilibrium eists for every δ > 0 and every finite game. Moreover, a vanishing eternal regret learning agent dynamics arrive at such δ-coarse correlated equilibrium 2.2 Swap Regret Definition 8. Fi an online learning environment, where an adversary chooses cost function at each time, C... C : A [, ]. An online learning algorithm which plays a mied strategy P t D(A) at time t has swap regret as follows swap-regret t alg = ( ) E(C t (j)) min E(C t (s(j))) s:a A t= t= s is the swap function. hus, we say that an algorithm has vanishing swap regret (or no swap regret) if swap-regret alg 0 adversaries, C t (a) Note that swap-regret criteria is a stronger criteria than the eternal regret criteria in that swap function is smart, i.e. it can look at the player s action and make the swap depending on the action as against choosing a fied action in eternal regret criteria. he main idea is that :
8 2 NO REGRE LEARNING IN GENERAL-SUM GAMES 8 Swap-regret benchmark moves around algorithm. It imposes some type of local optimality. It has local modification of action. 2.3 Eistence of δ-correlated equilibrium heorem 7. Fi a game of complete information, suppose that player play the game repeatedly times & each player uses a vanishing swap regret algorithm. hen, the time averaged mied strategy profiles forms an approimate Correlated equilibrium. Formally, Let Pi t D(A i ) be player i s mied strategy at time t Let χ t = P t... Pn t & Let χ = i= t be the time averaged joint action profile distribution hen, χ is a δ-correlated equilibrium where δ( ) 0 as Note that can be realized in two steps by sampling uniform {, } and then sampling from t Proof. Fi player i and use a vanishing swap regret algorithm to choose action. No swap regret implies that swapping the action with a action j in hindsight cannot give more that δ( ) per time step and on an average δ( ) 0 as. For a vanishing swap regret algorithm we know that at time E a t χ t[u i(a t )] E a t t χ t[u i(s(a t i), a t i)] δ( ) t where, s : A A is swap function Since epectations are linear so we consider the above equation as picking some t at random & get action from it. here the above eq. can be approimately written using χ as follows E a χ [u i (a)] E a χ [u i (a j, a i )] δ( ) Above eq. implies that δ-coarse correlated equilibrium eists in above scenario and it can be achieved if agents use a no swap regret algorithm. Corollary 8. δ-correlated equilibrium eists for every δ > 0 and every finite game. Moreover, a no swap regret learning agent dynamics arrive at such δ-correlated equilibrium
9 2 NO REGRE LEARNING IN GENERAL-SUM GAMES 9 So far, we have shown that given an algorithm with vanishing swap regret, δ- correlated equilibrium always eists. Net we try to establish eistence of a no swap regret algorithm 2.4 Eistence of a no swap regret algorithm heorem 9. In any online learning setting with finitely many actions A (at ma n), there is a reduction from no swap regret algorithm to a no eternal regret algorithm. q t D(A) p t C t Alg No eternal regret P t D(A) q t 2 D(A) C t (A) Alg. to compute p t i, q t i (No Swap Regret) p t 2C t Alg 2 No eternal regret q t n D(A) p t nc t Alg n No eternal Regret Figure : No Swap Regret Algorithm Proof. he algorithm for the above is as follows:
10 2 NO REGRE LEARNING IN GENERAL-SUM GAMES 0 Instantiate n copies of vanishing eternal regret algorithm. Each instance of algorithm gets some proportion of cast. his is intuitively interpreted in following way First instance of no eternal regret algorithm is incharge for swapping action to some other fied action, and similarly for other instances. See fig. Now, for each time step t =... Receive q t... q t n D(A) from Alg... Alg n Aggregate q t... q t n to get P t Play according to P t Receive C t (A) [, ] Compute and give p t ic t to each of Alg... Alg n o achieve no swap regret we need for any s : A A for E a Pt [C t (i)] E a Pt [C t (s(i))] + δ( ) i P t i C t (i) i P t i C t (s(i)) + δ( ) (2) We know Alg... Alg n have vanishing eternal regret for each Alg j Let, k = argmin c t (j) j c t = p t jc t E i q t j [c t (i)] E i q t j [c t (k)] + δ ( ) qjip t j C t (i) p j C t (k) + δ ( ) j =... n (3) i We need to prove eq. 2. Since eq. 3 holds for any k, therefore it should also hold for any s : A A as it will map to one of the k. We can write,
11 2 NO REGRE LEARNING IN GENERAL-SUM GAMES qjip t j C t (i) i Adding all eq. from... n. p j C t (s(i)) + δ ( ) j =... n qjip t j C t (i) p j C t (s(i)) + δ ( ) j =... n (4) i j j By eq. 4 & eq. 2, we can conclude that algorithm should have t j q ji p t j = P t i = p t j Assume, P t i, p t j are vectors and q ji is a matri of transition probabilities from j to i. We want p t Q t = P t = P t Q t (5) We know that Every markov chain Q has a stationary dist. P. Moreover such a distribution can be computed efficiently in O(No. of States = n). herefore, eq. 5 has a solution. herefore, we can create a no swap regret algorithm from n instance of no eternal regret algorithm. his proves the eistence of no swap regret algorithm and hence δ-correlated equilibrium t
COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #24 Scribe: Jad Bechara May 2, 2018
COS 5: heoretical Machine Learning Lecturer: Rob Schapire Lecture #24 Scribe: Jad Bechara May 2, 208 Review of Game heory he games we will discuss are two-player games that can be modeled by a game matrix
More informationCS261: A Second Course in Algorithms Lecture #11: Online Learning and the Multiplicative Weights Algorithm
CS61: A Second Course in Algorithms Lecture #11: Online Learning and the Multiplicative Weights Algorithm Tim Roughgarden February 9, 016 1 Online Algorithms This lecture begins the third module of the
More informationComputational Game Theory Spring Semester, 2005/6. Lecturer: Yishay Mansour Scribe: Ilan Cohen, Natan Rubin, Ophir Bleiberg*
Computational Game Theory Spring Semester, 2005/6 Lecture 5: 2-Player Zero Sum Games Lecturer: Yishay Mansour Scribe: Ilan Cohen, Natan Rubin, Ophir Bleiberg* 1 5.1 2-Player Zero Sum Games In this lecture
More informationCS261: A Second Course in Algorithms Lecture #12: Applications of Multiplicative Weights to Games and Linear Programs
CS26: A Second Course in Algorithms Lecture #2: Applications of Multiplicative Weights to Games and Linear Programs Tim Roughgarden February, 206 Extensions of the Multiplicative Weights Guarantee Last
More informationCOS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture 24 Scribe: Sachin Ravi May 2, 2013
COS 5: heoretical Machine Learning Lecturer: Rob Schapire Lecture 24 Scribe: Sachin Ravi May 2, 203 Review of Zero-Sum Games At the end of last lecture, we discussed a model for two player games (call
More informationGame Theory. Greg Plaxton Theory in Programming Practice, Spring 2004 Department of Computer Science University of Texas at Austin
Game Theory Greg Plaxton Theory in Programming Practice, Spring 2004 Department of Computer Science University of Texas at Austin Bimatrix Games We are given two real m n matrices A = (a ij ), B = (b ij
More informationZero-Sum Games Public Strategies Minimax Theorem and Nash Equilibria Appendix. Zero-Sum Games. Algorithmic Game Theory.
Public Strategies Minimax Theorem and Nash Equilibria Appendix 2013 Public Strategies Minimax Theorem and Nash Equilibria Appendix Definition Definition A zero-sum game is a strategic game, in which for
More informationCS260: Machine Learning Theory Lecture 12: No Regret and the Minimax Theorem of Game Theory November 2, 2011
CS260: Machine Learning heory Lecture 2: No Regret and the Minimax heorem of Game heory November 2, 20 Lecturer: Jennifer Wortman Vaughan Regret Bound for Randomized Weighted Majority In the last class,
More informationBregman Divergence and Mirror Descent
Bregman Divergence and Mirror Descent Bregman Divergence Motivation Generalize squared Euclidean distance to a class of distances that all share similar properties Lots of applications in machine learning,
More informationGame Theory, On-line prediction and Boosting (Freund, Schapire)
Game heory, On-line prediction and Boosting (Freund, Schapire) Idan Attias /4/208 INRODUCION he purpose of this paper is to bring out the close connection between game theory, on-line prediction and boosting,
More informationLecture 4.2 Finite Difference Approximation
Lecture 4. Finite Difference Approimation 1 Discretization As stated in Lecture 1.0, there are three steps in numerically solving the differential equations. They are: 1. Discretization of the domain by
More informationLecture notes for Analysis of Algorithms : Markov decision processes
Lecture notes for Analysis of Algorithms : Markov decision processes Lecturer: Thomas Dueholm Hansen June 6, 013 Abstract We give an introduction to infinite-horizon Markov decision processes (MDPs) with
More informationLecture 9: SVD, Low Rank Approximation
CSE 521: Design and Analysis of Algorithms I Spring 2016 Lecture 9: SVD, Low Rank Approimation Lecturer: Shayan Oveis Gharan April 25th Scribe: Koosha Khalvati Disclaimer: hese notes have not been subjected
More informationMS&E 246: Lecture 17 Network routing. Ramesh Johari
MS&E 246: Lecture 17 Network routing Ramesh Johari Network routing Basic definitions Wardrop equilibrium Braess paradox Implications Network routing N users travel across a network Transportation Internet
More informationA (Brief) Introduction to Game Theory
A (Brief) Introduction to Game Theory Johanne Cohen PRiSM/CNRS, Versailles, France. Goal Goal is a Nash equilibrium. Today The game of Chicken Definitions Nash Equilibrium Rock-paper-scissors Game Mixed
More informationAlgorithmic Game Theory and Applications. Lecture 4: 2-player zero-sum games, and the Minimax Theorem
Algorithmic Game Theory and Applications Lecture 4: 2-player zero-sum games, and the Minimax Theorem Kousha Etessami 2-person zero-sum games A finite 2-person zero-sum (2p-zs) strategic game Γ, is a strategic
More informationLecture December 2009 Fall 2009 Scribe: R. Ring In this lecture we will talk about
0368.4170: Cryptography and Game Theory Ran Canetti and Alon Rosen Lecture 7 02 December 2009 Fall 2009 Scribe: R. Ring In this lecture we will talk about Two-Player zero-sum games (min-max theorem) Mixed
More informationGame Theory: Lecture 3
Game Theory: Lecture 3 Lecturer: Pingzhong Tang Topic: Mixed strategy Scribe: Yuan Deng March 16, 2015 Definition 1 (Mixed strategy). A mixed strategy S i : A i [0, 1] assigns a probability S i ( ) 0 to
More informationLecture 3: September 10
CS294 Markov Chain Monte Carlo: Foundations & Applications Fall 2009 Lecture 3: September 10 Lecturer: Prof. Alistair Sinclair Scribes: Andrew H. Chan, Piyush Srivastava Disclaimer: These notes have not
More informationLecture 4. 1 Examples of Mechanism Design Problems
CSCI699: Topics in Learning and Game Theory Lecture 4 Lecturer: Shaddin Dughmi Scribes: Haifeng Xu,Reem Alfayez 1 Examples of Mechanism Design Problems Example 1: Single Item Auctions. There is a single
More informationSolving Zero-Sum Extensive-Form Games. Branislav Bošanský AE4M36MAS, Fall 2013, Lecture 6
Solving Zero-Sum Extensive-Form Games ranislav ošanský E4M36MS, Fall 2013, Lecture 6 Imperfect Information EFGs States Players 1 2 Information Set ctions Utility Solving II Zero-Sum EFG with perfect recall
More informationNotes on Coursera s Game Theory
Notes on Coursera s Game Theory Manoel Horta Ribeiro Week 01: Introduction and Overview Game theory is about self interested agents interacting within a specific set of rules. Self-Interested Agents have
More information1 Markov decision processes
2.997 Decision-Making in Large-Scale Systems February 4 MI, Spring 2004 Handout #1 Lecture Note 1 1 Markov decision processes In this class we will study discrete-time stochastic systems. We can describe
More informationGame Theory. 2.1 Zero Sum Games (Part 2) George Mason University, Spring 2018
Game Theory 2.1 Zero Sum Games (Part 2) George Mason University, Spring 2018 The Nash equilibria of two-player, zero-sum games have various nice properties. Minimax Condition A pair of strategies is in
More informationMS&E 246: Lecture 12 Static games of incomplete information. Ramesh Johari
MS&E 246: Lecture 12 Static games of incomplete information Ramesh Johari Incomplete information Complete information means the entire structure of the game is common knowledge Incomplete information means
More informationThe Algorithmic Foundations of Adaptive Data Analysis November, Lecture The Multiplicative Weights Algorithm
he Algorithmic Foundations of Adaptive Data Analysis November, 207 Lecture 5-6 Lecturer: Aaron Roth Scribe: Aaron Roth he Multiplicative Weights Algorithm In this lecture, we define and analyze a classic,
More informationGame Theory -- Lecture 4. Patrick Loiseau EURECOM Fall 2016
Game Theory -- Lecture 4 Patrick Loiseau EURECOM Fall 2016 1 Lecture 2-3 recap Proved existence of pure strategy Nash equilibrium in games with compact convex action sets and continuous concave utilities
More informationSocial Network Games
CWI and University of Amsterdam Based on joint orks ith Evangelos Markakis and Sunil Simon The model Social netork ([Apt, Markakis 2011]) Weighted directed graph: G = (V,,), here V: a finite set of agents,
More informationMotivating examples Introduction to algorithms Simplex algorithm. On a particular example General algorithm. Duality An application to game theory
Instructor: Shengyu Zhang 1 LP Motivating examples Introduction to algorithms Simplex algorithm On a particular example General algorithm Duality An application to game theory 2 Example 1: profit maximization
More informationConvergence and No-Regret in Multiagent Learning
Convergence and No-Regret in Multiagent Learning Michael Bowling Department of Computing Science University of Alberta Edmonton, Alberta Canada T6G 2E8 bowling@cs.ualberta.ca Abstract Learning in a multiagent
More informationprinceton univ. F 13 cos 521: Advanced Algorithm Design Lecture 17: Duality and MinMax Theorem Lecturer: Sanjeev Arora
princeton univ F 13 cos 521: Advanced Algorithm Design Lecture 17: Duality and MinMax Theorem Lecturer: Sanjeev Arora Scribe: Today we first see LP duality, which will then be explored a bit more in the
More informationAlgorithmic Game Theory and Applications. Lecture 7: The LP Duality Theorem
Algorithmic Game Theory and Applications Lecture 7: The LP Duality Theorem Kousha Etessami recall LP s in Primal Form 1 Maximize c 1 x 1 + c 2 x 2 +... + c n x n a 1,1 x 1 + a 1,2 x 2 +... + a 1,n x n
More informationChapter 9. Mixed Extensions. 9.1 Mixed strategies
Chapter 9 Mixed Extensions We now study a special case of infinite strategic games that are obtained in a canonic way from the finite games, by allowing mixed strategies. Below [0, 1] stands for the real
More informationCS364A: Algorithmic Game Theory Lecture #13: Potential Games; A Hierarchy of Equilibria
CS364A: Algorithmic Game Theory Lecture #13: Potential Games; A Hierarchy of Equilibria Tim Roughgarden November 4, 2013 Last lecture we proved that every pure Nash equilibrium of an atomic selfish routing
More information1 Kernel methods & optimization
Machine Learning Class Notes 9-26-13 Prof. David Sontag 1 Kernel methods & optimization One eample of a kernel that is frequently used in practice and which allows for highly non-linear discriminant functions
More informationGame-Theoretic Learning:
Game-Theoretic Learning: Regret Minimization vs. Utility Maximization Amy Greenwald with David Gondek, Amir Jafari, and Casey Marks Brown University University of Pennsylvania November 17, 2004 Background
More informationHybrid Machine Learning Algorithms
Hybrid Machine Learning Algorithms Umar Syed Princeton University Includes joint work with: Rob Schapire (Princeton) Nina Mishra, Alex Slivkins (Microsoft) Common Approaches to Machine Learning!! Supervised
More informationLecture 6: April 25, 2006
Computational Game Theory Spring Semester, 2005/06 Lecture 6: April 25, 2006 Lecturer: Yishay Mansour Scribe: Lior Gavish, Andrey Stolyarenko, Asaph Arnon Partially based on scribe by Nataly Sharkov and
More informationMATH 116, LECTURES 10 & 11: Limits
MATH 6, LECTURES 0 & : Limits Limits In application, we often deal with quantities which are close to other quantities but which cannot be defined eactly. Consider the problem of how a car s speedometer
More informationM340(921) Solutions Practice Problems (c) 2013, Philip D Loewen
M340(921) Solutions Practice Problems (c) 2013, Philip D Loewen 1. Consider a zero-sum game between Claude and Rachel in which the following matrix shows Claude s winnings: C 1 C 2 C 3 R 1 4 2 5 R G =
More informationCorrelated Equilibria: Rationality and Dynamics
Correlated Equilibria: Rationality and Dynamics Sergiu Hart June 2010 AUMANN 80 SERGIU HART c 2010 p. 1 CORRELATED EQUILIBRIA: RATIONALITY AND DYNAMICS Sergiu Hart Center for the Study of Rationality Dept
More informationGaussian integrals. Calvin W. Johnson. September 9, The basic Gaussian and its normalization
Gaussian integrals Calvin W. Johnson September 9, 24 The basic Gaussian and its normalization The Gaussian function or the normal distribution, ep ( α 2), () is a widely used function in physics and mathematical
More information15-859E: Advanced Algorithms CMU, Spring 2015 Lecture #16: Gradient Descent February 18, 2015
5-859E: Advanced Algorithms CMU, Spring 205 Lecture #6: Gradient Descent February 8, 205 Lecturer: Anupam Gupta Scribe: Guru Guruganesh In this lecture, we will study the gradient descent algorithm and
More informationLecture 4: LMN Learning (Part 2)
CS 294-114 Fine-Grained Compleity and Algorithms Sept 8, 2015 Lecture 4: LMN Learning (Part 2) Instructor: Russell Impagliazzo Scribe: Preetum Nakkiran 1 Overview Continuing from last lecture, we will
More informationSTATIC LECTURE 4: CONSTRAINED OPTIMIZATION II - KUHN TUCKER THEORY
STATIC LECTURE 4: CONSTRAINED OPTIMIZATION II - KUHN TUCKER THEORY UNIVERSITY OF MARYLAND: ECON 600 1. Some Eamples 1 A general problem that arises countless times in economics takes the form: (Verbally):
More informationAdvanced Topics in Machine Learning and Algorithmic Game Theory Fall semester, 2011/12
Advanced Topics in Machine Learning and Algorithmic Game Theory Fall semester, 2011/12 Lecture 4: Multiarmed Bandit in the Adversarial Model Lecturer: Yishay Mansour Scribe: Shai Vardi 4.1 Lecture Overview
More informationIndustrial Organization Lecture 3: Game Theory
Industrial Organization Lecture 3: Game Theory Nicolas Schutz Nicolas Schutz Game Theory 1 / 43 Introduction Why game theory? In the introductory lecture, we defined Industrial Organization as the economics
More informationWeighted Majority and the Online Learning Approach
Statistical Techniques in Robotics (16-81, F12) Lecture#9 (Wednesday September 26) Weighted Majority and the Online Learning Approach Lecturer: Drew Bagnell Scribe:Narek Melik-Barkhudarov 1 Figure 1: Drew
More informationDRAFT Formulation and Analysis of Linear Programs
DRAFT Formulation and Analysis of Linear Programs Benjamin Van Roy and Kahn Mason c Benjamin Van Roy and Kahn Mason September 26, 2005 1 2 Contents 1 Introduction 7 1.1 Linear Algebra..........................
More informationCS 573: Algorithmic Game Theory Lecture date: January 23rd, 2008
CS 573: Algorithmic Game Theory Lecture date: January 23rd, 2008 Instructor: Chandra Chekuri Scribe: Bolin Ding Contents 1 2-Player Zero-Sum Game 1 1.1 Min-Max Theorem for 2-Player Zero-Sum Game....................
More informationA Basic Course in Real Analysis Prof. P. D. Srivastava Department of Mathematics Indian Institute of Technology, Kharagpur
A Basic Course in Real Analysis Prof. P. D. Srivastava Department of Mathematics Indian Institute of Technology, Kharagpur Lecture - 36 Application of MVT, Darbou Theorem, L Hospital Rule (Refer Slide
More information1 Basic Game Modelling
Max-Planck-Institut für Informatik, Winter 2017 Advanced Topic Course Algorithmic Game Theory, Mechanism Design & Computational Economics Lecturer: CHEUNG, Yun Kuen (Marco) Lecture 1: Basic Game Modelling,
More informationSelecting Efficient Correlated Equilibria Through Distributed Learning. Jason R. Marden
1 Selecting Efficient Correlated Equilibria Through Distributed Learning Jason R. Marden Abstract A learning rule is completely uncoupled if each player s behavior is conditioned only on his own realized
More informationPower EP. Thomas Minka Microsoft Research Ltd., Cambridge, UK MSR-TR , October 4, Abstract
Power EP Thomas Minka Microsoft Research Ltd., Cambridge, UK MSR-TR-2004-149, October 4, 2004 Abstract This note describes power EP, an etension of Epectation Propagation (EP) that makes the computations
More informationFrom Stationary Methods to Krylov Subspaces
Week 6: Wednesday, Mar 7 From Stationary Methods to Krylov Subspaces Last time, we discussed stationary methods for the iterative solution of linear systems of equations, which can generally be written
More information6. Linear Programming
Linear Programming 6-1 6. Linear Programming Linear Programming LP reduction Duality Max-flow min-cut, Zero-sum game Integer Programming and LP relaxation Maximum Bipartite Matching, Minimum weight vertex
More informationProbability Models of Information Exchange on Networks Lecture 2. Elchanan Mossel UC Berkeley All rights reserved
Probability Models of Information Exchange on Networks Lecture 2 Elchanan Mossel UC Berkeley All rights reserved Lecture Plan We will see the first simple models of information exchange on networks. Assumptions:
More informationQuantum Games. Quantum Strategies in Classical Games. Presented by Yaniv Carmeli
Quantum Games Quantum Strategies in Classical Games Presented by Yaniv Carmeli 1 Talk Outline Introduction Game Theory Why quantum games? PQ Games PQ penny flip 2x2 Games Quantum strategies 2 Game Theory
More informationT 1. The value function v(x) is the expected net gain when using the optimal stopping time starting at state x:
108 OPTIMAL STOPPING TIME 4.4. Cost functions. The cost function g(x) gives the price you must pay to continue from state x. If T is your stopping time then X T is your stopping state and f(x T ) is your
More informationExponential Moving Average Based Multiagent Reinforcement Learning Algorithms
Exponential Moving Average Based Multiagent Reinforcement Learning Algorithms Mostafa D. Awheda Department of Systems and Computer Engineering Carleton University Ottawa, Canada KS 5B6 Email: mawheda@sce.carleton.ca
More informationDynamical Systems. 1.0 Ordinary Differential Equations. 2.0 Dynamical Systems
. Ordinary Differential Equations. An ordinary differential equation (ODE, for short) is an equation involves a dependent variable and its derivatives with respect to an independent variable. When we speak
More informationExact and Approximate Equilibria for Optimal Group Network Formation
Exact and Approximate Equilibria for Optimal Group Network Formation Elliot Anshelevich and Bugra Caskurlu Computer Science Department, RPI, 110 8th Street, Troy, NY 12180 {eanshel,caskub}@cs.rpi.edu Abstract.
More informationCS364B: Frontiers in Mechanism Design Lecture #3: The Crawford-Knoer Auction
CS364B: Frontiers in Mechanism Design Lecture #3: The Crawford-Knoer Auction Tim Roughgarden January 15, 2014 1 The Story So Far Our current theme is the design of ex post incentive compatible (EPIC) ascending
More informationLecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm. Lecturer: Sanjeev Arora
princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm Lecturer: Sanjeev Arora Scribe: (Today s notes below are
More information0.1 Motivating example: weighted majority algorithm
princeton univ. F 16 cos 521: Advanced Algorithm Design Lecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm Lecturer: Sanjeev Arora Scribe: Sanjeev Arora (Today s notes
More informationComputing Minmax; Dominance
Computing Minmax; Dominance CPSC 532A Lecture 5 Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 1 Lecture Overview 1 Recap 2 Linear Programming 3 Computational Problems Involving Maxmin 4 Domination
More informationFirst Prev Next Last Go Back Full Screen Close Quit. Game Theory. Giorgio Fagiolo
Game Theory Giorgio Fagiolo giorgio.fagiolo@univr.it https://mail.sssup.it/ fagiolo/welcome.html Academic Year 2005-2006 University of Verona Summary 1. Why Game Theory? 2. Cooperative vs. Noncooperative
More information15-780: LinearProgramming
15-780: LinearProgramming J. Zico Kolter February 1-3, 2016 1 Outline Introduction Some linear algebra review Linear programming Simplex algorithm Duality and dual simplex 2 Outline Introduction Some linear
More informationComputational Evolutionary Game Theory and why I m never using PowerPoint for another presentation involving maths ever again
Computational Evolutionary Game Theory and why I m never using PowerPoint for another presentation involving maths ever again Enoch Lau 5 September 2007 Outline What is evolutionary game theory? Why evolutionary
More informationAlgorithms, Games, and Networks January 17, Lecture 2
Algorithms, Games, and Networks January 17, 2013 Lecturer: Avrim Blum Lecture 2 Scribe: Aleksandr Kazachkov 1 Readings for today s lecture Today s topic is online learning, regret minimization, and minimax
More informationGame Theory: Lecture 2
Game Theory: Lecture 2 Tai-Wei Hu June 29, 2011 Outline Two-person zero-sum games normal-form games Minimax theorem Simplex method 1 2-person 0-sum games 1.1 2-Person Normal Form Games A 2-person normal
More information1 Overview. 2 Learning from Experts. 2.1 Defining a meaningful benchmark. AM 221: Advanced Optimization Spring 2016
AM 1: Advanced Optimization Spring 016 Prof. Yaron Singer Lecture 11 March 3rd 1 Overview In this lecture we will introduce the notion of online convex optimization. This is an extremely useful framework
More informationGame Theory Correlated equilibrium 1
Game Theory Correlated equilibrium 1 Christoph Schottmüller University of Copenhagen 1 License: CC Attribution ShareAlike 4.0 1 / 17 Correlated equilibrium I Example (correlated equilibrium 1) L R U 5,1
More informationAdministration. CSCI567 Machine Learning (Fall 2018) Outline. Outline. HW5 is available, due on 11/18. Practice final will also be available soon.
Administration CSCI567 Machine Learning Fall 2018 Prof. Haipeng Luo U of Southern California Nov 7, 2018 HW5 is available, due on 11/18. Practice final will also be available soon. Remaining weeks: 11/14,
More informationOnline Optimization in Dynamic Environments: Improved Regret Rates for Strongly Convex Problems
216 IEEE 55th Conference on Decision and Control (CDC) ARIA Resort & Casino December 12-14, 216, Las Vegas, USA Online Optimization in Dynamic Environments: Improved Regret Rates for Strongly Convex Problems
More informationMathematical Games and Random Walks
Mathematical Games and Random Walks Alexander Engau, Ph.D. Department of Mathematical and Statistical Sciences College of Liberal Arts and Sciences / Intl. College at Beijing University of Colorado Denver
More informationGame Theory, Evolutionary Dynamics, and Multi-Agent Learning. Prof. Nicola Gatti
Game Theory, Evolutionary Dynamics, and Multi-Agent Learning Prof. Nicola Gatti (nicola.gatti@polimi.it) Game theory Game theory: basics Normal form Players Actions Outcomes Utilities Strategies Solutions
More informationLecture 2: Basic Concepts of Statistical Decision Theory
EE378A Statistical Signal Processing Lecture 2-03/31/2016 Lecture 2: Basic Concepts of Statistical Decision Theory Lecturer: Jiantao Jiao, Tsachy Weissman Scribe: John Miller and Aran Nayebi In this lecture
More informationPROBLEMS In each of Problems 1 through 12:
6.5 Impulse Functions 33 which is the formal solution of the given problem. It is also possible to write y in the form 0, t < 5, y = 5 e (t 5/ sin 5 (t 5, t 5. ( The graph of Eq. ( is shown in Figure 6.5.3.
More informationFirst Variation of a Functional
First Variation of a Functional The derivative of a function being zero is a necessary condition for the etremum of that function in ordinary calculus. Let us now consider the equivalent of a derivative
More informationTheory of Auctions. Carlos Hurtado. Jun 23th, Department of Economics University of Illinois at Urbana-Champaign
Theory of Auctions Carlos Hurtado Department of Economics University of Illinois at Urbana-Champaign hrtdmrt2@illinois.edu Jun 23th, 2015 C. Hurtado (UIUC - Economics) Game Theory On the Agenda 1 Formalizing
More informationCalculus in Business. By Frederic A. Palmliden December 7, 1999
Calculus in Business By Frederic A. Palmliden December 7, 999 Optimization Linear Programming Game Theory Optimization The quest for the best Definition of goal equilibrium: The equilibrium state is defined
More informationQuitting games - An Example
Quitting games - An Example E. Solan 1 and N. Vieille 2 January 22, 2001 Abstract Quitting games are n-player sequential games in which, at any stage, each player has the choice between continuing and
More informationMechanism Design. Christoph Schottmüller / 27
Mechanism Design Christoph Schottmüller 2015-02-25 1 / 27 Outline 1 Bayesian implementation and revelation principle 2 Expected externality mechanism 3 Review questions and exercises 2 / 27 Bayesian implementation
More informationApproximate inference, Sampling & Variational inference Fall Cours 9 November 25
Approimate inference, Sampling & Variational inference Fall 2015 Cours 9 November 25 Enseignant: Guillaume Obozinski Scribe: Basile Clément, Nathan de Lara 9.1 Approimate inference with MCMC 9.1.1 Gibbs
More information3. Fundamentals of Lyapunov Theory
Applied Nonlinear Control Nguyen an ien -.. Fundamentals of Lyapunov heory he objective of this chapter is to present Lyapunov stability theorem and illustrate its use in the analysis and the design of
More informationNormal-form games. Vincent Conitzer
Normal-form games Vincent Conitzer conitzer@cs.duke.edu 2/3 of the average game Everyone writes down a number between 0 and 100 Person closest to 2/3 of the average wins Example: A says 50 B says 10 C
More informationIterated Strict Dominance in Pure Strategies
Iterated Strict Dominance in Pure Strategies We know that no rational player ever plays strictly dominated strategies. As each player knows that each player is rational, each player knows that his opponents
More informationC 4,4 0,5 0,0 0,0 B 0,0 0,0 1,1 1 2 T 3,3 0,0. Freund and Schapire: Prisoners Dilemma
On No-Regret Learning, Fictitious Play, and Nash Equilibrium Amir Jafari Department of Mathematics, Brown University, Providence, RI 292 Amy Greenwald David Gondek Department of Computer Science, Brown
More informationExperts in a Markov Decision Process
University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 2004 Experts in a Markov Decision Process Eyal Even-Dar Sham Kakade University of Pennsylvania Yishay Mansour Follow
More informationLecture 8. Instructor: Haipeng Luo
Lecture 8 Instructor: Haipeng Luo Boosting and AdaBoost In this lecture we discuss the connection between boosting and online learning. Boosting is not only one of the most fundamental theories in machine
More informationECE Optimization for wireless networks Final. minimize f o (x) s.t. Ax = b,
ECE 788 - Optimization for wireless networks Final Please provide clear and complete answers. PART I: Questions - Q.. Discuss an iterative algorithm that converges to the solution of the problem minimize
More informationBelief-based Learning
Belief-based Learning Algorithmic Game Theory Marcello Restelli Lecture Outline Introdutcion to multi-agent learning Belief-based learning Cournot adjustment Fictitious play Bayesian learning Equilibrium
More informationTHS Step By Step Calculus Chapter 1
Name: Class Period: Throughout this packet there will be blanks you are epected to fill in prior to coming to class. This packet follows your Larson Tetbook. Do NOT throw away! Keep in 3 ring binder until
More informationWeak Dominance and Never Best Responses
Chapter 4 Weak Dominance and Never Best Responses Let us return now to our analysis of an arbitrary strategic game G := (S 1,...,S n, p 1,...,p n ). Let s i, s i be strategies of player i. We say that
More informationLA lecture 4: linear eq. systems, (inverses,) determinants
LA lecture 4: linear eq. systems, (inverses,) determinants Yesterday: ˆ Linear equation systems Theory Gaussian elimination To follow today: ˆ Gaussian elimination leftovers ˆ A bit about the inverse:
More informationLecture 3: Latent Variables Models and Learning with the EM Algorithm. Sam Roweis. Tuesday July25, 2006 Machine Learning Summer School, Taiwan
Lecture 3: Latent Variables Models and Learning with the EM Algorithm Sam Roweis Tuesday July25, 2006 Machine Learning Summer School, Taiwan Latent Variable Models What to do when a variable z is always
More informationCSCI-6971 Lecture Notes: Probability theory
CSCI-6971 Lecture Notes: Probability theory Kristopher R. Beevers Department of Computer Science Rensselaer Polytechnic Institute beevek@cs.rpi.edu January 31, 2006 1 Properties of probabilities Let, A,
More informationShort course A vademecum of statistical pattern recognition techniques with applications to image and video analysis. Agenda
Short course A vademecum of statistical pattern recognition techniques with applications to image and video analysis Lecture Recalls of probability theory Massimo Piccardi University of Technology, Sydney,
More information1 Lattices and Tarski s Theorem
MS&E 336 Lecture 8: Supermodular games Ramesh Johari April 30, 2007 In this lecture, we develop the theory of supermodular games; key references are the papers of Topkis [7], Vives [8], and Milgrom and
More information