Advanced Machine Learning
|
|
- Scott Grant
- 5 years ago
- Views:
Transcription
1 Advanced Machine Learning Learning and Games MEHRYAR MOHRI COURANT INSTITUTE & GOOGLE RESEARCH.
2 Outline Normal form games Nash equilibrium von Neumann s minimax theorem Correlated equilibrium Internal regret page 2
3 Normal Form Games: Example Rock-Paper-Scissors. R P S R 0,0-1,1 1,-1 P 1,-1 0,0-1,1 S -1,1 1,-1 0,0 page 3
4 Be Truly Random page 4
5 Normal Form Games p players. k 2 [1,p] For each player : A k set of actions (or pure strategies). payoff function u k : Q p k=1 A k! R. Goal of each player: maximize his payoff in a repeated game. page 5
6 Prisoner s Dilemma Silence/Betrayal. for each player, the best action is B, regardless of the other player s action. but, with (B, B), both are worse off than (S, S). S B S 2,2 0,3 B 3,0 1,1 page 6
7 Matching Pennies Player A wins when pennies match, player B otherwise. other versions: penalty kick. no pure strategy Nash equilibrium. H T H 1,-1-1,1 T -1,1 1,-1 page 7
8 Battle of The Sexes Opera/Football. two pure strategy Nash equilibria. O F O 3,2 0,0 F 0,0 2,3 page 8
9 Mixed Strategies Strategies: pure strategies: elements of. mixed strategies: elements of. Payoff: for each player strategies (p 1,...,p p ), E [u k (a)] = a j p j k 2 [1,p] X a=(a 1,...,a p ) Q p k=1 A k Q p k=1 1(A k ), when players play mixed p 1 (a 1 ) p p (a p )u k (a). page 9
10 Nash Equilibrium Definition: a mixed strategy is a (mixed) Nash equilibrium if for all and, if for all, is a pure strategy, then is said to be a pure Nash equilibrium. (p 1,...,p p ) k 2 [1,p] q k 2 1 (A k ) u k (q k, p k ) apple u k (p k, p k ). p k k (p 1,...,p p ) page 10
11 Nash Equilibrium: Examples Prisoner s dilemma: (B, B) is a pure Nash equilibrium. Dominant strategy: both better off playing B regardless of the other player s action. Matching Pennies: no pure Nash equilibrium; clear mixed Nash equilibrium: uniform probability for both. Battle of The Sexes: pure Nash equilibria: both (O, O) and (F, F). mixed Nash equilibria: ((2/3, 1/3), (1/3, 2/3)). payoff of 2/3 for both in mixed case: less than payoffs in pure cases! page 11
12 Nash s Theorem Theorem: any normal form game with a finite set of players and finite set of actions admits a (mixed) Nash equilibrium. page 12
13 Define function Proof : Q p k=1 1(A k )! Q p k=1 1(A k ) (p 1,...p p )=(p 0 1,...p 0 p) with 8k 2 [1,p],j 2 [1,n k ], p 0j k = pj k +cj+ k 1+ P n k j=1 cj+ k where c j k = u k(e j, p k ) u k (p k, p k ), c j+ k = max(0,c j k ). is a continuous function mapping from a non-empty compact convex set to itself, thus, by Brouwer s fixed-point theorem, there exists (p 1,...p p ) such that (p 1,...p p )=(p 1,...p p )., by page 13
14 Proof Observe that for any, Xn k j=1 k 2 [1,p] n k p j k cj k = X p j k u k(e j, p k ) u k (p k, p k )=0. j=1 k 2 [1,p] p j k > 0 Thus, for any, there exists at least one such that with. For that, and p j k = c j k apple 0 p j k 1+ P n k j=1 cj+ k ) 1+ Xn k j=1 j c j+ k =0 c j+ k =1 ) c j+ k =0, 8j ) u k (e j, p k ) apple u k (p k, p k ), 8j ) u k (q k, p k ) apple u k (p k, p k ), 8q k. j page 14
15 Nash Equilibrium: Problems Different equilibria: not clear which one will be selected. different payoffs. Circular definition. Finding any Nash equilibrium is a PPAD-complete (polynomial parity argument on directed graphs) problem (Daskalakis et al., 2009). Not a natural model of rationality if computationally hard. page 15
16 Zero-Sum Games: Order of Play If row player plays then column player plays solution of Thus, if row player starts, he plays quantity and the payoff is p min q2 1 (A 2 ) max p2 1 (A 1 ) to maximize that Similarly, if column player plays first, the expected payoff is min q2 1 (A 2 ) E a 1 p a 2 q min q2 1 (A 2 ) max p2 1 (A 1 ) [u 1 (a)]. E E p a 1 p a 2 q a 1 p a 2 q [u 1 (a)]. [u 1 (a)]. q page 16
17 von Neumann s Theorem Theorem (von Neumann s minimax theorem): for any twoplayer zero-sum game with finite action sets, max p2 1 (A 1 ) min q2 1 (A 2 ) E a 1 p a 2 q [u 1 (a)] = min q2 1 (A 2 ) common value called value of the game. max p2 1 (A 1 ) mixed Nash equilibria coincide with maximizing and minimizing pairs and they all have the same payoff. (von Neumann, 1928) E a 1 p a 2 q [u 1 (a)]. page 17
18 Proof Playing second is never worse: max p2 1 (A 1 ) min q2 1 (A 2 ) E a 1 p a 2 q straightforward: [u 1 (a)] apple min 8p 2 1 (A 1 ), 8q 2 1 (A 2 ), min q2 1 (A 2 ) ) 8q 2 1 (A 2 ), max p2 1 (A 1 ) ) max p2 1 (A 1 ) min q2 1 (A 2 ) min q2 1 (A 2 ) q2 1 (A 2 ) E a 1 p a 2 q E a 1 p a 2 q E a 1 p a 2 q [u 1 (a)] apple E max p2 1 (A 1 ) a1 p a 2 q [u 1 (a)] apple max [u 1 (a)] p2 1 (A 1 ) [u 1 (a)] apple min q2 1 (A 2 ) E a 1 p a 2 q E a 1 p a 2 q [u 1 (a)] max p2 1 (A 1 ) [u 1 (a)]. E a 1 p a 2 q [u 1 (a)]. page 18
19 Proof min q2 1 (A 2 ) Set-up: at reach round, column player selects using RWM. row player selects. Thus, letting proof: max p2 1 (A 1 ) E a 1 p a 2 q T! +1 [u 1 (a)] = min apple apple 1 T q2 1 (A 2 ) in the following completes the max p2 max p2 1 (A 1 ) p> U =min q TX t=1 " 1 (A 1 ) p> Uq " 1 T TX t=1 max p2 1 (A 1 ) p> Uq t = 1 T 1 T p t = TX t=1 q t p > t p2 # max 1 (A 1 ) p> Uq t Uq + R T T q t # = max p2 1 (A 1 ) TX t=1 1 T TX p > Uq t t=1 p > 1 t Uq t =min q T TX t=1 apple max min p > Uq + R T p q T. p > t Uq + R T T page 19
20 Proof Let p 2 argmax min u 1(p, q) p2 1 (A 1 ) q2 1 (A 2 ) q 2 argmin max u 1(p, q). q2 1 (A 2 ) p2 1 (A 1 ) and p and q exist by the continuity of u 1 and the compactness of the simplices. By definition of p and q and the minmax theorem: v =minu 1 (p,q) apple u 1 (p,q ) apple max q p Thus, (p,q ) is a Nash equilibrium. u 1 (p, q )=v. page 20
21 Proof Conversely, assume that (p,q ) is a Nash equilibrium. Then, u 1 (p,q ) = max p u 1 (p, q ) min q max p u 1 (p, q) =v u 1 (p,q )=min q u 1 (p,q) apple max p min q u 1 (p, q) =v. This implies equalities and u 1 (p,q ) = max p min q u 1 (p, q) =min q max p u 1 (p, q). page 21
22 Notes Unique value: all Nash equilibria have the same payoff (less problematic than general case). Potentially several equilibria but no need to cooperate. q O log N T Computationally efficient: convergence in. Plausible explanation of how an equilibrium is reached note that both players can play RWM. In general non-zero-sum games regret minimization does not lead to an equilibrium. page 22
23 Yao s Lemma Theorem: for any two-player zero-sum game with finite action sets, max min p2 1 (A 1 ) a 2 2A 2 E [u a1 1(a)] = p min max q2 1 (A 2 ) a 1 2A 1 E [u a2 1(a)]. q (Yao, 1977) consequence: for any distribution D over the inputs, the cost of a randomized algorithm is lower bounded by the minimum D-average cost of a deterministic algorithm. page 23
24 General Finite Games Regret notion not relevant: (external) regret minimization may not lead to a Nash equilibrium. Notion of equilibrium: several issues related to Nash equilibria. new notion of equilibrium, new notion of regret. page 24
25 Correlated Equilibrium - Tale There is an authority or a correlation mechanism device. The authority defines a probability distribution p-tuple of the players actions. p over the The authority draws player only his action. k (a 1,...,a p ) p a k and reveals to each The authority is a correlated equilibrium if player k has no incentive to deviate from the action recommended: the utility of any other action is lower than a k, conditioned on the fact that he was told. a k page 25
26 X Correlated Equilibrium p<+1 Definition: consider a normal form game with players and finite action sets A Q k, k 2 [1,p]. Then, a p probability distribution p over k=1 A k is a correlated equilibrium if for all k 2[1,p], for all a k 2 A k with positive probability and all, or X a 0 k 2 A k X X (Aumann, 1974) p(a k,a k )u k (a k,a k ) p(a k,a k )u k (a 0 k,a k ), a k 2A k a k 2A k p(a a k )u k (a k,a k ) p(a a k )u k (a 0 k,a k ). a k 2A k a k 2A k page 26
27 Notes Think of the distribution as a correlation device. The set of all correlated equilibria is a convex set (it is a polyhedron): defined by a system of linear inequalities, including the simplex constraints. Solution via solving an LP problem. The set of Nash equilibria in general is not convex. It is defined by the intersection of the polyhedron of correlated equilibria and the constraints p(a) =p 1 (a 1 ) p p (a p ). page 27
28 Traffic Lights Stop/Go. S G Pure Nash equilibria: (S, G), (G, S). Mixed Nash equilibrium: ((1/2, 1/2), (1/2, 1/2)). Correlated equilibria: S 4,4 1,5 G 5,1 0,0 0 1/2 1/2 0 1/3 1/3 1/3 0 page 28
29 Internal Regret Definition: internal regret, functions leaving all actions unchanged but which is switched to. R T = C a,b f : A! A a b Definition: swap regret, family of all functions. R T = TX t=1 TX t=1 E [l(a t )] a t p t C E [l(a t )] a t p t min f2c a,b min f2c TX t=1 TX t=1 E [l(f(a t ))]. a t p t f : A! A E [l(f(a t ))]. a t p t page 29
30 Swap Regret and Correlated Eq. Theorem: consider a finite normal form game played repeatedly. Assume that each player follows a swap regret minimizing strategy. Then, the empirical distribution of all plays converges to a correlated equilibrium. page 30
31 Alternative Proof Let C be the convex set of correlated equilibria. If the sequence of empirical dist. (bp t ) t2n does not converge to C, it admits a subsequence in C = {p: d(p,c) } (compact set), thus, it admits a subsequence converging to bp 62 C. Thus, there exist > 0, k 2 [1,p], and a k,a 0 k 2 A k such that X a k 2A k bp(a)[u k (a 0 k,a k ) u k (a k,a k )] =. Therefore, for X t sufficiently large, a k 2A k bp (t) (a)[u k (a 0 k,a k ) u k (a k,a k )] 2. page 31
32 Alternative Proof bp (t) (a) = 1 Since, 1 (t) (t) X s=1 (t) P (t) s=1 1 a k, (s) =a k 1 ak, (s) =a k h u k (a 0 k, a k, (s) ) u k (a k, a k, (s) ) Thus, the internal regret of player k for switching a k to a 0 k is lower bounded by 2 at time (t) and later, which implies that the player is not following a swap regret minimization strategy. i 1 ak, (s) =a k 2. page 32
33 Proof Define the instantanous regret of player Then, at time as br k,t,j,j 0 =1 ak,t =j[l k (j, a k,t ) l k (j 0,a k,t )], and r k,t,j,j 0 = p k,t,j [l k (j, a k,t ) l k (j 0,a k,t )]. E[br k,t,j,j 0 past ^ other players actions] = r k,t,j,j 0. (j, j 0 ) (r k,t,j,j 0 br k,t,j,j 0) Thus, for any, is a bounded martingale difference. By Azuma s inequality and the Borell-Cantelli lemma, for all and, Therefore, 8k, lim sup T!+1 1 max j,j 0 T k k (j, j 0 ) P 1 T lim sup T!+1 T t=1 br k,t,j,j 0 r k,t,j,j 0 = 0 (a.s.). TX br k,t,j,j 0 apple 0 (a.s.). t=1 t page 33
34 Swap Regret Algorithm Theorem: there exists an algorithm with swap regret. (Blum and Mansour, 2007) O( p NT log N) R p 1 l t p 2 l t p N l t q t,1 q t,2 q t,n R 1 R 2 R N R i s external regret minimization algorithms page 34
35 Proof Define for all the stochastic matrix Since is stochastic, it admits a stationary distribution : Thus, Q t t 2 [1,T] Q t =(q t,i,j ) (i,j)2[1,n] 2 = p > t = p > t Q t,8j 2 [1,N],p t,j = " q > t,1. q > t,n NX i=1 #. p t,i q t,i,j p t TX p t l t = t=1 TX NX p t,j l t,j = t=1 j=1 TX NX NX p t,i q t,i,j l t,j = t=1 j=1 i=1 NX TX q t,i (p t,i l t ) apple i=1 t=1 NX i=1 min j TX p t,i l t,j + R T,i. t=1 page 35
36 Proof f : A! A TX NX p t l t apple Thus, for any, t=1 i=1 TX t=1 p t,i l t,f(i) + R T,i. For RWM, inequality, R T,i = O( p L min,i log N) NX R T,i = N 1 NX R T,i N i=1 i=1 0 v 1 u apple 1 NX L min,i log NA N apple O N i=1 r! 1 N T log N p = O NT log N.. Thus, by Jensen s (Jensen s ineq.) X N L min,i = i=1 TX t=1 NX p t,i l t,j i i=1 page 36
37 Notes Surprising result: no explicit joint distribution in the game! correlation induced by the empirical sequence of plays by the players. Game matrix: no need to know the full matrix (which could be huge with a lot of players). only need to know the loss or payoff for actions taken. page 37
38 Conclusion Zero-sum finite games: external regret minimization algorithms (e.g., RWM). Nash equilibrium, value of the game reached. General finite games: internal/swap regret minimization algorithms. correlated equilibrium, can be learned. Questions: Nash equilibria. extensions: e.g., time selection functions (Blum and Mansour, 2007), conditional correlated equilibrium (MM and Yang, 2014). page 38
39 References Robert Aumann. Subjectivity and correlation in randomized strategies. Journal of Mathematical Economics, 1:67 96, Avrim Blum and Yishay Mansour. From external to internal regret. Journal of Machine Learning Research, 8: , Nicolò Cesa-Bianchi and Gábor Lugosi. Prediction, learning, and games. Cambridge University Press, Constantinos Daskalakis, Paul W. Goldberg, Christos H. Papadimitriou. The complexity of computing a Nash equilibrium. Commun. ACM 52(2): (2009). Dean P. Foster and Rakesh V. Vohra. Calibrated learning and correlated equilibrium. Games and Economic Behavior, 21(12):40 55, Yoav Freund and Robert Schapire. Large margin classification using the perceptron algorithm. In Proceedings of COLT ACM Press, page 39
40 References Ehud Lehrer. Correlated equilibria in two-player repeated games with nonobservable actions. Mathematics of Operations Research, 17(1), Nick Littlestone. From On-Line to Batch Learning. COLT 1989: Nick Littlestone, Manfred K. Warmuth: The Weighted Majority Algorithm. FOCS 1989: Mehryar Mohri and Scott Yang. Conditional swap regret and conditional correlated equilibrium. In NIPS Noam Nisan, Tim Roughgarden, Éva Tardos, and Vijay V. Vazirani. Algorithmic Game Theory. Cambridge University Press, New York, NY, USA, John von Neumann. Zur Theorie der Gesellschaftsspiele. Mathematische Annalen, 100(1): , page 40
41 References Gilles Stoltz and Gábor Lugosi. Learning correlated equilibria in games with compact sets of strategies. Games and Economic Behavior, 59(1), Andrew Yao. Probabilistic computations: Toward a unified measure of complexity, in FOCS, pp , page 41
Conditional Swap Regret and Conditional Correlated Equilibrium
Conditional Swap Regret and Conditional Correlated quilibrium Mehryar Mohri Courant Institute and Google 251 Mercer Street New York, NY 10012 mohri@cims.nyu.edu Scott Yang Courant Institute 251 Mercer
More informationAdvanced Machine Learning
Advanced Machine Learning Bandit Problems MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Multi-Armed Bandit Problem Problem: which arm of a K-slot machine should a gambler pull to maximize his
More informationAdvanced Machine Learning
Advanced Machine Learning Follow-he-Perturbed Leader MEHRYAR MOHRI MOHRI@ COURAN INSIUE & GOOGLE RESEARCH. General Ideas Linear loss: decomposition as a sum along substructures. sum of edge losses in a
More informationGame-Theoretic Learning:
Game-Theoretic Learning: Regret Minimization vs. Utility Maximization Amy Greenwald with David Gondek, Amir Jafari, and Casey Marks Brown University University of Pennsylvania November 17, 2004 Background
More informationGame Theory: Lecture 3
Game Theory: Lecture 3 Lecturer: Pingzhong Tang Topic: Mixed strategy Scribe: Yuan Deng March 16, 2015 Definition 1 (Mixed strategy). A mixed strategy S i : A i [0, 1] assigns a probability S i ( ) 0 to
More informationPerceptron Mistake Bounds
Perceptron Mistake Bounds Mehryar Mohri, and Afshin Rostamizadeh Google Research Courant Institute of Mathematical Sciences Abstract. We present a brief survey of existing mistake bounds and introduce
More information6.891 Games, Decision, and Computation February 5, Lecture 2
6.891 Games, Decision, and Computation February 5, 2015 Lecture 2 Lecturer: Constantinos Daskalakis Scribe: Constantinos Daskalakis We formally define games and the solution concepts overviewed in Lecture
More informationPrediction and Playing Games
Prediction and Playing Games Vineel Pratap vineel@eng.ucsd.edu February 20, 204 Chapter 7 : Prediction, Learning and Games - Cesa Binachi & Lugosi K-Person Normal Form Games Each player k (k =,..., K)
More informationA Multiplayer Generalization of the MinMax Theorem
A Multiplayer Generalization of the MinMax Theorem Yang Cai Ozan Candogan Constantinos Daskalakis Christos Papadimitriou Abstract We show that in zero-sum polymatrix games, a multiplayer generalization
More informationFoundations of Machine Learning On-Line Learning. Mehryar Mohri Courant Institute and Google Research
Foundations of Machine Learning On-Line Learning Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Motivation PAC learning: distribution fixed over time (training and test). IID assumption.
More informationLecture 14: Approachability and regret minimization Ramesh Johari May 23, 2007
MS&E 336 Lecture 4: Approachability and regret minimization Ramesh Johari May 23, 2007 In this lecture we use Blackwell s approachability theorem to formulate both external and internal regret minimizing
More informationSmooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics
Smooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics Sergiu Hart August 2016 Smooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics Sergiu Hart Center for the Study of Rationality
More informationChapter 9. Mixed Extensions. 9.1 Mixed strategies
Chapter 9 Mixed Extensions We now study a special case of infinite strategic games that are obtained in a canonic way from the finite games, by allowing mixed strategies. Below [0, 1] stands for the real
More informationTheory and Applications of A Repeated Game Playing Algorithm. Rob Schapire Princeton University [currently visiting Yahoo!
Theory and Applications of A Repeated Game Playing Algorithm Rob Schapire Princeton University [currently visiting Yahoo! Research] Learning Is (Often) Just a Game some learning problems: learn from training
More informationLearning, Games, and Networks
Learning, Games, and Networks Abhishek Sinha Laboratory for Information and Decision Systems MIT ML Talk Series @CNRG December 12, 2016 1 / 44 Outline 1 Prediction With Experts Advice 2 Application to
More informationCS364A: Algorithmic Game Theory Lecture #13: Potential Games; A Hierarchy of Equilibria
CS364A: Algorithmic Game Theory Lecture #13: Potential Games; A Hierarchy of Equilibria Tim Roughgarden November 4, 2013 Last lecture we proved that every pure Nash equilibrium of an atomic selfish routing
More informationTime Series Prediction & Online Learning
Time Series Prediction & Online Learning Joint work with Vitaly Kuznetsov (Google Research) MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Motivation Time series prediction: stock values. earthquakes.
More informationAdvanced Machine Learning
Advanced Machine Learning Learning with Large Expert Spaces MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Problem Learning guarantees: R T = O( p T log N). informative even for N very large.
More informationGames A game is a tuple = (I; (S i ;r i ) i2i) where ffl I is a set of players (i 2 I) ffl S i is a set of (pure) strategies (s i 2 S i ) Q ffl r i :
On the Connection between No-Regret Learning, Fictitious Play, & Nash Equilibrium Amy Greenwald Brown University Gunes Ercal, David Gondek, Amir Jafari July, Games A game is a tuple = (I; (S i ;r i ) i2i)
More informationGame Theory. Greg Plaxton Theory in Programming Practice, Spring 2004 Department of Computer Science University of Texas at Austin
Game Theory Greg Plaxton Theory in Programming Practice, Spring 2004 Department of Computer Science University of Texas at Austin Bimatrix Games We are given two real m n matrices A = (a ij ), B = (b ij
More informationLearning correlated equilibria in games with compact sets of strategies
Learning correlated equilibria in games with compact sets of strategies Gilles Stoltz a,, Gábor Lugosi b a cnrs and Département de Mathématiques et Applications, Ecole Normale Supérieure, 45 rue d Ulm,
More informationLecture December 2009 Fall 2009 Scribe: R. Ring In this lecture we will talk about
0368.4170: Cryptography and Game Theory Ran Canetti and Alon Rosen Lecture 7 02 December 2009 Fall 2009 Scribe: R. Ring In this lecture we will talk about Two-Player zero-sum games (min-max theorem) Mixed
More informationOnline Learning for Time Series Prediction
Online Learning for Time Series Prediction Joint work with Vitaly Kuznetsov (Google Research) MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Motivation Time series prediction: stock values.
More informationAdvanced Machine Learning
Advanced Machine Learning Online Convex Optimization MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Outline Online projected sub-gradient descent. Exponentiated Gradient (EG). Mirror descent.
More informationNormal-form games. Vincent Conitzer
Normal-form games Vincent Conitzer conitzer@cs.duke.edu 2/3 of the average game Everyone writes down a number between 0 and 100 Person closest to 2/3 of the average wins Example: A says 50 B says 10 C
More informationAlgorithms, Games, and Networks January 17, Lecture 2
Algorithms, Games, and Networks January 17, 2013 Lecturer: Avrim Blum Lecture 2 Scribe: Aleksandr Kazachkov 1 Readings for today s lecture Today s topic is online learning, regret minimization, and minimax
More informationAlgorithmic Game Theory. Alexander Skopalik
Algorithmic Game Theory Alexander Skopalik Today Course Mechanics & Overview Introduction into game theory and some examples Chapter 1: Selfish routing Alexander Skopalik Skopalik@mail.uni-paderborn.de
More informationDeterministic Calibration and Nash Equilibrium
University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 2-2008 Deterministic Calibration and Nash Equilibrium Sham M. Kakade Dean P. Foster University of Pennsylvania Follow
More informationEfficient Nash Equilibrium Computation in Two Player Rank-1 G
Efficient Nash Equilibrium Computation in Two Player Rank-1 Games Dept. of CSE, IIT-Bombay China Theory Week 2012 August 17, 2012 Joint work with Bharat Adsul, Jugal Garg and Milind Sohoni Outline Games,
More informationFrom Bandits to Experts: A Tale of Domination and Independence
From Bandits to Experts: A Tale of Domination and Independence Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Domination and Independence 1 / 1 From Bandits to Experts: A
More informationIntroduction to Machine Learning Lecture 11. Mehryar Mohri Courant Institute and Google Research
Introduction to Machine Learning Lecture 11 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Boosting Mehryar Mohri - Introduction to Machine Learning page 2 Boosting Ideas Main idea:
More informationConvergence and No-Regret in Multiagent Learning
Convergence and No-Regret in Multiagent Learning Michael Bowling Department of Computing Science University of Alberta Edmonton, Alberta Canada T6G 2E8 bowling@cs.ualberta.ca Abstract Learning in a multiagent
More informationLearning to play partially-specified equilibrium
Learning to play partially-specified equilibrium Ehud Lehrer and Eilon Solan August 29, 2007 August 29, 2007 First draft: June 2007 Abstract: In a partially-specified correlated equilibrium (PSCE ) the
More informationGame Theory. Professor Peter Cramton Economics 300
Game Theory Professor Peter Cramton Economics 300 Definition Game theory is the study of mathematical models of conflict and cooperation between intelligent and rational decision makers. Rational: each
More informationLecture 6: April 25, 2006
Computational Game Theory Spring Semester, 2005/06 Lecture 6: April 25, 2006 Lecturer: Yishay Mansour Scribe: Lior Gavish, Andrey Stolyarenko, Asaph Arnon Partially based on scribe by Nataly Sharkov and
More informationThe Distribution of Optimal Strategies in Symmetric Zero-sum Games
The Distribution of Optimal Strategies in Symmetric Zero-sum Games Florian Brandl Technische Universität München brandlfl@in.tum.de Given a skew-symmetric matrix, the corresponding two-player symmetric
More informationCorrelated Equilibria: Rationality and Dynamics
Correlated Equilibria: Rationality and Dynamics Sergiu Hart June 2010 AUMANN 80 SERGIU HART c 2010 p. 1 CORRELATED EQUILIBRIA: RATIONALITY AND DYNAMICS Sergiu Hart Center for the Study of Rationality Dept
More informationComputational Game Theory Spring Semester, 2005/6. Lecturer: Yishay Mansour Scribe: Ilan Cohen, Natan Rubin, Ophir Bleiberg*
Computational Game Theory Spring Semester, 2005/6 Lecture 5: 2-Player Zero Sum Games Lecturer: Yishay Mansour Scribe: Ilan Cohen, Natan Rubin, Ophir Bleiberg* 1 5.1 2-Player Zero Sum Games In this lecture
More informationApproximate Nash Equilibria with Near Optimal Social Welfare
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 015) Approximate Nash Equilibria with Near Optimal Social Welfare Artur Czumaj, Michail Fasoulakis, Marcin
More informationThe Query Complexity of Correlated Equilibria
The Query Complexity of Correlated Equilibria arxiv:1305.4874v2 [cs.gt] 25 Dec 2017 Sergiu Hart Noam Nisan December 27, 2017 Abstract We consider the complexity of finding a correlated equilibrium of an
More informationSettling Some Open Problems on 2-Player Symmetric Nash Equilibria
Settling Some Open Problems on 2-Player Symmetric Nash Equilibria Ruta Mehta Vijay V. Vazirani Sadra Yazdanbod College of Computing, Georgia Tech rmehta,vazirani,syazdanb@cc.gatech.edu Abstract. Over the
More informationGame Theory and Algorithms Lecture 7: PPAD and Fixed-Point Theorems
Game Theory and Algorithms Lecture 7: PPAD and Fixed-Point Theorems March 17, 2011 Summary: The ultimate goal of this lecture is to finally prove Nash s theorem. First, we introduce and prove Sperner s
More informationFrom External to Internal Regret
From External to Internal Regret Avrim Blum School of Computer Science Carnegie Mellon University, Pittsburgh, PA 15213. Yishay Mansour School of Computer Science, Tel Aviv University, Tel Aviv, ISRAEL.
More informationZero-Sum Games Public Strategies Minimax Theorem and Nash Equilibria Appendix. Zero-Sum Games. Algorithmic Game Theory.
Public Strategies Minimax Theorem and Nash Equilibria Appendix 2013 Public Strategies Minimax Theorem and Nash Equilibria Appendix Definition Definition A zero-sum game is a strategic game, in which for
More informationLearning with Large Number of Experts: Component Hedge Algorithm
Learning with Large Number of Experts: Component Hedge Algorithm Giulia DeSalvo and Vitaly Kuznetsov Courant Institute March 24th, 215 1 / 3 Learning with Large Number of Experts Regret of RWM is O( T
More informationExponential Weights on the Hypercube in Polynomial Time
European Workshop on Reinforcement Learning 14 (2018) October 2018, Lille, France. Exponential Weights on the Hypercube in Polynomial Time College of Information and Computer Sciences University of Massachusetts
More informationThe No-Regret Framework for Online Learning
The No-Regret Framework for Online Learning A Tutorial Introduction Nahum Shimkin Technion Israel Institute of Technology Haifa, Israel Stochastic Processes in Engineering IIT Mumbai, March 2013 N. Shimkin,
More informationGambling in a rigged casino: The adversarial multi-armed bandit problem
Gambling in a rigged casino: The adversarial multi-armed bandit problem Peter Auer Institute for Theoretical Computer Science University of Technology Graz A-8010 Graz (Austria) pauer@igi.tu-graz.ac.at
More informationCSC304 Lecture 5. Game Theory : Zero-Sum Games, The Minimax Theorem. CSC304 - Nisarg Shah 1
CSC304 Lecture 5 Game Theory : Zero-Sum Games, The Minimax Theorem CSC304 - Nisarg Shah 1 Recap Last lecture Cost-sharing games o Price of anarchy (PoA) can be n o Price of stability (PoS) is O(log n)
More informationNotes on Coursera s Game Theory
Notes on Coursera s Game Theory Manoel Horta Ribeiro Week 01: Introduction and Overview Game theory is about self interested agents interacting within a specific set of rules. Self-Interested Agents have
More informationNo-Regret Learning in Convex Games
Geoffrey J. Gordon Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA 15213 Amy Greenwald Casey Marks Department of Computer Science, Brown University, Providence, RI 02912 Abstract
More informationSymmetries and the Complexity of Pure Nash Equilibrium
Symmetries and the Complexity of Pure Nash Equilibrium Felix Brandt a Felix Fischer a, Markus Holzer b a Institut für Informatik, Universität München, Oettingenstr. 67, 80538 München, Germany b Institut
More informationBelief-based Learning
Belief-based Learning Algorithmic Game Theory Marcello Restelli Lecture Outline Introdutcion to multi-agent learning Belief-based learning Cournot adjustment Fictitious play Bayesian learning Equilibrium
More informationComputing Minmax; Dominance
Computing Minmax; Dominance CPSC 532A Lecture 5 Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 1 Lecture Overview 1 Recap 2 Linear Programming 3 Computational Problems Involving Maxmin 4 Domination
More informationGames and Their Equilibria
Chapter 1 Games and Their Equilibria The central notion of game theory that captures many aspects of strategic decision making is that of a strategic game Definition 11 (Strategic Game) An n-player strategic
More informationSummer School on Graphs in Computer Graphics, Image and Signal Analysis Bornholm, Denmark, August Outline
Summer School on Graphs in Computer Graphics, Image and Signal Analysis Bornholm, Denmark, August 2011 Outline Lecture 1: Monday (10:15 11:00) Introduction to the basic concepts of game theory Lecture
More informationGame Theory, On-line prediction and Boosting (Freund, Schapire)
Game heory, On-line prediction and Boosting (Freund, Schapire) Idan Attias /4/208 INRODUCION he purpose of this paper is to bring out the close connection between game theory, on-line prediction and boosting,
More informationAlgorithmic Game Theory and Economics: A very short introduction. Mysore Park Workshop August, 2012
Algorithmic Game Theory and Economics: A very short introduction Mysore Park Workshop August, 2012 PRELIMINARIES Game Rock Paper Scissors 0,0-1,1 1,-1 1,-1 0,0-1,1-1,1 1,-1 0,0 Strategies of Player A Strategies
More informationIntroduction to Game Theory
Introduction to Game Theory Part 1. Static games of complete information Chapter 3. Mixed strategies and existence of equilibrium Ciclo Profissional 2 o Semestre / 2011 Graduação em Ciências Econômicas
More informationBasic Game Theory. Kate Larson. January 7, University of Waterloo. Kate Larson. What is Game Theory? Normal Form Games. Computing Equilibria
Basic Game Theory University of Waterloo January 7, 2013 Outline 1 2 3 What is game theory? The study of games! Bluffing in poker What move to make in chess How to play Rock-Scissors-Paper Also study of
More informationOn-Line Learning with Path Experts and Non-Additive Losses
On-Line Learning with Path Experts and Non-Additive Losses Joint work with Corinna Cortes (Google Research) Vitaly Kuznetsov (Courant Institute) Manfred Warmuth (UC Santa Cruz) MEHRYAR MOHRI MOHRI@ COURANT
More informationMixed Strategies. Krzysztof R. Apt. CWI, Amsterdam, the Netherlands, University of Amsterdam. (so not Krzystof and definitely not Krystof)
Mixed Strategies Krzysztof R. Apt (so not Krzystof and definitely not Krystof) CWI, Amsterdam, the Netherlands, University of Amsterdam Mixed Strategies p. 1/1 Mixed Extension of a Finite Game Probability
More informationThe Computational Aspect of Risk in Playing Non-Cooperative Games
The Computational Aspect of Risk in Playing Non-Cooperative Games Deeparnab Chakrabarty Subhash A. Khot Richard J. Lipton College of Computing, Georgia Tech Nisheeth K. Vishnoi Computer Science Division,
More informationCalibration and Nash Equilibrium
Calibration and Nash Equilibrium Dean Foster University of Pennsylvania and Sham Kakade TTI September 22, 2005 What is a Nash equilibrium? Cartoon definition of NE: Leroy Lockhorn: I m drinking because
More informationOnline Learning with Feedback Graphs
Online Learning with Feedback Graphs Claudio Gentile INRIA and Google NY clagentile@gmailcom NYC March 6th, 2018 1 Content of this lecture Regret analysis of sequential prediction problems lying between
More informationToday: Linear Programming (con t.)
Today: Linear Programming (con t.) COSC 581, Algorithms April 10, 2014 Many of these slides are adapted from several online sources Reading Assignments Today s class: Chapter 29.4 Reading assignment for
More informationBlackwell s Approachability Theorem: A Generalization in a Special Case. Amy Greenwald, Amir Jafari and Casey Marks
Blackwell s Approachability Theorem: A Generalization in a Special Case Amy Greenwald, Amir Jafari and Casey Marks Department of Computer Science Brown University Providence, Rhode Island 02912 CS-06-01
More informationNotes for Week 3: Learning in non-zero sum games
CS 683 Learning, Games, and Electronic Markets Spring 2007 Notes for Week 3: Learning in non-zero sum games Instructor: Robert Kleinberg February 05 09, 2007 Notes by: Yogi Sharma 1 Announcements The material
More informationGame Theory: Lecture 2
Game Theory: Lecture 2 Tai-Wei Hu June 29, 2011 Outline Two-person zero-sum games normal-form games Minimax theorem Simplex method 1 2-person 0-sum games 1.1 2-Person Normal Form Games A 2-person normal
More informationLearning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley
Learning Methods for Online Prediction Problems Peter Bartlett Statistics and EECS UC Berkeley Course Synopsis A finite comparison class: A = {1,..., m}. Converting online to batch. Online convex optimization.
More informationIntroduction to Game Theory
COMP323 Introduction to Computational Game Theory Introduction to Game Theory Paul G. Spirakis Department of Computer Science University of Liverpool Paul G. Spirakis (U. Liverpool) Introduction to Game
More informationSelf-stabilizing uncoupled dynamics
Self-stabilizing uncoupled dynamics Aaron D. Jaggard 1, Neil Lutz 2, Michael Schapira 3, and Rebecca N. Wright 4 1 U.S. Naval Research Laboratory, Washington, DC 20375, USA. aaron.jaggard@nrl.navy.mil
More informationAlgorithmic Game Theory
Nash Equilibria in Zero-Sum Games Algorithmic Game Theory Algorithmic game theory is not satisfied with merely an existence result for an important solution concept such as Nash equilibrium in mixed strategies.
More informationOnline Learning Class 12, 20 March 2006 Andrea Caponnetto, Sanmay Das
Online Learning 9.520 Class 12, 20 March 2006 Andrea Caponnetto, Sanmay Das About this class Goal To introduce the general setting of online learning. To describe an online version of the RLS algorithm
More informationConsistent Beliefs in Extensive Form Games
Games 2010, 1, 415-421; doi:10.3390/g1040415 OPEN ACCESS games ISSN 2073-4336 www.mdpi.com/journal/games Article Consistent Beliefs in Extensive Form Games Paulo Barelli 1,2 1 Department of Economics,
More informationGame Theory, On-line Prediction and Boosting
roceedings of the Ninth Annual Conference on Computational Learning heory, 996. Game heory, On-line rediction and Boosting Yoav Freund Robert E. Schapire A& Laboratories 600 Mountain Avenue Murray Hill,
More informationLectures 6, 7 and part of 8
Lectures 6, 7 and part of 8 Uriel Feige April 26, May 3, May 10, 2015 1 Linear programming duality 1.1 The diet problem revisited Recall the diet problem from Lecture 1. There are n foods, m nutrients,
More information1 PROBLEM DEFINITION. i=1 z i = 1 }.
Algorithms for Approximations of Nash Equilibrium (003; Lipton, Markakis, Mehta, 006; Kontogiannis, Panagopoulou, Spirakis, and 006; Daskalakis, Mehta, Papadimitriou) Spyros C. Kontogiannis University
More informationDefensive forecasting for optimal prediction with expert advice
Defensive forecasting for optimal prediction with expert advice Vladimir Vovk $25 Peter Paul $0 $50 Peter $0 Paul $100 The Game-Theoretic Probability and Finance Project Working Paper #20 August 11, 2007
More informationGame Theory and Algorithms Lecture 2: Nash Equilibria and Examples
Game Theory and Algorithms Lecture 2: Nash Equilibria and Examples February 24, 2011 Summary: We introduce the Nash Equilibrium: an outcome (action profile) which is stable in the sense that no player
More informationA Note on Approximate Nash Equilibria
A Note on Approximate Nash Equilibria Constantinos Daskalakis, Aranyak Mehta, and Christos Papadimitriou University of California, Berkeley, USA. Supported by NSF grant CCF-05559 IBM Almaden Research Center,
More informationPolynomial-time Computation of Exact Correlated Equilibrium in Compact Games
Polynomial-time Computation of Exact Correlated Equilibrium in Compact Games Albert Xin Jiang Kevin Leyton-Brown Department of Computer Science University of British Columbia Outline 1 Computing Correlated
More informationTheoretical Computer Science
Theoretical Computer Science 0 (009) 599 606 Contents lists available at ScienceDirect Theoretical Computer Science journal homepage: www.elsevier.com/locate/tcs Polynomial algorithms for approximating
More informationExistence of sparsely supported correlated equilibria
Existence of sparsely supported correlated equilibria Fabrizio Germano fabrizio.germano@upf.edu Gábor Lugosi lugosi@upf.es Departament d Economia i Empresa Universitat Pompeu Fabra Ramon Trias Fargas 25-27
More informationLecture Notes on Game Theory
Lecture Notes on Game Theory Levent Koçkesen Strategic Form Games In this part we will analyze games in which the players choose their actions simultaneously (or without the knowledge of other players
More informationBounded Rationality, Strategy Simplification, and Equilibrium
Bounded Rationality, Strategy Simplification, and Equilibrium UPV/EHU & Ikerbasque Donostia, Spain BCAM Workshop on Interactions, September 2014 Bounded Rationality Frequently raised criticism of game
More informationEconomics 703 Advanced Microeconomics. Professor Peter Cramton Fall 2017
Economics 703 Advanced Microeconomics Professor Peter Cramton Fall 2017 1 Outline Introduction Syllabus Web demonstration Examples 2 About Me: Peter Cramton B.S. Engineering, Cornell University Ph.D. Business
More informationIntroduction to Game Theory
Introduction to Game Theory Project Group DynaSearch November 5th, 2013 Maximilian Drees Source: Fotolia, Jürgen Priewe Introduction to Game Theory Maximilian Drees 1 Game Theory In many situations, the
More informationSatisfaction Equilibrium: Achieving Cooperation in Incomplete Information Games
Satisfaction Equilibrium: Achieving Cooperation in Incomplete Information Games Stéphane Ross and Brahim Chaib-draa Department of Computer Science and Software Engineering Laval University, Québec (Qc),
More informationDETERMINISTIC AND STOCHASTIC SELECTION DYNAMICS
DETERMINISTIC AND STOCHASTIC SELECTION DYNAMICS Jörgen Weibull March 23, 2010 1 The multi-population replicator dynamic Domain of analysis: finite games in normal form, G =(N, S, π), with mixed-strategy
More informationA Note on the Existence of Ratifiable Acts
A Note on the Existence of Ratifiable Acts Joseph Y. Halpern Cornell University Computer Science Department Ithaca, NY 14853 halpern@cs.cornell.edu http://www.cs.cornell.edu/home/halpern August 15, 2018
More informationBandit models: a tutorial
Gdt COS, December 3rd, 2015 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions) Bandit game: a each round t, an agent chooses
More informationNo-Regret Learning in Bayesian Games
No-Regret Learning in Bayesian Games Jason Hartline Northwestern University Evanston, IL hartline@northwestern.edu Vasilis Syrgkanis Microsoft Research New York, NY vasy@microsoft.com Éva Tardos Cornell
More informationPolynomial-time Computation of Exact Correlated Equilibrium in Compact Games
Polynomial-time Computation of Exact Correlated Equilibrium in Compact Games ABSTRACT Albert Xin Jiang Department of Computer Science University of British Columbia Vancouver, BC, Canada jiang@cs.ubc.ca
More informationThe Price of Stochastic Anarchy
The Price of Stochastic Anarchy Christine Chung 1, Katrina Ligett 2, Kirk Pruhs 1, and Aaron Roth 2 1 Department of Computer Science University of Pittsburgh {chung,kirk}@cs.pitt.edu 2 Department of Computer
More informationComputing Minmax; Dominance
Computing Minmax; Dominance CPSC 532A Lecture 5 Computing Minmax; Dominance CPSC 532A Lecture 5, Slide 1 Lecture Overview 1 Recap 2 Linear Programming 3 Computational Problems Involving Maxmin 4 Domination
More informationCOS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #24 Scribe: Jad Bechara May 2, 2018
COS 5: heoretical Machine Learning Lecturer: Rob Schapire Lecture #24 Scribe: Jad Bechara May 2, 208 Review of Game heory he games we will discuss are two-player games that can be modeled by a game matrix
More informationGame Theory and its Applications to Networks - Part I: Strict Competition
Game Theory and its Applications to Networks - Part I: Strict Competition Corinne Touati Master ENS Lyon, Fall 200 What is Game Theory and what is it for? Definition (Roger Myerson, Game Theory, Analysis
More informationNew Algorithms for Approximate Nash Equilibria in Bimatrix Games
New Algorithms for Approximate Nash Equilibria in Bimatrix Games Hartwig Bosse Jaroslaw Byrka Evangelos Markakis Abstract We consider the problem of computing additively approximate Nash equilibria in
More informationOnline Learning with Feedback Graphs
Online Learning with Feedback Graphs Nicolò Cesa-Bianchi Università degli Studi di Milano Joint work with: Noga Alon (Tel-Aviv University) Ofer Dekel (Microsoft Research) Tomer Koren (Technion and Microsoft
More information