Advanced Machine Learning

Size: px
Start display at page:

Download "Advanced Machine Learning"

Transcription

1 Advanced Machine Learning Online Convex Optimization MEHRYAR MOHRI COURANT INSTITUTE & GOOGLE RESEARCH.

2 Outline Online projected sub-gradient descent. Exponentiated Gradient (EG). Mirror descent. Dual Averaging. Advanced Machine Learning - Mohri@ page 2

3 Set-Up Convex set. t =1 C For to do w t T C predict. receive convex loss function. f t (w t ) incur loss. f t : C R A Regret of algorithm : R T (A) = T f t (w t ) inf w C T f t (w). Advanced Machine Learning - Mohri@ page 3

4 Online Projected Subgrad. Desc. Algorithm: w 1 C arbitrary. w t+1 = C [w t f t (w t )], where C is the projection over C. f t (w t ) f t (w t ) f t w t > 0 parameter. (sub-gradient of at ). Advanced Machine Learning - Mohri@ page 4

5 Assumptions: Analysis w 1 w R w argmin where.. w C f t (w t ) G Theorem: the regret of online projected sub-gradient descent (PSGD) is bounded as follows R T (PSGD) R G2 T 2 T. f t (w) (Zinkevich, 2009) Choosing to minimize the bound gives R T (PSGD) RG T. Advanced Machine Learning - Mohri@ page 5

6 Proof The proof uses the definition of subgradient and the property of projection: R T (PSGD) = apple = apple apple 1 2 apple 1 2 f t (w t ) f t (w ) f t (w t ) (w t w ) (def. of subgrad.) 1 hkw t w k 2 ]+ 2 k f t (w t )k 2 kw t f t (w t ) w k 2i 2 1 h kw t w k 2 ]+ 2 G 2 kw t+1 w k 2i (prop. of proj.) 2 hkw 1 w k 2 ]+ 2 G 2 T kw T +1 w k 2i h kw 1 i w k 2 ]+ 2 G 2 T apple 1 h i R G 2 T. 2 Advanced Machine Learning - Mohri@ page 6

7 Strong Convexity Definition: a convex function defined over a convex set is -strongly convex with respect to norm k k if the function is convex or, equivalently, w 7! (w) 2 kwk2 w w 0 C (w) (w) (w 0 ) (w)+ (w) (w 0 w)+ 2 kw0 wk 2. for all, in and, (w 0 ) (w)+hr (w), w 0 wi + 2 kw0 wk 2 C (w)+hr (w), w 0 wi (w) page 7

8 Strongly Convex Objectives Theorem: assume that the functions are -strongly convex and k f t (w)k appleg for all w and f t t (w). Then, the regret of online projected sub-gradient descent (PSGD) with parameter is bounded as follows t+1 = 1 t R T (PSGD) apple G2 2 f t (1 + log T ). (Hazan et al., 2007) page 8

9 Proof R T (PSGD) = f t (w t ) f t (w ) apple = apple apple 2 = 2 f t (w t ) (w t w ) 2 kw t w k 2 (strong convexity) 1 h kw t w k t+1k 2 f t (w t )k 2 kw t t+1 f t (w t ) w k 2i t+1 2 kw t w k 2 1 h kw t w k t+1g 2 2 kw t+1 w k 2i t+1 2 kw t w k 2 (prop. of proj.) h h(t 1)kw t w k 2 tkw t+1 w k 2i + G2 2 T kw T +1 w k 2i + G2 2 1 t apple G2 2 1 t 1 t G2 apple (1 + log T ). 2 (def. of t+1 ) page 9

10 Smoothness Definition: a continuously differentiable function is -smooth if its gradient is -Lipschitz: f for all. krf(w 0 ) rf(w)k apple kw 0 wk, w, w 0 f w, w 0 Property: if is convex and -smooth, then, for all, 0 apple f(w) f(w 0 ) rf(w 0 ) (w w 0 ) apple 2 kw w 0 k 2. page 10

11 Exponentiated Gradient (EG) Convex set: simplex. Algorithm: w 1 =( 1 N,..., 1 N ). where w t+1,i = w t,i exp( [ f t (w t )] i ) C = {w R N : w 0 w 1 =1} Z t Z t = N i=1 w t,ie [ f t (w t )] i. (Kivinen and Warmuth, 1997) Advanced Machine Learning - Mohri@ page 11

12 Analysis Assumption: f t (w t ) G. Theorem: the regret of the Exponentiated Gradient (EG) algorithm is bounded as follows R T (EG) log N + G 2 T 2. Choosing to minimize the bound gives R T (EG) 2G T log N. Advanced Machine Learning - Mohri@ page 12

13 Potential: t+1 t = = t = D(w w t )= NX wi log w t,i i=1 Proof N i=1 h X N i log Z t = log w t,i e [ f t(w t )] i i=1 h i wt = log E = log E i wt w t+1,i i e [ f t(w t )] i apple e [ f t(w t )] i w i log w i w t,i. NX wi log Zt + [ f t (w t )] i = log Zt + w f t (w t ). i=1 E [ f t (w t )] i E [ f t (w t )] i apple 2 4G2 1 8 Advanced Machine Learning - Mohri@ w t f t (w t ) (Hoe ding s ineq.). page 13

14 Proof Combining equality and inequality: t+1 t apple 2 G (w w t ) f t (w t ), (w w t ) f t (w t ) apple 2 G 2 1 +( t t+1 ) 2 ) (w w t ) f t (w t ) apple 2 G 2 1T + 1 T +1 2 ) R T (EG) = (w w t ) f t (w t ) apple 2 G 2 1T 2 apple apple G2 1T 2 f t (w t ) f t (w ) f t (w t ) (w t w ) + 1 = G2 1T D(w k w 1 ) apple G2 1T 2 (Rel. Ent. non-neg.) + log N. Advanced Machine Learning - Mohri@ page 14

15 Convex Optimization Application:. fixed loss function: guarantee for average weight vector: 1 f T min w2c f(w) w t f t = f. f(w ) apple 1 T O 1 2 thus, convergence in. = R T (A) T f (w t ) f(w ) = O 1 p T. Advanced Machine Learning - Mohri@ page 15

16 Generalization PSGD and EG both special instances of a more general algorithm: Mirror Descent. Mirror Descent is based on Bregman divergence: B(w k w 0 )= 1 2 kw w0 k 2 2 PSGD:. EG: unnormalized relative entropy; B(w k w 0 )= P N i=1 h w i log h w i w 0 i i w i + w 0 i i. page 16

17 Bregman Divergence Definition: convex differentiable over open convex set. The Bregman divergence associated to is defined by B (w k w 0 )= (w) (w 0 ) hr (w 0 ), w w 0 i. (w) C (w) (w 0 ) {z } B (w k w 0 ) (w 0 )+hr (w 0 ), w w 0 i w 0 w w page 17

18 Properties Proposition: the following properties hold for a Bregman divergence. non-negativity:. linearity:. projection: for any closed convex set, the projection of -projection of over is unique: Triangular identity: Pythagorean theorem: 8w, w 0 2 C, B (w k w 0 ) 0 B + = B + B K C B w 0 K P K (w 0 ) = argmin w2k B F (w k w 0 ). (r (w) r (v)) (w u) =B(u k w)+b(w k v) B(u k v). B (w k w 0 ) B (w k P K (w 0 )) + B (P K (w 0 ) k w 0 ). page 18

19 Pythagorean theorem w 0 P K (w 0 ) K w B (w k w 0 ) B (w k P K (w 0 )) + B (P K (w 0 ) k w 0 ). page 19

20 Legendre Type Functions Definition: a real-valued function defined over a nonempty open convex set C is said to be of Legendre type if it is proper closed convex and differentiable over C and if one of the following equivalent conditions holds: r C r (C) lim kr (w)k =+1. w!@c is one-to-one mapping from to. (Rockafellar, 1970) page 20

21 Mirror Descent (Nemirovski and Yudin, 1983) r w t r (w t ) K w t+1 f t (w t ) r (v t+1 ) C v t+1 [r ] 1 page 21

22 Mirror Descent Mirror-Descent( ) 1 w 1 argmin w2k\c (w) 2 for t 1 to T do 3 v t+1 [r ] 1 r (w t ) f t (w t ) 4 w t+1 argmin w2k\c B(w k v t+1 ) page 22

23 MD Guarantee Theorem: let be a non-empty open convex set and a compact convex set. Assume that : C! R is of Legendre type and -strongly convex with respect to k k and f t s convex and G -Lipschitz with respect to k k. Then, the regret of Mirror Descent can be bounded as follows: Choosing C R T (MD) apple B(w k w 1 ) to minimize the bound gives R T (MD) apple D G r 2T, with B(w k w 1 ) apple D 2. + G2 T 2. K C page 23

24 Proof R T (MD) = f t (w t ) f t (w ) apple = 1 = 1 apple 1 = 1 f t (w t ) (w t w ) (def. of subgrad.) [r (w t ) r (v t+1 )] (w t w ) (def. of v t ) B(w k w t ) B(w k v t+1 )+B(w t k v t+1 ) (Breg. div. Identity) B(w k w t ) B(w k w t+1 ) B(w t+1 k v t+1 )+B(w t k v t+1 ) (Pythagorean ineq.) h i B(w k w 1 ) B(w k w T +1 ) apple B(w k w 1 ) h h i B(w t k v t+1 ) B(w t+1 k v t+1 ). i B(w t+1 k v t+1 )+B(w t k v t+1 ) page 24

25 Proof h i B(w t k v t+1 ) B(w t+1 k v t+1 ) = (w t ) (w t+1 ) r (v t+1 ) (w t w t+1 ) apple r (w t ) r (v t+1 ) (w t w t+1 ) = f t (w t ) (w t w t+1 ) apple G kw t apple ( G ) 2 2. w t+1 k 2 kw t w t+1 k 2 ( -strong convexity) 2 kw t w t+1 k 2 (def. of v t+1 ) 2 kw t w t+1 k 2 (G -Lipschitzness) (max. of 2nd deg. eq.) page 25

26 Equivalent Description Mirror-Descent( ) 1 w 1 argmin w2k\c (w) 2 for t 1 to (T 1) do 3 w t+1 argmin w2k\c f t (w t ) w + 1 B(w k w t) Proof: linearization of f t regularization w t+1 = argmin w2k\c = argmin w2k\c = argmin w2k\c = argmin w2k\c B(w k v t+1 ) (w) r (v t+1 ) w (def. of Breg. div.) (w) r (w t ) f t (w t ) w (def. of v t+1 ) f t (w t ) w + B(w k w t ). (def. of Breg. div.) page 26

27 Dual Averaging Dual-Averaging( ) 1 v w 1 argmin w2k\c B(w k v 1 ) 3 for t 1 to T do 4 v t+1 [r ] 1 r (v t ) f t (w t ) 5 w t+1 argmin w2k\c B(w k v t+1 ) (Iouditski and Nesterov, 2010) page 27

28 Equivalent Description Equivalent form: w t+1 = argmin w2k\c = argmin w2k\c = argmin w2k\c = argmin w2k\c B(w k v t+1 ) (w) r (v t+1 ) w (def. of Breg. div.) (w) r (v t ) f t (w t ) w (def. of v t+1 ) tx f t (w s )+ (w). (recurrence) s=1 In particular, for linear losses, Averaging coincides with regularized FL: w t+1 = argmin w2k\c tx s=1 f t (w) =a t w a s w + 1 (w)., Dual page 28

29 DA Guarantee Theorem: under the same assumptions as for MD, the following holds for the regret of Dual Averaging, R T (DA) apple (w ) (w 1 ) + 2 G2 T. Choosing to minimize the bound gives R T (DA) apple 2D G r 2T, with (w ) (w 1 ) apple D 2. page 29

30 References Abraham Flaxman, Adam Tauman Kalai, and H. Brendan McMahan. On-line convex optimization in the bandit setting: gradient descent without a gradient. In SODA, pages SIAM, Baruch Awerbuch and Robert Kleinberg. Online linear optimization and adaptive routing. J. Comput. Syst. Sci., 74(1):97 114, Amir Beck and Marc Teboulle. Mirror Descent and nonlinear projected subgradient methods for convex optimization. Operations Research Letters, 31(3): , Nicolò Cesa-Bianchi and Gábor Lugosi. Prediction, learning, and games. Cambridge University Press, A.J. Grove, N. Littlestone, and D. Schuurmans. General convergence results for linear discriminant updates. Machine Learning, 43(3): , Advanced Machine Learning - Mohri@ page 30

31 References Elad Hazan, Amit Agarwal, and Satyen Kale. Logarithmic regret algorithms for online convex optimization. Machine Learning, 69(2-3): , Anatoli Iouditski, Yuri Nesterov. Primal-dual subgradient methods for minimizing uniformly convex functions <hal v1> Adam T. Kalai, Santosh Vempala. Efficient algorithms for online decision problems. J. Comput. Syst. Sci. 71(3): Jyrki Kivinen and Manfred K. Warmuth. Additive versus exponentiated gradient updates for linear prediction. Information and Computation, 132(1):1 64, Jyrki Kivinen and Manfred K. Warmuth. Relative loss bounds for multidimensional regression problems. Machine Learning, 45(3): , Yurii Nesterov. Introductory lectures on convex optimization: A basic course. Kluwer Academic Publishers, 2004a. R. Tyrrell Rockafellar. Convex Analysis. Princeton University Press, Advanced Machine Learning - Mohri@ page 31

32 References Arkadii Semenovich Nemirovski, David Berkovich Yudin. Problem complexity and Method Efficiency in Optimization, Wiley, New York, Eiji Takimoto and Manfred K. Warmuth. Path kernels and multiplicative updates. JMLR, 4: , Martin Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In ICML, pages , Advanced Machine Learning - Mohri@ page 32

Learning with Large Number of Experts: Component Hedge Algorithm

Learning with Large Number of Experts: Component Hedge Algorithm Learning with Large Number of Experts: Component Hedge Algorithm Giulia DeSalvo and Vitaly Kuznetsov Courant Institute March 24th, 215 1 / 3 Learning with Large Number of Experts Regret of RWM is O( T

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Learning with Large Expert Spaces MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Problem Learning guarantees: R T = O( p T log N). informative even for N very large.

More information

A survey: The convex optimization approach to regret minimization

A survey: The convex optimization approach to regret minimization A survey: The convex optimization approach to regret minimization Elad Hazan September 10, 2009 WORKING DRAFT Abstract A well studied and general setting for prediction and decision making is regret minimization

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning Editors: Suvrit Sra suvrit@gmail.com Max Planck Insitute for Biological Cybernetics 72076 Tübingen, Germany Sebastian Nowozin Microsoft Research Cambridge, CB3 0FB, United

More information

The convex optimization approach to regret minimization

The convex optimization approach to regret minimization The convex optimization approach to regret minimization Elad Hazan Technion - Israel Institute of Technology ehazan@ie.technion.ac.il Abstract A well studied and general setting for prediction and decision

More information

Exponential Weights on the Hypercube in Polynomial Time

Exponential Weights on the Hypercube in Polynomial Time European Workshop on Reinforcement Learning 14 (2018) October 2018, Lille, France. Exponential Weights on the Hypercube in Polynomial Time College of Information and Computer Sciences University of Massachusetts

More information

Adaptive Online Gradient Descent

Adaptive Online Gradient Descent University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 6-4-2007 Adaptive Online Gradient Descent Peter Bartlett Elad Hazan Alexander Rakhlin University of Pennsylvania Follow

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Follow-he-Perturbed Leader MEHRYAR MOHRI MOHRI@ COURAN INSIUE & GOOGLE RESEARCH. General Ideas Linear loss: decomposition as a sum along substructures. sum of edge losses in a

More information

Online Learning and Online Convex Optimization

Online Learning and Online Convex Optimization Online Learning and Online Convex Optimization Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Online Learning 1 / 49 Summary 1 My beautiful regret 2 A supposedly fun game

More information

Foundations of Machine Learning On-Line Learning. Mehryar Mohri Courant Institute and Google Research

Foundations of Machine Learning On-Line Learning. Mehryar Mohri Courant Institute and Google Research Foundations of Machine Learning On-Line Learning Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Motivation PAC learning: distribution fixed over time (training and test). IID assumption.

More information

The Online Approach to Machine Learning

The Online Approach to Machine Learning The Online Approach to Machine Learning Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Online Approach to ML 1 / 53 Summary 1 My beautiful regret 2 A supposedly fun game I

More information

No-Regret Algorithms for Unconstrained Online Convex Optimization

No-Regret Algorithms for Unconstrained Online Convex Optimization No-Regret Algorithms for Unconstrained Online Convex Optimization Matthew Streeter Duolingo, Inc. Pittsburgh, PA 153 matt@duolingo.com H. Brendan McMahan Google, Inc. Seattle, WA 98103 mcmahan@google.com

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Bandit Problems MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Multi-Armed Bandit Problem Problem: which arm of a K-slot machine should a gambler pull to maximize his

More information

Online Optimization with Gradual Variations

Online Optimization with Gradual Variations JMLR: Workshop and Conference Proceedings vol (0) 0 Online Optimization with Gradual Variations Chao-Kai Chiang, Tianbao Yang 3 Chia-Jung Lee Mehrdad Mahdavi 3 Chi-Jen Lu Rong Jin 3 Shenghuo Zhu 4 Institute

More information

Logarithmic Regret Algorithms for Online Convex Optimization

Logarithmic Regret Algorithms for Online Convex Optimization Logarithmic Regret Algorithms for Online Convex Optimization Elad Hazan 1, Adam Kalai 2, Satyen Kale 1, and Amit Agarwal 1 1 Princeton University {ehazan,satyen,aagarwal}@princeton.edu 2 TTI-Chicago kalai@tti-c.org

More information

Full-information Online Learning

Full-information Online Learning Introduction Expert Advice OCO LM A DA NANJING UNIVERSITY Full-information Lijun Zhang Nanjing University, China June 2, 2017 Outline Introduction Expert Advice OCO 1 Introduction Definitions Regret 2

More information

Accelerating Online Convex Optimization via Adaptive Prediction

Accelerating Online Convex Optimization via Adaptive Prediction Mehryar Mohri Courant Institute and Google Research New York, NY 10012 mohri@cims.nyu.edu Scott Yang Courant Institute New York, NY 10012 yangs@cims.nyu.edu Abstract We present a powerful general framework

More information

Time Series Prediction & Online Learning

Time Series Prediction & Online Learning Time Series Prediction & Online Learning Joint work with Vitaly Kuznetsov (Google Research) MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Motivation Time series prediction: stock values. earthquakes.

More information

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization John Duchi, Elad Hanzan, Yoram Singer

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization John Duchi, Elad Hanzan, Yoram Singer Adaptive Subgradient Methods for Online Learning and Stochastic Optimization John Duchi, Elad Hanzan, Yoram Singer Vicente L. Malave February 23, 2011 Outline Notation minimize a number of functions φ

More information

Bandit Online Convex Optimization

Bandit Online Convex Optimization March 31, 2015 Outline 1 OCO vs Bandit OCO 2 Gradient Estimates 3 Oblivious Adversary 4 Reshaping for Improved Rates 5 Adaptive Adversary 6 Concluding Remarks Review of (Online) Convex Optimization Set-up

More information

Convex Repeated Games and Fenchel Duality

Convex Repeated Games and Fenchel Duality Convex Repeated Games and Fenchel Duality Shai Shalev-Shwartz 1 and Yoram Singer 1,2 1 School of Computer Sci. & Eng., he Hebrew University, Jerusalem 91904, Israel 2 Google Inc. 1600 Amphitheater Parkway,

More information

Tutorial: PART 2. Online Convex Optimization, A Game- Theoretic Approach to Learning

Tutorial: PART 2. Online Convex Optimization, A Game- Theoretic Approach to Learning Tutorial: PART 2 Online Convex Optimization, A Game- Theoretic Approach to Learning Elad Hazan Princeton University Satyen Kale Yahoo Research Exploiting curvature: logarithmic regret Logarithmic regret

More information

Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization

Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization JMLR: Workshop and Conference Proceedings vol (2010) 1 16 24th Annual Conference on Learning heory Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization

More information

Anticipating Concept Drift in Online Learning

Anticipating Concept Drift in Online Learning Anticipating Concept Drift in Online Learning Michał Dereziński Computer Science Department University of California, Santa Cruz CA 95064, U.S.A. mderezin@soe.ucsc.edu Badri Narayan Bhaskar Yahoo Labs

More information

Exponentiated Gradient Descent

Exponentiated Gradient Descent CSE599s, Spring 01, Online Learning Lecture 10-04/6/01 Lecturer: Ofer Dekel Exponentiated Gradient Descent Scribe: Albert Yu 1 Introduction In this lecture we review norms, dual norms, strong convexity,

More information

Convex Repeated Games and Fenchel Duality

Convex Repeated Games and Fenchel Duality Convex Repeated Games and Fenchel Duality Shai Shalev-Shwartz 1 and Yoram Singer 1,2 1 School of Computer Sci. & Eng., he Hebrew University, Jerusalem 91904, Israel 2 Google Inc. 1600 Amphitheater Parkway,

More information

An Online Convex Optimization Approach to Blackwell s Approachability

An Online Convex Optimization Approach to Blackwell s Approachability Journal of Machine Learning Research 17 (2016) 1-23 Submitted 7/15; Revised 6/16; Published 8/16 An Online Convex Optimization Approach to Blackwell s Approachability Nahum Shimkin Faculty of Electrical

More information

Extracting Certainty from Uncertainty: Regret Bounded by Variation in Costs

Extracting Certainty from Uncertainty: Regret Bounded by Variation in Costs Extracting Certainty from Uncertainty: Regret Bounded by Variation in Costs Elad Hazan IBM Almaden Research Center 650 Harry Rd San Jose, CA 95120 ehazan@cs.princeton.edu Satyen Kale Yahoo! Research 4301

More information

A Drifting-Games Analysis for Online Learning and Applications to Boosting

A Drifting-Games Analysis for Online Learning and Applications to Boosting A Drifting-Games Analysis for Online Learning and Applications to Boosting Haipeng Luo Department of Computer Science Princeton University Princeton, NJ 08540 haipengl@cs.princeton.edu Robert E. Schapire

More information

Extracting Certainty from Uncertainty: Regret Bounded by Variation in Costs

Extracting Certainty from Uncertainty: Regret Bounded by Variation in Costs Extracting Certainty from Uncertainty: Regret Bounded by Variation in Costs Elad Hazan IBM Almaden 650 Harry Rd, San Jose, CA 95120 hazan@us.ibm.com Satyen Kale Microsoft Research 1 Microsoft Way, Redmond,

More information

Tutorial: PART 1. Online Convex Optimization, A Game- Theoretic Approach to Learning.

Tutorial: PART 1. Online Convex Optimization, A Game- Theoretic Approach to Learning. Tutorial: PART 1 Online Convex Optimization, A Game- Theoretic Approach to Learning http://www.cs.princeton.edu/~ehazan/tutorial/tutorial.htm Elad Hazan Princeton University Satyen Kale Yahoo Research

More information

Regret in Online Combinatorial Optimization

Regret in Online Combinatorial Optimization Regret in Online Combinatorial Optimization Jean-Yves Audibert Imagine, Université Paris Est, and Sierra, CNRS/ENS/INRIA audibert@imagine.enpc.fr Sébastien Bubeck Department of Operations Research and

More information

Bandits for Online Optimization

Bandits for Online Optimization Bandits for Online Optimization Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Bandits for Online Optimization 1 / 16 The multiarmed bandit problem... K slot machines Each

More information

Distributed Learning in Routing Games

Distributed Learning in Routing Games Distributed Learning in Routing Games Convergence, Estimation of Player Dynamics, and Control Walid Krichene Alexandre Bayen Electrical Engineering and Computer Sciences, UC Berkeley IPAM Workshop on Decision

More information

Lecture 23: Online convex optimization Online convex optimization: generalization of several algorithms

Lecture 23: Online convex optimization Online convex optimization: generalization of several algorithms EECS 598-005: heoretical Foundations of Machine Learning Fall 2015 Lecture 23: Online convex optimization Lecturer: Jacob Abernethy Scribes: Vikas Dhiman Disclaimer: hese notes have not been subjected

More information

Online Submodular Minimization

Online Submodular Minimization Online Submodular Minimization Elad Hazan IBM Almaden Research Center 650 Harry Rd, San Jose, CA 95120 hazan@us.ibm.com Satyen Kale Yahoo! Research 4301 Great America Parkway, Santa Clara, CA 95054 skale@yahoo-inc.com

More information

Online Optimization : Competing with Dynamic Comparators

Online Optimization : Competing with Dynamic Comparators Ali Jadbabaie Alexander Rakhlin Shahin Shahrampour Karthik Sridharan University of Pennsylvania University of Pennsylvania University of Pennsylvania Cornell University Abstract Recent literature on online

More information

Lecture: Adaptive Filtering

Lecture: Adaptive Filtering ECE 830 Spring 2013 Statistical Signal Processing instructors: K. Jamieson and R. Nowak Lecture: Adaptive Filtering Adaptive filters are commonly used for online filtering of signals. The goal is to estimate

More information

The Interplay Between Stability and Regret in Online Learning

The Interplay Between Stability and Regret in Online Learning The Interplay Between Stability and Regret in Online Learning Ankan Saha Department of Computer Science University of Chicago ankans@cs.uchicago.edu Prateek Jain Microsoft Research India prajain@microsoft.com

More information

Perceptron Mistake Bounds

Perceptron Mistake Bounds Perceptron Mistake Bounds Mehryar Mohri, and Afshin Rostamizadeh Google Research Courant Institute of Mathematical Sciences Abstract. We present a brief survey of existing mistake bounds and introduce

More information

A Unified Approach to Proximal Algorithms using Bregman Distance

A Unified Approach to Proximal Algorithms using Bregman Distance A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department

More information

Online Submodular Minimization

Online Submodular Minimization Journal of Machine Learning Research 13 (2012) 2903-2922 Submitted 12/11; Revised 7/12; Published 10/12 Online Submodular Minimization Elad Hazan echnion - Israel Inst. of ech. echnion City Haifa, 32000,

More information

Ad Placement Strategies

Ad Placement Strategies Case Study : Estimating Click Probabilities Intro Logistic Regression Gradient Descent + SGD AdaGrad Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox January 7 th, 04 Ad

More information

A Greedy Framework for First-Order Optimization

A Greedy Framework for First-Order Optimization A Greedy Framework for First-Order Optimization Jacob Steinhardt Department of Computer Science Stanford University Stanford, CA 94305 jsteinhardt@cs.stanford.edu Jonathan Huggins Department of EECS Massachusetts

More information

The FTRL Algorithm with Strongly Convex Regularizers

The FTRL Algorithm with Strongly Convex Regularizers CSE599s, Spring 202, Online Learning Lecture 8-04/9/202 The FTRL Algorithm with Strongly Convex Regularizers Lecturer: Brandan McMahan Scribe: Tamara Bonaci Introduction In the last lecture, we talked

More information

Stochastic and Adversarial Online Learning without Hyperparameters

Stochastic and Adversarial Online Learning without Hyperparameters Stochastic and Adversarial Online Learning without Hyperparameters Ashok Cutkosky Department of Computer Science Stanford University ashokc@cs.stanford.edu Kwabena Boahen Department of Bioengineering Stanford

More information

On the Generalization Ability of Online Strongly Convex Programming Algorithms

On the Generalization Ability of Online Strongly Convex Programming Algorithms On the Generalization Ability of Online Strongly Convex Programming Algorithms Sham M. Kakade I Chicago Chicago, IL 60637 sham@tti-c.org Ambuj ewari I Chicago Chicago, IL 60637 tewari@tti-c.org Abstract

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Learning and Games MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Outline Normal form games Nash equilibrium von Neumann s minimax theorem Correlated equilibrium Internal

More information

A Modular Analysis of Adaptive (Non-)Convex Optimization: Optimism, Composite Objectives, and Variational Bounds

A Modular Analysis of Adaptive (Non-)Convex Optimization: Optimism, Composite Objectives, and Variational Bounds Proceedings of Machine Learning Research 76: 40, 207 Algorithmic Learning Theory 207 A Modular Analysis of Adaptive (Non-)Convex Optimization: Optimism, Composite Objectives, and Variational Bounds Pooria

More information

Logarithmic Regret Algorithms for Strongly Convex Repeated Games

Logarithmic Regret Algorithms for Strongly Convex Repeated Games Logarithmic Regret Algorithms for Strongly Convex Repeated Games Shai Shalev-Shwartz 1 and Yoram Singer 1,2 1 School of Computer Sci & Eng, The Hebrew University, Jerusalem 91904, Israel 2 Google Inc 1600

More information

Lower and Upper Bounds on the Generalization of Stochastic Exponentially Concave Optimization

Lower and Upper Bounds on the Generalization of Stochastic Exponentially Concave Optimization JMLR: Workshop and Conference Proceedings vol 40:1 16, 2015 28th Annual Conference on Learning Theory Lower and Upper Bounds on the Generalization of Stochastic Exponentially Concave Optimization Mehrdad

More information

Online Sparse Linear Regression

Online Sparse Linear Regression JMLR: Workshop and Conference Proceedings vol 49:1 11, 2016 Online Sparse Linear Regression Dean Foster Amazon DEAN@FOSTER.NET Satyen Kale Yahoo Research SATYEN@YAHOO-INC.COM Howard Karloff HOWARD@CC.GATECH.EDU

More information

Online Learning for Time Series Prediction

Online Learning for Time Series Prediction Online Learning for Time Series Prediction Joint work with Vitaly Kuznetsov (Google Research) MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Motivation Time series prediction: stock values.

More information

Adaptivity and Optimism: An Improved Exponentiated Gradient Algorithm

Adaptivity and Optimism: An Improved Exponentiated Gradient Algorithm Adaptivity and Optimism: An Improved Exponentiated Gradient Algorithm Jacob Steinhardt Percy Liang Stanford University {jsteinhardt,pliang}@cs.stanford.edu Jun 11, 2013 J. Steinhardt & P. Liang (Stanford)

More information

Stochastic Optimization

Stochastic Optimization Introduction Related Work SGD Epoch-GD LM A DA NANJING UNIVERSITY Lijun Zhang Nanjing University, China May 26, 2017 Introduction Related Work SGD Epoch-GD Outline 1 Introduction 2 Related Work 3 Stochastic

More information

Learning, Games, and Networks

Learning, Games, and Networks Learning, Games, and Networks Abhishek Sinha Laboratory for Information and Decision Systems MIT ML Talk Series @CNRG December 12, 2016 1 / 44 Outline 1 Prediction With Experts Advice 2 Application to

More information

arxiv: v2 [stat.ml] 7 Sep 2018

arxiv: v2 [stat.ml] 7 Sep 2018 Projection-Free Bandit Convex Optimization Lin Chen 1, Mingrui Zhang 3 Amin Karbasi 1, 1 Yale Institute for Network Science, Department of Electrical Engineering, 3 Department of Statistics and Data Science,

More information

One Mirror Descent Algorithm for Convex Constrained Optimization Problems with Non-Standard Growth Properties

One Mirror Descent Algorithm for Convex Constrained Optimization Problems with Non-Standard Growth Properties One Mirror Descent Algorithm for Convex Constrained Optimization Problems with Non-Standard Growth Properties Fedor S. Stonyakin 1 and Alexander A. Titov 1 V. I. Vernadsky Crimean Federal University, Simferopol,

More information

Optimization, Learning, and Games with Predictable Sequences

Optimization, Learning, and Games with Predictable Sequences Optimization, Learning, and Games with Predictable Sequences Alexander Rakhlin University of Pennsylvania Karthik Sridharan University of Pennsylvania Abstract We provide several applications of Optimistic

More information

Online Optimization in Dynamic Environments: Improved Regret Rates for Strongly Convex Problems

Online Optimization in Dynamic Environments: Improved Regret Rates for Strongly Convex Problems 216 IEEE 55th Conference on Decision and Control (CDC) ARIA Resort & Casino December 12-14, 216, Las Vegas, USA Online Optimization in Dynamic Environments: Improved Regret Rates for Strongly Convex Problems

More information

Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations

Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations JMLR: Workshop and Conference Proceedings vol 35:1 20, 2014 Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations H. Brendan McMahan MCMAHAN@GOOGLE.COM Google,

More information

From Bandits to Experts: A Tale of Domination and Independence

From Bandits to Experts: A Tale of Domination and Independence From Bandits to Experts: A Tale of Domination and Independence Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Domination and Independence 1 / 1 From Bandits to Experts: A

More information

A Low Complexity Algorithm with O( T ) Regret and Finite Constraint Violations for Online Convex Optimization with Long Term Constraints

A Low Complexity Algorithm with O( T ) Regret and Finite Constraint Violations for Online Convex Optimization with Long Term Constraints A Low Complexity Algorithm with O( T ) Regret and Finite Constraint Violations for Online Convex Optimization with Long Term Constraints Hao Yu and Michael J. Neely Department of Electrical Engineering

More information

arxiv: v4 [math.oc] 5 Jan 2016

arxiv: v4 [math.oc] 5 Jan 2016 Restarted SGD: Beating SGD without Smoothness and/or Strong Convexity arxiv:151.03107v4 [math.oc] 5 Jan 016 Tianbao Yang, Qihang Lin Department of Computer Science Department of Management Sciences The

More information

Adaptive Online Learning in Dynamic Environments

Adaptive Online Learning in Dynamic Environments Adaptive Online Learning in Dynamic Environments Lijun Zhang, Shiyin Lu, Zhi-Hua Zhou National Key Laboratory for Novel Software Technology Nanjing University, Nanjing 210023, China {zhanglj, lusy, zhouzh}@lamda.nju.edu.cn

More information

4.1 Online Convex Optimization

4.1 Online Convex Optimization CS/CNS/EE 53: Advanced Topics in Machine Learning Topic: Online Convex Optimization and Online SVM Lecturer: Daniel Golovin Scribe: Xiaodi Hou Date: Jan 3, 4. Online Convex Optimization Definition 4..

More information

Convergence rate of SGD

Convergence rate of SGD Convergence rate of SGD heorem: (see Nemirovski et al 09 from readings) Let f be a strongly convex stochastic function Assume gradient of f is Lipschitz continuous and bounded hen, for step sizes: he expected

More information

Nesterov s Optimal Gradient Methods

Nesterov s Optimal Gradient Methods Yurii Nesterov http://www.core.ucl.ac.be/~nesterov Nesterov s Optimal Gradient Methods Xinhua Zhang Australian National University NICTA 1 Outline The problem from machine learning perspective Preliminaries

More information

Minimax Policies for Combinatorial Prediction Games

Minimax Policies for Combinatorial Prediction Games Minimax Policies for Combinatorial Prediction Games Jean-Yves Audibert Imagine, Univ. Paris Est, and Sierra, CNRS/ENS/INRIA, Paris, France audibert@imagine.enpc.fr Sébastien Bubeck Centre de Recerca Matemàtica

More information

Better Algorithms for Benign Bandits

Better Algorithms for Benign Bandits Better Algorithms for Benign Bandits Elad Hazan IBM Almaden 650 Harry Rd, San Jose, CA 95120 hazan@us.ibm.com Satyen Kale Microsoft Research One Microsoft Way, Redmond, WA 98052 satyen.kale@microsoft.com

More information

A simpler unified analysis of Budget Perceptrons

A simpler unified analysis of Budget Perceptrons Ilya Sutskever University of Toronto, 6 King s College Rd., Toronto, Ontario, M5S 3G4, Canada ILYA@CS.UTORONTO.CA Abstract The kernel Perceptron is an appealing online learning algorithm that has a drawback:

More information

Information geometry of mirror descent

Information geometry of mirror descent Information geometry of mirror descent Geometric Science of Information Anthea Monod Department of Statistical Science Duke University Information Initiative at Duke G. Raskutti (UW Madison) and S. Mukherjee

More information

Optimistic Bandit Convex Optimization

Optimistic Bandit Convex Optimization Optimistic Bandit Convex Optimization Mehryar Mohri Courant Institute and Google 25 Mercer Street New York, NY 002 mohri@cims.nyu.edu Scott Yang Courant Institute 25 Mercer Street New York, NY 002 yangs@cims.nyu.edu

More information

Online Passive-Aggressive Algorithms

Online Passive-Aggressive Algorithms Online Passive-Aggressive Algorithms Koby Crammer Ofer Dekel Shai Shalev-Shwartz Yoram Singer School of Computer Science & Engineering The Hebrew University, Jerusalem 91904, Israel {kobics,oferd,shais,singer}@cs.huji.ac.il

More information

Online Linear Optimization via Smoothing

Online Linear Optimization via Smoothing JMLR: Workshop and Conference Proceedings vol 35:1 18, 2014 Online Linear Optimization via Smoothing Jacob Abernethy Chansoo Lee Computer Science and Engineering Division, University of Michigan, Ann Arbor

More information

Delay-Tolerant Online Convex Optimization: Unified Analysis and Adaptive-Gradient Algorithms

Delay-Tolerant Online Convex Optimization: Unified Analysis and Adaptive-Gradient Algorithms Delay-Tolerant Online Convex Optimization: Unified Analysis and Adaptive-Gradient Algorithms Pooria Joulani 1 András György 2 Csaba Szepesvári 1 1 Department of Computing Science, University of Alberta,

More information

Learning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley

Learning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley Learning Methods for Online Prediction Problems Peter Bartlett Statistics and EECS UC Berkeley Course Synopsis A finite comparison class: A = {1,..., m}. 1. Prediction with expert advice. 2. With perfect

More information

Quasi-Newton Algorithms for Non-smooth Online Strongly Convex Optimization. Mark Franklin Godwin

Quasi-Newton Algorithms for Non-smooth Online Strongly Convex Optimization. Mark Franklin Godwin Quasi-Newton Algorithms for Non-smooth Online Strongly Convex Optimization by Mark Franklin Godwin A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy

More information

Generalized Mirror Descents with Non-Convex Potential Functions in Atomic Congestion Games

Generalized Mirror Descents with Non-Convex Potential Functions in Atomic Congestion Games Generalized Mirror Descents with Non-Convex Potential Functions in Atomic Congestion Games Po-An Chen Institute of Information Management National Chiao Tung University, Taiwan poanchen@nctu.edu.tw Abstract.

More information

Black-Box Reductions for Parameter-free Online Learning in Banach Spaces

Black-Box Reductions for Parameter-free Online Learning in Banach Spaces Proceedings of Machine Learning Research vol 75:1 37, 2018 31st Annual Conference on Learning Theory Black-Box Reductions for Parameter-free Online Learning in Banach Spaces Ashok Cutkosky Stanford University

More information

Learning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley

Learning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley Learning Methods for Online Prediction Problems Peter Bartlett Statistics and EECS UC Berkeley Course Synopsis A finite comparison class: A = {1,..., m}. Converting online to batch. Online convex optimization.

More information

Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization

Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization Jacob Abernethy Computer Science Division UC Berkeley jake@cs.berkeley.edu (eligible for best student paper award Elad Hazan

More information

6. Proximal gradient method

6. Proximal gradient method L. Vandenberghe EE236C (Spring 2016) 6. Proximal gradient method motivation proximal mapping proximal gradient method with fixed step size proximal gradient method with line search 6-1 Proximal mapping

More information

Lecture 16: FTRL and Online Mirror Descent

Lecture 16: FTRL and Online Mirror Descent Lecture 6: FTRL and Online Mirror Descent Akshay Krishnamurthy akshay@cs.umass.edu November, 07 Recap Last time we saw two online learning algorithms. First we saw the Weighted Majority algorithm, which

More information

Dual Proximal Gradient Method

Dual Proximal Gradient Method Dual Proximal Gradient Method http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/19 1 proximal gradient method

More information

Dynamic Regret of Strongly Adaptive Methods

Dynamic Regret of Strongly Adaptive Methods Lijun Zhang 1 ianbao Yang 2 Rong Jin 3 Zhi-Hua Zhou 1 Abstract o cope with changing environments, recent developments in online learning have introduced the concepts of adaptive regret and dynamic regret

More information

Bregman Divergence and Mirror Descent

Bregman Divergence and Mirror Descent Bregman Divergence and Mirror Descent Bregman Divergence Motivation Generalize squared Euclidean distance to a class of distances that all share similar properties Lots of applications in machine learning,

More information

Composite Objective Mirror Descent

Composite Objective Mirror Descent Composite Objective Mirror Descent John C. Duchi UC Berkeley jduchi@cs.berkeley.edu Shai Shalev-Shartz Hebre University shais@cs.huji.ac.il Yoram Singer Google Research singer@google.com Ambuj Teari TTI

More information

Bandit Convex Optimization: T Regret in One Dimension

Bandit Convex Optimization: T Regret in One Dimension Bandit Convex Optimization: T Regret in One Dimension arxiv:1502.06398v1 [cs.lg 23 Feb 2015 Sébastien Bubeck Microsoft Research sebubeck@microsoft.com Tomer Koren Technion tomerk@technion.ac.il February

More information

Online Passive-Aggressive Algorithms

Online Passive-Aggressive Algorithms Online Passive-Aggressive Algorithms Koby Crammer Ofer Dekel Shai Shalev-Shwartz Yoram Singer School of Computer Science & Engineering The Hebrew University, Jerusalem 91904, Israel {kobics,oferd,shais,singer}@cs.huji.ac.il

More information

From Batch to Transductive Online Learning

From Batch to Transductive Online Learning From Batch to Transductive Online Learning Sham Kakade Toyota Technological Institute Chicago, IL 60637 sham@tti-c.org Adam Tauman Kalai Toyota Technological Institute Chicago, IL 60637 kalai@tti-c.org

More information

Adaptivity and Optimism: An Improved Exponentiated Gradient Algorithm

Adaptivity and Optimism: An Improved Exponentiated Gradient Algorithm Jacob Steinhardt Percy Liang Stanford University, 353 Serra Street, Stanford, CA 94305 USA JSTEINHARDT@CS.STANFORD.EDU PLIANG@CS.STANFORD.EDU Abstract We present an adaptive variant of the exponentiated

More information

Online Learning with Non-Convex Losses and Non-Stationary Regret

Online Learning with Non-Convex Losses and Non-Stationary Regret Online Learning with Non-Convex Losses and Non-Stationary Regret Xiang Gao Xiaobo Li Shuzhong Zhang University of Minnesota University of Minnesota University of Minnesota Abstract In this paper, we consider

More information

Better Algorithms for Benign Bandits

Better Algorithms for Benign Bandits Better Algorithms for Benign Bandits Elad Hazan IBM Almaden 650 Harry Rd, San Jose, CA 95120 ehazan@cs.princeton.edu Satyen Kale Microsoft Research One Microsoft Way, Redmond, WA 98052 satyen.kale@microsoft.com

More information

arxiv: v1 [stat.ml] 23 Dec 2015

arxiv: v1 [stat.ml] 23 Dec 2015 Adaptive Algorithms for Online Convex Optimization with Long-term Constraints Rodolphe Jenatton Jim Huang Cédric Archambeau Amazon, Berlin Amazon, Seattle {jenatton,huangjim,cedrica}@amazon.com arxiv:5.74v

More information

Logarithmic Regret Algorithms for Online Convex Optimization

Logarithmic Regret Algorithms for Online Convex Optimization Logarithmic Regret Algorithms for Online Convex Optimization Elad Hazan Amit Agarwal Satyen Kale November 20, 2008 Abstract In an online convex optimization problem a decision-maker makes a sequence of

More information

Online Convex Optimization. Gautam Goel, Milan Cvitkovic, and Ellen Feldman CS 159 4/5/2016

Online Convex Optimization. Gautam Goel, Milan Cvitkovic, and Ellen Feldman CS 159 4/5/2016 Online Convex Optimization Gautam Goel, Milan Cvitkovic, and Ellen Feldman CS 159 4/5/2016 The General Setting The General Setting (Cover) Given only the above, learning isn't always possible Some Natural

More information

Stochastic Gradient Descent with Only One Projection

Stochastic Gradient Descent with Only One Projection Stochastic Gradient Descent with Only One Projection Mehrdad Mahdavi, ianbao Yang, Rong Jin, Shenghuo Zhu, and Jinfeng Yi Dept. of Computer Science and Engineering, Michigan State University, MI, USA Machine

More information

Minimax Fixed-Design Linear Regression

Minimax Fixed-Design Linear Regression JMLR: Workshop and Conference Proceedings vol 40:1 14, 2015 Mini Fixed-Design Linear Regression Peter L. Bartlett University of California at Berkeley and Queensland University of Technology Wouter M.

More information

Bandit Convex Optimization

Bandit Convex Optimization March 7, 2017 Table of Contents 1 (BCO) 2 Projection Methods 3 Barrier Methods 4 Variance reduction 5 Other methods 6 Conclusion Learning scenario Compact convex action set K R d. For t = 1 to T : Predict

More information