Tutorial: PART 2. Online Convex Optimization, A Game- Theoretic Approach to Learning
|
|
- Kimberly Martha Carroll
- 5 years ago
- Views:
Transcription
1 Tutorial: PART 2 Online Convex Optimization, A Game- Theoretic Approach to Learning Elad Hazan Princeton University Satyen Kale Yahoo Research
2 Exploiting curvature: logarithmic regret
3 Logarithmic regret Some natural problems admit better regret: Least squares linear regression Soft- margin SVM Portfolio selection α- strong convexity: (e.g. f x = x & ) r 2 f(x) I α- exp- concavity: (e.g. f x = log a, x) r 2 f(x) rf(x)rf(x) >
4 Online Soft- Margin SVM Features a R /, labels b { 1, 1} Hypothesis class H is a set of linear predictors: Parameters: weight vectors x R / Prediction of weight vector x for feature vector a is a T x Loss f 5 x = max 0, 1 b 5 a 5, x + λ x & Convex, non- smooth strongly- convex Thm [Hazan, Agarwal, Kale 2007]: OGD with step sizes η 5 > 5 has O(log T) regret
5 Universal Portfolio Selection Loss function f 5 x = log(r 5, x) Convex, but no strongly convex component But exp- concave: exp( f 5 x ) = r 5, x (concave function: actually, linear) Definition: f is called α- exp- concave if exp αf x is concave Theorem [Hazan, Agarwal, Kale 2007]: For exp- concave losses, there is an algorithm, Online Newton Step, that obtains O(log T) regret
6 Online Newton Step algorithm Use x 1 = arbitrary point in K in round 1 For t > 1: use x t defined by following equations cumulative gradient step Newton update r t 1 = rf t 1 (x t 1 ) C t 1 = g t 1 = Xt 1 =1 Xt 1 =1 r r > r r > x y t = C 1 t 1 g t 1 cumulative Hessian cr c is a constant depending on α, other bounds x t = arg min(x y t ) > C t 1 (x y t ) >
7 Regularization follow the regularized leader, online mirror descent
8 Follow- the- leader Most natural: Regret = X t f t (x t ) x t = arg min min x 2K X f t (x ) Be- the- leader algorithm has zero regret [Kalai- Vempala, 2005]: t P t 1 i=1 f i(x) x 0 t = arg min P t i=1 f i(x) =x t+1 So if f 5 x 5 f 5 (x 5K> ), we get a regret bound Unstability: f 5 x 5 f 5 (x 5K> ) can be large!
9 Fixing FTL: Follow- The- Regularized- Leader (FTRL) First, suffices to use replace f t by a linear function, f 5 x 5, x Next, add regularization: x t = arg min P t 1 i=1 rf i(x i ) > x + 1 R(x) R(x) is a strongly convex function, η is a regularization constant Under certain conditions, this ensures stability: rf t (x t ) x t rf t (x t ) x t+1 O( )
10 Specific instances of FTRL R x = x & P t 1 x t = arg min i=1 rf i(x i ) > x + 1 R(x) = Q K P t 1 i=1 rf i(x i ) Here, Q K (y) = arg min ky xk Almost like OGD: starting with y 1 = 0, for t = 1, 2, x t = Q K (y t) y t+1 = y t rf t (x t )
11 Specific instances of FTRL Experts setting: K = Δ / distributions over experts f 5 x = c 5, x, where c t is the vector of losses R x = R x R log x R : negative entropy Entrywise exponential P t 1 x t = arg min i=1 rf i(x i ) > x + 1 R(x) =exp P t 1 i=1 c i /Z t Normalization constant Exactly RWM! Starting with y 1 = uniform distribution, for t = 1, 2, Entrywise product with entrywise exponential x t = eq K (y t) y t+1 = y t exp( c t ) Projection = Normalization
12 FTRL Online Mirror Descent x t = arg min Bregman Projection: Q R K (y) = arg min B R (xky) B R (xky) :=R(x) R(y) rr(y) > (x y) P t 1 i=1 rf i(x i ) > x + 1 R(x) x t = Q R K (y t) y t+1 =(rr) 1 (rr(y t ) rf t (x t ))
13 Advanced regularization follow the perturbed leader, AdaGrad
14 Randomized regularization: Follow- The- Perturbed- Leader Special case of OCO, Online Linear Optimization: f 5 x = c 5, x Follow- The- Perturbed- Leader [Kalai- Vempala 2005]: x t = arg min P t 1 i=1 c> i x + p> x Regularization p T x: vector p drawn u.a.r. from [ T, T] Expected regret is the same if p is chosen afresh every round Re- randomizing every round x 5 = x 5K> w.p. at least 1 1/ T regret = O( T)
15 Follow- The- Perturbed- Leader Algorithm: p is random vector with entries in [ T, x t = arg min P t 1 i=1 c> i x + p> x T] Implementation requires only linear optimization over K, easier than projections needed by OMD O T regret bound holds for arbitrary sets K: convexity unnecessary! Can be extended to convex losses using gradient trick + using expected perturbed leader in each round
16 Adaptive Regularization: AdaGrad Consider use of OGD in learning a generalized linear model For parameter x R / and example a, b R / R, prediction is some function of a, x Thus, f 5 x = ll a 5, b 5, x a 5 OGD update: x 5K> = x 5 η f 5 x 5 = x 5 ηll a 5, b 5, x a 5 Thus, features treated equally in updating parameter vector In typical text classification tasks, feature vectors a t are very sparse Slow learning! Adaptive regularization: implements per- feature learning rates
17 AdaGrad (diagonal form) [Duchi, Hazan, Singer 10] Set x > K arbitrarily For t = 1, 2,, 1. use x 5 obtain f t 2. compute x 5K> as follows: G t = diag( P t i=1 rf i(x i )rf i (x i ) > ) y t+1 = x t G 1/2 t rf t (x t ) x t+1 = arg min(y t+1 x) > G t (y t+1 x) Regret bound: O R 5 f 5 x 5 R Infrequently occurring, or small- scale, features have small influence on regret (and therefore, convergence to optimal parameter) &
18 Bandit Convex Optimization
19 Bandit Optimization Motivating example: ad selection on websites Simplest form: in each round t Choose one ad i t out of n Simultaneously, user chooses loss vector c t Observe loss of chosen ad, c t (i t ) c t (i) = - revenue if ad i would be clicked if shown, 0 otherwise TX TX Regret = t=1 c t (i t ) min i t=1 c t (i) Essentially, experts problem with partial feedback Modeled as online linear optimization over n- simplex of distributions Available choices are usually called arms (from multi- armed bandits) Randomization necessary, so consider expected regret
20 Constructing cost estimators e 1 Exploration: Choose arm i t from 1, 2,, n u.a.r. Observe c t (i t ) c t = nc t (i t )e it e 2 e n E[ c t ]=c t Exploitation: x t = arg min Xt 1 i=1 c i > x + 1 R(x)
21 Separate explore- exploit template Algorithm: 1. w.p. q explore (as before), create estimator 2. o/w exploit x t = arg min Xt 1 i=1 c i > x + 1 R(x) E[ c t ]=c t Regret is essentially (ignoring dependence on n) bounded by q T +(1 q) 1 p! T q Regret FTRL = O q + q = O(T 2/3 ) Can we do better?
22 Simultaneous exploration- exploitation [Auer, Cesa- Bianchi, Freund, Schapire 2002] e 1 Exploration+Exploitation: Choose arm i t from distributionx t Observe c t (i t ) c t = c t(i t ) x t (i t ) e i t e 2 x t = arg min Xt 1 i=1 e n E[ c t ]=c t c i > x + 1 R(x) With R = negative entropy, this alg (called EXP3) has O T regret
23 Bandit Linear Optimization Generalize from multi- armed bandits K: convex set in R / Linear cost functions: f 5 x = c 5, x Approach: 1. (Exploration) construct unbiased estimators 2. (Exploitation) Plug into FTRL: t 1 x t = arg min X i=1 E[ c t ]=c t c i > x + 1 R(x)
24 Bandit Linear Optimization K: convex set in R /, linear cost functions: f 5 x = c 5, x Assume K contains e 1, e 2,, e n In round t: x t = arg min Xt 1 i=1 exploration basis 1. W.p. q choose i 5 1, 2,, n and use e R 5. Observe c 5 i 5 and set c t = c t(i t ) q/n e i t 2. W.p. 1 q use x t obtained from FTRL and set cb 5 = 0 Gives regret bound of O(T 2/3 ) E[ c t ]=c t c i > x + 1 R(x)
25 Geometric regularization via barriers [Abernethy, Hazan, Rakhlin 2008] R = self- concordant barrier for K. Dikin ellipsoid at any point x lies in K: y y x, & R x y x 1} Algorithm: use random endpoint of axes of ellipsoid at x t using x t in expectation (exploitation) 2n endpoints suffice to construct cb 5 (exploration) Thm: Regret = O( T) x t = arg min Xt 1 i=1 c i > x + 1 R(x)
26 Bandit Online Convex Optimization [Flaxman, Kalai, McMahan 2005] f t arbitrary convex functions Approach: 1. Construct estimator cb 5 of f 5 x 5 2. Plug into FTRL x t = arg min Xt 1 i=1 Near boundary, estimators have large variance Hence, x t and x t+1 can be very different Regret bound = O(T 3/4 ) Recent developments: O c i > x + 1 R(x) rf(x) 1 E u2s n T regret possible! 1 [f(x + u)u]
27 Projection- free algorithms
28 Matrix completion In round t: Obtain (user, movie) pair, and output a recommended rating Then observe true rating, and suffer square loss Comparison class: all matrices with bounded trace norm
29 Bounded trace norm matrices Trace norm of a matrix = sum of singular values K = { X X is a matrix with trace norm at most D } Optimizing over K promotes low rank solutions Online Matrix Completion problem is an OCO problem over K FTRL, OMD etc require computing projections on K Expensive! Requires computing eigendecomposition of matrix to be projected: O(n 3 ) operation But: linear optimization over K is easier Requires computing top eigenvector; can be done in O(sparsity) time typically
30 Projections à linear optimization 1. Matrix completion (K = bounded trace norm matrices) eigen decomposition largest singular vector computation 2. Online routing (K = flow polytope) conic optimization over flow polytope shortest path computation 3. Rotations (K = rotation matrices) conic optimization over rotations set Wahba s algorithm linear time 4. Matroids (K = matroid polytope) convex opt. via ellipsoid method greedy algorithm nearly linear time!
31 Projection- Free Online Learning Is online learning possible given access to linear optimization oracle rather than projection oracle? Yes: projection can be implemented via polynomially many linear optimizations But for efficiency, would like only a few (1 or 2) calls to linear optimization oracle per round Follow- The- Perturbed- Leader does the job for linear loss functions What about general convex loss functions? Is OCO possible with only 1 or 2 calls to a linear optimization oracle per round?
32 Conditional Gradient algorithm [Frank, Wolfe 56] Convex opt problem: min h i f(x) Assume: f is smooth, convex linear opt over K is easy v t = arg min rf(x t ) > x x t+1 = x t + t (v t x t ) 1. At iteration t: convex comb. of at most t vertices (sparsity) 2. No learning rate. η 5 > 5 (independent of diameter, gradients etc.)
33 Online Conditional Gradient [Hazan, Kale 12] Set x > K arbitrarily For t = 1, 2,, 1. Use x 5, obtain f t 2. Compute x 5K> as follows v t = arg min Pt i=1 rf i(x i )+ t x t >x x t+1 (1 t )x t + t v t P t i=1 rf i(x i )+ t x t v t x t+1 x t Theorem: Regret = O(T 3/4 )
34 Linearly- convergence CG [Garber, Hazan 2013] x t Theorem: For polytopes, strongly- convex and smooth losses: 1. Offline: gap after t steps: 2. Online: Regret = O( p exp( T ) ct)
35 Directions for future research 1. Efficient algorithms & optimal dependency on dimension for bandit OCO 2. Faster online matrix algorithms 3. Optimal regret /running time tradeoffs 4. Optimal regret and projection- free algorithm for general convex sets 5. Time series/dependence/adaptivity (vs. static regret) 6. Adding state / online Reinforcement Learning / MDPs / stochastic games [lots of existing work ] 7. Tensor/spectral learning in the online setting 8. Game theoretic aspect stronger notions, approachability,
36 Bibliography & more information, see: Thank you!
Tutorial: PART 1. Online Convex Optimization, A Game- Theoretic Approach to Learning.
Tutorial: PART 1 Online Convex Optimization, A Game- Theoretic Approach to Learning http://www.cs.princeton.edu/~ehazan/tutorial/tutorial.htm Elad Hazan Princeton University Satyen Kale Yahoo Research
More informationTutorial: PART 2. Optimization for Machine Learning. Elad Hazan Princeton University. + help from Sanjeev Arora & Yoram Singer
Tutorial: PART 2 Optimization for Machine Learning Elad Hazan Princeton University + help from Sanjeev Arora & Yoram Singer Agenda 1. Learning as mathematical optimization Stochastic optimization, ERM,
More informationBandits for Online Optimization
Bandits for Online Optimization Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Bandits for Online Optimization 1 / 16 The multiarmed bandit problem... K slot machines Each
More informationAdaptive Online Gradient Descent
University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 6-4-2007 Adaptive Online Gradient Descent Peter Bartlett Elad Hazan Alexander Rakhlin University of Pennsylvania Follow
More informationThe Online Approach to Machine Learning
The Online Approach to Machine Learning Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Online Approach to ML 1 / 53 Summary 1 My beautiful regret 2 A supposedly fun game I
More informationOnline Learning and Online Convex Optimization
Online Learning and Online Convex Optimization Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Online Learning 1 / 49 Summary 1 My beautiful regret 2 A supposedly fun game
More informationAdaptive Subgradient Methods for Online Learning and Stochastic Optimization John Duchi, Elad Hanzan, Yoram Singer
Adaptive Subgradient Methods for Online Learning and Stochastic Optimization John Duchi, Elad Hanzan, Yoram Singer Vicente L. Malave February 23, 2011 Outline Notation minimize a number of functions φ
More informationA survey: The convex optimization approach to regret minimization
A survey: The convex optimization approach to regret minimization Elad Hazan September 10, 2009 WORKING DRAFT Abstract A well studied and general setting for prediction and decision making is regret minimization
More informationAdaptive Online Learning in Dynamic Environments
Adaptive Online Learning in Dynamic Environments Lijun Zhang, Shiyin Lu, Zhi-Hua Zhou National Key Laboratory for Novel Software Technology Nanjing University, Nanjing 210023, China {zhanglj, lusy, zhouzh}@lamda.nju.edu.cn
More informationAd Placement Strategies
Case Study : Estimating Click Probabilities Intro Logistic Regression Gradient Descent + SGD AdaGrad Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox January 7 th, 04 Ad
More informationOnline Convex Optimization. Gautam Goel, Milan Cvitkovic, and Ellen Feldman CS 159 4/5/2016
Online Convex Optimization Gautam Goel, Milan Cvitkovic, and Ellen Feldman CS 159 4/5/2016 The General Setting The General Setting (Cover) Given only the above, learning isn't always possible Some Natural
More informationOptimal and Adaptive Online Learning
Optimal and Adaptive Online Learning Haipeng Luo Advisor: Robert Schapire Computer Science Department Princeton University Examples of Online Learning (a) Spam detection 2 / 34 Examples of Online Learning
More informationNo-Regret Algorithms for Unconstrained Online Convex Optimization
No-Regret Algorithms for Unconstrained Online Convex Optimization Matthew Streeter Duolingo, Inc. Pittsburgh, PA 153 matt@duolingo.com H. Brendan McMahan Google, Inc. Seattle, WA 98103 mcmahan@google.com
More informationNew Algorithms for Contextual Bandits
New Algorithms for Contextual Bandits Lev Reyzin Georgia Institute of Technology Work done at Yahoo! 1 S A. Beygelzimer, J. Langford, L. Li, L. Reyzin, R.E. Schapire Contextual Bandit Algorithms with Supervised
More informationBetter Algorithms for Benign Bandits
Better Algorithms for Benign Bandits Elad Hazan IBM Almaden 650 Harry Rd, San Jose, CA 95120 ehazan@cs.princeton.edu Satyen Kale Microsoft Research One Microsoft Way, Redmond, WA 98052 satyen.kale@microsoft.com
More informationContents. 1 Introduction. 1.1 History of Optimization ALG-ML SEMINAR LISSA: LINEAR TIME SECOND-ORDER STOCHASTIC ALGORITHM FEBRUARY 23, 2016
ALG-ML SEMINAR LISSA: LINEAR TIME SECOND-ORDER STOCHASTIC ALGORITHM FEBRUARY 23, 2016 LECTURERS: NAMAN AGARWAL AND BRIAN BULLINS SCRIBE: KIRAN VODRAHALLI Contents 1 Introduction 1 1.1 History of Optimization.....................................
More informationLogarithmic Regret Algorithms for Online Convex Optimization
Logarithmic Regret Algorithms for Online Convex Optimization Elad Hazan 1, Adam Kalai 2, Satyen Kale 1, and Amit Agarwal 1 1 Princeton University {ehazan,satyen,aagarwal}@princeton.edu 2 TTI-Chicago kalai@tti-c.org
More informationOptimistic Bandit Convex Optimization
Optimistic Bandit Convex Optimization Mehryar Mohri Courant Institute and Google 25 Mercer Street New York, NY 002 mohri@cims.nyu.edu Scott Yang Courant Institute 25 Mercer Street New York, NY 002 yangs@cims.nyu.edu
More informationExtracting Certainty from Uncertainty: Regret Bounded by Variation in Costs
Extracting Certainty from Uncertainty: Regret Bounded by Variation in Costs Elad Hazan IBM Almaden Research Center 650 Harry Rd San Jose, CA 95120 ehazan@cs.princeton.edu Satyen Kale Yahoo! Research 4301
More informationBetter Algorithms for Benign Bandits
Better Algorithms for Benign Bandits Elad Hazan IBM Almaden 650 Harry Rd, San Jose, CA 95120 hazan@us.ibm.com Satyen Kale Microsoft Research One Microsoft Way, Redmond, WA 98052 satyen.kale@microsoft.com
More informationReducing contextual bandits to supervised learning
Reducing contextual bandits to supervised learning Daniel Hsu Columbia University Based on joint work with A. Agarwal, S. Kale, J. Langford, L. Li, and R. Schapire 1 Learning to interact: example #1 Practicing
More informationOptimization for Machine Learning
Optimization for Machine Learning Editors: Suvrit Sra suvrit@gmail.com Max Planck Insitute for Biological Cybernetics 72076 Tübingen, Germany Sebastian Nowozin Microsoft Research Cambridge, CB3 0FB, United
More informationThe convex optimization approach to regret minimization
The convex optimization approach to regret minimization Elad Hazan Technion - Israel Institute of Technology ehazan@ie.technion.ac.il Abstract A well studied and general setting for prediction and decision
More informationEASINESS IN BANDITS. Gergely Neu. Pompeu Fabra University
EASINESS IN BANDITS Gergely Neu Pompeu Fabra University EASINESS IN BANDITS Gergely Neu Pompeu Fabra University THE BANDIT PROBLEM Play for T rounds attempting to maximize rewards THE BANDIT PROBLEM Play
More informationBandit Convex Optimization
March 7, 2017 Table of Contents 1 (BCO) 2 Projection Methods 3 Barrier Methods 4 Variance reduction 5 Other methods 6 Conclusion Learning scenario Compact convex action set K R d. For t = 1 to T : Predict
More informationExponential Weights on the Hypercube in Polynomial Time
European Workshop on Reinforcement Learning 14 (2018) October 2018, Lille, France. Exponential Weights on the Hypercube in Polynomial Time College of Information and Computer Sciences University of Massachusetts
More informationConvergence rate of SGD
Convergence rate of SGD heorem: (see Nemirovski et al 09 from readings) Let f be a strongly convex stochastic function Assume gradient of f is Lipschitz continuous and bounded hen, for step sizes: he expected
More informationOptimization, Learning, and Games with Predictable Sequences
Optimization, Learning, and Games with Predictable Sequences Alexander Rakhlin University of Pennsylvania Karthik Sridharan University of Pennsylvania Abstract We provide several applications of Optimistic
More informationFull-information Online Learning
Introduction Expert Advice OCO LM A DA NANJING UNIVERSITY Full-information Lijun Zhang Nanjing University, China June 2, 2017 Outline Introduction Expert Advice OCO 1 Introduction Definitions Regret 2
More informationAdaptive Gradient Methods AdaGrad / Adam. Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade
Adaptive Gradient Methods AdaGrad / Adam Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade 1 Announcements: HW3 posted Dual coordinate ascent (some review of SGD and random
More informationAdvanced Machine Learning
Advanced Machine Learning Online Convex Optimization MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Outline Online projected sub-gradient descent. Exponentiated Gradient (EG). Mirror descent.
More informationAdaptivity and Optimism: An Improved Exponentiated Gradient Algorithm
Adaptivity and Optimism: An Improved Exponentiated Gradient Algorithm Jacob Steinhardt Percy Liang Stanford University {jsteinhardt,pliang}@cs.stanford.edu Jun 11, 2013 J. Steinhardt & P. Liang (Stanford)
More informationSublinear Time Algorithms for Approximate Semidefinite Programming
Noname manuscript No. (will be inserted by the editor) Sublinear Time Algorithms for Approximate Semidefinite Programming Dan Garber Elad Hazan Received: date / Accepted: date Abstract We consider semidefinite
More informationOLSO. Online Learning and Stochastic Optimization. Yoram Singer August 10, Google Research
OLSO Online Learning and Stochastic Optimization Yoram Singer August 10, 2016 Google Research References Introduction to Online Convex Optimization, Elad Hazan, Princeton University Online Learning and
More informationNotes on AdaGrad. Joseph Perla 2014
Notes on AdaGrad Joseph Perla 2014 1 Introduction Stochastic Gradient Descent (SGD) is a common online learning algorithm for optimizing convex (and often non-convex) functions in machine learning today.
More informationAdaptive Gradient Methods AdaGrad / Adam
Case Study 1: Estimating Click Probabilities Adaptive Gradient Methods AdaGrad / Adam Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade 1 The Problem with GD (and SGD)
More informationNear-Optimal Algorithms for Online Matrix Prediction
JMLR: Workshop and Conference Proceedings vol 23 (2012) 38.1 38.13 25th Annual Conference on Learning Theory Near-Optimal Algorithms for Online Matrix Prediction Elad Hazan Technion - Israel Inst. of Tech.
More informationAdvanced Machine Learning
Advanced Machine Learning Bandit Problems MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Multi-Armed Bandit Problem Problem: which arm of a K-slot machine should a gambler pull to maximize his
More informationLearnability, Stability, Regularization and Strong Convexity
Learnability, Stability, Regularization and Strong Convexity Nati Srebro Shai Shalev-Shwartz HUJI Ohad Shamir Weizmann Karthik Sridharan Cornell Ambuj Tewari Michigan Toyota Technological Institute Chicago
More information1 Overview. 2 Learning from Experts. 2.1 Defining a meaningful benchmark. AM 221: Advanced Optimization Spring 2016
AM 1: Advanced Optimization Spring 016 Prof. Yaron Singer Lecture 11 March 3rd 1 Overview In this lecture we will introduce the notion of online convex optimization. This is an extremely useful framework
More informationEfficient Bandit Algorithms for Online Multiclass Prediction
Efficient Bandit Algorithms for Online Multiclass Prediction Sham Kakade, Shai Shalev-Shwartz and Ambuj Tewari Presented By: Nakul Verma Motivation In many learning applications, true class labels are
More informationHybrid Machine Learning Algorithms
Hybrid Machine Learning Algorithms Umar Syed Princeton University Includes joint work with: Rob Schapire (Princeton) Nina Mishra, Alex Slivkins (Microsoft) Common Approaches to Machine Learning!! Supervised
More informationRegret bounded by gradual variation for online convex optimization
Noname manuscript No. will be inserted by the editor Regret bounded by gradual variation for online convex optimization Tianbao Yang Mehrdad Mahdavi Rong Jin Shenghuo Zhu Received: date / Accepted: date
More informationOnline Optimization with Gradual Variations
JMLR: Workshop and Conference Proceedings vol (0) 0 Online Optimization with Gradual Variations Chao-Kai Chiang, Tianbao Yang 3 Chia-Jung Lee Mehrdad Mahdavi 3 Chi-Jen Lu Rong Jin 3 Shenghuo Zhu 4 Institute
More informationExtracting Certainty from Uncertainty: Regret Bounded by Variation in Costs
Extracting Certainty from Uncertainty: Regret Bounded by Variation in Costs Elad Hazan IBM Almaden 650 Harry Rd, San Jose, CA 95120 hazan@us.ibm.com Satyen Kale Microsoft Research 1 Microsoft Way, Redmond,
More informationarxiv: v1 [cs.lg] 8 Feb 2018
Online Learning: A Comprehensive Survey Steven C.H. Hoi, Doyen Sahoo, Jing Lu, Peilin Zhao School of Information Systems, Singapore Management University, Singapore School of Software Engineering, South
More informationOnline Learning with Predictable Sequences
JMLR: Workshop and Conference Proceedings vol (2013) 1 27 Online Learning with Predictable Sequences Alexander Rakhlin Karthik Sridharan rakhlin@wharton.upenn.edu skarthik@wharton.upenn.edu Abstract We
More informationQuasi-Newton Algorithms for Non-smooth Online Strongly Convex Optimization. Mark Franklin Godwin
Quasi-Newton Algorithms for Non-smooth Online Strongly Convex Optimization by Mark Franklin Godwin A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy
More informationLecture 16: FTRL and Online Mirror Descent
Lecture 6: FTRL and Online Mirror Descent Akshay Krishnamurthy akshay@cs.umass.edu November, 07 Recap Last time we saw two online learning algorithms. First we saw the Weighted Majority algorithm, which
More informationMinimax Policies for Combinatorial Prediction Games
Minimax Policies for Combinatorial Prediction Games Jean-Yves Audibert Imagine, Univ. Paris Est, and Sierra, CNRS/ENS/INRIA, Paris, France audibert@imagine.enpc.fr Sébastien Bubeck Centre de Recerca Matemàtica
More informationThe FTRL Algorithm with Strongly Convex Regularizers
CSE599s, Spring 202, Online Learning Lecture 8-04/9/202 The FTRL Algorithm with Strongly Convex Regularizers Lecturer: Brandan McMahan Scribe: Tamara Bonaci Introduction In the last lecture, we talked
More informationOn-line Variance Minimization
On-line Variance Minimization Manfred Warmuth Dima Kuzmin University of California - Santa Cruz 19th Annual Conference on Learning Theory M. Warmuth, D. Kuzmin (UCSC) On-line Variance Minimization COLT06
More informationAlireza Shafaei. Machine Learning Reading Group The University of British Columbia Summer 2017
s s Machine Learning Reading Group The University of British Columbia Summer 2017 (OCO) Convex 1/29 Outline (OCO) Convex Stochastic Bernoulli s (OCO) Convex 2/29 At each iteration t, the player chooses
More informationOnline Submodular Minimization
Online Submodular Minimization Elad Hazan IBM Almaden Research Center 650 Harry Rd, San Jose, CA 95120 hazan@us.ibm.com Satyen Kale Yahoo! Research 4301 Great America Parkway, Santa Clara, CA 95054 skale@yahoo-inc.com
More informationConditional Gradient (Frank-Wolfe) Method
Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties
More informationStochastic Gradient Descent with Only One Projection
Stochastic Gradient Descent with Only One Projection Mehrdad Mahdavi, ianbao Yang, Rong Jin, Shenghuo Zhu, and Jinfeng Yi Dept. of Computer Science and Engineering, Michigan State University, MI, USA Machine
More informationDelay-Tolerant Online Convex Optimization: Unified Analysis and Adaptive-Gradient Algorithms
Delay-Tolerant Online Convex Optimization: Unified Analysis and Adaptive-Gradient Algorithms Pooria Joulani 1 András György 2 Csaba Szepesvári 1 1 Department of Computing Science, University of Alberta,
More informationA Linearly Convergent Conditional Gradient Algorithm with Applications to Online and Stochastic Optimization
A Linearly Convergent Conditional Gradient Algorithm with Applications to Online and Stochastic Optimization Dan Garber Technion - Israel Inst. of Tech. dangar@tx.technion.ac.il Elad Hazan Technion - Israel
More informationStochastic and Adversarial Online Learning without Hyperparameters
Stochastic and Adversarial Online Learning without Hyperparameters Ashok Cutkosky Department of Computer Science Stanford University ashokc@cs.stanford.edu Kwabena Boahen Department of Bioengineering Stanford
More informationLecture: Adaptive Filtering
ECE 830 Spring 2013 Statistical Signal Processing instructors: K. Jamieson and R. Nowak Lecture: Adaptive Filtering Adaptive filters are commonly used for online filtering of signals. The goal is to estimate
More informationAccelerating Online Convex Optimization via Adaptive Prediction
Mehryar Mohri Courant Institute and Google Research New York, NY 10012 mohri@cims.nyu.edu Scott Yang Courant Institute New York, NY 10012 yangs@cims.nyu.edu Abstract We present a powerful general framework
More informationCompeting in the Dark: An Efficient Algorithm for Bandit Linear Optimization
Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization Jacob Abernethy Computer Science Division UC Berkeley jake@cs.berkeley.edu (eligible for best student paper award Elad Hazan
More informationLecture 23: Online convex optimization Online convex optimization: generalization of several algorithms
EECS 598-005: heoretical Foundations of Machine Learning Fall 2015 Lecture 23: Online convex optimization Lecturer: Jacob Abernethy Scribes: Vikas Dhiman Disclaimer: hese notes have not been subjected
More informationLecture 19: Follow The Regulerized Leader
COS-511: Learning heory Spring 2017 Lecturer: Roi Livni Lecture 19: Follow he Regulerized Leader Disclaimer: hese notes have not been subjected to the usual scrutiny reserved for formal publications. hey
More informationRegret Bounds for Online Portfolio Selection with a Cardinality Constraint
Regret Bounds for Online Portfolio Selection with a Cardinality Constraint Shinji Ito NEC Corporation Daisuke Hatano RIKEN AIP Hanna Sumita okyo Metropolitan University Akihiro Yabe NEC Corporation akuro
More informationYevgeny Seldin. University of Copenhagen
Yevgeny Seldin University of Copenhagen Classical (Batch) Machine Learning Collect Data Data Assumption The samples are independent identically distributed (i.i.d.) Machine Learning Prediction rule New
More informationProximal and First-Order Methods for Convex Optimization
Proximal and First-Order Methods for Convex Optimization John C Duchi Yoram Singer January, 03 Abstract We describe the proximal method for minimization of convex functions We review classical results,
More informationOnline Learning with Feedback Graphs
Online Learning with Feedback Graphs Claudio Gentile INRIA and Google NY clagentile@gmailcom NYC March 6th, 2018 1 Content of this lecture Regret analysis of sequential prediction problems lying between
More informationA Drifting-Games Analysis for Online Learning and Applications to Boosting
A Drifting-Games Analysis for Online Learning and Applications to Boosting Haipeng Luo Department of Computer Science Princeton University Princeton, NJ 08540 haipengl@cs.princeton.edu Robert E. Schapire
More informationDistributed online optimization over jointly connected digraphs
Distributed online optimization over jointly connected digraphs David Mateos-Núñez Jorge Cortés University of California, San Diego {dmateosn,cortes}@ucsd.edu Southern California Optimization Day UC San
More informationExponentiated Gradient Descent
CSE599s, Spring 01, Online Learning Lecture 10-04/6/01 Lecturer: Ofer Dekel Exponentiated Gradient Descent Scribe: Albert Yu 1 Introduction In this lecture we review norms, dual norms, strong convexity,
More informationarxiv: v2 [stat.ml] 7 Sep 2018
Projection-Free Bandit Convex Optimization Lin Chen 1, Mingrui Zhang 3 Amin Karbasi 1, 1 Yale Institute for Network Science, Department of Electrical Engineering, 3 Department of Statistics and Data Science,
More informationLearning with Large Number of Experts: Component Hedge Algorithm
Learning with Large Number of Experts: Component Hedge Algorithm Giulia DeSalvo and Vitaly Kuznetsov Courant Institute March 24th, 215 1 / 3 Learning with Large Number of Experts Regret of RWM is O( T
More informationDistributed online optimization over jointly connected digraphs
Distributed online optimization over jointly connected digraphs David Mateos-Núñez Jorge Cortés University of California, San Diego {dmateosn,cortes}@ucsd.edu Mathematical Theory of Networks and Systems
More informationA Low Complexity Algorithm with O( T ) Regret and Finite Constraint Violations for Online Convex Optimization with Long Term Constraints
A Low Complexity Algorithm with O( T ) Regret and Finite Constraint Violations for Online Convex Optimization with Long Term Constraints Hao Yu and Michael J. Neely Department of Electrical Engineering
More informationLogarithmic regret algorithms for online convex optimization
Mach Learn (2007) 69: 169 192 DOI 10.1007/s10994-007-5016-8 Logarithmic regret algorithms for online convex optimization Elad Hazan Amit Agarwal Satyen Kale Received: 3 October 2006 / Revised: 8 May 2007
More informationEfficient Bandit Algorithms for Online Multiclass Prediction
Sham M. Kakade Shai Shalev-Shwartz Ambuj Tewari Toyota Technological Institute, 427 East 60th Street, Chicago, Illinois 60637, USA sham@tti-c.org shai@tti-c.org tewari@tti-c.org Keywords: Online learning,
More informationRegret in Online Combinatorial Optimization
Regret in Online Combinatorial Optimization Jean-Yves Audibert Imagine, Université Paris Est, and Sierra, CNRS/ENS/INRIA audibert@imagine.enpc.fr Sébastien Bubeck Department of Operations Research and
More informationLearning with Exploration
Learning with Exploration John Langford (Yahoo!) { With help from many } Austin, March 24, 2011 Yahoo! wants to interactively choose content and use the observed feedback to improve future content choices.
More informationBandit Online Convex Optimization
March 31, 2015 Outline 1 OCO vs Bandit OCO 2 Gradient Estimates 3 Oblivious Adversary 4 Reshaping for Improved Rates 5 Adaptive Adversary 6 Concluding Remarks Review of (Online) Convex Optimization Set-up
More informationThe Algorithmic Foundations of Adaptive Data Analysis November, Lecture The Multiplicative Weights Algorithm
he Algorithmic Foundations of Adaptive Data Analysis November, 207 Lecture 5-6 Lecturer: Aaron Roth Scribe: Aaron Roth he Multiplicative Weights Algorithm In this lecture, we define and analyze a classic,
More informationOnline Kernel PCA with Entropic Matrix Updates
Online Kernel PCA with Entropic Matrix Updates Dima Kuzmin Manfred K. Warmuth University of California - Santa Cruz ICML 2007, Corvallis, Oregon April 23, 2008 D. Kuzmin, M. Warmuth (UCSC) Online Kernel
More informationA Modular Analysis of Adaptive (Non-)Convex Optimization: Optimism, Composite Objectives, and Variational Bounds
Proceedings of Machine Learning Research 76: 40, 207 Algorithmic Learning Theory 207 A Modular Analysis of Adaptive (Non-)Convex Optimization: Optimism, Composite Objectives, and Variational Bounds Pooria
More informationManfred K. Warmuth - UCSC S.V.N. Vishwanathan - Purdue & Microsoft Research. Updated: March 23, Warmuth (UCSC) ICML 09 Boosting Tutorial 1 / 62
Updated: March 23, 2010 Warmuth (UCSC) ICML 09 Boosting Tutorial 1 / 62 ICML 2009 Tutorial Survey of Boosting from an Optimization Perspective Part I: Entropy Regularized LPBoost Part II: Boosting from
More informationOnline Learning with Experts & Multiplicative Weights Algorithms
Online Learning with Experts & Multiplicative Weights Algorithms CS 159 lecture #2 Stephan Zheng April 1, 2016 Caltech Table of contents 1. Online Learning with Experts With a perfect expert Without perfect
More informationLearning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley
Learning Methods for Online Prediction Problems Peter Bartlett Statistics and EECS UC Berkeley Course Synopsis A finite comparison class: A = {1,..., m}. 1. Prediction with expert advice. 2. With perfect
More informationThe Frank-Wolfe Algorithm:
The Frank-Wolfe Algorithm: New Results, and Connections to Statistical Boosting Paul Grigas, Robert Freund, and Rahul Mazumder http://web.mit.edu/rfreund/www/talks.html Massachusetts Institute of Technology
More informationA LINEARLY CONVERGENT CONDITIONAL GRADIENT ALGORITHM WITH APPLICATIONS TO ONLINE AND STOCHASTIC OPTIMIZATION
A LINEARLY CONVERGENT CONDITIONAL GRADIENT ALGORITHM WITH APPLICATIONS TO ONLINE AND STOCHASTIC OPTIMIZATION DAN GARBER AND ELAD HAZAN Abstract. Linear optimization is many times algorithmically simpler
More informationFollow-the-Regularized-Leader and Mirror Descent: Equivalence Theorems and L1 Regularization
Follow-the-Regularized-Leader and Mirror Descent: Equivalence Theorems and L1 Regularization H. Brendan McMahan Google, Inc. Abstract We prove that many mirror descent algorithms for online conve optimization
More informationAdvanced Topics in Machine Learning and Algorithmic Game Theory Fall semester, 2011/12
Advanced Topics in Machine Learning and Algorithmic Game Theory Fall semester, 2011/12 Lecture 4: Multiarmed Bandit in the Adversarial Model Lecturer: Yishay Mansour Scribe: Shai Vardi 4.1 Lecture Overview
More informationBandit Convex Optimization: T Regret in One Dimension
Bandit Convex Optimization: T Regret in One Dimension arxiv:1502.06398v1 [cs.lg 23 Feb 2015 Sébastien Bubeck Microsoft Research sebubeck@microsoft.com Tomer Koren Technion tomerk@technion.ac.il February
More informationBeyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization
JMLR: Workshop and Conference Proceedings vol (2010) 1 16 24th Annual Conference on Learning heory Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization
More informationOnline Sparse Linear Regression
JMLR: Workshop and Conference Proceedings vol 49:1 11, 2016 Online Sparse Linear Regression Dean Foster Amazon DEAN@FOSTER.NET Satyen Kale Yahoo Research SATYEN@YAHOO-INC.COM Howard Karloff HOWARD@CC.GATECH.EDU
More informationFoundations of Machine Learning On-Line Learning. Mehryar Mohri Courant Institute and Google Research
Foundations of Machine Learning On-Line Learning Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Motivation PAC learning: distribution fixed over time (training and test). IID assumption.
More informationPerceptron Mistake Bounds
Perceptron Mistake Bounds Mehryar Mohri, and Afshin Rostamizadeh Google Research Courant Institute of Mathematical Sciences Abstract. We present a brief survey of existing mistake bounds and introduce
More informationMirror Descent for Metric Learning. Gautam Kunapuli Jude W. Shavlik
Mirror Descent for Metric Learning Gautam Kunapuli Jude W. Shavlik And what do we have here? We have a metric learning algorithm that uses composite mirror descent (COMID): Unifying framework for metric
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course Notes for EE7C (Spring 08): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee7c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee7c@berkeley.edu October
More informationA Greedy Framework for First-Order Optimization
A Greedy Framework for First-Order Optimization Jacob Steinhardt Department of Computer Science Stanford University Stanford, CA 94305 jsteinhardt@cs.stanford.edu Jonathan Huggins Department of EECS Massachusetts
More informationLogistic Regression Logistic
Case Study 1: Estimating Click Probabilities L2 Regularization for Logistic Regression Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Carlos Guestrin January 10 th,
More informationLogarithmic Regret Algorithms for Strongly Convex Repeated Games
Logarithmic Regret Algorithms for Strongly Convex Repeated Games Shai Shalev-Shwartz 1 and Yoram Singer 1,2 1 School of Computer Sci & Eng, The Hebrew University, Jerusalem 91904, Israel 2 Google Inc 1600
More information