Full-information Online Learning

Similar documents
Adaptive Online Learning in Dynamic Environments

Stochastic Optimization

Dynamic Regret of Strongly Adaptive Methods

Online Learning and Online Convex Optimization

Online Learning and Sequential Decision Making

The Online Approach to Machine Learning

Online Optimization in Dynamic Environments: Improved Regret Rates for Strongly Convex Problems

Supplementary Material of Dynamic Regret of Strongly Adaptive Methods

Adaptive Online Gradient Descent

A Second-order Bound with Excess Losses

OLSO. Online Learning and Stochastic Optimization. Yoram Singer August 10, Google Research

Exponential Weights on the Hypercube in Polynomial Time

The No-Regret Framework for Online Learning

Lower and Upper Bounds on the Generalization of Stochastic Exponentially Concave Optimization

Minimizing Adaptive Regret with One Gradient per Iteration

From Bandits to Experts: A Tale of Domination and Independence

On the Generalization Ability of Online Strongly Convex Programming Algorithms

A survey: The convex optimization approach to regret minimization

Accelerating Online Convex Optimization via Adaptive Prediction

Near-Optimal Algorithms for Online Matrix Prediction

Lecture 16: Perceptron and Exponential Weights Algorithm

The Interplay Between Stability and Regret in Online Learning

Learning, Games, and Networks

Advanced Machine Learning

A Low Complexity Algorithm with O( T ) Regret and Finite Constraint Violations for Online Convex Optimization with Long Term Constraints

Simultaneous Model Selection and Optimization through Parameter-free Stochastic Learning

Adaptivity and Optimism: An Improved Exponentiated Gradient Algorithm

Optimal and Adaptive Online Learning

Learnability, Stability, Regularization and Strong Convexity

Tutorial: PART 1. Online Convex Optimization, A Game- Theoretic Approach to Learning.

Online Convex Optimization

Stochastic and Adversarial Online Learning without Hyperparameters

Online Learning with Experts & Multiplicative Weights Algorithms

Tutorial: PART 2. Online Convex Optimization, A Game- Theoretic Approach to Learning

No-Regret Algorithms for Unconstrained Online Convex Optimization

1 Overview. 2 Learning from Experts. 2.1 Defining a meaningful benchmark. AM 221: Advanced Optimization Spring 2016

Online Optimization : Competing with Dynamic Comparators

Online Learning: Random Averages, Combinatorial Parameters, and Learnability

Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization

Online Optimization with Gradual Variations

Regret bounded by gradual variation for online convex optimization

Learning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley

Subgradient Methods for Maximum Margin Structured Learning

The Multi-Arm Bandit Framework

Online Convex Optimization. Gautam Goel, Milan Cvitkovic, and Ellen Feldman CS 159 4/5/2016

Foundations of Machine Learning On-Line Learning. Mehryar Mohri Courant Institute and Google Research

Generalization Bounds for Online Learning Algorithms with Pairwise Loss Functions

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization John Duchi, Elad Hanzan, Yoram Singer

Optimization, Learning, and Games with Predictable Sequences

Exponentiated Gradient Descent

Alireza Shafaei. Machine Learning Reading Group The University of British Columbia Summer 2017

Lecture 23: Online convex optimization Online convex optimization: generalization of several algorithms

THE WEIGHTED MAJORITY ALGORITHM

New bounds on the price of bandit feedback for mistake-bounded online multiclass learning

Distributed online optimization over jointly connected digraphs

Bandit Online Convex Optimization

Online Learning with Feedback Graphs

Stochastic Gradient Descent with Only One Projection

Applications of on-line prediction. in telecommunication problems

An Online Convex Optimization Approach to Blackwell s Approachability

Delay-Tolerant Online Convex Optimization: Unified Analysis and Adaptive-Gradient Algorithms

4.1 Online Convex Optimization

The FTRL Algorithm with Strongly Convex Regularizers

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I. Sébastien Bubeck Theory Group

A Drifting-Games Analysis for Online Learning and Applications to Boosting

Agnostic Online learnability

arxiv: v1 [cs.lg] 23 Jan 2019

Logarithmic Regret Algorithms for Online Convex Optimization

Advanced Machine Learning

Quasi-Newton Algorithms for Non-smooth Online Strongly Convex Optimization. Mark Franklin Godwin

Explore no more: Improved high-probability regret bounds for non-stochastic bandits

Online and Stochastic Learning with a Human Cognitive Bias

Learning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley

Lecture 16: FTRL and Online Mirror Descent

Online Manifold Regularization: A New Learning Setting and Empirical Study

CS261: A Second Course in Algorithms Lecture #11: Online Learning and the Multiplicative Weights Algorithm

Distributed online optimization over jointly connected digraphs

Computational and Statistical Learning Theory

Trade-Offs in Distributed Learning and Optimization

Bandit Convex Optimization

Large-scale Stochastic Optimization

Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization

Optimization for Machine Learning

Stochastic Gradient Descent

The convex optimization approach to regret minimization

A simpler unified analysis of Budget Perceptrons

Accelerating Stochastic Optimization

Classification Logistic Regression

arxiv: v4 [math.oc] 5 Jan 2016

Advanced Machine Learning

Regret Bounds for Online Portfolio Selection with a Cardinality Constraint

Online Submodular Minimization

0.1 Motivating example: weighted majority algorithm

Linear Probability Forecasting

Lecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm. Lecturer: Sanjeev Arora

Prediction and Playing Games

Robust Selective Sampling from Single and Multiple Teachers

Online Prediction Peter Bartlett

Linear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Online Learning with Feedback Graphs

Transcription:

Introduction Expert Advice OCO LM A DA NANJING UNIVERSITY Full-information Lijun Zhang Nanjing University, China June 2, 2017

Outline Introduction Expert Advice OCO 1 Introduction Definitions Regret 2 Prediction with Expert Advice 3 Online Convex Optimization Convex Functions Strongly Convex Functions Exponentially Concave Functions

Outline Introduction Expert Advice OCO Definitions Regret 1 Introduction Definitions Regret 2 Prediction with Expert Advice 3 Online Convex Optimization Convex Functions Strongly Convex Functions Exponentially Concave Functions

Introduction Expert Advice OCO Definitions Regret What Happens in an Internet Minute? http://www.dailyinfographic.com/what-happens-in-an-internet-minute-infographic

Outline Introduction Expert Advice OCO Definitions Regret 1 Introduction Definitions Regret 2 Prediction with Expert Advice 3 Online Convex Optimization Convex Functions Strongly Convex Functions Exponentially Concave Functions

Introduction Expert Advice OCO Definitions Regret Online Algorithm vs Online Algorithm [Karp, 1992] An online algorithm is one that receives a sequence of requests and performs an immediate action in response to each request. [Shalev-Shwartz, 2011] Online learning is the process of answering a sequence of questions given (maybe partial) knowledge of answers to previous questions and possibly additional information.

Introduction Expert Advice OCO Definitions Regret Online Algorithm vs Online Algorithm [Karp, 1992] An online algorithm is one that receives a sequence of requests and performs an immediate action in response to each request. Computer Vision, Machine Learning, Data Mining Theoretical Computer Science, Computer Networks [Shalev-Shwartz, 2011] Online learning is the process of answering a sequence of questions given (maybe partial) knowledge of answers to previous questions and possibly additional information.

Introduction Expert Advice OCO Definitions Regret Online Algorithm vs Online Algorithm [Karp, 1992] An online algorithm is one that receives a sequence of requests and performs an immediate action in response to each request. Computer Vision, Machine Learning, Data Mining Theoretical Computer Science, Computer Networks [Shalev-Shwartz, 2011] Online learning is the process of answering a sequence of questions given (maybe partial) knowledge of answers to previous questions and possibly additional information. Machine Learning, Game Theory, Information Theory

Introduction Expert Advice OCO Definitions Regret Online Algorithm vs Online Algorithm [Karp, 1992] An online algorithm is one that receives a sequence of requests and performs an immediate action in response to each request. The Ski Rental Problem: in the t-th day Rent (1$) or Buy (10$)? [Shalev-Shwartz, 2011] Online learning is the process of answering a sequence of questions given (maybe partial) knowledge of answers to previous questions and possibly additional information.

Introduction Expert Advice OCO Definitions Regret Online Algorithm vs Online Algorithm [Karp, 1992] An online algorithm is one that receives a sequence of requests and performs an immediate action in response to each request. The Ski Rental Problem: in the t-th day Rent (1$) or Buy (10$)? [Shalev-Shwartz, 2011] Online learning is the process of answering a sequence of questions given (maybe partial) knowledge of answers to previous questions and possibly additional information. Online Classification: in the t-th round Recevie x t R d, predict ŷ t, observe y t

Introduction Expert Advice OCO Definitions Regret Full-information vs Bandit [Shalev-Shwartz, 2011] Online learning is the process of answering a sequence of questions given (maybe partial) knowledge of answers to previous questions and possibly additional information.

Introduction Expert Advice OCO Definitions Regret Full-information vs Bandit [Shalev-Shwartz, 2011] Online learning is the process of answering a sequence of questions given (maybe partial) knowledge of answers to previous questions and possibly additional information. Full-information

Introduction Expert Advice OCO Definitions Regret Full-information vs Bandit [Shalev-Shwartz, 2011] Online learning is the process of answering a sequence of questions given (maybe partial) knowledge of answers to previous questions and possibly additional information. Bandit

Introduction Expert Advice OCO Definitions Regret Formal Definitions 1: for t = 1, 2,...,T do 4: end for

Introduction Expert Advice OCO Definitions Regret Formal Definitions 1: for t = 1, 2,...,T do 2: Learner picks a decision w t W Adversary chooses a function f t ( ) 4: end for A classifier Learner An example A loss Adversary

Introduction Expert Advice OCO Definitions Regret Formal Definitions 1: for t = 1, 2,...,T do 2: Learner picks a decision w t W Adversary chooses a function f t ( ) 3: Learner suffers loss f t (w t ) and updates w t 4: end for A classifier Learner An example A loss Adversary Suffer and update

Introduction Expert Advice OCO Definitions Regret Formal Definitions 1: for t = 1, 2,...,T do 2: Learner picks a decision w t W Adversary chooses a function f t ( ) 3: Learner suffers loss f t (w t ) and updates w t 4: end for Cumulative Loss Cumulative Loss = f t (w t )

Introduction Expert Advice OCO Definitions Regret Formal Definitions 1: for t = 1, 2,...,T do 2: Learner picks a decision w t W Adversary chooses a function f t ( ) 3: Learner suffers loss f t (w t ) and updates w t 4: end for Cumulative Loss Cumulative Loss = f t (w t ) Full-information Learner observes the function f t ( ) Online first-order optimization Bandit Learner only observes the value of f t (w t ) Online zero-order optimization

Outline Introduction Expert Advice OCO Definitions Regret 1 Introduction Definitions Regret 2 Prediction with Expert Advice 3 Online Convex Optimization Convex Functions Strongly Convex Functions Exponentially Concave Functions

Regret Introduction Expert Advice OCO Definitions Regret Cumulative Loss Cumulative Loss = f t (w t )

Regret Introduction Expert Advice OCO Definitions Regret Cumulative Loss Regret Regret = Cumulative Loss = f t (w t ) f t (w t ) min w W f t (w)

Regret Introduction Expert Advice OCO Definitions Regret Cumulative Loss Cumulative Loss = f t (w t ) Regret Regret = f t (w t ) }{{} Cumulative Loss of Online Learner min w W f t (w) }{{} Minimal Loss of Batch Learner

Regret Introduction Expert Advice OCO Definitions Regret Cumulative Loss Cumulative Loss = f t (w t ) Regret Regret = f t (w t ) }{{} Cumulative Loss of Online Learner min w W f t (w) }{{} Minimal Loss of Batch Learner Hannan Consistent ( ) 1 lim sup f t (w t ) min f t (w) = 0, with probability 1 T T w W

Regret Introduction Expert Advice OCO Definitions Regret Cumulative Loss Cumulative Loss = f t (w t ) Regret Regret = f t (w t ) }{{} Cumulative Loss of Online Learner min w W f t (w) }{{} Minimal Loss of Batch Learner Hannan Consistent f t (w t ) min f t (w) = o(t), with probability 1 w W

Outline Introduction Expert Advice OCO 1 Introduction Definitions Regret 2 Prediction with Expert Advice 3 Online Convex Optimization Convex Functions Strongly Convex Functions Exponentially Concave Functions

Introduction Expert Advice OCO Definitions The Prediction Protocol 1: for t = 1,...,T do 7: end for

Introduction Expert Advice OCO Definitions The Prediction Protocol 1: for t = 1,...,T do 2: The environment chooses the next outcome y t and the expert advice {f i,t : i [N]} 7: end for

Introduction Expert Advice OCO Definitions The Prediction Protocol 1: for t = 1,...,T do 2: The environment chooses the next outcome y t and the expert advice {f i,t : i [N]} 3: The expert advice is revealed to the forecaster 7: end for

Introduction Expert Advice OCO Definitions The Prediction Protocol 1: for t = 1,...,T do 2: The environment chooses the next outcome y t and the expert advice {f i,t : i [N]} 3: The expert advice is revealed to the forecaster 4: The forecaster chooses the prediction ˆp t 7: end for

Introduction Expert Advice OCO Definitions The Prediction Protocol 1: for t = 1,...,T do 2: The environment chooses the next outcome y t and the expert advice {f i,t : i [N]} 3: The expert advice is revealed to the forecaster 4: The forecaster chooses the prediction ˆp t 5: The environment reveals the next outcome y t 7: end for

Introduction Expert Advice OCO Definitions The Prediction Protocol 1: for t = 1,...,T do 2: The environment chooses the next outcome y t and the expert advice {f i,t : i [N]} 3: The expert advice is revealed to the forecaster 4: The forecaster chooses the prediction ˆp t 5: The environment reveals the next outcome y t 6: The forecaster incurs loss l(ˆp t, y t ) and each expert i incurs loss l(f i,t, y t ) 7: end for

Introduction Expert Advice OCO Definitions The Prediction Protocol 1: for t = 1,...,T do 2: The environment chooses the next outcome y t and the expert advice {f i,t : i [N]} 3: The expert advice is revealed to the forecaster 4: The forecaster chooses the prediction ˆp t 5: The environment reveals the next outcome y t 6: The forecaster incurs loss l(ˆp t, y t ) and each expert i incurs loss l(f i,t, y t ) 7: end for Regret with respect to expect i ( R i,t = l( p t, y t ) l(f i,t, y t ) ) = L T L i,t

Introduction Expert Advice OCO Online Algorithms Weighted Average Prediction N i=1 p t = w i,t 1f i,t N j=1 w j,t 1 where w i,t 1 is the weight assigned to expert i

Introduction Expert Advice OCO Online Algorithms Weighted Average Prediction N i=1 p t = w i,t 1f i,t N j=1 w j,t 1 where w i,t 1 is the weight assigned to expert i and Polynomially Weighted Average Forecaster w i,t 1 = 2(R i,t 1) p 1 + (R t 1 ) + p 2 p p t = N t 1 ( i=1( s=1 l( p s, y s ) l(f i,s, y s ) )) p 1 ( l( p s, y s ) l(f j,s, y s ) )) p 1 N j=1( t 1 s=1 + f i,t +

Introduction Expert Advice OCO Online Algorithms Weighted Average Prediction N i=1 p t = w i,t 1f i,t N j=1 w j,t 1 where w i,t 1 is the weight assigned to expert i Exponentially Weighted Average Forecaster w i,t 1 = e ηr i,t 1 N j=1 eηr j,t 1 and p t = ( ) N i=1 exp η( L t 1 L i,t 1 ) f i,t ( ) = N j=1 exp η( L t 1 L j,t 1 ) N i=1 e ηl i,t 1f i,t N j=1 e ηl j,t 1

Introduction Expert Advice OCO Regret Bounds I Corollary 2.1 of [Cesa-Bianchi and Lugosi, 2006] Assume that the loss function l is convex in its first argument and that it takes values in [0, 1]. Then, for any sequence y 1, y 2,... of outcomes and for any T 1, the regret of the polynomially weighted average forecaster satisfies L T min i=1,...,n L i,t T(p 1)N 2/p

Introduction Expert Advice OCO Regret Bounds I Corollary 2.1 of [Cesa-Bianchi and Lugosi, 2006] Assume that the loss function l is convex in its first argument and that it takes values in [0, 1]. Then, for any sequence y 1, y 2,... of outcomes and for any T 1, the regret of the polynomially weighted average forecaster satisfies L T min i=1,...,n L i,t T(p 1)N 2/p When p = 2, we have L T min i=1,...,n L i,t TN When p = 2ln N, we have L T min i=1,...,n L i,t Te(2ln N 1)

Introduction Expert Advice OCO Regret Bounds II Corollary 2.2 of [Cesa-Bianchi and Lugosi, 2006] Assume that the loss function l is convex in its first argument and that it takes values in [0, 1]. Then, for any sequence y 1, y 2,... of outcomes and for any T 1, the regret of the exponentially weighted average forecaster satisfies L T min L i,t ln N i=1,...,n η + ηt 2

Introduction Expert Advice OCO Regret Bounds II Corollary 2.2 of [Cesa-Bianchi and Lugosi, 2006] Assume that the loss function l is convex in its first argument and that it takes values in [0, 1]. Then, for any sequence y 1, y 2,... of outcomes and for any T 1, the regret of the exponentially weighted average forecaster satisfies L T min L i,t ln N i=1,...,n η + ηt 2 When η = 2ln N/T, we have L T min i=1,...,n L i,t 2T ln N

Introduction Expert Advice OCO Extensions Tighter Regret Bounds Small Losses L T min L i,t i=1,...,n where L T = min i=1,...,n L i,t Exp-concave Losses 2L T ln N +ln N L T min i=1,...,n L i,t ln N η

Introduction Expert Advice OCO Extensions Tighter Regret Bounds Small Losses L T min L i,t i=1,...,n where L T = min i=1,...,n L i,t Exp-concave Losses 2L T ln N +ln N L T min i=1,...,n L i,t ln N η Tracking Regret ( R(i 1,...,i T ) = l( p t, y t ) l(f it,t, y t ) )

Outline Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co 1 Introduction Definitions Regret 2 Prediction with Expert Advice 3 Online Convex Optimization Convex Functions Strongly Convex Functions Exponentially Concave Functions

Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Definitions Online Convex Optimization 1: for t = 1, 2,...,T do 2: Learner picks a decision w t W Adversary chooses a function f t ( ) 3: Learner suffers loss f t (w t ) and updates w t 4: end for where W and f t s are convex

Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Definitions Online Convex Optimization 1: for t = 1, 2,...,T do 2: Learner picks a decision w t W Adversary chooses a function f t ( ) 3: Learner suffers loss f t (w t ) and updates w t 4: end for where W and f t s are convex Online Gradient Descent w t+1 = Π W (w t η t f t (w t )) where Π W ( ) is the projection operator

Outline Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co 1 Introduction Definitions Regret 2 Prediction with Expert Advice 3 Online Convex Optimization Convex Functions Strongly Convex Functions Exponentially Concave Functions

Analysis I Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Define w t+1 = w t η t f t (w t ). For any w W, we have f t (w t ) f t (w) f t (w t ), w t w = 1 η t w t w t+1, w t w = 1 ( wt w 2 2 w t+1 w 2 2 + w t w 2η t+1 2 ) 2 t = 1 2η t ( wt w 2 2 w t+1 w 2 2 1 2η t ( wt w 2 2 w t+1 w 2 2 By adding the inequalities of all iterations, we have f t (w t ) f t (w) 1 2η 1 w 1 w 2 2 1 2η T w T+1 w 2 2 + 1 2 t=2 ( 1 1 ) w t w 2 2 + 1 η t η t 1 2 ) η t + 2 f t(w t ) 2 2 ) η t + 2 f t(w t ) 2 2 η t f t (w t ) 2 2

Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Analysis II Assuming we have By setting η t η t 1, w t w 2 2 D 2, and f t (w t ) 2 2 G 2, f t (w t ) f t (w) D2 + D2 2η 1 2 = D2 + G2 2η T 2 t=2 η t = 1 t, ( 1 1 )+ G2 η t η t 1 2 η t η t we have f t (w t ) f t (w) ( ) D 2 T 2 + G2

Outline Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co 1 Introduction Definitions Regret 2 Prediction with Expert Advice 3 Online Convex Optimization Convex Functions Strongly Convex Functions Exponentially Concave Functions

Analysis I Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Define w t+1 = w t η t f t (w t ). For any w W, we have f t (w t ) f t (w) f t (w t ), w t w λ 2 w t w 2 2 1 ( wt w 2 2 w t+1 w 2 ) η t 2 + 2η t 2 f t(w t ) 2 2 λ 2 w t w 2 2 By adding the inequalities of all iterations, we have f t (w t ) f t (w) 1 2η 1 w 1 w 2 2 λ 2 w 1 w 2 2 1 2η T w T+1 w 2 2 + 1 2 t=2 ( 1 1 ) λ w t w 2 2 + 1 η t η t 1 2 η t f t (w t ) 2 2

Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Analysis II Assuming we have f t (w t ) 2 2 G 2, f t (w t ) f t (w) 1 2η 1 w 1 w 2 2 λ 2 w 1 w 2 2 By setting we have + 1 2 t=2 ( 1 1 λ ) w t w 22 + G2 η t η t 1 2 η t = 1 λt, η t f t (w t ) f t (w) G2 2λ 1 t G2 (log T + 1) 2λ

Outline Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co 1 Introduction Definitions Regret 2 Prediction with Expert Advice 3 Online Convex Optimization Convex Functions Strongly Convex Functions Exponentially Concave Functions

Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Definitions Exponential Concavity A function f( ) : W R is α-exp-concave if exp( αf( )) is concave over W.

Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Definitions Exponential Concavity A function f( ) : W R is α-exp-concave if exp( αf( )) is concave over W. Logistic Loss Square Loss Negative Logarithm Loss ( ) f(w) = log 1+exp( yx w) f(w) = (x w y) 2 f(w) = log(x w)

Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Online Newton Step The Algorithm 1: A 1 = 1 β 2 D 2 I 2: for t = 1, 2,...,T do 3: A t+1 = A t + f t (w t )[ f t (w t )] 4: w t+1 =Π A t W =argmin w W ( w t 1 β A 1 where w t+1 = w t 1 β A 1 t f t (w t ) 5: end for ) t f t (w t ) ( w w t+1 ) At ( w w t+1 )

Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Regret Bound Theorem 2 of [Hazan et al., 2007] Assume that for all t, the loss function f t : W R d R is α-exp-concave and has the property that f t (w) 2 G, w W, t [T] Then, online Newton Step with β = 1 { } 1 2 min 4GD,α has the following regret bound ( ) 1 f t (w t ) min f t (w) 5 w W α + GD d log T

Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Online-to-batch Conversion Statistical Assumption f 1, f 2,... are sampled independently from P Define F(w) = E f P [f(w)]

Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Online-to-batch Conversion Statistical Assumption f 1, f 2,... are sampled independently from P Define F(w) = E f P [f(w)] Regret Bound f t (w t ) f t (w) C Taking expectation over both sides, we have [ ] E F(w t ) F(w) C

Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Online-to-batch Conversion Statistical Assumption f 1, f 2,... are sampled independently from P Define F(w) = E f P [f(w)] Regret Bound f t (w t ) f t (w) C Taking expectation over both sides, we have [ ] E F(w t ) F(w) C Risk Bound E[F( w)] F(w) C T, where w = 1 T w t

Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Examples of Conversion Convex Functions E[F( w)] F(w) 1 ( ) D 2 T 2 + G2 High-probability Bound [Cesa-bianchi et al., 2002]

Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Examples of Conversion Convex Functions E[F( w)] F(w) 1 ( ) D 2 T 2 + G2 High-probability Bound [Cesa-bianchi et al., 2002] Strongly Convex Functions E[F( w)] F(w) G2 (log T + 1) 2λT High-probability Bound [Kakade and Tewari, 2009]

Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Examples of Conversion Convex Functions E[F( w)] F(w) 1 ( ) D 2 T 2 + G2 High-probability Bound [Cesa-bianchi et al., 2002] Strongly Convex Functions E[F( w)] F(w) G2 (log T + 1) 2λT High-probability Bound [Kakade and Tewari, 2009] Exponentially Concave Functions ( ) 1 d log T E[F( w)] F(w) 5 α + GD T High-probability Bound [Mahdavi et al., 2015]

Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Extensions Faster Rates [Srebro et al., 2010] f t (w t ) min f t (w) O( Tf ) w W where f = min w W T f t(w)

Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Extensions Faster Rates [Srebro et al., 2010] f t (w t ) min f t (w) O( Tf ) w W where f = min w W T f t(w) Dynamic Regret [Zinkevich, 2003, Yang et al., 2016] R(u 1,...,u T ) = f t (w t ) f t (u t )

Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Extensions Faster Rates [Srebro et al., 2010] f t (w t ) min f t (w) O( Tf ) w W where f = min w W T f t(w) Dynamic Regret [Zinkevich, 2003, Yang et al., 2016] R(u 1,...,u T ) = f t (w t ) f t (u t ) R(T,τ) = max [s,s+τ 1] [T] Adaptive Regret [Hazan and Seshadhri, 2007, Daniely et al., 2015] ( s+τ 1 t=s s+τ 1 f t (w t ) min w W t=s f t (w) )

Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Reference I Cesa-bianchi, N., Conconi, A., and Gentile, C. (2002). On the generalization ability of on-line learning algorithms. In Advances in Neural Information Processing Systems 14, pages 359 366. Cesa-Bianchi, N. and Lugosi, G. (2006). Prediction, Learning, and Games. Cambridge University Press. Daniely, A., Gonen, A., and Shalev-Shwartz, S. (2015). Strongly adaptive online learning. In Proceedings of The 32nd International Conference on Machine Learning. Hazan, E., Agarwal, A., and Kale, S. (2007). Logarithmic regret algorithms for online convex optimization. Machine Learning, 69(2-3):169 192. Hazan, E. and Seshadhri, C. (2007). Adaptive algorithms for online decision problems. Electronic Colloquium on Computational Complexity, 88. Kakade, S. M. and Tewari, A. (2009). On the generalization ability of online strongly convex programming algorithms. In Advances in Neural Information Processing Systems 21, pages 801 808.

Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Reference II Karp, R. M. (1992). On-line algorithms versus off-line algorithms: How much is it worth to know the future? In Proceedings of the IFIP 12th World Computer Congress on Algorithms, Software, Architecture Information Processing, pages 416 429. Mahdavi, M., Zhang, L., and Jin, R. (2015). Lower and upper bounds on the generalization of stochastic exponentially concave optimization. In Proceedings of the 28th Conference on Learning Theory. Shalev-Shwartz, S. (2011). Online learning and online convex optimization. Foundations and Trends in Machine Learning, 4(2):107 194. Srebro, N., Sridharan, K., and Tewari, A. (2010). Optimistic rates for learning with a smooth loss. ArXiv e-prints, arxiv:1009.3896. Yang, T., Zhang, L., Jin, R., and Yi, J. (2016). Tracking slowly moving clairvoyant: Optimal dynamic regret of online learning with true and noisy gradient. In Proceedings of the 33rd International Conference on Machine Learning.

Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Reference III Zinkevich, M. (2003). Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the 20th International Conference on Machine Learning, pages 928 936.