Full-information Online Learning

Size: px
Start display at page:

Download "Full-information Online Learning"

Transcription

1 Introduction Expert Advice OCO LM A DA NANJING UNIVERSITY Full-information Lijun Zhang Nanjing University, China June 2, 2017

2 Outline Introduction Expert Advice OCO 1 Introduction Definitions Regret 2 Prediction with Expert Advice 3 Online Convex Optimization Convex Functions Strongly Convex Functions Exponentially Concave Functions

3 Outline Introduction Expert Advice OCO Definitions Regret 1 Introduction Definitions Regret 2 Prediction with Expert Advice 3 Online Convex Optimization Convex Functions Strongly Convex Functions Exponentially Concave Functions

4 Introduction Expert Advice OCO Definitions Regret What Happens in an Internet Minute?

5 Outline Introduction Expert Advice OCO Definitions Regret 1 Introduction Definitions Regret 2 Prediction with Expert Advice 3 Online Convex Optimization Convex Functions Strongly Convex Functions Exponentially Concave Functions

6 Introduction Expert Advice OCO Definitions Regret Online Algorithm vs Online Algorithm [Karp, 1992] An online algorithm is one that receives a sequence of requests and performs an immediate action in response to each request. [Shalev-Shwartz, 2011] Online learning is the process of answering a sequence of questions given (maybe partial) knowledge of answers to previous questions and possibly additional information.

7 Introduction Expert Advice OCO Definitions Regret Online Algorithm vs Online Algorithm [Karp, 1992] An online algorithm is one that receives a sequence of requests and performs an immediate action in response to each request. Computer Vision, Machine Learning, Data Mining Theoretical Computer Science, Computer Networks [Shalev-Shwartz, 2011] Online learning is the process of answering a sequence of questions given (maybe partial) knowledge of answers to previous questions and possibly additional information.

8 Introduction Expert Advice OCO Definitions Regret Online Algorithm vs Online Algorithm [Karp, 1992] An online algorithm is one that receives a sequence of requests and performs an immediate action in response to each request. Computer Vision, Machine Learning, Data Mining Theoretical Computer Science, Computer Networks [Shalev-Shwartz, 2011] Online learning is the process of answering a sequence of questions given (maybe partial) knowledge of answers to previous questions and possibly additional information. Machine Learning, Game Theory, Information Theory

9 Introduction Expert Advice OCO Definitions Regret Online Algorithm vs Online Algorithm [Karp, 1992] An online algorithm is one that receives a sequence of requests and performs an immediate action in response to each request. The Ski Rental Problem: in the t-th day Rent (1$) or Buy (10$)? [Shalev-Shwartz, 2011] Online learning is the process of answering a sequence of questions given (maybe partial) knowledge of answers to previous questions and possibly additional information.

10 Introduction Expert Advice OCO Definitions Regret Online Algorithm vs Online Algorithm [Karp, 1992] An online algorithm is one that receives a sequence of requests and performs an immediate action in response to each request. The Ski Rental Problem: in the t-th day Rent (1$) or Buy (10$)? [Shalev-Shwartz, 2011] Online learning is the process of answering a sequence of questions given (maybe partial) knowledge of answers to previous questions and possibly additional information. Online Classification: in the t-th round Recevie x t R d, predict ŷ t, observe y t

11 Introduction Expert Advice OCO Definitions Regret Full-information vs Bandit [Shalev-Shwartz, 2011] Online learning is the process of answering a sequence of questions given (maybe partial) knowledge of answers to previous questions and possibly additional information.

12 Introduction Expert Advice OCO Definitions Regret Full-information vs Bandit [Shalev-Shwartz, 2011] Online learning is the process of answering a sequence of questions given (maybe partial) knowledge of answers to previous questions and possibly additional information. Full-information

13 Introduction Expert Advice OCO Definitions Regret Full-information vs Bandit [Shalev-Shwartz, 2011] Online learning is the process of answering a sequence of questions given (maybe partial) knowledge of answers to previous questions and possibly additional information. Bandit

14 Introduction Expert Advice OCO Definitions Regret Formal Definitions 1: for t = 1, 2,...,T do 4: end for

15 Introduction Expert Advice OCO Definitions Regret Formal Definitions 1: for t = 1, 2,...,T do 2: Learner picks a decision w t W Adversary chooses a function f t ( ) 4: end for A classifier Learner An example A loss Adversary

16 Introduction Expert Advice OCO Definitions Regret Formal Definitions 1: for t = 1, 2,...,T do 2: Learner picks a decision w t W Adversary chooses a function f t ( ) 3: Learner suffers loss f t (w t ) and updates w t 4: end for A classifier Learner An example A loss Adversary Suffer and update

17 Introduction Expert Advice OCO Definitions Regret Formal Definitions 1: for t = 1, 2,...,T do 2: Learner picks a decision w t W Adversary chooses a function f t ( ) 3: Learner suffers loss f t (w t ) and updates w t 4: end for Cumulative Loss Cumulative Loss = f t (w t )

18 Introduction Expert Advice OCO Definitions Regret Formal Definitions 1: for t = 1, 2,...,T do 2: Learner picks a decision w t W Adversary chooses a function f t ( ) 3: Learner suffers loss f t (w t ) and updates w t 4: end for Cumulative Loss Cumulative Loss = f t (w t ) Full-information Learner observes the function f t ( ) Online first-order optimization Bandit Learner only observes the value of f t (w t ) Online zero-order optimization

19 Outline Introduction Expert Advice OCO Definitions Regret 1 Introduction Definitions Regret 2 Prediction with Expert Advice 3 Online Convex Optimization Convex Functions Strongly Convex Functions Exponentially Concave Functions

20 Regret Introduction Expert Advice OCO Definitions Regret Cumulative Loss Cumulative Loss = f t (w t )

21 Regret Introduction Expert Advice OCO Definitions Regret Cumulative Loss Regret Regret = Cumulative Loss = f t (w t ) f t (w t ) min w W f t (w)

22 Regret Introduction Expert Advice OCO Definitions Regret Cumulative Loss Cumulative Loss = f t (w t ) Regret Regret = f t (w t ) }{{} Cumulative Loss of Online Learner min w W f t (w) }{{} Minimal Loss of Batch Learner

23 Regret Introduction Expert Advice OCO Definitions Regret Cumulative Loss Cumulative Loss = f t (w t ) Regret Regret = f t (w t ) }{{} Cumulative Loss of Online Learner min w W f t (w) }{{} Minimal Loss of Batch Learner Hannan Consistent ( ) 1 lim sup f t (w t ) min f t (w) = 0, with probability 1 T T w W

24 Regret Introduction Expert Advice OCO Definitions Regret Cumulative Loss Cumulative Loss = f t (w t ) Regret Regret = f t (w t ) }{{} Cumulative Loss of Online Learner min w W f t (w) }{{} Minimal Loss of Batch Learner Hannan Consistent f t (w t ) min f t (w) = o(t), with probability 1 w W

25 Outline Introduction Expert Advice OCO 1 Introduction Definitions Regret 2 Prediction with Expert Advice 3 Online Convex Optimization Convex Functions Strongly Convex Functions Exponentially Concave Functions

26 Introduction Expert Advice OCO Definitions The Prediction Protocol 1: for t = 1,...,T do 7: end for

27 Introduction Expert Advice OCO Definitions The Prediction Protocol 1: for t = 1,...,T do 2: The environment chooses the next outcome y t and the expert advice {f i,t : i [N]} 7: end for

28 Introduction Expert Advice OCO Definitions The Prediction Protocol 1: for t = 1,...,T do 2: The environment chooses the next outcome y t and the expert advice {f i,t : i [N]} 3: The expert advice is revealed to the forecaster 7: end for

29 Introduction Expert Advice OCO Definitions The Prediction Protocol 1: for t = 1,...,T do 2: The environment chooses the next outcome y t and the expert advice {f i,t : i [N]} 3: The expert advice is revealed to the forecaster 4: The forecaster chooses the prediction ˆp t 7: end for

30 Introduction Expert Advice OCO Definitions The Prediction Protocol 1: for t = 1,...,T do 2: The environment chooses the next outcome y t and the expert advice {f i,t : i [N]} 3: The expert advice is revealed to the forecaster 4: The forecaster chooses the prediction ˆp t 5: The environment reveals the next outcome y t 7: end for

31 Introduction Expert Advice OCO Definitions The Prediction Protocol 1: for t = 1,...,T do 2: The environment chooses the next outcome y t and the expert advice {f i,t : i [N]} 3: The expert advice is revealed to the forecaster 4: The forecaster chooses the prediction ˆp t 5: The environment reveals the next outcome y t 6: The forecaster incurs loss l(ˆp t, y t ) and each expert i incurs loss l(f i,t, y t ) 7: end for

32 Introduction Expert Advice OCO Definitions The Prediction Protocol 1: for t = 1,...,T do 2: The environment chooses the next outcome y t and the expert advice {f i,t : i [N]} 3: The expert advice is revealed to the forecaster 4: The forecaster chooses the prediction ˆp t 5: The environment reveals the next outcome y t 6: The forecaster incurs loss l(ˆp t, y t ) and each expert i incurs loss l(f i,t, y t ) 7: end for Regret with respect to expect i ( R i,t = l( p t, y t ) l(f i,t, y t ) ) = L T L i,t

33 Introduction Expert Advice OCO Online Algorithms Weighted Average Prediction N i=1 p t = w i,t 1f i,t N j=1 w j,t 1 where w i,t 1 is the weight assigned to expert i

34 Introduction Expert Advice OCO Online Algorithms Weighted Average Prediction N i=1 p t = w i,t 1f i,t N j=1 w j,t 1 where w i,t 1 is the weight assigned to expert i and Polynomially Weighted Average Forecaster w i,t 1 = 2(R i,t 1) p 1 + (R t 1 ) + p 2 p p t = N t 1 ( i=1( s=1 l( p s, y s ) l(f i,s, y s ) )) p 1 ( l( p s, y s ) l(f j,s, y s ) )) p 1 N j=1( t 1 s=1 + f i,t +

35 Introduction Expert Advice OCO Online Algorithms Weighted Average Prediction N i=1 p t = w i,t 1f i,t N j=1 w j,t 1 where w i,t 1 is the weight assigned to expert i Exponentially Weighted Average Forecaster w i,t 1 = e ηr i,t 1 N j=1 eηr j,t 1 and p t = ( ) N i=1 exp η( L t 1 L i,t 1 ) f i,t ( ) = N j=1 exp η( L t 1 L j,t 1 ) N i=1 e ηl i,t 1f i,t N j=1 e ηl j,t 1

36 Introduction Expert Advice OCO Regret Bounds I Corollary 2.1 of [Cesa-Bianchi and Lugosi, 2006] Assume that the loss function l is convex in its first argument and that it takes values in [0, 1]. Then, for any sequence y 1, y 2,... of outcomes and for any T 1, the regret of the polynomially weighted average forecaster satisfies L T min i=1,...,n L i,t T(p 1)N 2/p

37 Introduction Expert Advice OCO Regret Bounds I Corollary 2.1 of [Cesa-Bianchi and Lugosi, 2006] Assume that the loss function l is convex in its first argument and that it takes values in [0, 1]. Then, for any sequence y 1, y 2,... of outcomes and for any T 1, the regret of the polynomially weighted average forecaster satisfies L T min i=1,...,n L i,t T(p 1)N 2/p When p = 2, we have L T min i=1,...,n L i,t TN When p = 2ln N, we have L T min i=1,...,n L i,t Te(2ln N 1)

38 Introduction Expert Advice OCO Regret Bounds II Corollary 2.2 of [Cesa-Bianchi and Lugosi, 2006] Assume that the loss function l is convex in its first argument and that it takes values in [0, 1]. Then, for any sequence y 1, y 2,... of outcomes and for any T 1, the regret of the exponentially weighted average forecaster satisfies L T min L i,t ln N i=1,...,n η + ηt 2

39 Introduction Expert Advice OCO Regret Bounds II Corollary 2.2 of [Cesa-Bianchi and Lugosi, 2006] Assume that the loss function l is convex in its first argument and that it takes values in [0, 1]. Then, for any sequence y 1, y 2,... of outcomes and for any T 1, the regret of the exponentially weighted average forecaster satisfies L T min L i,t ln N i=1,...,n η + ηt 2 When η = 2ln N/T, we have L T min i=1,...,n L i,t 2T ln N

40 Introduction Expert Advice OCO Extensions Tighter Regret Bounds Small Losses L T min L i,t i=1,...,n where L T = min i=1,...,n L i,t Exp-concave Losses 2L T ln N +ln N L T min i=1,...,n L i,t ln N η

41 Introduction Expert Advice OCO Extensions Tighter Regret Bounds Small Losses L T min L i,t i=1,...,n where L T = min i=1,...,n L i,t Exp-concave Losses 2L T ln N +ln N L T min i=1,...,n L i,t ln N η Tracking Regret ( R(i 1,...,i T ) = l( p t, y t ) l(f it,t, y t ) )

42 Outline Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co 1 Introduction Definitions Regret 2 Prediction with Expert Advice 3 Online Convex Optimization Convex Functions Strongly Convex Functions Exponentially Concave Functions

43 Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Definitions Online Convex Optimization 1: for t = 1, 2,...,T do 2: Learner picks a decision w t W Adversary chooses a function f t ( ) 3: Learner suffers loss f t (w t ) and updates w t 4: end for where W and f t s are convex

44 Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Definitions Online Convex Optimization 1: for t = 1, 2,...,T do 2: Learner picks a decision w t W Adversary chooses a function f t ( ) 3: Learner suffers loss f t (w t ) and updates w t 4: end for where W and f t s are convex Online Gradient Descent w t+1 = Π W (w t η t f t (w t )) where Π W ( ) is the projection operator

45 Outline Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co 1 Introduction Definitions Regret 2 Prediction with Expert Advice 3 Online Convex Optimization Convex Functions Strongly Convex Functions Exponentially Concave Functions

46 Analysis I Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Define w t+1 = w t η t f t (w t ). For any w W, we have f t (w t ) f t (w) f t (w t ), w t w = 1 η t w t w t+1, w t w = 1 ( wt w 2 2 w t+1 w w t w 2η t+1 2 ) 2 t = 1 2η t ( wt w 2 2 w t+1 w η t ( wt w 2 2 w t+1 w 2 2 By adding the inequalities of all iterations, we have f t (w t ) f t (w) 1 2η 1 w 1 w η T w T+1 w t=2 ( 1 1 ) w t w η t η t 1 2 ) η t + 2 f t(w t ) 2 2 ) η t + 2 f t(w t ) 2 2 η t f t (w t ) 2 2

47 Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Analysis II Assuming we have By setting η t η t 1, w t w 2 2 D 2, and f t (w t ) 2 2 G 2, f t (w t ) f t (w) D2 + D2 2η 1 2 = D2 + G2 2η T 2 t=2 η t = 1 t, ( 1 1 )+ G2 η t η t 1 2 η t η t we have f t (w t ) f t (w) ( ) D 2 T 2 + G2

48 Outline Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co 1 Introduction Definitions Regret 2 Prediction with Expert Advice 3 Online Convex Optimization Convex Functions Strongly Convex Functions Exponentially Concave Functions

49 Analysis I Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Define w t+1 = w t η t f t (w t ). For any w W, we have f t (w t ) f t (w) f t (w t ), w t w λ 2 w t w ( wt w 2 2 w t+1 w 2 ) η t 2 + 2η t 2 f t(w t ) 2 2 λ 2 w t w 2 2 By adding the inequalities of all iterations, we have f t (w t ) f t (w) 1 2η 1 w 1 w 2 2 λ 2 w 1 w η T w T+1 w t=2 ( 1 1 ) λ w t w η t η t 1 2 η t f t (w t ) 2 2

50 Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Analysis II Assuming we have f t (w t ) 2 2 G 2, f t (w t ) f t (w) 1 2η 1 w 1 w 2 2 λ 2 w 1 w 2 2 By setting we have t=2 ( 1 1 λ ) w t w 22 + G2 η t η t 1 2 η t = 1 λt, η t f t (w t ) f t (w) G2 2λ 1 t G2 (log T + 1) 2λ

51 Outline Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co 1 Introduction Definitions Regret 2 Prediction with Expert Advice 3 Online Convex Optimization Convex Functions Strongly Convex Functions Exponentially Concave Functions

52 Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Definitions Exponential Concavity A function f( ) : W R is α-exp-concave if exp( αf( )) is concave over W.

53 Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Definitions Exponential Concavity A function f( ) : W R is α-exp-concave if exp( αf( )) is concave over W. Logistic Loss Square Loss Negative Logarithm Loss ( ) f(w) = log 1+exp( yx w) f(w) = (x w y) 2 f(w) = log(x w)

54 Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Online Newton Step The Algorithm 1: A 1 = 1 β 2 D 2 I 2: for t = 1, 2,...,T do 3: A t+1 = A t + f t (w t )[ f t (w t )] 4: w t+1 =Π A t W =argmin w W ( w t 1 β A 1 where w t+1 = w t 1 β A 1 t f t (w t ) 5: end for ) t f t (w t ) ( w w t+1 ) At ( w w t+1 )

55 Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Regret Bound Theorem 2 of [Hazan et al., 2007] Assume that for all t, the loss function f t : W R d R is α-exp-concave and has the property that f t (w) 2 G, w W, t [T] Then, online Newton Step with β = 1 { } 1 2 min 4GD,α has the following regret bound ( ) 1 f t (w t ) min f t (w) 5 w W α + GD d log T

56 Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Online-to-batch Conversion Statistical Assumption f 1, f 2,... are sampled independently from P Define F(w) = E f P [f(w)]

57 Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Online-to-batch Conversion Statistical Assumption f 1, f 2,... are sampled independently from P Define F(w) = E f P [f(w)] Regret Bound f t (w t ) f t (w) C Taking expectation over both sides, we have [ ] E F(w t ) F(w) C

58 Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Online-to-batch Conversion Statistical Assumption f 1, f 2,... are sampled independently from P Define F(w) = E f P [f(w)] Regret Bound f t (w t ) f t (w) C Taking expectation over both sides, we have [ ] E F(w t ) F(w) C Risk Bound E[F( w)] F(w) C T, where w = 1 T w t

59 Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Examples of Conversion Convex Functions E[F( w)] F(w) 1 ( ) D 2 T 2 + G2 High-probability Bound [Cesa-bianchi et al., 2002]

60 Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Examples of Conversion Convex Functions E[F( w)] F(w) 1 ( ) D 2 T 2 + G2 High-probability Bound [Cesa-bianchi et al., 2002] Strongly Convex Functions E[F( w)] F(w) G2 (log T + 1) 2λT High-probability Bound [Kakade and Tewari, 2009]

61 Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Examples of Conversion Convex Functions E[F( w)] F(w) 1 ( ) D 2 T 2 + G2 High-probability Bound [Cesa-bianchi et al., 2002] Strongly Convex Functions E[F( w)] F(w) G2 (log T + 1) 2λT High-probability Bound [Kakade and Tewari, 2009] Exponentially Concave Functions ( ) 1 d log T E[F( w)] F(w) 5 α + GD T High-probability Bound [Mahdavi et al., 2015]

62 Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Extensions Faster Rates [Srebro et al., 2010] f t (w t ) min f t (w) O( Tf ) w W where f = min w W T f t(w)

63 Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Extensions Faster Rates [Srebro et al., 2010] f t (w t ) min f t (w) O( Tf ) w W where f = min w W T f t(w) Dynamic Regret [Zinkevich, 2003, Yang et al., 2016] R(u 1,...,u T ) = f t (w t ) f t (u t )

64 Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Extensions Faster Rates [Srebro et al., 2010] f t (w t ) min f t (w) O( Tf ) w W where f = min w W T f t(w) Dynamic Regret [Zinkevich, 2003, Yang et al., 2016] R(u 1,...,u T ) = f t (w t ) f t (u t ) R(T,τ) = max [s,s+τ 1] [T] Adaptive Regret [Hazan and Seshadhri, 2007, Daniely et al., 2015] ( s+τ 1 t=s s+τ 1 f t (w t ) min w W t=s f t (w) )

65 Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Reference I Cesa-bianchi, N., Conconi, A., and Gentile, C. (2002). On the generalization ability of on-line learning algorithms. In Advances in Neural Information Processing Systems 14, pages Cesa-Bianchi, N. and Lugosi, G. (2006). Prediction, Learning, and Games. Cambridge University Press. Daniely, A., Gonen, A., and Shalev-Shwartz, S. (2015). Strongly adaptive online learning. In Proceedings of The 32nd International Conference on Machine Learning. Hazan, E., Agarwal, A., and Kale, S. (2007). Logarithmic regret algorithms for online convex optimization. Machine Learning, 69(2-3): Hazan, E. and Seshadhri, C. (2007). Adaptive algorithms for online decision problems. Electronic Colloquium on Computational Complexity, 88. Kakade, S. M. and Tewari, A. (2009). On the generalization ability of online strongly convex programming algorithms. In Advances in Neural Information Processing Systems 21, pages

66 Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Reference II Karp, R. M. (1992). On-line algorithms versus off-line algorithms: How much is it worth to know the future? In Proceedings of the IFIP 12th World Computer Congress on Algorithms, Software, Architecture Information Processing, pages Mahdavi, M., Zhang, L., and Jin, R. (2015). Lower and upper bounds on the generalization of stochastic exponentially concave optimization. In Proceedings of the 28th Conference on Learning Theory. Shalev-Shwartz, S. (2011). Online learning and online convex optimization. Foundations and Trends in Machine Learning, 4(2): Srebro, N., Sridharan, K., and Tewari, A. (2010). Optimistic rates for learning with a smooth loss. ArXiv e-prints, arxiv: Yang, T., Zhang, L., Jin, R., and Yi, J. (2016). Tracking slowly moving clairvoyant: Optimal dynamic regret of online learning with true and noisy gradient. In Proceedings of the 33rd International Conference on Machine Learning.

67 Introduction Expert Advice OCO Convex Functions Strongly Convex Functions Exponentially Co Reference III Zinkevich, M. (2003). Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the 20th International Conference on Machine Learning, pages

Adaptive Online Learning in Dynamic Environments

Adaptive Online Learning in Dynamic Environments Adaptive Online Learning in Dynamic Environments Lijun Zhang, Shiyin Lu, Zhi-Hua Zhou National Key Laboratory for Novel Software Technology Nanjing University, Nanjing 210023, China {zhanglj, lusy, zhouzh}@lamda.nju.edu.cn

More information

Stochastic Optimization

Stochastic Optimization Introduction Related Work SGD Epoch-GD LM A DA NANJING UNIVERSITY Lijun Zhang Nanjing University, China May 26, 2017 Introduction Related Work SGD Epoch-GD Outline 1 Introduction 2 Related Work 3 Stochastic

More information

Dynamic Regret of Strongly Adaptive Methods

Dynamic Regret of Strongly Adaptive Methods Lijun Zhang 1 ianbao Yang 2 Rong Jin 3 Zhi-Hua Zhou 1 Abstract o cope with changing environments, recent developments in online learning have introduced the concepts of adaptive regret and dynamic regret

More information

Online Learning and Online Convex Optimization

Online Learning and Online Convex Optimization Online Learning and Online Convex Optimization Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Online Learning 1 / 49 Summary 1 My beautiful regret 2 A supposedly fun game

More information

Online Learning and Sequential Decision Making

Online Learning and Sequential Decision Making Online Learning and Sequential Decision Making Emilie Kaufmann CNRS & CRIStAL, Inria SequeL, emilie.kaufmann@univ-lille.fr Research School, ENS Lyon, Novembre 12-13th 2018 Emilie Kaufmann Online Learning

More information

The Online Approach to Machine Learning

The Online Approach to Machine Learning The Online Approach to Machine Learning Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Online Approach to ML 1 / 53 Summary 1 My beautiful regret 2 A supposedly fun game I

More information

Online Optimization in Dynamic Environments: Improved Regret Rates for Strongly Convex Problems

Online Optimization in Dynamic Environments: Improved Regret Rates for Strongly Convex Problems 216 IEEE 55th Conference on Decision and Control (CDC) ARIA Resort & Casino December 12-14, 216, Las Vegas, USA Online Optimization in Dynamic Environments: Improved Regret Rates for Strongly Convex Problems

More information

Supplementary Material of Dynamic Regret of Strongly Adaptive Methods

Supplementary Material of Dynamic Regret of Strongly Adaptive Methods Supplementary Material of A Proof of Lemma 1 Lijun Zhang 1 ianbao Yang Rong Jin 3 Zhi-Hua Zhou 1 We first prove the first part of Lemma 1 Let k = log K t hen, integer t can be represented in the base-k

More information

Adaptive Online Gradient Descent

Adaptive Online Gradient Descent University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 6-4-2007 Adaptive Online Gradient Descent Peter Bartlett Elad Hazan Alexander Rakhlin University of Pennsylvania Follow

More information

A Second-order Bound with Excess Losses

A Second-order Bound with Excess Losses A Second-order Bound with Excess Losses Pierre Gaillard 12 Gilles Stoltz 2 Tim van Erven 3 1 EDF R&D, Clamart, France 2 GREGHEC: HEC Paris CNRS, Jouy-en-Josas, France 3 Leiden University, the Netherlands

More information

OLSO. Online Learning and Stochastic Optimization. Yoram Singer August 10, Google Research

OLSO. Online Learning and Stochastic Optimization. Yoram Singer August 10, Google Research OLSO Online Learning and Stochastic Optimization Yoram Singer August 10, 2016 Google Research References Introduction to Online Convex Optimization, Elad Hazan, Princeton University Online Learning and

More information

Exponential Weights on the Hypercube in Polynomial Time

Exponential Weights on the Hypercube in Polynomial Time European Workshop on Reinforcement Learning 14 (2018) October 2018, Lille, France. Exponential Weights on the Hypercube in Polynomial Time College of Information and Computer Sciences University of Massachusetts

More information

The No-Regret Framework for Online Learning

The No-Regret Framework for Online Learning The No-Regret Framework for Online Learning A Tutorial Introduction Nahum Shimkin Technion Israel Institute of Technology Haifa, Israel Stochastic Processes in Engineering IIT Mumbai, March 2013 N. Shimkin,

More information

Lower and Upper Bounds on the Generalization of Stochastic Exponentially Concave Optimization

Lower and Upper Bounds on the Generalization of Stochastic Exponentially Concave Optimization JMLR: Workshop and Conference Proceedings vol 40:1 16, 2015 28th Annual Conference on Learning Theory Lower and Upper Bounds on the Generalization of Stochastic Exponentially Concave Optimization Mehrdad

More information

Minimizing Adaptive Regret with One Gradient per Iteration

Minimizing Adaptive Regret with One Gradient per Iteration Minimizing Adaptive Regret with One Gradient per Iteration Guanghui Wang, Dakuan Zhao, Lijun Zhang National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China wanggh@lamda.nju.edu.cn,

More information

From Bandits to Experts: A Tale of Domination and Independence

From Bandits to Experts: A Tale of Domination and Independence From Bandits to Experts: A Tale of Domination and Independence Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Domination and Independence 1 / 1 From Bandits to Experts: A

More information

On the Generalization Ability of Online Strongly Convex Programming Algorithms

On the Generalization Ability of Online Strongly Convex Programming Algorithms On the Generalization Ability of Online Strongly Convex Programming Algorithms Sham M. Kakade I Chicago Chicago, IL 60637 sham@tti-c.org Ambuj ewari I Chicago Chicago, IL 60637 tewari@tti-c.org Abstract

More information

A survey: The convex optimization approach to regret minimization

A survey: The convex optimization approach to regret minimization A survey: The convex optimization approach to regret minimization Elad Hazan September 10, 2009 WORKING DRAFT Abstract A well studied and general setting for prediction and decision making is regret minimization

More information

Accelerating Online Convex Optimization via Adaptive Prediction

Accelerating Online Convex Optimization via Adaptive Prediction Mehryar Mohri Courant Institute and Google Research New York, NY 10012 mohri@cims.nyu.edu Scott Yang Courant Institute New York, NY 10012 yangs@cims.nyu.edu Abstract We present a powerful general framework

More information

Near-Optimal Algorithms for Online Matrix Prediction

Near-Optimal Algorithms for Online Matrix Prediction JMLR: Workshop and Conference Proceedings vol 23 (2012) 38.1 38.13 25th Annual Conference on Learning Theory Near-Optimal Algorithms for Online Matrix Prediction Elad Hazan Technion - Israel Inst. of Tech.

More information

Lecture 16: Perceptron and Exponential Weights Algorithm

Lecture 16: Perceptron and Exponential Weights Algorithm EECS 598-005: Theoretical Foundations of Machine Learning Fall 2015 Lecture 16: Perceptron and Exponential Weights Algorithm Lecturer: Jacob Abernethy Scribes: Yue Wang, Editors: Weiqing Yu and Andrew

More information

The Interplay Between Stability and Regret in Online Learning

The Interplay Between Stability and Regret in Online Learning The Interplay Between Stability and Regret in Online Learning Ankan Saha Department of Computer Science University of Chicago ankans@cs.uchicago.edu Prateek Jain Microsoft Research India prajain@microsoft.com

More information

Learning, Games, and Networks

Learning, Games, and Networks Learning, Games, and Networks Abhishek Sinha Laboratory for Information and Decision Systems MIT ML Talk Series @CNRG December 12, 2016 1 / 44 Outline 1 Prediction With Experts Advice 2 Application to

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Online Convex Optimization MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Outline Online projected sub-gradient descent. Exponentiated Gradient (EG). Mirror descent.

More information

A Low Complexity Algorithm with O( T ) Regret and Finite Constraint Violations for Online Convex Optimization with Long Term Constraints

A Low Complexity Algorithm with O( T ) Regret and Finite Constraint Violations for Online Convex Optimization with Long Term Constraints A Low Complexity Algorithm with O( T ) Regret and Finite Constraint Violations for Online Convex Optimization with Long Term Constraints Hao Yu and Michael J. Neely Department of Electrical Engineering

More information

Simultaneous Model Selection and Optimization through Parameter-free Stochastic Learning

Simultaneous Model Selection and Optimization through Parameter-free Stochastic Learning Simultaneous Model Selection and Optimization through Parameter-free Stochastic Learning Francesco Orabona Yahoo! Labs New York, USA francesco@orabona.com Abstract Stochastic gradient descent algorithms

More information

Adaptivity and Optimism: An Improved Exponentiated Gradient Algorithm

Adaptivity and Optimism: An Improved Exponentiated Gradient Algorithm Adaptivity and Optimism: An Improved Exponentiated Gradient Algorithm Jacob Steinhardt Percy Liang Stanford University {jsteinhardt,pliang}@cs.stanford.edu Jun 11, 2013 J. Steinhardt & P. Liang (Stanford)

More information

Optimal and Adaptive Online Learning

Optimal and Adaptive Online Learning Optimal and Adaptive Online Learning Haipeng Luo Advisor: Robert Schapire Computer Science Department Princeton University Examples of Online Learning (a) Spam detection 2 / 34 Examples of Online Learning

More information

Learnability, Stability, Regularization and Strong Convexity

Learnability, Stability, Regularization and Strong Convexity Learnability, Stability, Regularization and Strong Convexity Nati Srebro Shai Shalev-Shwartz HUJI Ohad Shamir Weizmann Karthik Sridharan Cornell Ambuj Tewari Michigan Toyota Technological Institute Chicago

More information

Tutorial: PART 1. Online Convex Optimization, A Game- Theoretic Approach to Learning.

Tutorial: PART 1. Online Convex Optimization, A Game- Theoretic Approach to Learning. Tutorial: PART 1 Online Convex Optimization, A Game- Theoretic Approach to Learning http://www.cs.princeton.edu/~ehazan/tutorial/tutorial.htm Elad Hazan Princeton University Satyen Kale Yahoo Research

More information

Online Convex Optimization

Online Convex Optimization Advanced Course in Machine Learning Spring 2010 Online Convex Optimization Handouts are jointly prepared by Shie Mannor and Shai Shalev-Shwartz A convex repeated game is a two players game that is performed

More information

Stochastic and Adversarial Online Learning without Hyperparameters

Stochastic and Adversarial Online Learning without Hyperparameters Stochastic and Adversarial Online Learning without Hyperparameters Ashok Cutkosky Department of Computer Science Stanford University ashokc@cs.stanford.edu Kwabena Boahen Department of Bioengineering Stanford

More information

Online Learning with Experts & Multiplicative Weights Algorithms

Online Learning with Experts & Multiplicative Weights Algorithms Online Learning with Experts & Multiplicative Weights Algorithms CS 159 lecture #2 Stephan Zheng April 1, 2016 Caltech Table of contents 1. Online Learning with Experts With a perfect expert Without perfect

More information

Tutorial: PART 2. Online Convex Optimization, A Game- Theoretic Approach to Learning

Tutorial: PART 2. Online Convex Optimization, A Game- Theoretic Approach to Learning Tutorial: PART 2 Online Convex Optimization, A Game- Theoretic Approach to Learning Elad Hazan Princeton University Satyen Kale Yahoo Research Exploiting curvature: logarithmic regret Logarithmic regret

More information

No-Regret Algorithms for Unconstrained Online Convex Optimization

No-Regret Algorithms for Unconstrained Online Convex Optimization No-Regret Algorithms for Unconstrained Online Convex Optimization Matthew Streeter Duolingo, Inc. Pittsburgh, PA 153 matt@duolingo.com H. Brendan McMahan Google, Inc. Seattle, WA 98103 mcmahan@google.com

More information

1 Overview. 2 Learning from Experts. 2.1 Defining a meaningful benchmark. AM 221: Advanced Optimization Spring 2016

1 Overview. 2 Learning from Experts. 2.1 Defining a meaningful benchmark. AM 221: Advanced Optimization Spring 2016 AM 1: Advanced Optimization Spring 016 Prof. Yaron Singer Lecture 11 March 3rd 1 Overview In this lecture we will introduce the notion of online convex optimization. This is an extremely useful framework

More information

Online Optimization : Competing with Dynamic Comparators

Online Optimization : Competing with Dynamic Comparators Ali Jadbabaie Alexander Rakhlin Shahin Shahrampour Karthik Sridharan University of Pennsylvania University of Pennsylvania University of Pennsylvania Cornell University Abstract Recent literature on online

More information

Online Learning: Random Averages, Combinatorial Parameters, and Learnability

Online Learning: Random Averages, Combinatorial Parameters, and Learnability Online Learning: Random Averages, Combinatorial Parameters, and Learnability Alexander Rakhlin Department of Statistics University of Pennsylvania Karthik Sridharan Toyota Technological Institute at Chicago

More information

Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization

Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization JMLR: Workshop and Conference Proceedings vol (2010) 1 16 24th Annual Conference on Learning heory Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization

More information

Online Optimization with Gradual Variations

Online Optimization with Gradual Variations JMLR: Workshop and Conference Proceedings vol (0) 0 Online Optimization with Gradual Variations Chao-Kai Chiang, Tianbao Yang 3 Chia-Jung Lee Mehrdad Mahdavi 3 Chi-Jen Lu Rong Jin 3 Shenghuo Zhu 4 Institute

More information

Regret bounded by gradual variation for online convex optimization

Regret bounded by gradual variation for online convex optimization Noname manuscript No. will be inserted by the editor Regret bounded by gradual variation for online convex optimization Tianbao Yang Mehrdad Mahdavi Rong Jin Shenghuo Zhu Received: date / Accepted: date

More information

Learning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley

Learning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley Learning Methods for Online Prediction Problems Peter Bartlett Statistics and EECS UC Berkeley Course Synopsis A finite comparison class: A = {1,..., m}. Converting online to batch. Online convex optimization.

More information

Subgradient Methods for Maximum Margin Structured Learning

Subgradient Methods for Maximum Margin Structured Learning Nathan D. Ratliff ndr@ri.cmu.edu J. Andrew Bagnell dbagnell@ri.cmu.edu Robotics Institute, Carnegie Mellon University, Pittsburgh, PA. 15213 USA Martin A. Zinkevich maz@cs.ualberta.ca Department of Computing

More information

The Multi-Arm Bandit Framework

The Multi-Arm Bandit Framework The Multi-Arm Bandit Framework A. LAZARIC (SequeL Team @INRIA-Lille) ENS Cachan - Master 2 MVA SequeL INRIA Lille MVA-RL Course In This Lecture A. LAZARIC Reinforcement Learning Algorithms Oct 29th, 2013-2/94

More information

Online Convex Optimization. Gautam Goel, Milan Cvitkovic, and Ellen Feldman CS 159 4/5/2016

Online Convex Optimization. Gautam Goel, Milan Cvitkovic, and Ellen Feldman CS 159 4/5/2016 Online Convex Optimization Gautam Goel, Milan Cvitkovic, and Ellen Feldman CS 159 4/5/2016 The General Setting The General Setting (Cover) Given only the above, learning isn't always possible Some Natural

More information

Foundations of Machine Learning On-Line Learning. Mehryar Mohri Courant Institute and Google Research

Foundations of Machine Learning On-Line Learning. Mehryar Mohri Courant Institute and Google Research Foundations of Machine Learning On-Line Learning Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Motivation PAC learning: distribution fixed over time (training and test). IID assumption.

More information

Generalization Bounds for Online Learning Algorithms with Pairwise Loss Functions

Generalization Bounds for Online Learning Algorithms with Pairwise Loss Functions JMLR: Workshop and Conference Proceedings vol 3 0 3. 3. 5th Annual Conference on Learning Theory Generalization Bounds for Online Learning Algorithms with Pairwise Loss Functions Yuyang Wang Roni Khardon

More information

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization John Duchi, Elad Hanzan, Yoram Singer

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization John Duchi, Elad Hanzan, Yoram Singer Adaptive Subgradient Methods for Online Learning and Stochastic Optimization John Duchi, Elad Hanzan, Yoram Singer Vicente L. Malave February 23, 2011 Outline Notation minimize a number of functions φ

More information

Optimization, Learning, and Games with Predictable Sequences

Optimization, Learning, and Games with Predictable Sequences Optimization, Learning, and Games with Predictable Sequences Alexander Rakhlin University of Pennsylvania Karthik Sridharan University of Pennsylvania Abstract We provide several applications of Optimistic

More information

Exponentiated Gradient Descent

Exponentiated Gradient Descent CSE599s, Spring 01, Online Learning Lecture 10-04/6/01 Lecturer: Ofer Dekel Exponentiated Gradient Descent Scribe: Albert Yu 1 Introduction In this lecture we review norms, dual norms, strong convexity,

More information

Alireza Shafaei. Machine Learning Reading Group The University of British Columbia Summer 2017

Alireza Shafaei. Machine Learning Reading Group The University of British Columbia Summer 2017 s s Machine Learning Reading Group The University of British Columbia Summer 2017 (OCO) Convex 1/29 Outline (OCO) Convex Stochastic Bernoulli s (OCO) Convex 2/29 At each iteration t, the player chooses

More information

Lecture 23: Online convex optimization Online convex optimization: generalization of several algorithms

Lecture 23: Online convex optimization Online convex optimization: generalization of several algorithms EECS 598-005: heoretical Foundations of Machine Learning Fall 2015 Lecture 23: Online convex optimization Lecturer: Jacob Abernethy Scribes: Vikas Dhiman Disclaimer: hese notes have not been subjected

More information

THE WEIGHTED MAJORITY ALGORITHM

THE WEIGHTED MAJORITY ALGORITHM THE WEIGHTED MAJORITY ALGORITHM Csaba Szepesvári University of Alberta CMPUT 654 E-mail: szepesva@ualberta.ca UofA, October 3, 2006 OUTLINE 1 PREDICTION WITH EXPERT ADVICE 2 HALVING: FIND THE PERFECT EXPERT!

More information

New bounds on the price of bandit feedback for mistake-bounded online multiclass learning

New bounds on the price of bandit feedback for mistake-bounded online multiclass learning Journal of Machine Learning Research 1 8, 2017 Algorithmic Learning Theory 2017 New bounds on the price of bandit feedback for mistake-bounded online multiclass learning Philip M. Long Google, 1600 Amphitheatre

More information

Distributed online optimization over jointly connected digraphs

Distributed online optimization over jointly connected digraphs Distributed online optimization over jointly connected digraphs David Mateos-Núñez Jorge Cortés University of California, San Diego {dmateosn,cortes}@ucsd.edu Southern California Optimization Day UC San

More information

Bandit Online Convex Optimization

Bandit Online Convex Optimization March 31, 2015 Outline 1 OCO vs Bandit OCO 2 Gradient Estimates 3 Oblivious Adversary 4 Reshaping for Improved Rates 5 Adaptive Adversary 6 Concluding Remarks Review of (Online) Convex Optimization Set-up

More information

Online Learning with Feedback Graphs

Online Learning with Feedback Graphs Online Learning with Feedback Graphs Nicolò Cesa-Bianchi Università degli Studi di Milano Joint work with: Noga Alon (Tel-Aviv University) Ofer Dekel (Microsoft Research) Tomer Koren (Technion and Microsoft

More information

Stochastic Gradient Descent with Only One Projection

Stochastic Gradient Descent with Only One Projection Stochastic Gradient Descent with Only One Projection Mehrdad Mahdavi, ianbao Yang, Rong Jin, Shenghuo Zhu, and Jinfeng Yi Dept. of Computer Science and Engineering, Michigan State University, MI, USA Machine

More information

Applications of on-line prediction. in telecommunication problems

Applications of on-line prediction. in telecommunication problems Applications of on-line prediction in telecommunication problems Gábor Lugosi Pompeu Fabra University, Barcelona based on joint work with András György and Tamás Linder 1 Outline On-line prediction; Some

More information

An Online Convex Optimization Approach to Blackwell s Approachability

An Online Convex Optimization Approach to Blackwell s Approachability Journal of Machine Learning Research 17 (2016) 1-23 Submitted 7/15; Revised 6/16; Published 8/16 An Online Convex Optimization Approach to Blackwell s Approachability Nahum Shimkin Faculty of Electrical

More information

Delay-Tolerant Online Convex Optimization: Unified Analysis and Adaptive-Gradient Algorithms

Delay-Tolerant Online Convex Optimization: Unified Analysis and Adaptive-Gradient Algorithms Delay-Tolerant Online Convex Optimization: Unified Analysis and Adaptive-Gradient Algorithms Pooria Joulani 1 András György 2 Csaba Szepesvári 1 1 Department of Computing Science, University of Alberta,

More information

4.1 Online Convex Optimization

4.1 Online Convex Optimization CS/CNS/EE 53: Advanced Topics in Machine Learning Topic: Online Convex Optimization and Online SVM Lecturer: Daniel Golovin Scribe: Xiaodi Hou Date: Jan 3, 4. Online Convex Optimization Definition 4..

More information

The FTRL Algorithm with Strongly Convex Regularizers

The FTRL Algorithm with Strongly Convex Regularizers CSE599s, Spring 202, Online Learning Lecture 8-04/9/202 The FTRL Algorithm with Strongly Convex Regularizers Lecturer: Brandan McMahan Scribe: Tamara Bonaci Introduction In the last lecture, we talked

More information

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I. Sébastien Bubeck Theory Group

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I. Sébastien Bubeck Theory Group Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I Sébastien Bubeck Theory Group i.i.d. multi-armed bandit, Robbins [1952] i.i.d. multi-armed bandit, Robbins [1952] Known

More information

A Drifting-Games Analysis for Online Learning and Applications to Boosting

A Drifting-Games Analysis for Online Learning and Applications to Boosting A Drifting-Games Analysis for Online Learning and Applications to Boosting Haipeng Luo Department of Computer Science Princeton University Princeton, NJ 08540 haipengl@cs.princeton.edu Robert E. Schapire

More information

Agnostic Online learnability

Agnostic Online learnability Technical Report TTIC-TR-2008-2 October 2008 Agnostic Online learnability Shai Shalev-Shwartz Toyota Technological Institute Chicago shai@tti-c.org ABSTRACT We study a fundamental question. What classes

More information

arxiv: v1 [cs.lg] 23 Jan 2019

arxiv: v1 [cs.lg] 23 Jan 2019 Cooperative Online Learning: Keeping your Neighbors Updated Nicolò Cesa-Bianchi, Tommaso R. Cesari, and Claire Monteleoni 2 Dipartimento di Informatica, Università degli Studi di Milano, Italy 2 Department

More information

Logarithmic Regret Algorithms for Online Convex Optimization

Logarithmic Regret Algorithms for Online Convex Optimization Logarithmic Regret Algorithms for Online Convex Optimization Elad Hazan 1, Adam Kalai 2, Satyen Kale 1, and Amit Agarwal 1 1 Princeton University {ehazan,satyen,aagarwal}@princeton.edu 2 TTI-Chicago kalai@tti-c.org

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Bandit Problems MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Multi-Armed Bandit Problem Problem: which arm of a K-slot machine should a gambler pull to maximize his

More information

Quasi-Newton Algorithms for Non-smooth Online Strongly Convex Optimization. Mark Franklin Godwin

Quasi-Newton Algorithms for Non-smooth Online Strongly Convex Optimization. Mark Franklin Godwin Quasi-Newton Algorithms for Non-smooth Online Strongly Convex Optimization by Mark Franklin Godwin A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy

More information

Explore no more: Improved high-probability regret bounds for non-stochastic bandits

Explore no more: Improved high-probability regret bounds for non-stochastic bandits Explore no more: Improved high-probability regret bounds for non-stochastic bandits Gergely Neu SequeL team INRIA Lille Nord Europe gergely.neu@gmail.com Abstract This work addresses the problem of regret

More information

Online and Stochastic Learning with a Human Cognitive Bias

Online and Stochastic Learning with a Human Cognitive Bias Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence Online and Stochastic Learning with a Human Cognitive Bias Hidekazu Oiwa The University of Tokyo and JSPS Research Fellow 7-3-1

More information

Learning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley

Learning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley Learning Methods for Online Prediction Problems Peter Bartlett Statistics and EECS UC Berkeley Course Synopsis A finite comparison class: A = {1,..., m}. 1. Prediction with expert advice. 2. With perfect

More information

Lecture 16: FTRL and Online Mirror Descent

Lecture 16: FTRL and Online Mirror Descent Lecture 6: FTRL and Online Mirror Descent Akshay Krishnamurthy akshay@cs.umass.edu November, 07 Recap Last time we saw two online learning algorithms. First we saw the Weighted Majority algorithm, which

More information

Online Manifold Regularization: A New Learning Setting and Empirical Study

Online Manifold Regularization: A New Learning Setting and Empirical Study Online Manifold Regularization: A New Learning Setting and Empirical Study Andrew B. Goldberg 1, Ming Li 2, Xiaojin Zhu 1 1 Computer Sciences, University of Wisconsin Madison, USA. {goldberg,jerryzhu}@cs.wisc.edu

More information

CS261: A Second Course in Algorithms Lecture #11: Online Learning and the Multiplicative Weights Algorithm

CS261: A Second Course in Algorithms Lecture #11: Online Learning and the Multiplicative Weights Algorithm CS61: A Second Course in Algorithms Lecture #11: Online Learning and the Multiplicative Weights Algorithm Tim Roughgarden February 9, 016 1 Online Algorithms This lecture begins the third module of the

More information

Distributed online optimization over jointly connected digraphs

Distributed online optimization over jointly connected digraphs Distributed online optimization over jointly connected digraphs David Mateos-Núñez Jorge Cortés University of California, San Diego {dmateosn,cortes}@ucsd.edu Mathematical Theory of Networks and Systems

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Computational and Statistical Learning Theory TTIC 31120 Prof. Nati Srebro Lecture 17: Stochastic Optimization Part II: Realizable vs Agnostic Rates Part III: Nearest Neighbor Classification Stochastic

More information

Trade-Offs in Distributed Learning and Optimization

Trade-Offs in Distributed Learning and Optimization Trade-Offs in Distributed Learning and Optimization Ohad Shamir Weizmann Institute of Science Includes joint works with Yossi Arjevani, Nathan Srebro and Tong Zhang IHES Workshop March 2016 Distributed

More information

Bandit Convex Optimization

Bandit Convex Optimization March 7, 2017 Table of Contents 1 (BCO) 2 Projection Methods 3 Barrier Methods 4 Variance reduction 5 Other methods 6 Conclusion Learning scenario Compact convex action set K R d. For t = 1 to T : Predict

More information

Large-scale Stochastic Optimization

Large-scale Stochastic Optimization Large-scale Stochastic Optimization 11-741/641/441 (Spring 2016) Hanxiao Liu hanxiaol@cs.cmu.edu March 24, 2016 1 / 22 Outline 1. Gradient Descent (GD) 2. Stochastic Gradient Descent (SGD) Formulation

More information

Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization

Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization Alexander Rakhlin University of Pennsylvania Ohad Shamir Microsoft Research New England Karthik Sridharan University of Pennsylvania

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning Editors: Suvrit Sra suvrit@gmail.com Max Planck Insitute for Biological Cybernetics 72076 Tübingen, Germany Sebastian Nowozin Microsoft Research Cambridge, CB3 0FB, United

More information

Stochastic Gradient Descent

Stochastic Gradient Descent Stochastic Gradient Descent Machine Learning CSE546 Carlos Guestrin University of Washington October 9, 2013 1 Logistic Regression Logistic function (or Sigmoid): Learn P(Y X) directly Assume a particular

More information

The convex optimization approach to regret minimization

The convex optimization approach to regret minimization The convex optimization approach to regret minimization Elad Hazan Technion - Israel Institute of Technology ehazan@ie.technion.ac.il Abstract A well studied and general setting for prediction and decision

More information

A simpler unified analysis of Budget Perceptrons

A simpler unified analysis of Budget Perceptrons Ilya Sutskever University of Toronto, 6 King s College Rd., Toronto, Ontario, M5S 3G4, Canada ILYA@CS.UTORONTO.CA Abstract The kernel Perceptron is an appealing online learning algorithm that has a drawback:

More information

Accelerating Stochastic Optimization

Accelerating Stochastic Optimization Accelerating Stochastic Optimization Shai Shalev-Shwartz School of CS and Engineering, The Hebrew University of Jerusalem and Mobileye Master Class at Tel-Aviv, Tel-Aviv University, November 2014 Shalev-Shwartz

More information

Classification Logistic Regression

Classification Logistic Regression Announcements: Classification Logistic Regression Machine Learning CSE546 Sham Kakade University of Washington HW due on Friday. Today: Review: sub-gradients,lasso Logistic Regression October 3, 26 Sham

More information

arxiv: v4 [math.oc] 5 Jan 2016

arxiv: v4 [math.oc] 5 Jan 2016 Restarted SGD: Beating SGD without Smoothness and/or Strong Convexity arxiv:151.03107v4 [math.oc] 5 Jan 016 Tianbao Yang, Qihang Lin Department of Computer Science Department of Management Sciences The

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Follow-he-Perturbed Leader MEHRYAR MOHRI MOHRI@ COURAN INSIUE & GOOGLE RESEARCH. General Ideas Linear loss: decomposition as a sum along substructures. sum of edge losses in a

More information

Regret Bounds for Online Portfolio Selection with a Cardinality Constraint

Regret Bounds for Online Portfolio Selection with a Cardinality Constraint Regret Bounds for Online Portfolio Selection with a Cardinality Constraint Shinji Ito NEC Corporation Daisuke Hatano RIKEN AIP Hanna Sumita okyo Metropolitan University Akihiro Yabe NEC Corporation akuro

More information

Online Submodular Minimization

Online Submodular Minimization Online Submodular Minimization Elad Hazan IBM Almaden Research Center 650 Harry Rd, San Jose, CA 95120 hazan@us.ibm.com Satyen Kale Yahoo! Research 4301 Great America Parkway, Santa Clara, CA 95054 skale@yahoo-inc.com

More information

0.1 Motivating example: weighted majority algorithm

0.1 Motivating example: weighted majority algorithm princeton univ. F 16 cos 521: Advanced Algorithm Design Lecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm Lecturer: Sanjeev Arora Scribe: Sanjeev Arora (Today s notes

More information

Linear Probability Forecasting

Linear Probability Forecasting Linear Probability Forecasting Fedor Zhdanov and Yuri Kalnishkan Computer Learning Research Centre and Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, TW0 0EX, UK {fedor,yura}@cs.rhul.ac.uk

More information

Lecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm. Lecturer: Sanjeev Arora

Lecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm. Lecturer: Sanjeev Arora princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm Lecturer: Sanjeev Arora Scribe: (Today s notes below are

More information

Prediction and Playing Games

Prediction and Playing Games Prediction and Playing Games Vineel Pratap vineel@eng.ucsd.edu February 20, 204 Chapter 7 : Prediction, Learning and Games - Cesa Binachi & Lugosi K-Person Normal Form Games Each player k (k =,..., K)

More information

Robust Selective Sampling from Single and Multiple Teachers

Robust Selective Sampling from Single and Multiple Teachers Robust Selective Sampling from Single and Multiple Teachers Ofer Dekel Microsoft Research oferd@microsoftcom Claudio Gentile DICOM, Università dell Insubria claudiogentile@uninsubriait Karthik Sridharan

More information

Online Prediction Peter Bartlett

Online Prediction Peter Bartlett Online Prediction Peter Bartlett Statistics and EECS UC Berkeley and Mathematical Sciences Queensland University of Technology Online Prediction Repeated game: Cumulative loss: ˆL n = Decision method plays

More information

Linear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Linear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington Linear Classification CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Example of Linear Classification Red points: patterns belonging

More information

Online Learning with Feedback Graphs

Online Learning with Feedback Graphs Online Learning with Feedback Graphs Claudio Gentile INRIA and Google NY clagentile@gmailcom NYC March 6th, 2018 1 Content of this lecture Regret analysis of sequential prediction problems lying between

More information