Advanced Machine Learning
|
|
- Shavonne Campbell
- 5 years ago
- Views:
Transcription
1 Advanced Machine Learning Follow-he-Perturbed Leader MEHRYAR MOHRI COURAN INSIUE & GOOGLE RESEARCH.
2 General Ideas Linear loss: decomposition as a sum along substructures. sum of edge losses in a tree. includes expert setting. sum of edge losses along a path. sum of other substructures losses in a discrete problem. page 2
3 FPL General linear decision problem: w t 2 W R N player selects,. x t 2 X R N player incurs loss, sup w x apple R. player receives,. Objective: minimize cumulative loss or regret. M(x) = argmin Notation:. w t x t X {x: kxk 1 apple X 1 } w2w w2w,x2x w x l 1 -diam(w) apple W 1 (Kalai and Vempala, 2004) page 3
4 FL Follow the Leader (FL): use play). at every round (aka fictitious FL problem: Suppose and consider a sequence 0 starting with and then alternating and. hen, 1/2 M N =2 FL incurs loss 1 at every round, overall. any single expert incurs loss overall. /2 ( 1 0 ) ( 0 1 ) page 4
5 FPL Algorithms Additive bound Follow the Perturbed Leader (FPL):. Multiplicative bound Follow the Perturbed Leader (FPL*): p t U([0, 1/ ] N ) w t = argmin w2w P t 1 s=1 w x s + w p t = M(x 1:t 1 + p t ). p t f(x) = 2 e kxk 1 Laplacian with density. w t = argmin w2w P t 1 s=1 w x s + w p t = M(x 1:t 1 + p t ). (Hannan 1957; Kalai and Vempala, 2004) page 5
6 FPL - Bound >0 heorem: fix. hen, the expected cumulative loss of additive FPL( ) is bounded as follows For = E[L ] apple L min + RX 1 + W 1. r W1 RX 1 E[L ] apple L min +2 p X 1 W 1 R. page 6
7 FPL* - Bound >0 heorem: fix and assume that. hen, the expected cumulative loss of (multiplicative) FPL*( /2X 1 ) is bounded as follows For =min q E[L ] apple L min +4 1/2X 1, L min q W, X R N + E[L ] apple (1 + )L min + 2X 1W 1 (1 + log N) W 1 (1 + log N)/X 1 L min X 1W 1 (1 + log N)+4X 1 W 1 (1 + log N).. page 7
8 Proof Outline Be the perturbed leader (BPL): w t = M(x 1:t + p t ). 1. Bound on regret of BPL: E[R (BPL)] apple W Bound on difference of regrets of FPL and BPL: E[M(x 1:t 1 + p 1 ) x t ] E[M(x 1:t + p 1 ) x t ]. 3. Difference of expectations small because similar distributions. page 8
9 Proof: BL Regret Lemma 1: P M(x 1:t) x t apple M(x 1: ) x 1:. Proof: case =1 is clear. By induction, X+1 M(x 1:t ) x t apple M(x 1: ) x 1: + M(x 1: +1 ) x +1 apple M(x 1: +1 ) x 1: + M(x 1: +1 ) x +1 (induction) (def. of M(x 1: ) as minimizer) = M(x 1: +1 ) x 1: +1. page 9
10 Proof: BPL Regret p 0 =0 Lemma 2: let. hen, the following holds: X X M(x 1:t + p t ) x t apple M(x 1: ) x 1: + W 1 kp t p t 1 k 1. hus, Proof: use Lemma 1 with x 0 t = x t + p t p t 1 X M(x 1:t + p t ) (x t + p t p t 1 ) apple M(x 1: + p ) (x 1: + p ) X M(x 1:t + p t ) x t apple M(x 1: ) x 1: + apple M(x 1: ) x 1: + W 1, then apple M(x 1: ) (x 1: + p ) X = M(x 1: ) x 1: + M(x 1: ) p t p t 1. X M(x1: ) M(x 1:t + p t ) p t p t 1 X p t p t 1 1. page 10
11 Proof: FPL vs. BPL Regrets p t = p 1 t>0 X M(x 1:t + p 1 ) x t apple M(x 1: ) x 1: + W 1 kp 1 k 1. Proof: for the expected loss, we can just choose all, which yields: hus, X E[M(x 1:t 1 + p 1 ) x t ] = apple X E[M(x 1:t 1 + p 1 ) x t ] E[M(x 1:t + p 1 ) x t ]+E[M(x 1:t + p 1 ) x t ] X h E[M(x 1:t 1 + p 1 ) x t ] E[M(x 1:t + p 1 ) x t ] i + L min + W 1 kp 1 k 1. for page 11
12 Proof: FPL By definition of the perturbation,. x 1:t + p 1 x 1:t 1 + p 1 Now, and both follow a uniform distribution over a cube. hus, wo cubes and overlap over at least the fraction : if but then for at least one i, most. kp 1 k 1 apple 1 E[M(x 1:t 1 + p 1 ) x t ] E[M(x 1:t + p 1 ) x t ] apple R(1 fraction of overlap). [0, 1/ ] N v +[0, 1/ ] N (1 kvk 1 ) x 2 [0, 1/ ] N x 62 v +[0, 1/ ] N x i 62 v i +[0, 1/ ] N v i 0 v i 1/ v i +1/, which has probability at v i mass page 12
13 Proof: FPL hus, E[M(x 1:t 1 + p 1 ) x t ] E[M(x 1:t + p 1 ) x t ] apple R kx t k 1 apple R X 1. And, E[R ] apple R X 1 + W 1. page 13
14 Proof: FPL* Lemma 3: E[M(x 1:t 1 + p 1 ) x t ] apple e X 1 E[M(x 1:t + p 1 ) x t ]. Proof: E[M(x 1:t 1 + p 1 ) x t ] Z = M(x 1:t 1 + u) x t dµ(u) R Z N = M(x 1:t + v) x t dµ(x t + v) (change of var. v = u + x t ) R Z N = M(x 1:t + v) x t e kx t+vk 1 kvk 1 {z } d(v) R N applee X 1 apple e X 1 E[M(x 1:t + p 1 ) x t ]. page 14
15 Proof: FPL* apple 1/X 1 For,, thus, X E[M(x 1:t 1 + p 1 ) x t ] apple hus, h i E[kp 1 k 1 ]=E max p 1,i i2[1,n] e X 1 apple (1 + 2 X 1 ) = apple apple 2 =2 X (1 + 2 X 1 )E[M(x 1:t + p 1 ) x t ] X Z +1 0 Z +1 0 Z u 0 apple 2u + N (1 + 2 X 1 )(L min + W 1 E[kp 1 k 1 ]). h i Pr max p 1,i >t dt i2[1,n] h i Pr max p 1,i >t dt i2[1,n] h i Pr max p 1,i >t dt + i2[1,n] Z +1 u =2u + N e u Pr apple h i p 1,1 >t dt 2(1 + log N) Z +1 u h i Pr max p 1,i >t dt i2[1,n] (best choice of u). page 15
16 Expert Setting W 1 =1X 1 = N R =1,, and ; for FLP*( ), E[L ] apple (1 + 2N )L min + 2(1+log(N). More favorable bound: x t! x t,1 e 1...x t,n e N. new L min N = old L min. E[L old ] apple E[L new N]. new guarantee: for FLP*( ), E[L ] apple (1 + 2 )L min + 2(1+log(N)). E[R ] apple 2 p 2L min (1 + log(n)). page 16
17 RWM = FPL Let FPL( ) be an instance of the general FPL algorithm with a perturbation defined by apple log( log(u1 )) p 1 =,..., log( log(u > N )), where u j is drawn according to the uniform distribution over [0, 1]. hen, FPL( ) and RWM( ) coincide. page 17
18 References Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile: On the Generalization Ability of On-Line Learning Algorithms. IEEE ransactions on Information heory 50(9): Nicolò Cesa-Bianchi and Gábor Lugosi. Prediction, learning, and games. Cambridge University Press, Yoav Freund and Robert Schapire. Large margin classification using the perceptron algorithm. In Proceedings of COL ACM Press, Adam. Kalai, Santosh Vempala. Efficient algorithms for online decision problems. J. Comput. Syst. Sci. 71(3): Nick Littlestone. From On-Line to Batch Learning. COL 1989: Nick Littlestone. "Learning Quickly When Irrelevant Attributes Abound: A New Linear-threshold Algorithm" Machine Learning (2) page 18
19 References Nick Littlestone, Manfred K. Warmuth: he Weighted Majority Algorithm. FOCS 1989: om Mitchell. Machine Learning, McGraw Hill, Novikoff, A. B. (1962). On convergence proofs on perceptrons. Symposium on the Mathematical heory of Automata, 12, Polytechnic Institute of Brooklyn. page 19
Foundations of Machine Learning On-Line Learning. Mehryar Mohri Courant Institute and Google Research
Foundations of Machine Learning On-Line Learning Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Motivation PAC learning: distribution fixed over time (training and test). IID assumption.
More informationAdvanced Machine Learning
Advanced Machine Learning Learning with Large Expert Spaces MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Problem Learning guarantees: R T = O( p T log N). informative even for N very large.
More informationLearning with Large Number of Experts: Component Hedge Algorithm
Learning with Large Number of Experts: Component Hedge Algorithm Giulia DeSalvo and Vitaly Kuznetsov Courant Institute March 24th, 215 1 / 3 Learning with Large Number of Experts Regret of RWM is O( T
More informationPerceptron Mistake Bounds
Perceptron Mistake Bounds Mehryar Mohri, and Afshin Rostamizadeh Google Research Courant Institute of Mathematical Sciences Abstract. We present a brief survey of existing mistake bounds and introduce
More informationAdvanced Machine Learning
Advanced Machine Learning Learning and Games MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Outline Normal form games Nash equilibrium von Neumann s minimax theorem Correlated equilibrium Internal
More informationLearning, Games, and Networks
Learning, Games, and Networks Abhishek Sinha Laboratory for Information and Decision Systems MIT ML Talk Series @CNRG December 12, 2016 1 / 44 Outline 1 Prediction With Experts Advice 2 Application to
More informationAdvanced Machine Learning
Advanced Machine Learning Online Convex Optimization MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Outline Online projected sub-gradient descent. Exponentiated Gradient (EG). Mirror descent.
More informationExtracting Certainty from Uncertainty: Regret Bounded by Variation in Costs
Extracting Certainty from Uncertainty: Regret Bounded by Variation in Costs Elad Hazan IBM Almaden 650 Harry Rd, San Jose, CA 95120 hazan@us.ibm.com Satyen Kale Microsoft Research 1 Microsoft Way, Redmond,
More informationAdvanced Machine Learning
Advanced Machine Learning Bandit Problems MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Multi-Armed Bandit Problem Problem: which arm of a K-slot machine should a gambler pull to maximize his
More informationExponential Weights on the Hypercube in Polynomial Time
European Workshop on Reinforcement Learning 14 (2018) October 2018, Lille, France. Exponential Weights on the Hypercube in Polynomial Time College of Information and Computer Sciences University of Massachusetts
More informationNew bounds on the price of bandit feedback for mistake-bounded online multiclass learning
Journal of Machine Learning Research 1 8, 2017 Algorithmic Learning Theory 2017 New bounds on the price of bandit feedback for mistake-bounded online multiclass learning Philip M. Long Google, 1600 Amphitheatre
More informationFrom Bandits to Experts: A Tale of Domination and Independence
From Bandits to Experts: A Tale of Domination and Independence Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Domination and Independence 1 / 1 From Bandits to Experts: A
More informationOnline Submodular Minimization
Journal of Machine Learning Research 13 (2012) 2903-2922 Submitted 12/11; Revised 7/12; Published 10/12 Online Submodular Minimization Elad Hazan echnion - Israel Inst. of ech. echnion City Haifa, 32000,
More informationExtracting Certainty from Uncertainty: Regret Bounded by Variation in Costs
Extracting Certainty from Uncertainty: Regret Bounded by Variation in Costs Elad Hazan IBM Almaden Research Center 650 Harry Rd San Jose, CA 95120 ehazan@cs.princeton.edu Satyen Kale Yahoo! Research 4301
More informationEfficient learning by implicit exploration in bandit problems with side observations
Efficient learning by implicit exploration in bandit problems with side observations omáš Kocák Gergely Neu Michal Valko Rémi Munos SequeL team, INRIA Lille Nord Europe, France {tomas.kocak,gergely.neu,michal.valko,remi.munos}@inria.fr
More informationOn the Generalization Ability of Online Strongly Convex Programming Algorithms
On the Generalization Ability of Online Strongly Convex Programming Algorithms Sham M. Kakade I Chicago Chicago, IL 60637 sham@tti-c.org Ambuj ewari I Chicago Chicago, IL 60637 tewari@tti-c.org Abstract
More informationAgnostic Online learnability
Technical Report TTIC-TR-2008-2 October 2008 Agnostic Online learnability Shai Shalev-Shwartz Toyota Technological Institute Chicago shai@tti-c.org ABSTRACT We study a fundamental question. What classes
More informationarxiv: v2 [cs.lg] 19 Oct 2018
Learning in Non-convex Games with an Optimization Oracle arxiv:1810.07362v2 [cs.lg] 19 Oct 2018 Alon Gonen Elad Hazan Abstract We consider adversarial online learning in a non-convex setting under the
More informationTutorial: PART 1. Online Convex Optimization, A Game- Theoretic Approach to Learning.
Tutorial: PART 1 Online Convex Optimization, A Game- Theoretic Approach to Learning http://www.cs.princeton.edu/~ehazan/tutorial/tutorial.htm Elad Hazan Princeton University Satyen Kale Yahoo Research
More informationOnline Learning for Time Series Prediction
Online Learning for Time Series Prediction Joint work with Vitaly Kuznetsov (Google Research) MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Motivation Time series prediction: stock values.
More informationA survey: The convex optimization approach to regret minimization
A survey: The convex optimization approach to regret minimization Elad Hazan September 10, 2009 WORKING DRAFT Abstract A well studied and general setting for prediction and decision making is regret minimization
More informationOn-Line Learning with Path Experts and Non-Additive Losses
On-Line Learning with Path Experts and Non-Additive Losses Joint work with Corinna Cortes (Google Research) Vitaly Kuznetsov (Courant Institute) Manfred Warmuth (UC Santa Cruz) MEHRYAR MOHRI MOHRI@ COURANT
More informationFull-information Online Learning
Introduction Expert Advice OCO LM A DA NANJING UNIVERSITY Full-information Lijun Zhang Nanjing University, China June 2, 2017 Outline Introduction Expert Advice OCO 1 Introduction Definitions Regret 2
More informationTime Series Prediction & Online Learning
Time Series Prediction & Online Learning Joint work with Vitaly Kuznetsov (Google Research) MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Motivation Time series prediction: stock values. earthquakes.
More informationThe Online Approach to Machine Learning
The Online Approach to Machine Learning Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Online Approach to ML 1 / 53 Summary 1 My beautiful regret 2 A supposedly fun game I
More informationDefensive forecasting for optimal prediction with expert advice
Defensive forecasting for optimal prediction with expert advice Vladimir Vovk $25 Peter Paul $0 $50 Peter $0 Paul $100 The Game-Theoretic Probability and Finance Project Working Paper #20 August 11, 2007
More informationTheory and Applications of A Repeated Game Playing Algorithm. Rob Schapire Princeton University [currently visiting Yahoo!
Theory and Applications of A Repeated Game Playing Algorithm Rob Schapire Princeton University [currently visiting Yahoo! Research] Learning Is (Often) Just a Game some learning problems: learn from training
More informationMinimax strategy for prediction with expert advice under stochastic assumptions
Minimax strategy for prediction ith expert advice under stochastic assumptions Wojciech Kotłosi Poznań University of Technology, Poland otlosi@cs.put.poznan.pl Abstract We consider the setting of prediction
More informationGame Theory, On-line Prediction and Boosting
roceedings of the Ninth Annual Conference on Computational Learning heory, 996. Game heory, On-line rediction and Boosting Yoav Freund Robert E. Schapire A& Laboratories 600 Mountain Avenue Murray Hill,
More informationA Drifting-Games Analysis for Online Learning and Applications to Boosting
A Drifting-Games Analysis for Online Learning and Applications to Boosting Haipeng Luo Department of Computer Science Princeton University Princeton, NJ 08540 haipengl@cs.princeton.edu Robert E. Schapire
More informationIntroduction to Machine Learning Lecture 11. Mehryar Mohri Courant Institute and Google Research
Introduction to Machine Learning Lecture 11 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Boosting Mehryar Mohri - Introduction to Machine Learning page 2 Boosting Ideas Main idea:
More informationThe Algorithmic Foundations of Adaptive Data Analysis November, Lecture The Multiplicative Weights Algorithm
he Algorithmic Foundations of Adaptive Data Analysis November, 207 Lecture 5-6 Lecturer: Aaron Roth Scribe: Aaron Roth he Multiplicative Weights Algorithm In this lecture, we define and analyze a classic,
More informationThe No-Regret Framework for Online Learning
The No-Regret Framework for Online Learning A Tutorial Introduction Nahum Shimkin Technion Israel Institute of Technology Haifa, Israel Stochastic Processes in Engineering IIT Mumbai, March 2013 N. Shimkin,
More informationOnline Forest Density Estimation
Online Forest Density Estimation Frédéric Koriche CRIL - CNRS UMR 8188, Univ. Artois koriche@cril.fr UAI 16 1 Outline 1 Probabilistic Graphical Models 2 Online Density Estimation 3 Online Forest Density
More informationOnline Learning with Feedback Graphs
Online Learning with Feedback Graphs Nicolò Cesa-Bianchi Università degli Studi di Milano Joint work with: Noga Alon (Tel-Aviv University) Ofer Dekel (Microsoft Research) Tomer Koren (Technion and Microsoft
More informationLecture 14: Approachability and regret minimization Ramesh Johari May 23, 2007
MS&E 336 Lecture 4: Approachability and regret minimization Ramesh Johari May 23, 2007 In this lecture we use Blackwell s approachability theorem to formulate both external and internal regret minimizing
More informationOnline Learning. Jordan Boyd-Graber. University of Colorado Boulder LECTURE 21. Slides adapted from Mohri
Online Learning Jordan Boyd-Graber University of Colorado Boulder LECTURE 21 Slides adapted from Mohri Jordan Boyd-Graber Boulder Online Learning 1 of 31 Motivation PAC learning: distribution fixed over
More informationAdvanced Machine Learning
Advanced Machine Learning Deep Boosting MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Outline Model selection. Deep boosting. theory. algorithm. experiments. page 2 Model Selection Problem:
More informationClassification. Jordan Boyd-Graber University of Maryland WEIGHTED MAJORITY. Slides adapted from Mohri. Jordan Boyd-Graber UMD Classification 1 / 13
Classification Jordan Boyd-Graber University of Maryland WEIGHTED MAJORITY Slides adapted from Mohri Jordan Boyd-Graber UMD Classification 1 / 13 Beyond Binary Classification Before we ve talked about
More informationOnline Learning and Online Convex Optimization
Online Learning and Online Convex Optimization Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Online Learning 1 / 49 Summary 1 My beautiful regret 2 A supposedly fun game
More informationAdaptive Online Prediction by Following the Perturbed Leader
Journal of Machine Learning Research 6 (2005) 639 660 Submitted 10/04; Revised 3/05; Published 4/05 Adaptive Online Prediction by Following the Perturbed Leader Marcus Hutter Jan Poland IDSIA, Galleria
More informationOptimization for Machine Learning
Optimization for Machine Learning Editors: Suvrit Sra suvrit@gmail.com Max Planck Insitute for Biological Cybernetics 72076 Tübingen, Germany Sebastian Nowozin Microsoft Research Cambridge, CB3 0FB, United
More informationOnline Learning of Probabilistic Graphical Models
1/34 Online Learning of Probabilistic Graphical Models Frédéric Koriche CRIL - CNRS UMR 8188, Univ. Artois koriche@cril.fr CRIL-U Nankin 2016 Probabilistic Graphical Models 2/34 Outline 1 Probabilistic
More informationOnline Learning: Random Averages, Combinatorial Parameters, and Learnability
Online Learning: Random Averages, Combinatorial Parameters, and Learnability Alexander Rakhlin Department of Statistics University of Pennsylvania Karthik Sridharan Toyota Technological Institute at Chicago
More informationBandits for Online Optimization
Bandits for Online Optimization Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Bandits for Online Optimization 1 / 16 The multiarmed bandit problem... K slot machines Each
More informationFrom Batch to Transductive Online Learning
From Batch to Transductive Online Learning Sham Kakade Toyota Technological Institute Chicago, IL 60637 sham@tti-c.org Adam Tauman Kalai Toyota Technological Institute Chicago, IL 60637 kalai@tti-c.org
More informationOnline prediction with expert advise
Online prediction with expert advise Jyrki Kivinen Australian National University http://axiom.anu.edu.au/~kivinen Contents 1. Online prediction: introductory example, basic setting 2. Classification with
More informationThe convex optimization approach to regret minimization
The convex optimization approach to regret minimization Elad Hazan Technion - Israel Institute of Technology ehazan@ie.technion.ac.il Abstract A well studied and general setting for prediction and decision
More informationOnline Submodular Minimization
Online Submodular Minimization Elad Hazan IBM Almaden Research Center 650 Harry Rd, San Jose, CA 95120 hazan@us.ibm.com Satyen Kale Yahoo! Research 4301 Great America Parkway, Santa Clara, CA 95054 skale@yahoo-inc.com
More informationLecture 19: UCB Algorithm and Adversarial Bandit Problem. Announcements Review on stochastic multi-armed bandit problem
Lecture 9: UCB Algorithm and Adversarial Bandit Problem EECS598: Prediction and Learning: It s Only a Game Fall 03 Lecture 9: UCB Algorithm and Adversarial Bandit Problem Prof. Jacob Abernethy Scribe:
More informationTutorial: PART 2. Online Convex Optimization, A Game- Theoretic Approach to Learning
Tutorial: PART 2 Online Convex Optimization, A Game- Theoretic Approach to Learning Elad Hazan Princeton University Satyen Kale Yahoo Research Exploiting curvature: logarithmic regret Logarithmic regret
More informationThe Price of Differential Privacy for Online Learning
Naman Agarwal Karan Singh Abstract We design differentially private algorithms for the problem of online linear optimization in the full information and bandit settings with optimal ( p T ) regret bounds.
More informationGeneralization Bounds for Online Learning Algorithms with Pairwise Loss Functions
JMLR: Workshop and Conference Proceedings vol 3 0 3. 3. 5th Annual Conference on Learning Theory Generalization Bounds for Online Learning Algorithms with Pairwise Loss Functions Yuyang Wang Roni Khardon
More informationConditional Swap Regret and Conditional Correlated Equilibrium
Conditional Swap Regret and Conditional Correlated quilibrium Mehryar Mohri Courant Institute and Google 251 Mercer Street New York, NY 10012 mohri@cims.nyu.edu Scott Yang Courant Institute 251 Mercer
More informationAdaptive Sampling Under Low Noise Conditions 1
Manuscrit auteur, publié dans "41èmes Journées de Statistique, SFdS, Bordeaux (2009)" Adaptive Sampling Under Low Noise Conditions 1 Nicolò Cesa-Bianchi Dipartimento di Scienze dell Informazione Università
More informationApplications of on-line prediction. in telecommunication problems
Applications of on-line prediction in telecommunication problems Gábor Lugosi Pompeu Fabra University, Barcelona based on joint work with András György and Tamás Linder 1 Outline On-line prediction; Some
More informationOnline Learning Class 12, 20 March 2006 Andrea Caponnetto, Sanmay Das
Online Learning 9.520 Class 12, 20 March 2006 Andrea Caponnetto, Sanmay Das About this class Goal To introduce the general setting of online learning. To describe an online version of the RLS algorithm
More informationLecture 8. Instructor: Haipeng Luo
Lecture 8 Instructor: Haipeng Luo Boosting and AdaBoost In this lecture we discuss the connection between boosting and online learning. Boosting is not only one of the most fundamental theories in machine
More informationAdaptive Online Gradient Descent
University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 6-4-2007 Adaptive Online Gradient Descent Peter Bartlett Elad Hazan Alexander Rakhlin University of Pennsylvania Follow
More informationMachine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring /
Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring 2015 http://ce.sharif.edu/courses/93-94/2/ce717-1 / Agenda Combining Classifiers Empirical view Theoretical
More informationOptimization, Learning, and Games with Predictable Sequences
Optimization, Learning, and Games with Predictable Sequences Alexander Rakhlin University of Pennsylvania Karthik Sridharan University of Pennsylvania Abstract We provide several applications of Optimistic
More informationWorst-Case Analysis of the Perceptron and Exponentiated Update Algorithms
Worst-Case Analysis of the Perceptron and Exponentiated Update Algorithms Tom Bylander Division of Computer Science The University of Texas at San Antonio San Antonio, Texas 7849 bylander@cs.utsa.edu April
More informationGambling in a rigged casino: The adversarial multi-armed bandit problem
Gambling in a rigged casino: The adversarial multi-armed bandit problem Peter Auer Institute for Theoretical Computer Science University of Technology Graz A-8010 Graz (Austria) pauer@igi.tu-graz.ac.at
More informationCS264: Beyond Worst-Case Analysis Lecture #20: From Unknown Input Distributions to Instance Optimality
CS264: Beyond Worst-Case Analysis Lecture #20: From Unknown Input Distributions to Instance Optimality Tim Roughgarden December 3, 2014 1 Preamble This lecture closes the loop on the course s circle of
More informationAn Online Convex Optimization Approach to Blackwell s Approachability
Journal of Machine Learning Research 17 (2016) 1-23 Submitted 7/15; Revised 6/16; Published 8/16 An Online Convex Optimization Approach to Blackwell s Approachability Nahum Shimkin Faculty of Electrical
More informationMove from Perturbed scheme to exponential weighting average
Move from Perturbed scheme to exponential weighting average Chunyang Xiao Abstract In an online decision problem, one makes decisions often with a pool of decisions sequence called experts but without
More informationExperts in a Markov Decision Process
University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 2004 Experts in a Markov Decision Process Eyal Even-Dar Sham Kakade University of Pennsylvania Yishay Mansour Follow
More informationOnline Learning and Sequential Decision Making
Online Learning and Sequential Decision Making Emilie Kaufmann CNRS & CRIStAL, Inria SequeL, emilie.kaufmann@univ-lille.fr Research School, ENS Lyon, Novembre 12-13th 2018 Emilie Kaufmann Online Learning
More informationBeyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization
JMLR: Workshop and Conference Proceedings vol (2010) 1 16 24th Annual Conference on Learning heory Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization
More informationLecture 3: Lower Bounds for Bandit Algorithms
CMSC 858G: Bandits, Experts and Games 09/19/16 Lecture 3: Lower Bounds for Bandit Algorithms Instructor: Alex Slivkins Scribed by: Soham De & Karthik A Sankararaman 1 Lower Bounds In this lecture (and
More informationOnline combinatorial optimization with stochastic decision sets and adversarial losses
Online combinatorial optimization with stochastic decision sets and adversarial losses Gergely Neu Michal Valko SequeL team, INRIA Lille Nord urope, France {gergely.neu,michal.valko}@inria.fr Abstract
More informationNo-Regret Algorithms for Unconstrained Online Convex Optimization
No-Regret Algorithms for Unconstrained Online Convex Optimization Matthew Streeter Duolingo, Inc. Pittsburgh, PA 153 matt@duolingo.com H. Brendan McMahan Google, Inc. Seattle, WA 98103 mcmahan@google.com
More informationOnline Learning for Non-Stationary A/B Tests
CIKM 18, October 22-26, 218, Torino, Italy Online Learning for Non-Stationary A/B Tests Andrés Muñoz Medina Google AI New York, NY ammedina@google.com Sergei Vassilvitiskii Google AI New York, NY sergeiv@google.com
More informationRegret Minimization With Concept Drift
Regret Minimization With Concept Drift Koby Crammer he echnion koby@ee.technion.ac.il Yishay Mansour el Aviv University mansour@cs.tau.ac.il Eyal Even-Dar Google Research evendar@google.com Jennifer Wortman
More informationOnline Learning, Mistake Bounds, Perceptron Algorithm
Online Learning, Mistake Bounds, Perceptron Algorithm 1 Online Learning So far the focus of the course has been on batch learning, where algorithms are presented with a sample of training data, from which
More informationImproved Bounds for Online Learning Over the Permutahedron and Other Ranking Polytopes
Improved Bounds for Online Learning Over the Permutahedron and Other Ranking Polytopes Nir Ailon Department of Computer Science, echnion II, Haifa, Israel nailon@cs.technion.ac.il Abstract Consider the
More informationLearning for Contextual Bandits
Learning for Contextual Bandits Alina Beygelzimer 1 John Langford 2 IBM Research 1 Yahoo! Research 2 NYC ML Meetup, Sept 21, 2010 Example of Learning through Exploration Repeatedly: 1. A user comes to
More informationIntroduction to Machine Learning Lecture 13. Mehryar Mohri Courant Institute and Google Research
Introduction to Machine Learning Lecture 13 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Multi-Class Classification Mehryar Mohri - Introduction to Machine Learning page 2 Motivation
More informationAdaptivity and Optimism: An Improved Exponentiated Gradient Algorithm
Jacob Steinhardt Percy Liang Stanford University, 353 Serra Street, Stanford, CA 94305 USA JSTEINHARDT@CS.STANFORD.EDU PLIANG@CS.STANFORD.EDU Abstract We present an adaptive variant of the exponentiated
More informationBetter Algorithms for Benign Bandits
Better Algorithms for Benign Bandits Elad Hazan IBM Almaden 650 Harry Rd, San Jose, CA 95120 hazan@us.ibm.com Satyen Kale Microsoft Research One Microsoft Way, Redmond, WA 98052 satyen.kale@microsoft.com
More informationAdaptive Game Playing Using Multiplicative Weights
Games and Economic Behavior 29, 79 03 (999 Article ID game.999.0738, available online at http://www.idealibrary.com on Adaptive Game Playing Using Multiplicative Weights Yoav Freund and Robert E. Schapire
More informationExplore no more: Improved high-probability regret bounds for non-stochastic bandits
Explore no more: Improved high-probability regret bounds for non-stochastic bandits Gergely Neu SequeL team INRIA Lille Nord Europe gergely.neu@gmail.com Abstract This work addresses the problem of regret
More informationAdaptivity and Optimism: An Improved Exponentiated Gradient Algorithm
Adaptivity and Optimism: An Improved Exponentiated Gradient Algorithm Jacob Steinhardt Percy Liang Stanford University {jsteinhardt,pliang}@cs.stanford.edu Jun 11, 2013 J. Steinhardt & P. Liang (Stanford)
More informationDeep Boosting. Joint work with Corinna Cortes (Google Research) Umar Syed (Google Research) COURANT INSTITUTE & GOOGLE RESEARCH.
Deep Boosting Joint work with Corinna Cortes (Google Research) Umar Syed (Google Research) MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Ensemble Methods in ML Combining several base classifiers
More informationRegret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I. Sébastien Bubeck Theory Group
Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I Sébastien Bubeck Theory Group i.i.d. multi-armed bandit, Robbins [1952] i.i.d. multi-armed bandit, Robbins [1952] Known
More informationNo-regret algorithms for structured prediction problems
No-regret algorithms for structured prediction problems Geoffrey J. Gordon December 21, 2005 CMU-CALD-05-112 School of Computer Science Carnegie-Mellon University Pittsburgh, PA 15213 Abstract No-regret
More informationOnline Aggregation of Unbounded Signed Losses Using Shifting Experts
Proceedings of Machine Learning Research 60: 5, 207 Conformal and Probabilistic Prediction and Applications Online Aggregation of Unbounded Signed Losses Using Shifting Experts Vladimir V. V yugin Institute
More informationFoundations of Machine Learning Multi-Class Classification. Mehryar Mohri Courant Institute and Google Research
Foundations of Machine Learning Multi-Class Classification Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Motivation Real-world problems often have multiple classes: text, speech,
More informationOnline Learning with Feedback Graphs
Online Learning with Feedback Graphs Claudio Gentile INRIA and Google NY clagentile@gmailcom NYC March 6th, 2018 1 Content of this lecture Regret analysis of sequential prediction problems lying between
More informationA simpler unified analysis of Budget Perceptrons
Ilya Sutskever University of Toronto, 6 King s College Rd., Toronto, Ontario, M5S 3G4, Canada ILYA@CS.UTORONTO.CA Abstract The kernel Perceptron is an appealing online learning algorithm that has a drawback:
More informationConvex Repeated Games and Fenchel Duality
Convex Repeated Games and Fenchel Duality Shai Shalev-Shwartz 1 and Yoram Singer 1,2 1 School of Computer Sci. & Eng., he Hebrew University, Jerusalem 91904, Israel 2 Google Inc. 1600 Amphitheater Parkway,
More informationUsing Additive Expert Ensembles to Cope with Concept Drift
Jeremy Z. Kolter and Marcus A. Maloof {jzk, maloof}@cs.georgetown.edu Department of Computer Science, Georgetown University, Washington, DC 20057-1232, USA Abstract We consider online learning where the
More informationEnsemble Methods for Structured Prediction
Ensemble Methods for Structured Prediction Corinna Cortes Google Research, 111 8th Avenue, New York, NY 10011 Vitaly Kuznetsov Courant Institute of Mathematical Sciences, 251 Mercer Street, New York, NY
More informationMinimax Policies for Combinatorial Prediction Games
Minimax Policies for Combinatorial Prediction Games Jean-Yves Audibert Imagine, Univ. Paris Est, and Sierra, CNRS/ENS/INRIA, Paris, France audibert@imagine.enpc.fr Sébastien Bubeck Centre de Recerca Matemàtica
More informationLecture 4: Lower Bounds (ending); Thompson Sampling
CMSC 858G: Bandits, Experts and Games 09/12/16 Lecture 4: Lower Bounds (ending); Thompson Sampling Instructor: Alex Slivkins Scribed by: Guowei Sun,Cheng Jie 1 Lower bounds on regret (ending) Recap from
More informationOn Minimaxity of Follow the Leader Strategy in the Stochastic Setting
On Minimaxity of Follow the Leader Strategy in the Stochastic Setting Wojciech Kot lowsi Poznań University of Technology, Poland wotlowsi@cs.put.poznan.pl Abstract. We consider the setting of prediction
More informationFoundations of Machine Learning
Introduction to ML Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu page 1 Logistics Prerequisites: basics in linear algebra, probability, and analysis of algorithms. Workload: about
More informationLearning Hurdles for Sleeping Experts
Learning Hurdles for Sleeping Experts Varun Kanade EECS, University of California Berkeley, CA, USA vkanade@eecs.berkeley.edu Thomas Steinke SEAS, Harvard University Cambridge, MA, USA tsteinke@fas.harvard.edu
More informationPerceptron (Theory) + Linear Regression
10601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Perceptron (Theory) Linear Regression Matt Gormley Lecture 6 Feb. 5, 2018 1 Q&A
More informationCS 395T Computational Learning Theory. Scribe: Rahul Suri
CS 395T Computational Learning Theory Lecture 6: September 19, 007 Lecturer: Adam Klivans Scribe: Rahul Suri 6.1 Overview In Lecture 5, we showed that for any DNF formula f with n variables and s terms
More information