Foundations of Machine Learning Boosting. Mehryar Mohri Courant Institute and Google Research

Size: px
Start display at page:

Download "Foundations of Machine Learning Boosting. Mehryar Mohri Courant Institute and Google Research"

Transcription

1 Foundations of Machine Learning Boosting Mehryar Mohri Courant Institute and Google Research

2 Weak Learning Definition: concept class C is weakly PAC-learnable if there exists a (weak) learning algorith Land such that: for all >0, for all c C and all distributions D, for saples S of size polynoial. Pr R(h S) S D 1 2 =poly(1/ ) (Kearns and Valiant, 1994) 1, for a fixed >0 2

3 Finding siple relatively accurate base classifiers often not hard weak learner. Main ideas: Boosting Ideas use weak learner to create a strong learner. cobine base classifiers returned by weak learner (enseble ethod). But, how should the base classifiers be cobined? 3

4 H { 1, +1} X. AdaBoost (Freund and Schapire, 1997) AdaBoost(S =((x 1,y 1 ),...,(x,y ))) 1 for i 1 to do 2 D 1 (i) 1 3 for t 1 to T do 4 h t base classifier in H with sall error t = Pr i D t [h t (x i )6=y i ] 1 5 t 2 log 1 t t 6 Z t 2[ t (1 t )] 1 2. noralization factor 7 for i 1 to do 8 D t+1 (i) P t 9 f t s=1 sh s 10 return h = sgn(f T ) D t (i)exp( t y i h t (x i )) Z t 4

5 Distributions D t originally unifor. Notes over training saple: at each round, the weight of a isclassified exaple is increased. observation: D, since t+1 (i)= e y i f t (x i ) Q t s=1 Z s D t+1 (i) = D ty t(i)e i h t (x i ) = D t 1yiht t 1(i)e 1(xi) e tyiht(xi) = 1 e y P t i s=1 sh s (x i ) t Z t Z t 1 Z t s=1 Z. s Weight assigned to base classifier h t : t directly depends on the accuracy of at round t. h t 5

6 Illustration t = 1 t = 2 6

7 t =

8 α 1 + α 2 + α 3 = 8

9 Bound on Epirical Error Theore: The epirical error of the classifier output by AdaBoost verifies: R(h) exp 2 If further for all t [1,T], ( 1 2 t), then R(h) exp( 2 2 T ). does not need to be known in advance: adaptive boosting. T t=1 1 2 (Freund and Schapire, 1997) t 2. 9

10 Proof: Since, as we saw, D t+1 (i)= e y i f t (x i ), Q t s=1 Z s R(h) = 1 Now, since Z t = = 1 Z t 1 yi f(x i ) 0 T t=1 1 Z t D T +1 (i) = is a noralization factor, D t (i)e ty i h t (x i ) i:y i h t (x i ) 0 D t (i)e t + =(1 t)e t + t e t =(1 t) t 1 t + t exp( y i f(x i )) T t=1 i:y i h t (x i )<0 1 t t Z t. D t (i)e t =2 t(1 t). 10

11 Thus, T Z t = T 2 t(1 t) = T t 2 t=1 Notes: t=1 T t=1 exp t t=1 2 =exp 2 T t=1 1 2 t 2. t iniizer of (1 t)e + t e. since (1 t)e t = t e t, at each round, AdaBoost assigns the sae probability ass to correctly classified and isclassified instances. for base classifiers x [ 1, +1], t can be siilarly chosen to iniize. Z t 11

12 AdaBoost = Coordinate Descent Objective Function: convex and differentiable. F ( ) = 1 X e y if(x i ) = 1 e x X e y P N i j=1 jh j (x i ). 0 1loss 12

13 Direction: unit vector derivative: F 0 ( t e k with best directional F ( t 1 + e k ) F ( t 1 ) 1, e k )=li!0 X Since F (, t 1 + e k )= e y P N i j=1 t 1,jh j (x i ) y i h k (x i ) F 0 ( t 1, e k )= 1 X y i h k (x i )e y i = 1 X y i h k (x i ) D t (i) Z t " X = D t (i)1 yi h k (x i )=+1 = P N j=1 t 1,jh j (x i ) Thus, direction corresponding to base classifier with sallest error. X h (1 t,k ) t,k i Zt = h 2 t,k 1 Zt D t (i)1 yi h k (x i )= 1# i Zt.. 13

14 Step size: chosen to iniize F ( t 1 + e k ); df ( t 1 + e k ) d =0,,, X y i h k (x i )e y P N i j=1 t 1,jh j (x i ) e y ih k (x i ) =0 X y i h k (x i ) D t (i) Z t e y ih k (x i ) =0 X y i h k (x i ) D t (i)e y ih k (x i ) =0, (1 t,k )e t,k e =0, = 1 2 log 1 t,k t,k. Thus, step size atches base classifier weight of AdaBoost. 14

15 Alternative Loss Functions boosting loss x e x logistic loss x log 2 (1 + e x ) square loss x (1 x) 2 1 x 1 hinge loss x ax(1 x, 0) zero-one loss x 1 x<0 15

16 Standard Use in Practice Base learners: decision trees, quite often just decision stups (trees of depth one). Boosting stups: data in R N, e.g., N =2, (height(x), weight(x)). associate a stup to each coponent. pre-sort each coponent: O(Nlog ). total coplexity: O(( log )N + NT ). at each round, find best coponent and threshold. stups not weak learners: think XOR exaple! 16

17 Assue that F T = sgn Overfitting? VCdi(H)=d and for a fixed T, define can for a very rich faily of classifiers. It can be shown (Freund and Schapire, 1997) that: F T T t=1 th t b : t,b R,h t H. VCdi(F T ) 2(d +1)(T +1)log 2 ((T +1)e). This suggests that AdaBoost could overfit for large values of T, and that is in fact observed in soe cases, but in various others it is not! 17

18 Epirical Observations Several epirical observations (not all): AdaBoost does not see to overfit, furtherore: test error training error error # rounds C4.5 decision trees (Schapire et al., 1998). 18

19 Radeacher Coplexity of Convex Hulls Theore: Let H be a set of functions apping fro X to R. Let the convex hull of H be defined as Then, for any saple S, R S (conv(h)) = R S (H). Proof: conv(h) ={ p k=1 R S (conv(h)) = 1 E µ k h k : p 1,µ k 0, sup h k H,µ 0, µ 1 1 = 1 E sup sup h k H µ 0, µ 1 1 = 1 E sup h k H = 1 E sup h H ax k [1,p] p k=1 ih(x i ) k=1 µ k 1,h k H}. i p k=1 p µ k ih k (x i ) = R S (H). µ k h k (x i ) ih k (x i ) 19

20 Margin Bound - Enseble Methods Corollary: Let H be a set of real-valued functions. Fix >0. For any >0, with probability at least 1, the following holds for all h conv(h) : R(h) R (h)+ 2 R H + log 1 R(h) R (h)+ 2 R S H +3 (Koltchinskii and Panchenko, 2002) 2 log 2 2. Proof: Direct consequence of argin bound of Lecture 4 and R S (conv(h))=r S (H). 20

21 Margin Bound - Enseble Methods (Koltchinskii and Panchenko, 2002); see also (Schapire et al., 1998) Corollary: Let H be a faily of functions taking values in { 1, +1} with VC diension d. Fix >0. For any >0, with probability at least 1, the following holds for all h conv(h) : R(h) R (h)+ 2 2d log e d + log 1 2. Proof: Follows directly previous corollary and VC diension bound on Radeacher coplexity (see lecture 3). 21

22 Notes All of these bounds can be generalized to hold uniforly for all (0, 1), at the cost of an additional 2 log log ter 2 and other inor constant factor changes (Koltchinskii and Panchenko, 2002). For AdaBoost, the bound applies to the functions x Note that f(x) T 1 = T t=1 th t (x) 1 conv(h). does not appear in the bound. 22

23 Margin Distribution 1 Theore: For any Pr yf(x) 1 >0, the following holds: 2 T T t=1 1 t (1 t) 1+. Proof: Using the identity D t+1 (i)= e y i f(x i ), 1 yi f(x i ) = 1 = e 1 Q T t=1 Z t exp( y i f(x i )+ 1 ) e 1 T t=1 Z t =2 T T t=1 T Z t t=1 D T +1 (i) 1 t t t(1 t). 23

24 Notes If for all t [1,T], ( 1 2 t), then the upper bound can be bounded by Pr yf(x) 1 (1 2 ) 1 (1 + 2 ) 1+ T/2. For <, (1 2 ) 1 (1+2 ) 1+ <1 and the bound decreases exponentially in T. For the bound to be convergent: O(1/ ), thus O(1/ ) is roughly the condition on the edge value. 24

25 Outliers AdaBoost assigns larger weights to harder exaples. Application: Detecting islabeled exaples. Dealing with noisy data: regularization based on the average weight assigned to a point (soft argin idea for boosting) (Meir and Rätsch, 2003). 25

26 L1-Geoetric Margin Definition: the L 1 -argin f (x) of a linear function f = P T t=1 th t with 6= 0at a point x 2 X is defined by f (x) = f(x) k 1 = P T t=1 th t (x) k k 1 the -argin of over a saple is = h(x) k k 1. L 1 f S =(x 1,...,x ) its iniu argin at points in that saple: f = in f (x i )= i2[1,] in i2[1,] h(x i ) k k 1. 26

27 SVM vs AdaBoost features or base hypotheses SVM apple (x) = 1(x). N (x) AdaBoost apple h1 (x) h(x) =. h N (x) predictor x 7! w (x) x 7! h(x) geo. argin w (x) = d 2 ( (x), hyperpl.) kwk 2 h(x) k k 1 = d 1 (h(x), hyperpl.) conf. argin y(w (x)) y( h(x)) regularization kwk 2 k k 1 (L1-AB) 27

28 Maxiu-Margin Solutions Nor 2. Nor. 28

29 But, Does AdaBoost Maxiize the Margin? No: AdaBoost ay converge to a argin that is significantly below the axiu argin (Rudin et al., 2004) (e.g., 1/3 instead of 3/8)! Lower bound: AdaBoost can achieve asyptotically a argin that is at least ax if the data is separable 2 and soe conditions on the base learners hold (Rätsch and Waruth, 2002). Several boosting-type argin-axiization algoriths: but, perforance in practice not clear or not reported. 29

30 AdaBoost s Weak Learning Condition Definition: the edge of a base classifier h t for a distribution D over the training saple is (t) = 1 t = Condition: there exists >0 for any distribution over the training saple and any base classifier (t). y i h t (x i )D(i). D 30

31 Zero-Su Gaes Definition: payoff atrix M =(M ij ) R n. possible actions (pure strategy) for row player. n possible actions for colun player. M ij payoff for row player ( = loss for colun player) when row plays i, colun plays j. Exaple: rock paper scissors rock paper scissors

32 Mixed Strategies Definition: player row selects a distribution p over the rows, player colun a distribution qover coluns. The expected payoff for row is i j E p q von Neuann s iniax theore: ax p in q equivalent for: ax p [M ij ]= n j=1 p Mq =in q in p Me j =in j [1,n] q p i M ij q j = p Mq. ax p i p Mq. ax e i Mq. [1,] (von Neuann, 1928) 32

33 John von Neuann ( ) 33

34 AdaBoost and Gae Theory Gae: Player A: selects point x i, i [1,]. Player B: selects base learner h t, t [1,T]. M { 1, +1} T Payoff atrix : M it =y i h t (x i ). von Neuann s theore: assue finite H. 2 =in D ax h H D(i)y i h(x i )=ax in y i i [1,] T t=1 th t (x i ) 1 =. 34

35 Consequences Weak learning condition non-zero argin. thus, possible to search for non-zero argin. AdaBoost = (suboptial) search for corresponding ; achieves at least half of the axiu argin. Weak learning = strong condition: the condition iplies linear separability with argin 2 >0. = 35

36 Linear Prograing Proble Maxiizing the argin: This is equivalent to the following convex optiization LP proble: Note that: x 1 =ax in y i i [1,] ax subject to : y i ( x i ) 1 =1. ( x i ) 1. = x H, with H = {x: x =0}. 36

37 Advantages of AdaBoost Siple: straightforward ipleentation. Efficient: coplexity N T O(NT ) for stups: when and are not too large, the algorith is quite fast. Theoretical guarantees: but still any questions. AdaBoost not designed to axiize argin. regularized versions of AdaBoost. 37

38 Weaker Aspects Paraeters: need to deterine T, the nuber of rounds of boosting: stopping criterion. need to deterine base learners: risk of overfitting or low argins. Noise: severely daages the accuracy of Adaboost (Dietterich, 2000). 38

39 Other Boosting Algoriths arc-gv (Breian, 1996): designed to axiize the argin, but outperfored by AdaBoost in experients (Reyzin and Schapire, 2006). L1-regularized AdaBoost (Raetsch et al., 2001): outperfos AdaBoost in experients (Cortes et al., 2014). DeepBoost (Cortes et al., 2014): ore favorable learning guarantees, outperfors both AdaBoost and L1-regularized AdaBoost in experients. 39

40 References Corinna Cortes, Mehryar Mohri, and Uar Syed. Deep boosting. In ICML, s , Leo Breian. Bagging predictors. Machine Learning, 24(2): , Thoas G. Dietterich. An experiental coparison of three ethods for constructing ensebles of decision trees: bagging, boosting, and randoization. Machine Learning, 40(2): , Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Coputer and Syste Sciences, 55(1): , G. Lebanon and J. Lafferty. Boosting and axiu likelihood for exponential odels. In NIPS, s , Ron Meir and Gunnar Rätsch. An introduction to boosting and leveraging. In Advanced Lectures on Machine Learning (LNAI2600), J. von Neuann. Zur Theorie der Gesellschaftsspiele. Matheatische Annalen, 100: ,

41 References Cynthia Rudin, Ingrid Daubechies and Robert E. Schapire. The dynaics of AdaBoost: Cyclic behavior and convergence of argins. Journal of Machine Learning Research, 5: , Rätsch, G., and Waruth, M. K. (2002) Maxiizing the Margin with Boosting, in Proceedings of the 15th Annual Conference on Coputational Learning Theory (COLT 02), Sidney, Australia, pp , July Reyzin, Lev and Schapire, Robert E. How boosting the argin can also boost classifier coplexity. In ICML, s , Robert E. Schapire. The boosting approach to achine learning: An overview. In D. D. Denison, M. H. Hansen, C. Holes, B. Mallick, B. Yu, editors, Nonlinear Estiation and Classification. Springer, Robert E. Schapire and Yoav Freund. Boosting, Foundations and Algoriths. The MIT Press, Robert E. Schapire, Yoav Freund, Peter Bartlett and Wee Sun Lee. Boosting the argin: A new explanation for the effectiveness of voting ethods. The Annals of Statistics, 26(5): ,

Introduction to Machine Learning Lecture 11. Mehryar Mohri Courant Institute and Google Research

Introduction to Machine Learning Lecture 11. Mehryar Mohri Courant Institute and Google Research Introduction to Machine Learning Lecture 11 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Boosting Mehryar Mohri - Introduction to Machine Learning page 2 Boosting Ideas Main idea:

More information

Rademacher Complexity Margin Bounds for Learning with a Large Number of Classes

Rademacher Complexity Margin Bounds for Learning with a Large Number of Classes Radeacher Coplexity Margin Bounds for Learning with a Large Nuber of Classes Vitaly Kuznetsov Courant Institute of Matheatical Sciences, 25 Mercer street, New York, NY, 002 Mehryar Mohri Courant Institute

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

1 Rademacher Complexity Bounds

1 Rademacher Complexity Bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #10 Scribe: Max Goer March 07, 2013 1 Radeacher Coplexity Bounds Recall the following theore fro last lecture: Theore 1. With probability

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Deep Boosting MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Outline Model selection. Deep boosting. theory. algorithm. experiments. page 2 Model Selection Problem:

More information

Deep Boosting. Abstract. 1. Introduction

Deep Boosting. Abstract. 1. Introduction Corinna Cortes Google Research, 8th Avenue, New York, NY Mehryar Mohri Courant Institute and Google Research, 25 Mercer Street, New York, NY 2 Uar Syed Google Research, 8th Avenue, New York, NY Abstract

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

A Smoothed Boosting Algorithm Using Probabilistic Output Codes

A Smoothed Boosting Algorithm Using Probabilistic Output Codes A Soothed Boosting Algorith Using Probabilistic Output Codes Rong Jin rongjin@cse.su.edu Dept. of Coputer Science and Engineering, Michigan State University, MI 48824, USA Jian Zhang jian.zhang@cs.cu.edu

More information

Deep Boosting. Joint work with Corinna Cortes (Google Research) Umar Syed (Google Research) COURANT INSTITUTE & GOOGLE RESEARCH.

Deep Boosting. Joint work with Corinna Cortes (Google Research) Umar Syed (Google Research) COURANT INSTITUTE & GOOGLE RESEARCH. Deep Boosting Joint work with Corinna Cortes (Google Research) Umar Syed (Google Research) MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Ensemble Methods in ML Combining several base classifiers

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

Support Vector Machines. Goals for the lecture

Support Vector Machines. Goals for the lecture Support Vector Machines Mark Craven and David Page Coputer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Soe of the slides in these lectures have been adapted/borrowed fro aterials developed

More information

Boosting with log-loss

Boosting with log-loss Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the

More information

Bayes Decision Rule and Naïve Bayes Classifier

Bayes Decision Rule and Naïve Bayes Classifier Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.

More information

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

Deep Boosting. Abstract. 1. Introduction

Deep Boosting. Abstract. 1. Introduction Corinna Cortes Google Research, 8th Avenue, New York, NY 00 Mehryar Mohri Courant Institute and Google Research, 5 Mercer Street, New York, NY 00 Uar Syed Google Research, 8th Avenue, New York, NY 00 Abstract

More information

Introduction to Machine Learning Lecture 13. Mehryar Mohri Courant Institute and Google Research

Introduction to Machine Learning Lecture 13. Mehryar Mohri Courant Institute and Google Research Introduction to Machine Learning Lecture 13 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Multi-Class Classification Mehryar Mohri - Introduction to Machine Learning page 2 Motivation

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

Support Vector Machines. Maximizing the Margin

Support Vector Machines. Maximizing the Margin Support Vector Machines Support vector achines (SVMs) learn a hypothesis: h(x) = b + Σ i= y i α i k(x, x i ) (x, y ),..., (x, y ) are the training exs., y i {, } b is the bias weight. α,..., α are the

More information

PAC-Bayes Analysis Of Maximum Entropy Learning

PAC-Bayes Analysis Of Maximum Entropy Learning PAC-Bayes Analysis Of Maxiu Entropy Learning John Shawe-Taylor and David R. Hardoon Centre for Coputational Statistics and Machine Learning Departent of Coputer Science University College London, UK, WC1E

More information

Support Vector Machines MIT Course Notes Cynthia Rudin

Support Vector Machines MIT Course Notes Cynthia Rudin Support Vector Machines MIT 5.097 Course Notes Cynthia Rudin Credit: Ng, Hastie, Tibshirani, Friedan Thanks: Şeyda Ertekin Let s start with soe intuition about argins. The argin of an exaple x i = distance

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

Robustness and Regularization of Support Vector Machines

Robustness and Regularization of Support Vector Machines Robustness and Regularization of Support Vector Machines Huan Xu ECE, McGill University Montreal, QC, Canada xuhuan@ci.cgill.ca Constantine Caraanis ECE, The University of Texas at Austin Austin, TX, USA

More information

CS Lecture 13. More Maximum Likelihood

CS Lecture 13. More Maximum Likelihood CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood

More information

Foundations of Machine Learning Lecture 9. Mehryar Mohri Courant Institute and Google Research

Foundations of Machine Learning Lecture 9. Mehryar Mohri Courant Institute and Google Research Foundations of Machine Learning Lecture 9 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Multi-Class Classification page 2 Motivation Real-world problems often have multiple classes:

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher

More information

Foundations of Machine Learning Multi-Class Classification. Mehryar Mohri Courant Institute and Google Research

Foundations of Machine Learning Multi-Class Classification. Mehryar Mohri Courant Institute and Google Research Foundations of Machine Learning Multi-Class Classification Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Motivation Real-world problems often have multiple classes: text, speech,

More information

A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION

A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION A eshsize boosting algorith in kernel density estiation A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION C.C. Ishiekwene, S.M. Ogbonwan and J.E. Osewenkhae Departent of Matheatics, University

More information

Accuracy at the Top. Abstract

Accuracy at the Top. Abstract Accuracy at the Top Stephen Boyd Stanford University Packard 264 Stanford, CA 94305 boyd@stanford.edu Mehryar Mohri Courant Institute and Google 25 Mercer Street New York, NY 002 ohri@cis.nyu.edu Corinna

More information

Ensemble Based on Data Envelopment Analysis

Ensemble Based on Data Envelopment Analysis Enseble Based on Data Envelopent Analysis So Young Sohn & Hong Choi Departent of Coputer Science & Industrial Systes Engineering, Yonsei University, Seoul, Korea Tel) 82-2-223-404, Fax) 82-2- 364-7807

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016 Lessons 7 14 Dec 2016 Outline Artificial Neural networks Notation...2 1. Introduction...3... 3 The Artificial

More information

1 Generalization bounds based on Rademacher complexity

1 Generalization bounds based on Rademacher complexity COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #0 Scribe: Suqi Liu March 07, 08 Last tie we started proving this very general result about how quickly the epirical average converges

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lessons 7 20 Dec 2017 Outline Artificial Neural networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory TTIC 31120 Prof. Nati Srebro Lecture 2: PAC Learning and VC Theory I Fro Adversarial Online to Statistical Three reasons to ove fro worst-case deterinistic

More information

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks Intelligent Systes: Reasoning and Recognition Jaes L. Crowley MOSIG M1 Winter Seester 2018 Lesson 7 1 March 2018 Outline Artificial Neural Networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

PAC-Bayesian Learning of Linear Classifiers

PAC-Bayesian Learning of Linear Classifiers Pascal Gerain Pascal.Gerain.@ulaval.ca Alexandre Lacasse Alexandre.Lacasse@ift.ulaval.ca François Laviolette Francois.Laviolette@ift.ulaval.ca Mario Marchand Mario.Marchand@ift.ulaval.ca Départeent d inforatique

More information

Lecture 12: Ensemble Methods. Introduction. Weighted Majority. Mixture of Experts/Committee. Σ k α k =1. Isabelle Guyon

Lecture 12: Ensemble Methods. Introduction. Weighted Majority. Mixture of Experts/Committee. Σ k α k =1. Isabelle Guyon Lecture 2: Enseble Methods Isabelle Guyon guyoni@inf.ethz.ch Introduction Book Chapter 7 Weighted Majority Mixture of Experts/Coittee Assue K experts f, f 2, f K (base learners) x f (x) Each expert akes

More information

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab Support Vector Machines Machine Learning Series Jerry Jeychandra Bloh Lab Outline Main goal: To understand how support vector achines (SVMs) perfor optial classification for labelled data sets, also a

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

Boosting with Abstention

Boosting with Abstention Boosting with Abstention Corinna Cortes Google Research New York, NY corinna@google.co Giulia DeSalvo Courant Institute New York, NY desalvo@cis.nyu.edu Mehryar Mohri Courant Institute and Google New York,

More information

Stability Bounds for Non-i.i.d. Processes

Stability Bounds for Non-i.i.d. Processes tability Bounds for Non-i.i.d. Processes Mehryar Mohri Courant Institute of Matheatical ciences and Google Research 25 Mercer treet New York, NY 002 ohri@cis.nyu.edu Afshin Rostaiadeh Departent of Coputer

More information

Soft-margin SVM can address linearly separable problems with outliers

Soft-margin SVM can address linearly separable problems with outliers Non-linear Support Vector Machines Non-linearly separable probles Hard-argin SVM can address linearly separable probles Soft-argin SVM can address linearly separable probles with outliers Non-linearly

More information

Learning with Rejection

Learning with Rejection Learning with Rejection Corinna Cortes 1, Giulia DeSalvo 2, and Mehryar Mohri 2,1 1 Google Research, 111 8th Avenue, New York, NY 2 Courant Institute of Matheatical Sciences, 251 Mercer Street, New York,

More information

Machine Learning: Fisher s Linear Discriminant. Lecture 05

Machine Learning: Fisher s Linear Discriminant. Lecture 05 Machine Learning: Fisher s Linear Discriinant Lecture 05 Razvan C. Bunescu chool of Electrical Engineering and Coputer cience bunescu@ohio.edu Lecture 05 upervised Learning ask learn an (unkon) function

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

1 Proof of learning bounds

1 Proof of learning bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Coputable Shell Decoposition Bounds John Langford TTI-Chicago jcl@cs.cu.edu David McAllester TTI-Chicago dac@autoreason.co Editor: Leslie Pack Kaelbling and David Cohn Abstract Haussler, Kearns, Seung

More information

E0 370 Statistical Learning Theory Lecture 5 (Aug 25, 2011)

E0 370 Statistical Learning Theory Lecture 5 (Aug 25, 2011) E0 370 Statistical Learning Theory Lecture 5 Aug 5, 0 Covering Nubers, Pseudo-Diension, and Fat-Shattering Diension Lecturer: Shivani Agarwal Scribe: Shivani Agarwal Introduction So far we have seen how

More information

Topic 5a Introduction to Curve Fitting & Linear Regression

Topic 5a Introduction to Curve Fitting & Linear Regression /7/08 Course Instructor Dr. Rayond C. Rup Oice: A 337 Phone: (95) 747 6958 E ail: rcrup@utep.edu opic 5a Introduction to Curve Fitting & Linear Regression EE 4386/530 Coputational ethods in EE Outline

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Journal of Machine Learning Research 5 (2004) 529-547 Subitted 1/03; Revised 8/03; Published 5/04 Coputable Shell Decoposition Bounds John Langford David McAllester Toyota Technology Institute at Chicago

More information

Learning with Rejection

Learning with Rejection Learning with Rejection Corinna Cortes 1, Giulia DeSalvo 2, and Mehryar Mohri 1,2 1 Google Research, 111 8th Avenue, New York, NY 2 Courant Institute of Matheatical Sciences, 251 Mercer Street, New York,

More information

Least Squares Fitting of Data

Least Squares Fitting of Data Least Squares Fitting of Data David Eberly, Geoetric Tools, Redond WA 98052 https://www.geoetrictools.co/ This work is licensed under the Creative Coons Attribution 4.0 International License. To view a

More information

Learning with Deep Cascades

Learning with Deep Cascades Learning with Deep Cascades Giulia DeSalvo 1, Mehryar Mohri 1,2, and Uar Syed 2 1 Courant Institute of Matheatical Sciences, 251 Mercer Street, New Yor, NY 10012 2 Google Research, 111 8th Avenue, New

More information

Randomized Recovery for Boolean Compressed Sensing

Randomized Recovery for Boolean Compressed Sensing Randoized Recovery for Boolean Copressed Sensing Mitra Fatei and Martin Vetterli Laboratory of Audiovisual Counication École Polytechnique Fédéral de Lausanne (EPFL) Eail: {itra.fatei, artin.vetterli}@epfl.ch

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

Boosting with Abstention

Boosting with Abstention Boosting with Abstention Corinna Cortes Google Research New York, NY 00 corinna@google.co Giulia DeSalvo Courant Institute New York, NY 00 desalvo@cis.nyu.edu Mehryar Mohri Courant Institute and Google

More information

Structured Prediction Theory Based on Factor Graph Complexity

Structured Prediction Theory Based on Factor Graph Complexity Structured Prediction Theory Based on Factor Graph Coplexity Corinna Cortes Google Research New York, NY 00 corinna@googleco Mehryar Mohri Courant Institute and Google New York, NY 00 ohri@cisnyuedu Vitaly

More information

Understanding Machine Learning Solution Manual

Understanding Machine Learning Solution Manual Understanding Machine Learning Solution Manual Written by Alon Gonen Edited by Dana Rubinstein Noveber 17, 2014 2 Gentle Start 1. Given S = ((x i, y i )), define the ultivariate polynoial p S (x) = i []:y

More information

Machine Learning Basics: Estimators, Bias and Variance

Machine Learning Basics: Estimators, Bias and Variance Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics

More information

Ch 12: Variations on Backpropagation

Ch 12: Variations on Backpropagation Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith

More information

Lecture 8. Instructor: Haipeng Luo

Lecture 8. Instructor: Haipeng Luo Lecture 8 Instructor: Haipeng Luo Boosting and AdaBoost In this lecture we discuss the connection between boosting and online learning. Boosting is not only one of the most fundamental theories in machine

More information

Geometrical intuition behind the dual problem

Geometrical intuition behind the dual problem Based on: Geoetrical intuition behind the dual proble KP Bennett, EJ Bredensteiner, Duality and Geoetry in SVM Classifiers, Proceedings of the International Conference on Machine Learning, 2000 1 Geoetrical

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

Probabilistic Machine Learning

Probabilistic Machine Learning Probabilistic Machine Learning by Prof. Seungchul Lee isystes Design Lab http://isystes.unist.ac.kr/ UNIST Table of Contents I.. Probabilistic Linear Regression I... Maxiu Likelihood Solution II... Maxiu-a-Posteriori

More information

Supplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data

Supplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data Suppleentary to Learning Discriinative Bayesian Networks fro High-diensional Continuous Neuroiaging Data Luping Zhou, Lei Wang, Lingqiao Liu, Philip Ogunbona, and Dinggang Shen Proposition. Given a sparse

More information

The Boosting Approach to. Machine Learning. Maria-Florina Balcan 10/31/2016

The Boosting Approach to. Machine Learning. Maria-Florina Balcan 10/31/2016 The Boosting Approach to Machine Learning Maria-Florina Balcan 10/31/2016 Boosting General method for improving the accuracy of any given learning algorithm. Works by creating a series of challenge datasets

More information

Foundations of Machine Learning Kernel Methods. Mehryar Mohri Courant Institute and Google Research

Foundations of Machine Learning Kernel Methods. Mehryar Mohri Courant Institute and Google Research Foundations of Machine Learning Kernel Methods Mehryar Mohri Courant Institute and Google Research ohri@cis.nyu.edu Motivation Efficient coputation of inner products in high diension. Non-linear decision

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 018: Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee7c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee7c@berkeley.edu October 15,

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

Domain-Adversarial Neural Networks

Domain-Adversarial Neural Networks Doain-Adversarial Neural Networks Hana Ajakan, Pascal Gerain 2, Hugo Larochelle 3, François Laviolette 2, Mario Marchand 2,2 Départeent d inforatique et de génie logiciel, Université Laval, Québec, Canada

More information

Bayesian Learning. Chapter 6: Bayesian Learning. Bayes Theorem. Roles for Bayesian Methods. CS 536: Machine Learning Littman (Wu, TA)

Bayesian Learning. Chapter 6: Bayesian Learning. Bayes Theorem. Roles for Bayesian Methods. CS 536: Machine Learning Littman (Wu, TA) Bayesian Learning Chapter 6: Bayesian Learning CS 536: Machine Learning Littan (Wu, TA) [Read Ch. 6, except 6.3] [Suggested exercises: 6.1, 6.2, 6.6] Bayes Theore MAP, ML hypotheses MAP learners Miniu

More information

VC Dimension and Sauer s Lemma

VC Dimension and Sauer s Lemma CMSC 35900 (Spring 2008) Learning Theory Lecture: VC Diension and Sauer s Lea Instructors: Sha Kakade and Abuj Tewari Radeacher Averages and Growth Function Theore Let F be a class of ±-valued functions

More information

On the Impact of Kernel Approximation on Learning Accuracy

On the Impact of Kernel Approximation on Learning Accuracy On the Ipact of Kernel Approxiation on Learning Accuracy Corinna Cortes Mehryar Mohri Aeet Talwalkar Google Research New York, NY corinna@google.co Courant Institute and Google Research New York, NY ohri@cs.nyu.edu

More information

AdaBoost. Lecturer: Authors: Center for Machine Perception Czech Technical University, Prague

AdaBoost. Lecturer: Authors: Center for Machine Perception Czech Technical University, Prague AdaBoost Lecturer: Jan Šochman Authors: Jan Šochman, Jiří Matas Center for Machine Perception Czech Technical University, Prague http://cmp.felk.cvut.cz Motivation Presentation 2/17 AdaBoost with trees

More information

UNIVERSITY OF TRENTO ON THE USE OF SVM FOR ELECTROMAGNETIC SUBSURFACE SENSING. A. Boni, M. Conci, A. Massa, and S. Piffer.

UNIVERSITY OF TRENTO ON THE USE OF SVM FOR ELECTROMAGNETIC SUBSURFACE SENSING. A. Boni, M. Conci, A. Massa, and S. Piffer. UIVRSITY OF TRTO DIPARTITO DI IGGRIA SCIZA DLL IFORAZIO 3823 Povo Trento (Italy) Via Soarive 4 http://www.disi.unitn.it O TH US OF SV FOR LCTROAGTIC SUBSURFAC SSIG A. Boni. Conci A. assa and S. Piffer

More information

A Note on the Applied Use of MDL Approximations

A Note on the Applied Use of MDL Approximations A Note on the Applied Use of MDL Approxiations Daniel J. Navarro Departent of Psychology Ohio State University Abstract An applied proble is discussed in which two nested psychological odels of retention

More information

Stochastic Subgradient Methods

Stochastic Subgradient Methods Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods

More information

Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison

Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison yströ Method vs : A Theoretical and Epirical Coparison Tianbao Yang, Yu-Feng Li, Mehrdad Mahdavi, Rong Jin, Zhi-Hua Zhou Machine Learning Lab, GE Global Research, San Raon, CA 94583 Michigan State University,

More information

Estimating Parameters for a Gaussian pdf

Estimating Parameters for a Gaussian pdf Pattern Recognition and achine Learning Jaes L. Crowley ENSIAG 3 IS First Seester 00/0 Lesson 5 7 Noveber 00 Contents Estiating Paraeters for a Gaussian pdf Notation... The Pattern Recognition Proble...3

More information

List Scheduling and LPT Oliver Braun (09/05/2017)

List Scheduling and LPT Oliver Braun (09/05/2017) List Scheduling and LPT Oliver Braun (09/05/207) We investigate the classical scheduling proble P ax where a set of n independent jobs has to be processed on 2 parallel and identical processors (achines)

More information

arxiv: v1 [cs.ds] 3 Feb 2014

arxiv: v1 [cs.ds] 3 Feb 2014 arxiv:40.043v [cs.ds] 3 Feb 04 A Bound on the Expected Optiality of Rando Feasible Solutions to Cobinatorial Optiization Probles Evan A. Sultani The Johns Hopins University APL evan@sultani.co http://www.sultani.co/

More information

Consistent Multiclass Algorithms for Complex Performance Measures. Supplementary Material

Consistent Multiclass Algorithms for Complex Performance Measures. Supplementary Material Consistent Multiclass Algoriths for Coplex Perforance Measures Suppleentary Material Notations. Let λ be the base easure over n given by the unifor rando variable (say U over n. Hence, for all easurable

More information

Convex Programming for Scheduling Unrelated Parallel Machines

Convex Programming for Scheduling Unrelated Parallel Machines Convex Prograing for Scheduling Unrelated Parallel Machines Yossi Azar Air Epstein Abstract We consider the classical proble of scheduling parallel unrelated achines. Each job is to be processed by exactly

More information

Multiple Instance Learning with Query Bags

Multiple Instance Learning with Query Bags Multiple Instance Learning with Query Bags Boris Babenko UC San Diego bbabenko@cs.ucsd.edu Piotr Dollár California Institute of Technology pdollar@caltech.edu Serge Belongie UC San Diego sjb@cs.ucsd.edu

More information

Boosting. Acknowledgment Slides are based on tutorials from Robert Schapire and Gunnar Raetsch

Boosting. Acknowledgment Slides are based on tutorials from Robert Schapire and Gunnar Raetsch . Machine Learning Boosting Prof. Dr. Martin Riedmiller AG Maschinelles Lernen und Natürlichsprachliche Systeme Institut für Informatik Technische Fakultät Albert-Ludwigs-Universität Freiburg riedmiller@informatik.uni-freiburg.de

More information

Interactive Markov Models of Evolutionary Algorithms

Interactive Markov Models of Evolutionary Algorithms Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary

More information

Learnability and Stability in the General Learning Setting

Learnability and Stability in the General Learning Setting Learnability and Stability in the General Learning Setting Shai Shalev-Shwartz TTI-Chicago shai@tti-c.org Ohad Shair The Hebrew University ohadsh@cs.huji.ac.il Nathan Srebro TTI-Chicago nati@uchicago.edu

More information

New Bounds for Learning Intervals with Implications for Semi-Supervised Learning

New Bounds for Learning Intervals with Implications for Semi-Supervised Learning JMLR: Workshop and Conference Proceedings vol (1) 1 15 New Bounds for Learning Intervals with Iplications for Sei-Supervised Learning David P. Helbold dph@soe.ucsc.edu Departent of Coputer Science, University

More information

arxiv: v1 [cs.lg] 8 Jan 2019

arxiv: v1 [cs.lg] 8 Jan 2019 Data Masking with Privacy Guarantees Anh T. Pha Oregon State University phatheanhbka@gail.co Shalini Ghosh Sasung Research shalini.ghosh@gail.co Vinod Yegneswaran SRI international vinod@csl.sri.co arxiv:90.085v

More information

Bounds on the Minimax Rate for Estimating a Prior over a VC Class from Independent Learning Tasks

Bounds on the Minimax Rate for Estimating a Prior over a VC Class from Independent Learning Tasks Bounds on the Miniax Rate for Estiating a Prior over a VC Class fro Independent Learning Tasks Liu Yang Steve Hanneke Jaie Carbonell Deceber 01 CMU-ML-1-11 School of Coputer Science Carnegie Mellon University

More information

Research in Area of Longevity of Sylphon Scraies

Research in Area of Longevity of Sylphon Scraies IOP Conference Series: Earth and Environental Science PAPER OPEN ACCESS Research in Area of Longevity of Sylphon Scraies To cite this article: Natalia Y Golovina and Svetlana Y Krivosheeva 2018 IOP Conf.

More information

Foundations of Machine Learning Lecture 5. Mehryar Mohri Courant Institute and Google Research

Foundations of Machine Learning Lecture 5. Mehryar Mohri Courant Institute and Google Research Foundations of Machine Learning Lecture 5 Mehryar Mohri Courant Institute and Google Research ohri@cis.nyu.edu Kernel Methods Motivation Non-linear decision boundary. Efficient coputation of inner products

More information

A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words)

A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words) 1 A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine (1900 words) Contact: Jerry Farlow Dept of Matheatics Univeristy of Maine Orono, ME 04469 Tel (07) 866-3540 Eail: farlow@ath.uaine.edu

More information

Lecture 21. Interior Point Methods Setup and Algorithm

Lecture 21. Interior Point Methods Setup and Algorithm Lecture 21 Interior Point Methods In 1984, Kararkar introduced a new weakly polynoial tie algorith for solving LPs [Kar84a], [Kar84b]. His algorith was theoretically faster than the ellipsoid ethod and

More information

Symmetrization and Rademacher Averages

Symmetrization and Rademacher Averages Stat 928: Statistical Learning Theory Lecture: Syetrization and Radeacher Averages Instructor: Sha Kakade Radeacher Averages Recall that we are interested in bounding the difference between epirical and

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

COMS 4771 Lecture Boosting 1 / 16

COMS 4771 Lecture Boosting 1 / 16 COMS 4771 Lecture 12 1. Boosting 1 / 16 Boosting What is boosting? Boosting: Using a learning algorithm that provides rough rules-of-thumb to construct a very accurate predictor. 3 / 16 What is boosting?

More information

Precise Statements of Convergence for AdaBoost and arc-gv

Precise Statements of Convergence for AdaBoost and arc-gv Contemporary Mathematics Precise Statements of Convergence for AdaBoost and arc-gv Cynthia Rudin, Robert E Schapire, and Ingrid Daubechies We wish to dedicate this paper to Leo Breiman Abstract We present

More information

Predictive Vaccinology: Optimisation of Predictions Using Support Vector Machine Classifiers

Predictive Vaccinology: Optimisation of Predictions Using Support Vector Machine Classifiers Predictive Vaccinology: Optiisation of Predictions Using Support Vector Machine Classifiers Ivana Bozic,2, Guang Lan Zhang 2,3, and Vladiir Brusic 2,4 Faculty of Matheatics, University of Belgrade, Belgrade,

More information