Machine Learning Lecture 10

Size: px
Start display at page:

Download "Machine Learning Lecture 10"

Transcription

1 Today s Topic Machie Learig Lecture 10 Neural Networks Bastia Leibe RWTH Aache leibe@visio.rwth-aache.de Deep Learig 2 Course Outlie Recap: AdaBoost Adaptive Boostig Fudametals Bayes Decisio Theory Probability Desity Estimatio Classificatio Approaches Liear Discrimiats Support Vector Machies Esemble Methods & Boostig (Radom Forests) Mai idea [Freud & Schapire, 1996] Iteratively select a esemble of compoet classifiers After each iteratio, reweight misclassified traiig examples. Icrease the chace of beig selected i a sampled traiig set. Or icrease the misclassificatio cost whe traiig o the full set. Compoets h m (x): weak or base classifier Coditio: <50% traiig error over ay distributio H(x): strog or fial classifier Deep Learig Foudatios Covolutioal Neural Networks Recurret Neural Networks 3 AdaBoost: Costruct a strog classifier as a thresholded liear combiatio of the weighted weak classifiers: Ã M! X H(x) = sig m h m (x) m=1 4 Recap: AdaBoost Algorithm Recap: Miimizig Expoetial Error 1. Iitializatio: Set w (1) = 1 for = 1,,N. N 2. For m = 1,,M iteratios a) Trai a ew weak classifier h m (x) usig the curret weightig coefficiets W (m) by miimizig the weighted error fuctio J m = w (m) I(hm(x) 6= t) b) Estimate the weighted error of this classifier o X: ² m = w(m) I(h m(x) 6= t ) w(m) c) Calculate a weightig coefficiet for h m (x): m =? d) Update the weightig coefficiets: w (m+1) =? How should we do this exactly? 5 The origial algorithm used a expoetial error fuctio E = exp f t f m (x )g where f m (x) is a classifier defied as a liear combiatio of base classifiers h l (x): f m (x) = 1 mx l h l (x) 2 l=1 Goal Miimize E with respect to both the weightig coefficiets l ad the parameters of the base classifiers h l (x). 6 1

2 Recap: Miimizig Expoetial Error AdaBoost Miimizig Expoetial Error Sequetial Miimizatio (cotiuatio from last lecture) Oly miimize with respect to m ad h m (x) f m(x) = 1 mx E = exp f t f m (x )g with lh l(x) 2 l=1 ½ = exp t f m 1 (x ) 1 ¾ 2 t m h m (x ) = w (m) exp = cost. ½ 1 ¾ 2 t m h m (x ) = E = ³e m=2 X N m=2 e w (m) I(hm(x) 6= t) + e m=2 w (m) Miimize with respect to h m (x): = 0 E = ³e m(x ) X N m=2 e w (m) I(hm(x) 6= t) + e m=2 w (m) = cost. This is equivalet to miimizig J m = w (m) I(hm(x) 6= t) (our weighted error fuctio from step 2a) of the algorithm) We re o the right track. Let s cotiue = cost. 8 AdaBoost Miimizig Expoetial Miimize with respect to m : = 0 E = ³e m X N m=2 e w (m) I(hm(x) 6= t) + e m=2 Update for the coefficiets: µ 1 2 e m=2 + 1 X N 2 e m=2 w (m) I(hm(x) 6= t)! weighted error ² m := w(m) I(h m(x ) 6= t ) w(m) = 1 2 e m=2 N X = w (m) w (m) e m=2 e m=2 + e m=2 1 ² m = e m + 1 ½ ¾ 1 ²m m = l ² m AdaBoost Miimizig Expoetial Error Remaiig step: update the weights Recall that ½ E = w (m) exp 1 ¾ 2 t m h m (x ) Therefore w (m+1) Update for the weight coefficiets. This becomes w (m+1) i the ext iteratio. ½ = w (m) exp 1 ¾ 2 t m h m (x ) = ::: = w (m) exp f m I(h m (x ) 6= t )g 9 10 AdaBoost Fial Algorithm AdaBoost Aalysis 1. Iitializatio: Set w (1) = 1 for = 1,,N. N 2. For m = 1,,M iteratios a) Trai a ew weak classifier h m (x) usig the curret weightig coefficiets W (m) by miimizig the weighted error fuctio J m = w (m) I(hm(x) 6= t) b) Estimate the weighted error of this classifier o X: ² m = w(m) I(h m(x) 6= t ) w(m) c) Calculate a weightig ½coefficiet ¾ for h m (x): 1 ²m m = l ² m d) Update the weightig coefficiets: w (m+1) = w (m) exp f mi(h m(x ) 6= t )g 11 Result of this derivatio We ow kow that AdaBoost miimizes a expoetial error fuctio i a sequetial fashio. This allows us to aalyze AdaBoost s behavior i more detail. I particular, we ca see how robust it is to outlier data poits. 12 2

3 Recap: Error Fuctios Recap: Error Fuctios Ideal misclassificatio error Ideal misclassificatio error Squared error Sesitive to outliers! Pealizes too correct data poits! Not differetiable! Ideal misclassificatio error fuctio (black) This is what we wat to approximate, Ufortuately, it is ot differetiable. The gradiet is zero for misclassified poits. z = t y(x ) We caot miimize it by gradiet descet. 13 Squared error used i Least-Squares Classificatio Very popular, leads to closed-form solutios. However, sesitive to outliers due to squared pealty. Pealizes too correct data poits z = t y(x ) Geerally does ot lead to good classifiers. 14 Recap: Error Fuctios Discussio: AdaBoost Error Fuctio Robust to outliers! Ideal misclassificatio error Squared error Hige error Ideal misclassificatio error Squared error Hige error Expoetial error Not differetiable! Favors sparse solutios! z = t y(x ) z = t y(x ) Hige error used i SVMs Zero error for poits outside the margi (z > 1) sparsity Liear pealty for misclassified poits (z < 1) robustess Not differetiable aroud z = 1 Caot be optimized directly. 15 Expoetial error used i AdaBoost Cotiuous approximatio to ideal misclassificatio fuctio. Sequetial miimizatio leads to simple AdaBoost scheme. Properties? 16 Discussio: AdaBoost Error Fuctio Discussio: Other Possible Error Fuctios Sesitive to outliers! Ideal misclassificatio error Squared error Hige error Expoetial error Ideal misclassificatio error Squared error Hige error Expoetial error Cross-etropy error E = X ft l y + (1 t) l(1 y)g Expoetial error used i AdaBoost No pealty for too correct data poits, fast covergece. Disadvatage: expoetial pealty for large egative values! Less robust to outliers or misclassified data poits! z = t y(x ) 17 z = t y(x ) Cross-etropy error used i Logistic Regressio Similar to expoetial error for z>0. Oly grows liearly with large egative values of z. Make AdaBoost more robust by switchig to this error fuctio. GetleBoost 18 3

4 Summary: AdaBoost Today s Topic Properties Simple combiatio of multiple classifiers. Easy to implemet. Ca be used with may differet types of classifiers. Noe of them eeds to be too good o its ow. I fact, they oly have to be slightly better tha chace. Commoly used i may areas. Empirically good geeralizatio capabilities. Limitatios Origial AdaBoost sesitive to misclassified traiig data poits. Because of expoetial error fuctio. Improvemet by GetleBoost Sigle-class classifier Multiclass extesios available 19 Deep Learig 20 Perceptros Defiitio Loss fuctios Regularizatio 1957 Roseblatt ivets the Perceptro Ad a cool learig algorithm: Perceptro Learig Hardware implemetatio Mark I Perceptro for pixel image aalysis Multi-Layer Perceptros Defiitio Learig with hidde uits Obtaiig the Gradiets Naive aalytical differetiatio Numerical differetiatio Backpropagatio The embryo of a electroic computer that [...] will be able to walk, talk, see, write, reproduce itself ad be coscious of its existece Image source: Wikipedia, clipartpada.com 1957 Roseblatt ivets the Perceptro 1969 Misky & Papert They showed that (sigle-layer) Perceptros caot solve all problems. This was misuderstood by may that they were worthless Roseblatt ivets the Perceptro 1969 Misky & Papert 1980s Resurgece of Neural Networks Some otable successes with multi-layer perceptros. Backpropagatio learig algorithm Neural Networks do t work! OMG! They work like the huma brai! Oh o! Killer robots will achieve world domiatio! 23 Image source: colourbox.de, thikstock 24 Image sources: clipartpada.com, cliparts.co 4

5 1957 Roseblatt ivets the Perceptro 1969 Misky & Papert 1980s Resurgece of Neural Networks Some otable successes with multi-layer perceptros. Backpropagatio learig algorithm But they are hard to trai, ted to overfit, ad have uituitive parameters. So, the excitemet fades agai sigh! 1957 Roseblatt ivets the Perceptro 1969 Misky & Papert 1980s Resurgece of Neural Networks Iterest shifts to other learig methods Notably Support Vector Machies Machie Learig becomes a disciplie of its ow. I ca do sciece, me! 25 Image source: clipartof.com, colourbox.de 26 Image source: clipartof.com 1957 Roseblatt ivets the Perceptro 1969 Misky & Papert 1980s Resurgece of Neural Networks Iterest shifts to other learig methods Notably Support Vector Machies Machie Learig becomes a disciplie of its ow. The geeral public ad the press still love Neural Networks. I m doig Machie Learig Roseblatt ivets the Perceptro 1969 Misky & Papert 1980s Resurgece of Neural Networks Iterest shifts to other learig methods Gradual progress Better uderstadig how to successfully trai deep etworks Availability of large datasets ad powerful GPUs Still largely uder the radar for may disciplies applyig ML So, you re usig Neural Networks? Are you usig Neural Networks? Actually... Come o. Get real! Roseblatt ivets the Perceptro 1969 Misky & Papert 1980s Resurgece of Neural Networks Iterest shifts to other learig methods Gradual progress 2012 Breakthrough results ImageNet Large Scale Visual Recogitio Challege A CovNet halves the error rate of dedicated visio approaches. Deep Learig is widely adopted. It works! Perceptros Defiitio Loss fuctios Regularizatio Multi-Layer Perceptros Defiitio Learig with hidde uits Obtaiig the Gradiets Naive aalytical differetiatio Numerical differetiatio Backpropagatio 29 Image source: clipartpada.com, clipartof.com 30 5

6 Perceptros (Roseblatt 1957) Stadard Perceptro Extesio: Multi-Class Networks Oe output ode per class Output layer Output layer Weights Weights Iput layer Iput layer Iput Layer Had-desiged features based o commo sese Outputs Liear outputs Logistic outputs Outputs Liear outputs Logistic outputs Learig = Determiig the weights w 31 Ca be used to do multidimesioal liear regressio or multiclass classificatio. Slide adapted from Stefa Roth 32 Extesio: No-Liear Basis Fuctios Extesio: No-Liear Basis Fuctios Straightforward geeralizatio Straightforward geeralizatio Output layer Weights Feature layer Mappig (fixed) Iput layer W kd Output layer Weights Feature layer Mappig (fixed) Iput layer Outputs Liear outputs Logistic outputs Remarks Perceptros are geeralized liear discrimiats! Everythig we kow about the latter ca also be applied here. Note: feature fuctios Á(x) are kept fixed, ot leared! Perceptro Learig Perceptro Learig Very simple algorithm Let s aalyze this algorithm... Process the traiig cases i some permutatio If the output uit is correct, leave the weights aloe. If the output uit icorrectly outputs a zero, add the iput vector to the weight vector. If the output uit icorrectly outputs a oe, subtract the iput vector from the weight vector. Process the traiig cases i some permutatio If the output uit is correct, leave the weights aloe. If the output uit icorrectly outputs a zero, add the iput vector to the weight vector. If the output uit icorrectly outputs a oe, subtract the iput vector from the weight vector. This is guarateed to coverge to a correct solutio if such a solutio exists. Traslatio w ( +1) kj = w ( ) kj (y k(x ; w) t k ) Á j (x ) Slide adapted from Geoff Hito 35 Slide adapted from Geoff Hito 36 6

7 Perceptro Learig Let s aalyze this algorithm... Process the traiig cases i some permutatio If the output uit is correct, leave the weights aloe. If the output uit icorrectly outputs a zero, add the iput vector to the weight vector. If the output uit icorrectly outputs a oe, subtract the iput vector from the weight vector. Traslatio w ( +1) kj = w ( ) kj (y k(x k(x ; w) t k ) Á j (x ) This is the Delta rule a.k.a. LMS rule! Perceptro Learig correspods to 1 st -order (stochastic) Gradiet Descet (e.g., of a quadratic error fuctio)! Slide adapted from Geoff Hito 37 Loss Fuctios We ca ow also apply other loss fuctios L2 loss L1 loss: Cross-etropy loss Hige loss Softmax loss L(t; y(x)) = P P k Least-squares regressio Media regressio Logistic regressio SVM classificatio Multi-class probabilistic classificatio I (t = k) l P exp(yk(x)) j exp(yj(x)) o 38 Regularizatio Limitatios of Perceptros I additio, we ca apply regularizers E.g., a L2 regularizer What makes the task difficult? Perceptros with fixed, had-coded iput features ca model ay separable fuctio perfectly......give the right iput features. This is kow as weight decay i Neural Networks. We ca also apply other regularizers, e.g. L1 sparsity Sice Neural Networks ofte have may parameters, regularizatio becomes very importat i practice. We will see more complex regularizatio techiques later o... For some tasks this requires a expoetial umber of iput features. E.g., by eumeratig all possible biary iput vectors as separate feature uits (similar to a look-up table). But this approach wo t geeralize to usee test cases! It is the feature desig that solves the task! Oce the had-coded features have bee determied, there are very strog limitatios o what a perceptro ca lear. Classic example: XOR fuctio Wait... Did t we just say that... Perceptros correspod to geeralized liear discrimiats Ad Perceptros are very limited... Does t this mea that what we have bee doig so far i this lecture has the same problems??? Yes, this is the case. A liear classifier caot solve certai problems (e.g., XOR). However, with a o-liear classifier based o the right kid of features, the problem becomes solvable. So far, we have solved such problems by had-desigig good features Á ad kerels Á > Á. Ca we also lear such feature represetatios? Perceptros Defiitio Loss fuctios Regularizatio Multi-Layer Perceptros Defiitio Learig with hidde uits Obtaiig the Gradiets Naive aalytical differetiatio Numerical differetiatio Backpropagatio

8 Multi-Layer Perceptros Multi-Layer Perceptros Addig more layers Output layer Hidde layer Mappig (leared!) Iput layer Activatio fuctios g : For example: g (2) (a) = ¾(a), g (1) (a) = a The hidde layer ca have a arbitrary umber of odes There ca also be multiple hidde layers. Output Uiversal approximators A 2-layer etwork (1 hidde layer) ca approximate ay cotiuous fuctio of a compact domai arbitrarily well! (assumig sufficiet hidde odes) Slide adapted from Stefa Roth 43 Slide credit: Stefa Roth 44 Learig with Hidde Uits Networks without hidde uits are very limited i what they ca lear More layers of liear uits do ot help still liear Fixed output o-liearities are ot eough. We eed multiple layers of adaptive o-liear hidde uits. But how ca we trai such ets? Need a efficiet way of adaptig all weights, ot just the last layer. Learig the weights to the hidde uits = learig features This is difficult, because obody tells us what the hidde uits should do. Mai challege i deep learig. Slide adapted from Geoff Hito 45 Learig with Hidde Uits How ca we trai multi-layer etworks efficietly? Need a efficiet way of adaptig all weights, ot just the last layer. Idea: Gradiet Descet Set up a error fuctio with a loss L( ) ad a regularizer ( ). E.g., Update each weight i the directio of the gradiet L 2 loss L 2 regularizer ( weight decay ) 46 Gradiet Descet Two mai steps 1. Computig the gradiets for each weight 2. Adjustig the weights i the directio of the gradiet today ext lecture Perceptros Defiitio Loss fuctios Regularizatio Multi-Layer Perceptros Defiitio Learig with hidde uits Obtaiig the Gradiets Naive aalytical differetiatio Numerical differetiatio Backpropagatio

9 Obtaiig the Gradiets Approach 1: Naive Aalytical Differetiatio Excursio: Chai Rule of Differetiatio Oe-dimesioal case: Scalar fuctios Compute the gradiets for each variable aalytically. What is the problem whe doig this? Excursio: Chai Rule of Differetiatio Obtaiig the Gradiets Multi-dimesioal case: Total derivative Approach 1: Naive Aalytical Differetiatio Need to sum over all paths that lead to the target variable x. 51 Compute the gradiets for each variable aalytically. What is the problem whe doig this? With icreasig depth, there will be expoetially may paths! Ifeasible to compute this way. 52 Obtaiig the Gradiets Approach 2: Numerical Differetiatio Perceptros Defiitio Loss fuctios Regularizatio Multi-Layer Perceptros Defiitio Learig with hidde uits Obtaiig the Gradiets Naive aalytical differetiatio Numerical differetiatio Backpropagatio 53 Give the curret state W ( ), we ca evaluate E(W ( ) ). Idea: Make small chages to W ( ) ad accept those that improve E(W ( ) ). Horribly iefficiet! Need several forward passes for each weight. Each forward pass is oe ru over the etire dataset! 54 9

10 Obtaiig the Gradiets Approach 3: Icremetal Aalytical Differetiatio Perceptros Defiitio Loss fuctios Regularizatio Multi-Layer Perceptros Defiitio Learig with hidde uits Obtaiig the Gradiets Naive aalytical differetiatio Numerical differetiatio Backpropagatio Idea: Compute the gradiets layer by layer. Each layer below builds upo the results of the layer above. The gradiet is propagated backwards through the layers. Backpropagatio algorithm Backpropagatio Algorithm Backpropagatio Algorithm Core steps 1. Covert the discrepacy betwee each output ad its target value ito a error derivate. 2. Compute error derivatives i each hidde layer from error derivatives i the layer above. 3. Use error derivatives w.r.t. activities to get error derivatives w.r.t. the icomig weights Slide adapted from Geoff Hito y i w ji 58 Notatio y j z j y j z j y i Slide adapted from Geoff Hito Output of layer k Iput of layer k z = y j j Coectios: = g z j z j = w ji yi i y j = g zj 63 Backpropagatio Algorithm Backpropagatio Algorithm y j z j z = y j j = g z j y j z j z = y j j = g z j y i y i = j = y i j w ji y i y i = j = y i j w ji Notatio y j Output of layer k Coectios: z j Iput of layer k Slide adapted from Geoff Hito z j = w ji yi i z y j j = g zj 64 y = w ji i Notatio y j z j Slide adapted from Geoff Hito Output of layer k Iput of layer k w = z j = y i ji w ji Coectios: z j = w ji yi i z y j j = g zj 65 w = y i ji 10

11 Backpropagatio Algorithm Summary: MLP Backpropagatio y j z j z = y j j = g z j Forward Pass for k = 1,..., l do Backward Pass for k = l, l-1,...,1 do y i y i = j = y i j w ji edfor Efficiet propagatio scheme w = z j = y i ji w ji y i is already kow from forward pass! (Dyamic Programmig) Propagate back the gradiet from layer k ad multiply with y i. 66 Slide adapted from Geoff Hito edfor Notes For efficiecy, a etire batch of data X is processed at oce. deotes the elemet-wise product 67 Aalysis: Backpropagatio Refereces ad Further Readig Backpropagatio is the key to make deep NNs tractable However... More iformatio o Neural Networks ca be foud i Chapters 6 ad 7 of the Goodfellow & Begio book The Backprop algorithm give here is specific to MLPs It does ot work with more complex architectures, e.g. skip coectios or recurret etworks! Wheever a ew coectio fuctio iduces a differet fuctioal form of the chai rule, you have to derive a ew Backprop algorithm for it. Tedious... I. Goodfellow, Y. Begio, A. Courville Deep Learig MIT Press, 2016 Let s aalyze Backprop i more detail This will lead us to a more flexible algorithm formulatio Next lecture

Machine Learning Lecture 10

Machine Learning Lecture 10 Machine Learning Lecture 10 Neural Networks 26.11.2018 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Today s Topic Deep Learning 2 Course Outline Fundamentals Bayes

More information

Machine Learning Lecture 12

Machine Learning Lecture 12 Machine Learning Lecture 12 Neural Networks 30.11.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory Probability

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

Multilayer perceptrons

Multilayer perceptrons Multilayer perceptros If traiig set is ot liearly separable, a etwork of McCulloch-Pitts uits ca give a solutio If o loop exists i etwork, called a feedforward etwork (else, recurret etwork) A two-layer

More information

Lecture 12. Neural Networks Bastian Leibe RWTH Aachen

Lecture 12. Neural Networks Bastian Leibe RWTH Aachen Advanced Machine Learning Lecture 12 Neural Networks 10.12.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de This Lecture: Advanced Machine Learning Regression

More information

Boosting. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 1, / 32

Boosting. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 1, / 32 Boostig Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machie Learig Algorithms March 1, 2017 1 / 32 Outlie 1 Admiistratio 2 Review of last lecture 3 Boostig Professor Ameet Talwalkar CS260

More information

Support vector machine revisited

Support vector machine revisited 6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector

More information

Machine Learning. Ilya Narsky, Caltech

Machine Learning. Ilya Narsky, Caltech Machie Learig Ilya Narsky, Caltech Lecture 4 Multi-class problems. Multi-class versios of Neural Networks, Decisio Trees, Support Vector Machies ad AdaBoost. Reductio of a multi-class problem to a set

More information

Week 1, Lecture 2. Neural Network Basics. Announcements: HW 1 Due on 10/8 Data sets for HW 1 are online Project selection 10/11. Suggested reading :

Week 1, Lecture 2. Neural Network Basics. Announcements: HW 1 Due on 10/8 Data sets for HW 1 are online Project selection 10/11. Suggested reading : ME 537: Learig-Based Cotrol Week 1, Lecture 2 Neural Network Basics Aoucemets: HW 1 Due o 10/8 Data sets for HW 1 are olie Proect selectio 10/11 Suggested readig : NN survey paper (Zhag Chap 1, 2 ad Sectios

More information

ME 539, Fall 2008: Learning-Based Control

ME 539, Fall 2008: Learning-Based Control ME 539, Fall 2008: Learig-Based Cotrol Neural Network Basics 10/1/2008 & 10/6/2008 Uiversity Orego State Neural Network Basics Questios??? Aoucemet: Homework 1 has bee posted Due Friday 10/10/08 at oo

More information

Admin REGULARIZATION. Schedule. Midterm 9/29/16. Assignment 5. Midterm next week, due Friday (more on this in 1 min)

Admin REGULARIZATION. Schedule. Midterm 9/29/16. Assignment 5. Midterm next week, due Friday (more on this in 1 min) Admi Assigmet 5! Starter REGULARIZATION David Kauchak CS 158 Fall 2016 Schedule Midterm ext week, due Friday (more o this i 1 mi Assigmet 6 due Friday before fall break Midterm Dowload from course web

More information

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) CSCI567 Machie Learig (Fall 2014) Drs. Sha & Liu {feisha,yaliu.cs}@usc.edu October 14, 2014 Drs. Sha & Liu ({feisha,yaliu.cs}@usc.edu) CSCI567 Machie Learig (Fall 2014) October 14, 2014 1 / 49 Outlie Admiistratio

More information

Deep Neural Networks CMSC 422 MARINE CARPUAT. Deep learning slides credit: Vlad Morariu

Deep Neural Networks CMSC 422 MARINE CARPUAT. Deep learning slides credit: Vlad Morariu Deep Neural Networks CMSC 422 MARINE CARPUAT marie@cs.umd.edu Deep learig slides credit: Vlad Morariu Traiig (Deep) Neural Networks Computatioal graphs Improvemets to gradiet descet Stochastic gradiet

More information

6.867 Machine learning

6.867 Machine learning 6.867 Machie learig Mid-term exam October, ( poits) Your ame ad MIT ID: Problem We are iterested here i a particular -dimesioal liear regressio problem. The dataset correspodig to this problem has examples

More information

Mixtures of Gaussians and the EM Algorithm

Mixtures of Gaussians and the EM Algorithm Mixtures of Gaussias ad the EM Algorithm CSE 6363 Machie Learig Vassilis Athitsos Computer Sciece ad Egieerig Departmet Uiversity of Texas at Arligto 1 Gaussias A popular way to estimate probability desity

More information

Perceptron. Inner-product scalar Perceptron. XOR problem. Gradient descent Stochastic Approximation to gradient descent 5/10/10

Perceptron. Inner-product scalar Perceptron. XOR problem. Gradient descent Stochastic Approximation to gradient descent 5/10/10 Perceptro Ier-product scalar Perceptro Perceptro learig rule XOR problem liear separable patters Gradiet descet Stochastic Approximatio to gradiet descet LMS Adalie 1 Ier-product et =< w, x >= w x cos(θ)

More information

Lecture 9: Boosting. Akshay Krishnamurthy October 3, 2017

Lecture 9: Boosting. Akshay Krishnamurthy October 3, 2017 Lecture 9: Boostig Akshay Krishamurthy akshay@csumassedu October 3, 07 Recap Last week we discussed some algorithmic aspects of machie learig We saw oe very powerful family of learig algorithms, amely

More information

Pattern recognition systems Laboratory 10 Linear Classifiers and the Perceptron Algorithm

Pattern recognition systems Laboratory 10 Linear Classifiers and the Perceptron Algorithm Patter recogitio systems Laboratory 10 Liear Classifiers ad the Perceptro Algorithm 1. Objectives his laboratory sessio presets the perceptro learig algorithm for the liear classifier. We will apply gradiet

More information

Lecture 12. Neural Networks Bastian Leibe RWTH Aachen

Lecture 12. Neural Networks Bastian Leibe RWTH Aachen Advanced Machine Learning Lecture 12 Neural Networks 24.11.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de Talk Announcement Yann LeCun (NYU & FaceBook AI)

More information

Lecture 12. Talk Announcement. Neural Networks. This Lecture: Advanced Machine Learning. Recap: Generalized Linear Discriminants

Lecture 12. Talk Announcement. Neural Networks. This Lecture: Advanced Machine Learning. Recap: Generalized Linear Discriminants Advanced Machine Learning Lecture 2 Neural Networks 24..206 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de Talk Announcement Yann LeCun (NYU & FaceBook AI) 28..

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques

More information

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) CSCI567 Machie Learig (Fall 2014) Drs. Sha & Liu {feisha,yaliu.cs}@usc.edu October 9, 2014 Drs. Sha & Liu ({feisha,yaliu.cs}@usc.edu) CSCI567 Machie Learig (Fall 2014) October 9, 2014 1 / 49 Outlie Admiistratio

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

Pattern recognition systems Lab 10 Linear Classifiers and the Perceptron Algorithm

Pattern recognition systems Lab 10 Linear Classifiers and the Perceptron Algorithm Patter recogitio systems Lab 10 Liear Classifiers ad the Perceptro Algorithm 1. Objectives his lab sessio presets the perceptro learig algorithm for the liear classifier. We will apply gradiet descet ad

More information

1 Review of Probability & Statistics

1 Review of Probability & Statistics 1 Review of Probability & Statistics a. I a group of 000 people, it has bee reported that there are: 61 smokers 670 over 5 960 people who imbibe (drik alcohol) 86 smokers who imbibe 90 imbibers over 5

More information

6.867 Machine learning, lecture 7 (Jaakkola) 1

6.867 Machine learning, lecture 7 (Jaakkola) 1 6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit

More information

Lecture 4. Hw 1 and 2 will be reoped after class for every body. New deadline 4/20 Hw 3 and 4 online (Nima is lead)

Lecture 4. Hw 1 and 2 will be reoped after class for every body. New deadline 4/20 Hw 3 and 4 online (Nima is lead) Lecture 4 Homework Hw 1 ad 2 will be reoped after class for every body. New deadlie 4/20 Hw 3 ad 4 olie (Nima is lead) Pod-cast lecture o-lie Fial projects Nima will register groups ext week. Email/tell

More information

Topics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion

Topics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion .87 Machie learig: lecture Tommi S. Jaakkola MIT CSAIL tommi@csail.mit.edu Topics The learig problem hypothesis class, estimatio algorithm loss ad estimatio criterio samplig, empirical ad epected losses

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple

More information

Introduction to Artificial Intelligence CAP 4601 Summer 2013 Midterm Exam

Introduction to Artificial Intelligence CAP 4601 Summer 2013 Midterm Exam Itroductio to Artificial Itelligece CAP 601 Summer 013 Midterm Exam 1. Termiology (7 Poits). Give the followig task eviromets, eter their properties/characteristics. The properties/characteristics of the

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Linear Classifiers III

Linear Classifiers III Uiversität Potsdam Istitut für Iformatik Lehrstuhl Maschielles Lere Liear Classifiers III Blaie Nelso, Tobias Scheffer Cotets Classificatio Problem Bayesia Classifier Decisio Liear Classifiers, MAP Models

More information

Naïve Bayes. Naïve Bayes

Naïve Bayes. Naïve Bayes Statistical Data Miig ad Machie Learig Hilary Term 206 Dio Sejdiovic Departmet of Statistics Oxford Slides ad other materials available at: http://www.stats.ox.ac.uk/~sejdiov/sdmml : aother plug-i classifier

More information

Introduction to Machine Learning DIS10

Introduction to Machine Learning DIS10 CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig

More information

Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3

Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 No-Parametric Techiques Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 Parametric vs. No-Parametric Parametric Based o Fuctios (e.g Normal Distributio) Uimodal Oly oe peak Ulikely real data cofies

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Patter Recogitio Classificatio: No-Parametric Modelig Hamid R. Rabiee Jafar Muhammadi Sprig 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Ageda Parametric Modelig No-Parametric Modelig

More information

Linear Support Vector Machines

Linear Support Vector Machines Liear Support Vector Machies David S. Roseberg The Support Vector Machie For a liear support vector machie (SVM), we use the hypothesis space of affie fuctios F = { f(x) = w T x + b w R d, b R } ad evaluate

More information

Last time, we talked about how Equation (1) can simulate Equation (2). We asserted that Equation (2) can also simulate Equation (1).

Last time, we talked about how Equation (1) can simulate Equation (2). We asserted that Equation (2) can also simulate Equation (1). 6896 Quatum Complexity Theory Sept 23, 2008 Lecturer: Scott Aaroso Lecture 6 Last Time: Quatum Error-Correctio Quatum Query Model Deutsch-Jozsa Algorithm (Computes x y i oe query) Today: Berstei-Vazirii

More information

Outline. CSCI-567: Machine Learning (Spring 2019) Outline. Prof. Victor Adamchik. Mar. 26, 2019

Outline. CSCI-567: Machine Learning (Spring 2019) Outline. Prof. Victor Adamchik. Mar. 26, 2019 Outlie CSCI-567: Machie Learig Sprig 209 Gaussia mixture models Prof. Victor Adamchik 2 Desity estimatio U of Souther Califoria Mar. 26, 209 3 Naive Bayes Revisited March 26, 209 / 57 March 26, 209 2 /

More information

Introduction to Optimization Techniques. How to Solve Equations

Introduction to Optimization Techniques. How to Solve Equations Itroductio to Optimizatio Techiques How to Solve Equatios Iterative Methods of Optimizatio Iterative methods of optimizatio Solutio of the oliear equatios resultig form a optimizatio problem is usually

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Pixel Recurrent Neural Networks

Pixel Recurrent Neural Networks Pixel Recurret Neural Networks Aa ro va de Oord, Nal Kalchbreer, Koray Kavukcuoglu Google DeepMid August 2016 Preseter - Neha M Example problem (completig a image) Give the first half of the image, create

More information

Machine Learning Regression I Hamid R. Rabiee [Slides are based on Bishop Book] Spring

Machine Learning Regression I Hamid R. Rabiee [Slides are based on Bishop Book] Spring Machie Learig Regressio I Hamid R. Rabiee [Slides are based o Bishop Book] Sprig 015 http://ce.sharif.edu/courses/93-94//ce717-1 Liear Regressio Liear regressio: ivolves a respose variable ad a sigle predictor

More information

Binary classification, Part 1

Binary classification, Part 1 Biary classificatio, Part 1 Maxim Ragisky September 25, 2014 The problem of biary classificatio ca be stated as follows. We have a radom couple Z = (X,Y ), where X R d is called the feature vector ad Y

More information

10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice

10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice 0//008 Liear Discrimiat Fuctios Jacob Hays Amit Pillay James DeFelice 5.8, 5.9, 5. Miimum Squared Error Previous methods oly worked o liear separable cases, by lookig at misclassified samples to correct

More information

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ. 2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For

More information

Expectation-Maximization Algorithm.

Expectation-Maximization Algorithm. Expectatio-Maximizatio Algorithm. Petr Pošík Czech Techical Uiversity i Prague Faculty of Electrical Egieerig Dept. of Cyberetics MLE 2 Likelihood.........................................................................................................

More information

Chapter 7. Support Vector Machine

Chapter 7. Support Vector Machine Chapter 7 Support Vector Machie able of Cotet Margi ad support vectors SVM formulatio Slack variables ad hige loss SVM for multiple class SVM ith Kerels Relevace Vector Machie Support Vector Machie (SVM)

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

FMA901F: Machine Learning Lecture 4: Linear Models for Classification. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 4: Linear Models for Classification. Cristian Sminchisescu FMA90F: Machie Learig Lecture 4: Liear Models for Classificatio Cristia Smichisescu Liear Classificatio Classificatio is itrisically o liear because of the traiig costraits that place o idetical iputs

More information

IP Reference guide for integer programming formulations.

IP Reference guide for integer programming formulations. IP Referece guide for iteger programmig formulatios. by James B. Orli for 15.053 ad 15.058 This documet is iteded as a compact (or relatively compact) guide to the formulatio of iteger programs. For more

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week Lecture: Cocept Check Exercises Starred problems are optioal. Statistical Learig Theory. Suppose A = Y = R ad X is some other set. Furthermore, assume P X Y is a discrete

More information

6.883: Online Methods in Machine Learning Alexander Rakhlin

6.883: Online Methods in Machine Learning Alexander Rakhlin 6.883: Olie Methods i Machie Learig Alexader Rakhli LECTURE 23. SOME CONSEQUENCES OF ONLINE NO-REGRET METHODS I this lecture, we explore some cosequeces of the developed techiques.. Covex optimizatio Wheever

More information

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We

More information

CHAPTER 10 INFINITE SEQUENCES AND SERIES

CHAPTER 10 INFINITE SEQUENCES AND SERIES CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece

More information

CS 2750 Machine Learning. Lecture 23. Concept learning. CS 2750 Machine Learning. Concept Learning

CS 2750 Machine Learning. Lecture 23. Concept learning. CS 2750 Machine Learning. Concept Learning Lecture 3 Cocept learig Milos Hauskrecht milos@cs.pitt.edu Cocept Learig Outlie: Learig boolea fuctios Most geeral ad most specific cosistet hypothesis. Mitchell s versio space algorithm Probably approximately

More information

Support Vector Machines and Kernel Methods

Support Vector Machines and Kernel Methods Support Vector Machies ad Kerel Methods Daiel Khashabi Fall 202 Last Update: September 26, 206 Itroductio I Support Vector Machies the goal is to fid a separator betwee data which has the largest margi,

More information

Information-based Feature Selection

Information-based Feature Selection Iformatio-based Feature Selectio Farza Faria, Abbas Kazeroui, Afshi Babveyh Email: {faria,abbask,afshib}@staford.edu 1 Itroductio Feature selectio is a topic of great iterest i applicatios dealig with

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Vector Quantization: a Limiting Case of EM

Vector Quantization: a Limiting Case of EM . Itroductio & defiitios Assume that you are give a data set X = { x j }, j { 2,,, }, of d -dimesioal vectors. The vector quatizatio (VQ) problem requires that we fid a set of prototype vectors Z = { z

More information

Some examples of vector spaces

Some examples of vector spaces Roberto s Notes o Liear Algebra Chapter 11: Vector spaces Sectio 2 Some examples of vector spaces What you eed to kow already: The te axioms eeded to idetify a vector space. What you ca lear here: Some

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

CHAPTER 5. Theory and Solution Using Matrix Techniques

CHAPTER 5. Theory and Solution Using Matrix Techniques A SERIES OF CLASS NOTES FOR 2005-2006 TO INTRODUCE LINEAR AND NONLINEAR PROBLEMS TO ENGINEERS, SCIENTISTS, AND APPLIED MATHEMATICIANS DE CLASS NOTES 3 A COLLECTION OF HANDOUTS ON SYSTEMS OF ORDINARY DIFFERENTIAL

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture & 3: Pricipal Compoet Aalysis The text i black outlies high level ideas. The text i blue provides simple mathematical details to derive or get to the algorithm

More information

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machine Learning 8.657: Mathematics of Machie Learig Lecturer: Philippe Rigollet Lecture 0 Scribe: Ade Forrow Oct. 3, 05 Recall the followig defiitios from last time: Defiitio: A fuctio K : X X R is called a positive symmetric

More information

LECTURE 17: Linear Discriminant Functions

LECTURE 17: Linear Discriminant Functions LECURE 7: Liear Discrimiat Fuctios Perceptro leari Miimum squared error (MSE) solutio Least-mea squares (LMS) rule Ho-Kashyap procedure Itroductio to Patter Aalysis Ricardo Gutierrez-Osua exas A&M Uiversity

More information

Castiel, Supernatural, Season 6, Episode 18

Castiel, Supernatural, Season 6, Episode 18 13 Differetial Equatios the aswer to your questio ca best be epressed as a series of partial differetial equatios... Castiel, Superatural, Seaso 6, Episode 18 A differetial equatio is a mathematical equatio

More information

The Expectation-Maximization (EM) Algorithm

The Expectation-Maximization (EM) Algorithm The Expectatio-Maximizatio (EM) Algorithm Readig Assigmets T. Mitchell, Machie Learig, McGraw-Hill, 997 (sectio 6.2, hard copy). S. Gog et al. Dyamic Visio: From Images to Face Recogitio, Imperial College

More information

Apply change-of-basis formula to rewrite x as a linear combination of eigenvectors v j.

Apply change-of-basis formula to rewrite x as a linear combination of eigenvectors v j. Eigevalue-Eigevector Istructor: Nam Su Wag eigemcd Ay vector i real Euclidea space of dimesio ca be uiquely epressed as a liear combiatio of liearly idepedet vectors (ie, basis) g j, j,,, α g α g α g α

More information

Optimization Methods MIT 2.098/6.255/ Final exam

Optimization Methods MIT 2.098/6.255/ Final exam Optimizatio Methods MIT 2.098/6.255/15.093 Fial exam Date Give: December 19th, 2006 P1. [30 pts] Classify the followig statemets as true or false. All aswers must be well-justified, either through a short

More information

Signals & Systems Chapter3

Signals & Systems Chapter3 Sigals & Systems Chapter3 1.2 Discrete-Time (D-T) Sigals Electroic systems do most of the processig of a sigal usig a computer. A computer ca t directly process a C-T sigal but istead eeds a stream of

More information

CHAPTER I: Vector Spaces

CHAPTER I: Vector Spaces CHAPTER I: Vector Spaces Sectio 1: Itroductio ad Examples This first chapter is largely a review of topics you probably saw i your liear algebra course. So why cover it? (1) Not everyoe remembers everythig

More information

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense,

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense, 3. Z Trasform Referece: Etire Chapter 3 of text. Recall that the Fourier trasform (FT) of a DT sigal x [ ] is ω ( ) [ ] X e = j jω k = xe I order for the FT to exist i the fiite magitude sese, S = x [

More information

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet

More information

Lecture 11: Pseudorandom functions

Lecture 11: Pseudorandom functions COM S 6830 Cryptography Oct 1, 2009 Istructor: Rafael Pass 1 Recap Lecture 11: Pseudoradom fuctios Scribe: Stefao Ermo Defiitio 1 (Ge, Ec, Dec) is a sigle message secure ecryptio scheme if for all uppt

More information

Lecture 2 October 11

Lecture 2 October 11 Itroductio to probabilistic graphical models 203/204 Lecture 2 October Lecturer: Guillaume Oboziski Scribes: Aymeric Reshef, Claire Verade Course webpage: http://www.di.es.fr/~fbach/courses/fall203/ 2.

More information

Recursive Algorithms. Recurrences. Recursive Algorithms Analysis

Recursive Algorithms. Recurrences. Recursive Algorithms Analysis Recursive Algorithms Recurreces Computer Sciece & Egieerig 35: Discrete Mathematics Christopher M Bourke cbourke@cseuledu A recursive algorithm is oe i which objects are defied i terms of other objects

More information

Differentiable Convex Functions

Differentiable Convex Functions Differetiable Covex Fuctios The followig picture motivates Theorem 11. f ( x) f ( x) f '( x)( x x) ˆx x 1 Theorem 11 : Let f : R R be differetiable. The, f is covex o the covex set C R if, ad oly if for

More information

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.

More information

Learning Bounds for Support Vector Machines with Learned Kernels

Learning Bounds for Support Vector Machines with Learned Kernels Learig Bouds for Support Vector Machies with Leared Kerels Nati Srebro TTI-Chicago Shai Be-David Uiversity of Waterloo Mostly based o a paper preseted at COLT 06 Kerelized Large-Margi Liear Classificatio

More information

CS 2750 Machine Learning. Lecture 22. Concept learning. CS 2750 Machine Learning. Concept Learning

CS 2750 Machine Learning. Lecture 22. Concept learning. CS 2750 Machine Learning. Concept Learning Lecture 22 Cocept learig Milos Hauskrecht milos@cs.pitt.edu 5329 Seott Square Cocept Learig Outlie: Learig boolea fuctios Most geeral ad most specific cosistet hypothesis. Mitchell s versio space algorithm

More information

Clustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar.

Clustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar. Clusterig CM226: Machie Learig for Bioiformatics. Fall 216 Sriram Sakararama Ackowledgmets: Fei Sha, Ameet Talwalkar Clusterig 1 / 42 Admiistratio HW 1 due o Moday. Email/post o CCLE if you have questios.

More information

NYU Center for Data Science: DS-GA 1003 Machine Learning and Computational Statistics (Spring 2018)

NYU Center for Data Science: DS-GA 1003 Machine Learning and Computational Statistics (Spring 2018) NYU Ceter for Data Sciece: DS-GA 003 Machie Learig ad Computatioal Statistics (Sprig 208) Brett Berstei, David Roseberg, Be Jakubowski Jauary 20, 208 Istructios: Followig most lab ad lecture sectios, we

More information

Discrete Orthogonal Moment Features Using Chebyshev Polynomials

Discrete Orthogonal Moment Features Using Chebyshev Polynomials Discrete Orthogoal Momet Features Usig Chebyshev Polyomials R. Mukuda, 1 S.H.Og ad P.A. Lee 3 1 Faculty of Iformatio Sciece ad Techology, Multimedia Uiversity 75450 Malacca, Malaysia. Istitute of Mathematical

More information

Empirical Process Theory and Oracle Inequalities

Empirical Process Theory and Oracle Inequalities Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Output Analysis and Run-Length Control

Output Analysis and Run-Length Control IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%

More information

MCT242: Electronic Instrumentation Lecture 2: Instrumentation Definitions

MCT242: Electronic Instrumentation Lecture 2: Instrumentation Definitions Faculty of Egieerig MCT242: Electroic Istrumetatio Lecture 2: Istrumetatio Defiitios Overview Measuremet Error Accuracy Precisio ad Mea Resolutio Mea Variace ad Stadard deviatio Fiesse Sesitivity Rage

More information

The Maximum-Likelihood Decoding Performance of Error-Correcting Codes

The Maximum-Likelihood Decoding Performance of Error-Correcting Codes The Maximum-Lielihood Decodig Performace of Error-Correctig Codes Hery D. Pfister ECE Departmet Texas A&M Uiversity August 27th, 2007 (rev. 0) November 2st, 203 (rev. ) Performace of Codes. Notatio X,

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

ENGI Series Page 6-01

ENGI Series Page 6-01 ENGI 3425 6 Series Page 6-01 6. Series Cotets: 6.01 Sequeces; geeral term, limits, covergece 6.02 Series; summatio otatio, covergece, divergece test 6.03 Stadard Series; telescopig series, geometric series,

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms Desig ad Aalysis of Algorithms Probabilistic aalysis ad Radomized algorithms Referece: CLRS Chapter 5 Topics: Hirig problem Idicatio radom variables Radomized algorithms Huo Hogwei 1 The hirig problem

More information

Reliability and Queueing

Reliability and Queueing Copyright 999 Uiversity of Califoria Reliability ad Queueig by David G. Messerschmitt Supplemetary sectio for Uderstadig Networked Applicatios: A First Course, Morga Kaufma, 999. Copyright otice: Permissio

More information

Lecture 10: Universal coding and prediction

Lecture 10: Universal coding and prediction 0-704: Iformatio Processig ad Learig Sprig 0 Lecture 0: Uiversal codig ad predictio Lecturer: Aarti Sigh Scribes: Georg M. Goerg Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture 9: Pricipal Compoet Aalysis The text i black outlies mai ideas to retai from the lecture. The text i blue give a deeper uderstadig of how we derive or get

More information

Polynomial Functions and Their Graphs

Polynomial Functions and Their Graphs Polyomial Fuctios ad Their Graphs I this sectio we begi the study of fuctios defied by polyomial expressios. Polyomial ad ratioal fuctios are the most commo fuctios used to model data, ad are used extesively

More information