DISTINCTIVE FEATURE DETECTION USING SUPPORT VECTOR MACHINES. Partha Niyogi, Chris Burges, and Padma Ramesh. Bell Labs, Lucent Technologies, USA.

Size: px
Start display at page:

Download "DISTINCTIVE FEATURE DETECTION USING SUPPORT VECTOR MACHINES. Partha Niyogi, Chris Burges, and Padma Ramesh. Bell Labs, Lucent Technologies, USA."

Transcription

1 DISTINCTIVE FEATURE DETECTION USING SUPPORT VECTOR MACHINES Partha Nyog, Chrs Burges, and Padma Ramesh Bell Labs, Lucent Technologes, USA. ABSTRACT An mportant aspect of dstnctve feature based approaches to automatc speech recognton s the formulaton of a framework for robust detecton of these features. We dscuss the applcaton of the support vector machnes (SVM) that arse when the structural rsk mnmzaton prncple s appled to such feature detecton problems. In partcular, we descrbe the problem of detectng stop consonants n contnuous speech and dscuss an SVM framework for detectng these sounds. In ths paper we use both lnear and nonlnear SVMs for stop detecton and present expermental results to show that they perform better than a cepstral features based hdden Markov model (HMM) system, on the same task. 1. INTRODUCTION We are pursung an approach to speech recognton that develops detectors for varous dstnctve features from ther acoustc correlates n the speech sgnal. Crucal to the success of such an approach are the followng: (1) determnaton of the acoustc sgnatures for dfferent sound classes and the development of sgnal representatons n whch that acoustc sgnature best expresses tself; (2) the constructon of statstcal learnng paradgms that operate on the above representatons and separate the postve nstances of the dstnctve feature from the negatve nstances of the same feature. Tradtonally, (1) s regarded as the front-end and (2) as the back end of a speech recognton system. In most tradtonal speech recognton desgns, the same representaton s used for all sound classes (.e., cepstra; flterbanks; computed wth the same analyss wndow and steppng rate). The front-end s thus a vector tme seres. The back-end s typcally an HMM of one form or another. In contrast, we consder usng dfferent representatons for dfferent sound classes. An mportant queston that arses n ths context s: gven a partcular representaton, what sort of a statstcal learnng machne should be used to optmally separate the postve examples of each sound from the negatve? In ths paper, we nvestgate the applcaton of support vector machnes (SVM) for the knds of detecton problems that would naturally arse n our feature based approach to speech recognton. Here, we address the ssue of detectng stop consonants n contnuous speech. We frst provde a descrpton of the detecton problem and then dscuss support vector machnes. An mportant ssue that needs to be addressed s the ablty of the machne to generalze from ts tranng set to ts test set. We use the Vapnk-Chervonenks theory [7] that gves rse to SVMs to address ths ssue. We examne a varety of support vector machnes wth dfferent archtectures and capacty control mechansms and show ther effect on the successful utlzaton of SVMs for speech recognton. Fnally, we present expermental results on usng SVMs for the detecton of stop consonants. 8 6 dcl d y axr dcl d aa r kcl k s Fgure 1: Porton of the speech waveform s(n) (top panel), the assocated three-dmensonal feature vector, x(n) (mddle panel), and the desred output y(n) bottom panel markng the tmes of the closure-burst transton, x axs s tme n msec. 2. STOP DETECTION Stop consonants are produced by causng a complete closure of the vocal tract followed by a sudden release. Hence they are sgnalled n contnuous speech by a perod of extremely low energy (correspondng to the perod of closure) followed by a sharp, broad band sgnal (correspondng to the release). As a result, stops consonants are hghly transent (dynamc) sounds that have a varyng duraton lastng anywhere from 5 to 1 ms. In Amercan Englsh, the class of stops conssts of the sounds f p,t,k,b,d,gg. In order to buld a detector for stop consonants n runnng speech, the speech sgnal, s(t) s characterzed by a vector tme seres wth three dmensons: () log(total Energy) () log(energy above 3kHz) () spectral flatness measure based on Wener Entropy defned as R log(s(f t))df ; log( R S(f t)df) where S(f t) s spectral energy at frequency f and tme t: All quanttes are computed usng 5 ms wndows moved every 1 ms. Thus, we have x(n) =[x 1(n) x 2(n) x 3(n)] where n represents tme (dscretzed n unts of mllseconds) and x 1 through x 3 are the three acoustc quanttes that are measured every 1 ms. Energes at 1 ms ntervals potentally allow us to track rapd transtons that would otherwse be smoothed out by a coarser temporal resoluton. For more on stop consonants, ther acoustc-phonetc features and common confusons, see [4]. We need to fnd an operator on the feature vector tme seres that wll return a sngle dmensonal tme seres that takes on large values around the tmes that stops occur and small values otherwse. The most natural ponts n tme that mark the presence of stops are the transton from closure to burst release. Shown n fg. 1 s an example of a speech waveform s(n) the assocated

2 feature vector tme seres x(n) and a desred output y(n): The techncal goal s to fnd an operator g on the tme seres x(n) that produces an output y g(n) = g x(n) wth values n f,1g, such that k y ; y g k s small n some sense (norm). Specfcally, we choose the optmal operator (from some class G of operators) accordng to the crteron g opt =argmn R(g) = arg mn E[(y ; g2g g2g yg)2 ] (1) Snce both y and y g have values n f 1g, ths s a two class, pattern classfcaton problem at each pont n tme n: Here we consder operators gven by y h (n) =h(x(n ; W ) ::: x(n + W )) where h s a functon from R 3(2W +1) ;!f 1g: In our experments W =5: Fnally, t s mportant to emphasze that the speech recognton problem can be decomposed nto a collecton of feature detecton problems that have a structure very smlar to that of the stop detecton problem descrbed above. For example, a vowel detector mght be bult wth a representaton x consstng of dmensons lke degree of perodcty n the sgnal, rato of hgh frequency to low frequency energy, and so on. The same s true of a nasal detector, a frcatve detector and so on. In all of these detecton problems, the support vector machne framework mght provde the bass for constructng optmal detectors that are learned from the data. 3. SUPPORT VECTOR MACHINES For a general ntroducton to usng SVMs for the pattern recognton task, see [1]. Consder a two-class pattern recognton problem wth labelled examples,.e., (x y ) pars drawn accordng to some unknown dstrbuton P (x y) on the space X Y: The goal s to construct a functon h (drawn from some class H : X ;! Y ; w.l.o.g let Y = f;1 1g as ths s convenent for our dscusson) that s able to classfy unknown patterns x nto the approprate class wth mnmum msclassfcaton error. A sutable h s usually pcked by mnmzng the emprcal rsk,.e., 1 ^h l = arg mn Remp(f) =argmn (y ; f(x))2 f2h f2h l Ths s the straghtforward approach commonly pursued n statstcal pattern recognton and the hope s that the emprcally chosen functon ^h l would generalze successfully to novel unlabelled examples Structural Rsk Mnmzaton The straghtforward approach of mnmzng the emprcal rsk turns out not to guarantee a small actual rsk on the test set, partcularly, f the number l of tranng examples are lmted. As a result of the work of Vapnk and Chervonenks [7] (see also [5, 6]), a new nducton prncple has emerged, the prncple of structural rsk mnmzaton (SRM). Ths s based on the fact that the true goal of the learner should be to mnmze the expected rsk,.e., h opt = arg mn R(f) =argmn E[(y ; f2h f2h f(x))2 ] where the expectaton s taken accordng to the true dstrbuton P: Snce P s unknown, one approxmates the above functonal by the followng large devaton bound (that holds wth probablty greater than 1 ; ) d R(f) R emp(f) +( l log() ) (2) l q d(log 2l where s defned as ( d l log() l ) = The parameter d s called the VC dmenson of the set of functons, H[7] and s a measure of ts complexty. Some aspects of eq. 2 are worth hghlghtng. Frst, to guarantee mnmzaton of the true rsk, R(f) one has to ensure that both terms, R emp(f) d +1);log(=4) l and ( d l log() l ) are made suffcently small sgnfcantly, t s noted that mnmzng R emp alone s not enough. Second, for a fxed amount of tranng data l the two components represent a complexty regularzaton trade-off. As the class H becomes larger, the mnmum emprcal rsk becomes smaller but the VC term becomes larger. Generalzaton s controlled by controllng each of these two terms. Conceptually ths s done by mposng a structure H 1 H 2 H 3 :::H of nested subsets of H: One then searches wthn ths herarchy for the class H j that mnmzes the bound,.e., ^h =arg mn (Remp(f) +()) (3) H j 2H 3.2. Hyperplanes Here we dscuss the applcaton of the SRM prncple to the case where the class H s the class of lnear hyperplanes n X.e., H = fh : X! f;1 1gjh(x) = (w:x + b)g where (z) = 1 f z and (z) = ;1 otherwse. Thus each par (w b) corresponds to a unque hyperplane (after removng an arbtrary scale factor n w and b by requrng that mn jw:x + bj = 1 where (x y ) are the set of tranng examples.) Now we use the followng: Theorem 1 Let B x1 ::: x r = fx 2 X : kx ; ak <Rg (a 2 X) be the smallest ball contanng the ponts x 1 ::: x r, and H A = ffw b = ((w:x) +b) j jjwjj Ag (4) be a subclass of hyperplanes n canoncal form wth respect to fx 1 ::: x rg. Then, H A has a VC-dmenson h satsfyng h R 2 A 2 : (5) Ths theorem suggests a natural structure on the set of hyperplanes n canoncal form. Clearly, H = [ A>H A: It s possble to show that utlzng such a structure transforms the problem posed n eq. 3 to ^h = arg mn w b (Remp(w b)+w:w) trades-off the ft to data wth model complexty Separable Data For the case, where the data s separable by hyperplanes, the structural rsk mnmzaton prncple attempts to mnmze the VCdmenson of H A whle keepng the emprcal rsk R emp fxed at. Ths s equvalent to mn 1 w:w (6) 2

3 subject to y (w:x + b) 1 (7) The th constrant s satsfed only f the th data pont s correctly classfed by the hyperplane classfer. Introducng Lagarange multplers for each of the constrants, the Lagrangan s formed as X L(w b f g) = 1 2 w:w ; (y(w:x + b) ; 1) The optmal soluton les at the saddle pont of the Lagrangan and satsfes the followng condtons: P (1) dfferentatng wth respect to w and settng to yelds w = yx.e., the optmal hyperplane s a lnear combnaton of the tranng vectors, P (2) dfferentatng wth respect to b and settng to yelds y =, (3) frst order Kuhn Tucker condtons yeld (y(w:x + b) ; 1) = : From ths we see that all ponts x for whch s non-zero le on the margn hyperplanes w :x + b =1or w :x + b = ;1: Such ponts are called support vectors. All other ponts are exteror ponts and do not enter n the expanson of the optmal hyperplane w snce they have = : As a result of these propertes, the learnng machne s called the support vector machne. Substtutng for w n the Lagrangan yelds the quadratc pro- P j yjyj(x :x j) sub- grammng P P problem, max ; 1 2 ject to y =: Non-separable Case In many practcal applcatons, a perfectly separatng hyperplane does not exst. To allow for the possblty of examples volatng (7), [2] ntroduce slack varables =1 ::: l (8) to get y ((w:x) +b) 1 ; =1 ::: l: (9) The structural rsk mnmzaton approach to mnmzng the guaranteed rsk bound (2) conssts of the followng: mn (w )=w:w + U =1 ( ) (1) subject to the constrants (8) and (9). In eq. 1, the frst term corresponds to the VC-dmenson of the learnng machne as before. The term P l =1 () (for small ) s equvalent to the number of msclassfcatons on the tranng set and therefore a measure of emprcal rsk. The constant U therefore controls the trade-off between the emprcal rsk P l =1 () and the VC-dmenson of the learnng machne. In actual practce, one has to make choces both for U and : For computatonal reasons, s chosen to be 1 because that translates the optmzaton problem of eq. 1 nto a quadratc programmng problem lke that of the prevous secton (n fact, =2does also). Introducng Lagrange multplers for each of the constrants n eqs. 8 ( s) and 9 ( s), we form the Lagrangan L(w f g b f g f jg) as before L = 1 2 w:w+u =1 ( ); j j ; j=1 =1 (y((w:x)+b);1) Frst order P condtons show that the optmal hyperplane s gven by w = yx: Addtonally takng nto account the Kuhn- Tucker P condtons, Peq. P1 s transformed nto P mn ; 1 2 j jyyj(x:xj) subject to y = U: Once the s are obtaned by solvng the above problem, the optmal hyperplane s easly found by substtuton Non-lnear Extensons: Kernels In order to get non-lnear decson boundares, we transform the data by a fxed non-lnear transformaton and use lnear technques n the transformed space. Consder a transformaton : X ;! Z mappng ponts x 2 X to correspondng z 2 Z: Constructng hyperplanes n Z accordng to the SRM prncple ultmately reduces to solvng optmzaton problems of the sort descrbed earler (eq. 1) wth x s replaced by z s. Sgnfcantly, we note that the only form n whch z s appear n the optmzaton problem s nner products,.e., (z z j): Therefore, t s enough to know the nner product between pars of them. Consequently, we characterze the transformaton by the nner product t mposes on the space Z: Specfcally, we consder mappngs of the form K : X ;! Z such that for any z 1 z 2 2 Z we have (z 1 z 2) = K(x 1 x 2) where K s the Kernel of a postve Hlbert-Schmdt operator and accordng to Mercer s theorem of functonal analyss ([3]) corresponds to a dot product n some other space. Each dfferent choce of the kernel K defnes a dfferent choce of the transformaton K: The dmensonalty of Z depends upon the number of nonzero egenvalues of the kernel K and s potentally nfnte. If we set up the optmzaton problem approprately, we see that the optmal hyperplane has the form w = P y z. The decson rule for ths s: h(x) =( P y K(x x )+b): We see that the form of ths decson rule s lke a feed-forward neural network, or a kernel-based non-parametrc scheme wth two sgnfcant dfferences (a) the parameters are estmated by the prncple of structural rsk mnmzaton (b) the number of hdden nodes or bass functons s chosen automatcally by the procedure. Thus an mportant problem of model selecton s resolved. 4. EXPERIMENTAL RESULTS As formulated n secton 2, stop detecton s a two class pattern classfcaton problem that can be solved usng SVMs. We dscuss n ths secton the experments conducted wth these SVMs. Detecton experments were conducted on dalect regon 4 of the TIMIT test set that conssts of 32 speakers, 16 male and 16 female utterng 1 sentences each makng for a total of 32 sentences n all. Tranng was performed on 1 sentences each from 4 randomly chosen speakers from the TIMIT tranng set from dfferent dalect regons yeldng 133 postve examples and 176 negatve examples Lnear Hyperplanes In ths secton we consder results obtaned when the class of lnear hyperplanes s used as a decson boundary between stops and nonstops (non-separable formulaton). On 1893 tranng data ponts, wth U =1 163 support vectors were generated (1:5% of tranng vectors). The errors on the tranng set conssted of 25 false postves and 37 false negatves. Shown n fg. 2 s a dstrbuton (normalzed hstograms)

4 Non Stops (176).35.3 LINEAR Percentage.15.1 Stops (133) False Acceptance DERIV.5 SVM d(x) False Rejecton.8 Fgure 2: Hstogram of d(x) on the tranng set. Fgure 4: Performance of lnear and non-lnear support vector machnes aganst dervatve operators and HMMs. False Acceptance U =.5 U = False Rejecton Fgure 3: Performance of lnear support vector machnes. of d(x) =w:x + b for the postve and negatve tranng examples. As d(x) jjwjj s the dstance of each datapont x to the separatng hyperplane d(x) s proportonal to ths dstance. Postve examples that have d<and negatve examples that have d>are msclassfed. The regon ;1 d 1 corresponds to the margn,.e., ponts that le wthn the strp gven by the hyperplanes w:x + b =1and w:x + b = ;1: As we dscussed n secton 2, the problem s really one of accurate detecton of stops. Consequently, by changng the threshold (currently set at d =) one can obtan a trade-off between type I and type II errors for the stop detecton problem. Shown n fg. 3 are the ROC curves generated by varyng such a threshold of acceptance for values of U = :2 :5: Ucontrols the trade-off between emprcal ft to the data and capacty of the learnng machne. Performance on the test set (a measure of generalzaton) gets progressvely better as U decreases over ths range. The best performance was obtaned for U =:5: 4.2. RBF Kernels Another mportant choce n the adopton of the support vector framework for problems such as these nvolves the choce of kernel, K: In the experments below, we consdered Radal Bass Functon kernels of the sort K(x y) = exp ; jjx;yjj2 2. Experments were conducted wth values of = Performance mproved margnally from sgma = 1 to = 5. Shown n fg. 4 s the ROC curve for an RBF network wth = 5 (labeled SVM). Also shown are ROC curves for the best lnear hyperplane and one obtaned usng a dervatve operator on total and hgh-frequency energes. As a pont of comparson, the performance of an HMM based system runnng wth free grammar on the test sentences s shown by the * n fg. 4. The HMM based system conssted of 47 phones wth 3 state left to rght HMMs per phone. The output probabltes were mxtures of Gaussans (16 mxtures per state) and the frontend was a 39 dmensonal vector tme seres obtaned from the frst 12 cepstral coeffcents and total energy and ther frst and second dfferences (delta and delta-delta). A stop n a test sentence was consdered to be correctly dentfed f the closure-burst transton was anywhere wthn the segment postulated as a stop by the HMM recognzer. If a partcular segment postulated by the HMM recognzer as a stop dd not contan a closure-burst transton, t was deemed a false accept. 5. CONCLUSIONS Stop detecton belongs to a class of feature detecton problems that naturally arse n a feature based approach to speech recognton. It s partcularly nterestng as stops present a transent sgnal wth a perod of rapd change that s often poorly characterzed by standard cepstral representatons. We have dscussed how the problem can be cast as tryng to dscrmnate between postve and negatve examples usng functons drawn from the class H of the approprate complexty. We have ntroduced the prncple of structural rsk mnmzaton that provdes a framework for dong ths. We have dscussed the two major ssues, the choce of an approprate U and kernel K that affect the successful deployment of ths technology for detecton problems. We have also shown a steady mprovement from lnear to non-lnear SVMs and shown that they perform better than an HMM usng cepstral features. 6. REFERENCES [1] Burges, C.J., A Tutoral on Support Vector Machnes for Pattern Recognton, Data Mnng and Knowledge Dscovery, Vol. 2, No. 2, pp. 1-47, [2] Cortes, C. and Vapnk, V. N., Support Vector Networks, Machne Learnng Journal, Vol. 2, pp [3] Courant, R. and Hlbert, D., Methods of Mathematcal Physcs, J. Wley, New York, [4] Nyog, P., Mtra, P., and Sondh, M., A Detecton Framework for Locatng Phonetc Events, Proceedngs of ICSLP- 98, Sydney, Australa. [5] Shawe-Taylor, J., Bartlett, P. L., Wllamson, R. C., and Anthony, M., A Framework for Structural Rsk Mnmzaton, Proceedngs, 9th Annual Conference on Computatonal Learnng Theory, pp , 1996.

5 [6] Shawe-Taylor, J., Bartlett, P. L., Wllamson, R. C., and Anthony, M., Structural Rsk Mnmzaton over Data- Dependent Herarches, NeuroCOLT Techncal Report NC- TR-96-53, [7] Vapnk, V. N. The Nature of Statstcal Learnng Theory, Sprnger-Verlag, 1995.

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

Natural Language Processing and Information Retrieval

Natural Language Processing and Information Retrieval Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support

More information

Linear Classification, SVMs and Nearest Neighbors

Linear Classification, SVMs and Nearest Neighbors 1 CSE 473 Lecture 25 (Chapter 18) Lnear Classfcaton, SVMs and Nearest Neghbors CSE AI faculty + Chrs Bshop, Dan Klen, Stuart Russell, Andrew Moore Motvaton: Face Detecton How do we buld a classfer to dstngush

More information

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015 CS 3710: Vsual Recognton Classfcaton and Detecton Adrana Kovashka Department of Computer Scence January 13, 2015 Plan for Today Vsual recognton bascs part 2: Classfcaton and detecton Adrana s research

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Support Vector Machines

Support Vector Machines CS 2750: Machne Learnng Support Vector Machnes Prof. Adrana Kovashka Unversty of Pttsburgh February 17, 2016 Announcement Homework 2 deadlne s now 2/29 We ll have covered everythng you need today or at

More information

Which Separator? Spring 1

Which Separator? Spring 1 Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Support Vector Machines

Support Vector Machines Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x n class

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING 1 ADVANCED ACHINE LEARNING ADVANCED ACHINE LEARNING Non-lnear regresson technques 2 ADVANCED ACHINE LEARNING Regresson: Prncple N ap N-dm. nput x to a contnuous output y. Learn a functon of the type: N

More information

Support Vector Machines

Support Vector Machines /14/018 Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

Chapter 6 Support vector machine. Séparateurs à vaste marge

Chapter 6 Support vector machine. Séparateurs à vaste marge Chapter 6 Support vector machne Séparateurs à vaste marge Méthode de classfcaton bnare par apprentssage Introdute par Vladmr Vapnk en 1995 Repose sur l exstence d un classfcateur lnéare Apprentssage supervsé

More information

Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them?

Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them? Image classfcaton Gven te bag-of-features representatons of mages from dfferent classes ow do we learn a model for dstngusng tem? Classfers Learn a decson rule assgnng bag-offeatures representatons of

More information

Statistical machine learning and its application to neonatal seizure detection

Statistical machine learning and its application to neonatal seizure detection 19/Oct/2009 Statstcal machne learnng and ts applcaton to neonatal sezure detecton Presented by Andry Temko Department of Electrcal and Electronc Engneerng Page 2 of 42 A. Temko, Statstcal Machne Learnng

More information

CSE 252C: Computer Vision III

CSE 252C: Computer Vision III CSE 252C: Computer Vson III Lecturer: Serge Belonge Scrbe: Catherne Wah LECTURE 15 Kernel Machnes 15.1. Kernels We wll study two methods based on a specal knd of functon k(x, y) called a kernel: Kernel

More information

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland

More information

Support Vector Machines CS434

Support Vector Machines CS434 Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? + + + + + + + + + Intuton of Margn Consder ponts

More information

Kristin P. Bennett. Rensselaer Polytechnic Institute

Kristin P. Bennett. Rensselaer Polytechnic Institute Support Vector Machnes and Other Kernel Methods Krstn P. Bennett Mathematcal Scences Department Rensselaer Polytechnc Insttute Support Vector Machnes (SVM) A methodology for nference based on Statstcal

More information

Advanced Introduction to Machine Learning

Advanced Introduction to Machine Learning Advanced Introducton to Machne Learnng 10715, Fall 2014 The Kernel Trck, Reproducng Kernel Hlbert Space, and the Representer Theorem Erc Xng Lecture 6, September 24, 2014 Readng: Erc Xng @ CMU, 2014 1

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

VQ widely used in coding speech, image, and video

VQ widely used in coding speech, image, and video at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan Kernels n Support Vector Machnes Based on lectures of Martn Law, Unversty of Mchgan Non Lnear separable problems AND OR NOT() The XOR problem cannot be solved wth a perceptron. XOR Per Lug Martell - Systems

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Konstantn Tretyakov (kt@ut.ee) MTAT.03.227 Machne Learnng So far Supervsed machne learnng Lnear models Least squares regresson Fsher s dscrmnant, Perceptron, Logstc model Non-lnear

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Regularized Discriminant Analysis for Face Recognition

Regularized Discriminant Analysis for Face Recognition 1 Regularzed Dscrmnant Analyss for Face Recognton Itz Pma, Mayer Aladem Department of Electrcal and Computer Engneerng, Ben-Guron Unversty of the Negev P.O.Box 653, Beer-Sheva, 845, Israel. Abstract Ths

More information

Lagrange Multipliers Kernel Trick

Lagrange Multipliers Kernel Trick Lagrange Multplers Kernel Trck Ncholas Ruozz Unversty of Texas at Dallas Based roughly on the sldes of Davd Sontag General Optmzaton A mathematcal detour, we ll come back to SVMs soon! subject to: f x

More information

Linear Feature Engineering 11

Linear Feature Engineering 11 Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19

More information

Lecture 3: Dual problems and Kernels

Lecture 3: Dual problems and Kernels Lecture 3: Dual problems and Kernels C4B Machne Learnng Hlary 211 A. Zsserman Prmal and dual forms Lnear separablty revsted Feature mappng Kernels for SVMs Kernel trck requrements radal bass functons SVM

More information

Pattern Classification

Pattern Classification Pattern Classfcaton All materals n these sldes ere taken from Pattern Classfcaton (nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wley & Sons, 000 th the permsson of the authors and the publsher

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Konstantn Tretyakov (kt@ut.ee) MTAT.03.227 Machne Learnng So far So far Supervsed machne learnng Lnear models Non-lnear models Unsupervsed machne learnng Generc scaffoldng So far

More information

Nonlinear Classifiers II

Nonlinear Classifiers II Nonlnear Classfers II Nonlnear Classfers: Introducton Classfers Supervsed Classfers Lnear Classfers Perceptron Least Squares Methods Lnear Support Vector Machne Nonlnear Classfers Part I: Mult Layer Neural

More information

Supporting Information

Supporting Information Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to

More information

Intro to Visual Recognition

Intro to Visual Recognition CS 2770: Computer Vson Intro to Vsual Recognton Prof. Adrana Kovashka Unversty of Pttsburgh February 13, 2018 Plan for today What s recognton? a.k.a. classfcaton, categorzaton Support vector machnes Separable

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learnng Theory: Lecture Notes Lecturer: Kamalka Chaudhur Scrbe: Qush Wang October 27, 2012 1 The Agnostc PAC Model Recall that one of the constrants of the PAC model s that the data dstrbuton has to be

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

Online Classification: Perceptron and Winnow

Online Classification: Perceptron and Winnow E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z ) C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

Support Vector Machines CS434

Support Vector Machines CS434 Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? Intuton of Margn Consder ponts A, B, and C We

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

On the Multicriteria Integer Network Flow Problem

On the Multicriteria Integer Network Flow Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of

More information

The Minimum Universal Cost Flow in an Infeasible Flow Network

The Minimum Universal Cost Flow in an Infeasible Flow Network Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran

More information

Maximal Margin Classifier

Maximal Margin Classifier CS81B/Stat41B: Advanced Topcs n Learnng & Decson Makng Mamal Margn Classfer Lecturer: Mchael Jordan Scrbes: Jana van Greunen Corrected verson - /1/004 1 References/Recommended Readng 1.1 Webstes www.kernel-machnes.org

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

Neural networks. Nuno Vasconcelos ECE Department, UCSD

Neural networks. Nuno Vasconcelos ECE Department, UCSD Neural networs Nuno Vasconcelos ECE Department, UCSD Classfcaton a classfcaton problem has two types of varables e.g. X - vector of observatons (features) n the world Y - state (class) of the world x X

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm The Expectaton-Maxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hgh-level verson of the algorthm.

More information

1 Matrix representations of canonical matrices

1 Matrix representations of canonical matrices 1 Matrx representatons of canoncal matrces 2-d rotaton around the orgn: ( ) cos θ sn θ R 0 = sn θ cos θ 3-d rotaton around the x-axs: R x = 1 0 0 0 cos θ sn θ 0 sn θ cos θ 3-d rotaton around the y-axs:

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

The Feynman path integral

The Feynman path integral The Feynman path ntegral Aprl 3, 205 Hesenberg and Schrödnger pctures The Schrödnger wave functon places the tme dependence of a physcal system n the state, ψ, t, where the state s a vector n Hlbert space

More information

Other NN Models. Reinforcement learning (RL) Probabilistic neural networks

Other NN Models. Reinforcement learning (RL) Probabilistic neural networks Other NN Models Renforcement learnng (RL) Probablstc neural networks Support vector machne (SVM) Renforcement learnng g( (RL) Basc deas: Supervsed dlearnng: (delta rule, BP) Samples (x, f(x)) to learn

More information

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor Taylor Enterprses, Inc. Control Lmts for P Charts Copyrght 2017 by Taylor Enterprses, Inc., All Rghts Reserved. Control Lmts for P Charts Dr. Wayne A. Taylor Abstract: P charts are used for count data

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

Support Vector Novelty Detection

Support Vector Novelty Detection Support Vector Novelty Detecton Dscusson of Support Vector Method for Novelty Detecton (NIPS 2000) and Estmatng the Support of a Hgh- Dmensonal Dstrbuton (Neural Computaton 13, 2001) Bernhard Scholkopf,

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

3.1 ML and Empirical Distribution

3.1 ML and Empirical Distribution 67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Solutions to exam in SF1811 Optimization, Jan 14, 2015

Solutions to exam in SF1811 Optimization, Jan 14, 2015 Solutons to exam n SF8 Optmzaton, Jan 4, 25 3 3 O------O -4 \ / \ / The network: \/ where all lnks go from left to rght. /\ / \ / \ 6 O------O -5 2 4.(a) Let x = ( x 3, x 4, x 23, x 24 ) T, where the varable

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

EEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming

EEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming EEL 6266 Power System Operaton and Control Chapter 3 Economc Dspatch Usng Dynamc Programmng Pecewse Lnear Cost Functons Common practce many utltes prefer to represent ther generator cost functons as sngle-

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

Evaluation of simple performance measures for tuning SVM hyperparameters

Evaluation of simple performance measures for tuning SVM hyperparameters Evaluaton of smple performance measures for tunng SVM hyperparameters Kabo Duan, S Sathya Keerth, Aun Neow Poo Department of Mechancal Engneerng, Natonal Unversty of Sngapore, 0 Kent Rdge Crescent, 960,

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU Group M D L M Chapter Bayesan Decson heory Xn-Shun Xu @ SDU School of Computer Scence and echnology, Shandong Unversty Bayesan Decson heory Bayesan decson theory s a statstcal approach to data mnng/pattern

More information

Lecture 10 Support Vector Machines. Oct

Lecture 10 Support Vector Machines. Oct Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

PHYS 705: Classical Mechanics. Calculus of Variations II

PHYS 705: Classical Mechanics. Calculus of Variations II 1 PHYS 705: Classcal Mechancs Calculus of Varatons II 2 Calculus of Varatons: Generalzaton (no constrant yet) Suppose now that F depends on several dependent varables : We need to fnd such that has a statonary

More information

Statistics Chapter 4

Statistics Chapter 4 Statstcs Chapter 4 "There are three knds of les: les, damned les, and statstcs." Benjamn Dsrael, 1895 (Brtsh statesman) Gaussan Dstrbuton, 4-1 If a measurement s repeated many tmes a statstcal treatment

More information

UVA CS / Introduc8on to Machine Learning and Data Mining. Lecture 10: Classifica8on with Support Vector Machine (cont.

UVA CS / Introduc8on to Machine Learning and Data Mining. Lecture 10: Classifica8on with Support Vector Machine (cont. UVA CS 4501-001 / 6501 007 Introduc8on to Machne Learnng and Data Mnng Lecture 10: Classfca8on wth Support Vector Machne (cont. ) Yanjun Q / Jane Unversty of Vrgna Department of Computer Scence 9/6/14

More information

CS 468 Lecture 16: Isometry Invariance and Spectral Techniques

CS 468 Lecture 16: Isometry Invariance and Spectral Techniques CS 468 Lecture 16: Isometry Invarance and Spectral Technques Justn Solomon Scrbe: Evan Gawlk Introducton. In geometry processng, t s often desrable to characterze the shape of an object n a manner that

More information

Pop-Click Noise Detection Using Inter-Frame Correlation for Improved Portable Auditory Sensing

Pop-Click Noise Detection Using Inter-Frame Correlation for Improved Portable Auditory Sensing Advanced Scence and Technology Letters, pp.164-168 http://dx.do.org/10.14257/astl.2013 Pop-Clc Nose Detecton Usng Inter-Frame Correlaton for Improved Portable Audtory Sensng Dong Yun Lee, Kwang Myung Jeon,

More information

Analytical Chemistry Calibration Curve Handout

Analytical Chemistry Calibration Curve Handout I. Quck-and Drty Excel Tutoral Analytcal Chemstry Calbraton Curve Handout For those of you wth lttle experence wth Excel, I ve provded some key technques that should help you use the program both for problem

More information

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16 STAT 39: MATHEMATICAL COMPUTATIONS I FALL 218 LECTURE 16 1 why teratve methods f we have a lnear system Ax = b where A s very, very large but s ether sparse or structured (eg, banded, Toepltz, banded plus

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space.

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space. Lnear, affne, and convex sets and hulls In the sequel, unless otherwse specfed, X wll denote a real vector space. Lnes and segments. Gven two ponts x, y X, we defne xy = {x + t(y x) : t R} = {(1 t)x +

More information

Physics 5153 Classical Mechanics. D Alembert s Principle and The Lagrangian-1

Physics 5153 Classical Mechanics. D Alembert s Principle and The Lagrangian-1 P. Guterrez Physcs 5153 Classcal Mechancs D Alembert s Prncple and The Lagrangan 1 Introducton The prncple of vrtual work provdes a method of solvng problems of statc equlbrum wthout havng to consder the

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING MACHINE LEANING Vasant Honavar Bonformatcs and Computatonal Bology rogram Center for Computatonal Intellgence, Learnng, & Dscovery Iowa State Unversty honavar@cs.astate.edu www.cs.astate.edu/~honavar/

More information