Support Vector Machines and Flexible Discriminants
|
|
- Sophie Perkins
- 5 years ago
- Views:
Transcription
1 12 Supprt Vectr Machines and Flexible Discriminants This is page 417 Printer: Opaque this 12.1 Intrductin In this chapter we describe generalizatins f linear decisin bundaries fr classificatin. Optimal separating hyperplanes are intrduced in Chapter 4 fr the case when tw classes are linearly separable. Here we cver extensins t the nnseparable case, where the classes verlap. These techniques are then generalized t what is knwn as the supprt vectr machine, which prduces nnlinear bundaries by cnstructing a linear bundary in alarge,transfrmedversinfthefeaturespace.thesecndsetfmethds generalize Fisher s linear discriminant analysis (LDA). The generalizatins include flexible discriminant analysis which facilitates cnstructin f nnlinear bundaries in a manner very similar t the supprt vectr machines, penalized discriminant analysis fr prblems such as signal and image classificatin where the large number f features are highly crrelated, and mixture discriminant analysis fr irregularly shaped classes The Supprt Vectr Classifier In Chapter 4 we discussed a technique fr cnstructing an ptimal separating hyperplane between tw perfectly separated classes. We review this and generalize t the nnseparable case, where the classes may nt be separable by a linear bundary.
2 Flexible Discriminants x T β + β 0 =0 M = 1 β M = 1 β margin x T β + β 0 =0 ξ 4 ξ5 ξ ξ 3 1 ξ 2 M = 1 β M = 1 β margin FIGURE Supprt vectr classifiers. The left panel shws the separable case. The decisin bundary is the slid line, while brken lines bund the shaded maximal margin f width 2M =2/ β. The right panel shws the nnseparable (verlap) case. The pints labeled ξj are n the wrng side f their margin by an amunt ξj = Mξ j;pintsnthecrrectsidehaveξj =0.Themarginis maximized subject t a ttal budget ξ i cnstant. Hence ξj is the ttal distance f pints n the wrng side f their margin. Our training data cnsists f N pairs (x 1,y 1 ), (x 2,y 2 ),...,(x N,y N ), with x i IR p and y i { 1, 1}. Defineahyperplaneby {x : f(x) =x T β + β 0 =0}, (12.1) where β is a unit vectr: β =1.Aclassificatinruleinducedbyf(x) is G(x) =sign[x T β + β 0 ]. (12.2) The gemetry f hyperplanes is reviewed in Sectin 4.5, whereweshwthat f(x) in(12.1)givesthesigneddistancefrmapintx t the hyperplane f(x) =x T β +β 0 =0.Sincetheclassesareseparable,wecanfindafunctin f(x) = x T β + β 0 with y i f(x i ) > 0 i. Henceweareabletfindthe hyperplane that creates the biggest margin between the training pints fr class 1 and 1 (seefigure12.1).theptimizatinprblem max M β,β 0, β =1 subject t y i (x T i β + β 0 ) M, i =1,...,N, (12.3) captures this cncept. The band in the figure is M units away frm the hyperplane n either side, and hence 2M units wide. It is called the margin. We shwed that this prblem can be mre cnveniently rephrased as min β β,β 0 subject t y i (x T i β + β 0 ) 1,,...,N, (12.4)
3 12.2 The Supprt Vectr Classifier 419 where we have drpped the nrm cnstraint n β. NtethatM =1/ β. Expressin (12.4) is the usual way f writing the supprt vectr criterin fr separated data. This is a cnvex ptimizatin prblem (quadratic criterin, linear inequality cnstraints), and the slutin is characterizedin Sectin Suppse nw that the classes verlap in feature space. One way tdeal with the verlap is t still maximize M, butallwfrsmepintstben the wrng side f the margin. Define the slack variables ξ =(ξ 1,ξ 2,...,ξ N ). There are tw natural ways t mdify the cnstraint in (12.3): y i (x T i β + β 0 ) M ξ i, (12.5) r y i (x T i β + β 0 ) M(1 ξ i ), (12.6) i, ξ i 0, N ξ i cnstant. The tw chices lead t different slutins. The first chice seems mre natural, since it measures verlap inactual distance frm the margin; the secnd chice measures the verlap in relative distance, which changes with the width f the margin M. Hwever,thefirst chice results in a nncnvex ptimizatin prblem, while the secnd is cnvex; thus (12.6) leads t the standard supprt vectr classifier, which we use frm here n. Here is the idea f the frmulatin. The value ξ i in the cnstraint y i (x T i β+ β 0 ) M(1 ξ i ) is the prprtinal amunt by which the predictin f(x i )=x T i β +β 0 is n the wrng side f its margin. Hence by bunding the sum ξ i,webundthettalprprtinalamuntbywhichpredictins fall n the wrng side f their margin. Misclassificatins ccur when ξ i > 1, s bunding ξ i at a value K say, bunds the ttal number f training misclassificatins at K. As in (4.48) in Sectin 4.5.2, we can drp the nrm cnstraint n β, define M =1/ β, andwrite(12.4)intheequivalentfrm min β subject t { y i (x T i β + β 0) 1 ξ i i, ξ i 0, ξ i cnstant. (12.7) This is the usual way the supprt vectr classifier is defined fr the nnseparable case. Hwever we find cnfusing the presence f the fixed scale 1 in the cnstraint y i (x T i β + β 0) 1 ξ i,andprefertstartwith(12.6). The right panel f Figure 12.1 illustrates this verlapping case. By the nature f the criterin (12.7), we see that pints well inside their class bundary d nt play a big rle in shaping the bundary. This seems like an attractive prperty, and ne that differentiates it frm linear discriminant analysis (Sectin 4.3). In LDA, the decisin bundary is determined by the cvariance f the class distributins and the psitins f the class centrids. We will see in Sectin that lgistic regressin is mre similar t the supprt vectr classifier in this regard.
4 Flexible Discriminants Cmputing the Supprt Vectr Classifier The prblem (12.7) is quadratic with linear inequality cnstraints, hence it is a cnvex ptimizatin prblem. We describe a quadratic prgramming slutin using Lagrange multipliers. Cmputatinally it is cnvenientt re-express (12.7) in the equivalent frm 1 min β,β 0 2 β 2 + C subject t ξ i ξ i 0, y i (x T i β + β 0 ) 1 ξ i i, (12.8) where the cst parameter C replaces the cnstant in (12.7); the separable case crrespnds t C =. The Lagrange (primal) functin is L P = 1 2 β 2 + C ξ i α i [y i (x T i β + β 0 ) (1 ξ i )] µ i ξ i, (12.9) which we minimize w.r.t β, β 0 and ξ i.settingtherespectivederivativest zer, we get β = 0 = α i y i x i, (12.10) α i y i, (12.11) α i = C µ i, i, (12.12) as well as the psitivity cnstraints α i, µ i, ξ i 0 i. Bysubstituting (12.10) (12.12) int (12.9), we btain the Lagrangian (Wlfe) dual bjective functin L D = α i 1 α i α i y i y i x T i x i, (12.13) 2 i =1 which gives a lwer bund n the bjective functin (12.8) fr anyfeasible pint. We maximize L D subject t 0 α i C and N α iy i =0.In additin t (12.10) (12.12), the Karush Kuhn Tucker cnditins include the cnstraints α i [y i (x T i β + β 0 ) (1 ξ i )] = 0, (12.14) µ i ξ i = 0, (12.15) y i (x T i β + β 0 ) (1 ξ i ) 0, (12.16) fr i =1,...,N.Tgethertheseequatins(12.10) (12.16)uniquelycharacterize the slutin t the primal and dual prblem.
5 12.2 The Supprt Vectr Classifier 421 Frm (12.10) we see that the slutin fr β has the frm ˆβ = ˆα i y i x i, (12.17) with nnzer cefficients ˆα i nly fr thse bservatins i fr which the cnstraints in (12.16) are exactly met (due t (12.14)). These bservatins are called the supprt vectrs, since ˆβ is represented in terms f them alne. Amng these supprt pints, sme will lie n the edge f themargin (ˆξ i = 0), and hence frm (12.15) and (12.12) will be characterized by 0 < ˆα i <C;theremainder(ˆξ i > 0) have ˆα i = C. Frm(12.14)wecan see that any f these margin pints (0 < ˆα i, ˆξi =0)canbeusedtslve fr β 0,andwetypicallyuseanaveragefalltheslutinsfrnumerical stability. Maximizing the dual (12.13) is a simpler cnvex quadratic prgramming prblem than the primal (12.9), and can be slved with standard techniques (Murray et al., 1981, fr example). Given the slutins ˆβ 0 and ˆβ, thedecisinfunctincanbewrittenas Ĝ(x) = sign[ˆf(x)] = sign[x T ˆβ + ˆβ0 ]. (12.18) The tuning parameter f this prcedure is the cst parameter C Mixture Example (Cntinued) Figure 12.2 shws the supprt vectr bundary fr the mixture example f Figure 2.5 n page 21, with tw verlapping classes, fr tw different values f the cst parameter C. Theclassifiersarerathersimilarintheir perfrmance. Pints n the wrng side f the bundary are supprt vectrs. In additin, pints n the crrect side f the bundary but clse t it (in the margin), are als supprt vectrs. The margin is larger fr C =0.01 than it is fr C =10, 000. Hence larger values f C fcus attentin mre n (crrectly classified) pints near the decisin bundary, while smaller values invlve data further away. Either way, misclassified pints are given weight, n matter hw far away. In this example the prcedure is nt very sensitive t chices f C, becauseftherigidityfalinearbundary. The ptimal value fr C can be estimated by crss-validatin, as discussed in Chapter 7. Interestingly, the leave-ne-ut crss-validatin errr can be bunded abve by the prprtin f supprt pints in the data.the reasn is that leaving ut an bservatin that is nt a supprt vectrwill nt change the slutin. Hence these bservatins, being classified crrectly by the riginal bundary, will be classified crrectly in the crss-validatin prcess. Hwever this bund tends t be t high, and nt generally useful fr chsing C (62% and 85%, respectively, in ur examples).
6 Flexible Discriminants Training Errr: Test Errr: Bayes Errr: C = Training Errr: 0.26 Test Errr: 0.30 Bayes Errr: 0.21 C =0.01 FIGURE The linear supprt vectr bundary fr the mixture data example with tw verlapping classes, fr tw different values f C. The brken lines indicate the margins, where f(x) =±1. Thesupprtpints(α i > 0) areallthe pints n the wrng side f their margin. The black slid dts are thsesupprt pints falling exactly n the margin (ξ i =0,α i > 0). In the upper panel 62% f the bservatins are supprt pints, while in the lwer panel 85% are. The brken purple curve in the backgrund is the Bayes decisin bundary.
7 12.3 Supprt Vectr Machines and Kernels Supprt Vectr Machines and Kernels The supprt vectr classifier described s far finds linear bundaries in the input feature space. As with ther linear methds, we can make theprcedure mre flexible by enlarging the feature space using basis expansins such as plynmials r splines (Chapter 5). Generally linear bundaries in the enlarged space achieve better training-class separatin, and translate t nnlinear bundaries in the riginal space. Once the basis functins h m (x), m=1,...,m are selected, the prcedure is the same as befre. We fit the SV classifier using input features h(x i )=(h 1 (x i ),h 2 (x i ),...,h M (x i )), i =1,...,N,andprducethe(nnlinear)functin ˆf(x) =h(x) T ˆβ + ˆβ 0. The classifier is Ĝ(x) =sign(ˆf(x)) as befre. The supprt vectr machine classifier is an extensin f this idea, where the dimensin f the enlarged space is allwed t get very large, infinite in sme cases. It might seem that the cmputatins wuld becme prhibitive. It wuld als seem that with sufficient basis functins, the data wuld be separable, and verfitting wuld ccur. We first shw hw the SVM technlgy deals with these issues. We then see that in fact the SVM classifier is slving a functin-fitting prblem using a particular criterin and frm f regularizatin, and is part f a much bigger class f prblems that includes the smthing splines f Chapter 5. The reader may wish t cnsult Sectin 5.8, which prvides backgrund material and verlaps smewhat with the next tw sectins Cmputing the SVM fr Classificatin We can represent the ptimizatin prblem (12.9) and its slutin in a special way that nly invlves the input features via inner prducts. We d this directly fr the transfrmed feature vectrs h(x i ). We then see that fr particular chices f h, theseinnerprductscanbecmputedverycheaply. The Lagrange dual functin (12.13) has the frm L D = α i 1 2 i =1 α i α i y i y i h(x i ),h(x i ). (12.19) Frm (12.10) we see that the slutin functin f(x) canbewritten f(x) = h(x) T β + β 0 = α i y i h(x),h(x i ) + β 0. (12.20) As befre, given α i, β 0 can be determined by slving y i f(x i )=1in(12.20) fr any (r all) x i fr which 0 <α i <C.
8 Flexible Discriminants S bth (12.19) and (12.20) invlve h(x) nlythrughinnerprducts.in fact, we need nt specify the transfrmatin h(x) atall,butrequirenly knwledge f the kernel functin K(x, x )= h(x),h(x ) (12.21) that cmputes inner prducts in the transfrmed space. K shuld be a symmetric psitive (semi-) definite functin; see Sectin Three ppular chices fr K in the SVM literature are dth-degree plynmial: K(x, x )=(1+ x, x ) d, Radial basis: K(x, x )=exp( γ x x 2 ), Neural netwrk: K(x, x )=tanh(κ 1 x, x + κ 2 ). (12.22) Cnsider fr example a feature space with tw inputs X 1 and X 2,anda plynmial kernel f degree 2. Then K(X, X )=(1+ X, X ) 2 =(1+X 1 X 1 + X 2 X 2) 2 =1+2X 1 X 1 +2X 2 X 2 +(X 1 X 1) 2 +(X 2 X 2) 2 +2X 1 X 1X 2 X 2. (12.23) Then M =6,andifwechseh 1 (X) =1,h 2 (X) = 2X 1, h 3 (X) = 2X2, h 4 (X) =X 2 1, h 5 (X) =X 2 2,andh 6 (X) = 2X 1 X 2,thenK(X, X )= h(x),h(x ). Frm(12.20)weseethattheslutincanbewritten ˆf(x) = ˆα i y i K(x, x i )+ ˆβ 0. (12.24) The rle f the parameter C is clearer in an enlarged feature space, since perfect separatin is ften achievable there. A large value f C will discurage any psitive ξ i,andleadtanverfitwigglybundaryinthe riginal feature space; a small value f C will encurage a small value f β, whichinturncausesf(x) andhencethebundarytbesmther. Figure 12.3 shw tw nnlinear supprt vectr machines applied t the mixture example f Chapter 2. The regularizatin parameter was chsen in bth cases t achieve gd test errr. The radial basis kernel prduces abundaryquitesimilartthebayesptimalbundaryfrthis example; cmpare Figure 2.5. In the early literature n supprt vectrs, there were claims thatthe kernel prperty f the supprt vectr machine is unique t it and allws ne t finesse the curse f dimensinality. Neither f these claims is true, and we g int bth f these issues in the next three subsectins.
9 12.3 Supprt Vectr Machines and Kernels 425 SVM - Degree-4 Plynmial in Feature Space Training Errr: Test Errr: Bayes Errr: SVM - Radial Kernel in Feature Space Training Errr: Test Errr: Bayes Errr: FIGURE Tw nnlinear SVMs fr the mixture data. The upper plt uses a 4th degree plynmial kernel, the lwer a radial basis kernel (with γ =1). In each case C was tuned t apprximately achieve the best test errr perfrmance, and C =1wrked well in bth cases. The radial basis kernel perfrms the best (clse t Bayes ptimal), as might be expected given the data arise frm mixtures f Gaussians. The brken purple curve in the backgrund is the Bayes decisin bundary.
10 Flexible Discriminants Lss Hinge Lss Binmial Deviance Squared Errr Class Huber yf FIGURE The supprt vectr lss functin (hinge lss), cmpared t the negative lg-likelihd lss (binmial deviance) fr lgistic regressin, squared-errr lss, and a Huberized versin f the squared hinge lss. All are shwn as a functin f yf rather than f, becausefthesymmetrybetweenthey = +1 and y = 1 case. The deviance and Huber have the same asympttes as the SVM lss, but are runded in the interir. All are scaled t have the limiting left-tail slpe f The SVM as a Penalizatin Methd With f(x) =h(x) T β + β 0,cnsidertheptimizatinprblem min β 0,β [1 y i f(x i )] + + λ 2 β 2 (12.25) where the subscript + indicates psitive part. This has the frm lss + penalty, which is a familiar paradigm in functin estimatin.it is easy t shw (Exercise 12.1) that the slutin t (12.25), with λ =1/C, isthe same as that fr (12.8). Examinatin f the hinge lss functin L(y, f) =[1 yf] + shws that it is reasnable fr tw-class classificatin, when cmpared tthermre traditinal lss functins. Figure 12.4 cmpares it t the lg-likelihd lss fr lgistic regressin, as well as squared-errr lss and a variant theref. The (negative) lg-likelihd r binmial deviance has similar tails as the SVM lss, giving zer penalty t pints well inside their margin, and a
11 12.3 Supprt Vectr Machines and Kernels 427 TABLE The ppulatin minimizers fr the different lss functins in Figure Lgistic regressin uses the binmial lg-likelihd r deviance. Linear discriminant analysis (Exercise 4.2) uses squared-errr lss. The SVM hinge lss estimates the mde f the psterir class prbabilities, whereas the thers estimate a linear transfrmatin f these prbabilities. Lss Functin L[y, f(x)] Minimizing Functin Binmial Pr(Y = +1 x) Deviance lg[1 + e yf(x) ] f(x) =lg Pr(Y =-1 x) SVM Hinge Lss Squared Errr [1 yf(x)] + f(x) =sign[pr(y = +1 x) 1 2 ] [y f(x)] 2 =[1 yf(x)] 2 f(x) =2Pr(Y = +1 x) 1 Huberised Square Hinge Lss 4yf(x), yf(x) < -1 [1 yf(x)] 2 + therwise f(x) =2Pr(Y = +1 x) 1 linear penalty t pints n the wrng side and far away. Squared-errr, n the ther hand gives a quadratic penalty, and pints well inside their wn margin have a strng influence n the mdel as well. The squared hinge lss L(y, f) =[1 yf] 2 + is like the quadratic, except it is zer fr pints inside their margin. It still rises quadratically in the left tail,andwillbe less rbust than hinge r deviance t misclassified bservatins. Recently Rsset and Zhu (2007) prpsed a Huberized versin f the squared hinge lss, which cnverts smthly t a linear lss at yf = 1. We can characterize these lss functins in terms f what they are estimating at the ppulatin level. We cnsider minimizing EL(Y,f(X)). Table 12.1 summarizes the results. Whereas the hinge lss estimates the classifier G(x) itself,allthethersestimateatransfrmatinftheclass psterir prbabilities. The Huberized square hinge lss sharesattractive prperties f lgistic regressin (smth lss functin, estimates prbabilities), as well as the SVM hinge lss (supprt pints). Frmulatin (12.25) casts the SVM as a regularized functin estimatin prblem, where the cefficients f the linear expansin f(x) =β 0 + h(x) T β are shrunk tward zer (excluding the cnstant). If h(x)representsahierarchical basis having sme rdered structure (such as rdered in rughness),
12 Flexible Discriminants then the unifrm shrinkage makes mre sense if the rugher elements h j in the vectr h have smaller nrm. All the lss-functins in Table 12.1 except squared-errr are s called margin maximizing lss-functins (Rsset et al., 2004b). Thismeansthat if the data are separable, then the limit f ˆβ λ in (12.25) as λ 0defines the ptimal separating hyperplane Functin Estimatin and Reprducing Kernels Here we describe SVMs in terms f functin estimatin in reprducing kernel Hilbert spaces, where the kernel prperty abunds. This material is discussed in sme detail in Sectin 5.8. This prvides anther view f the supprt vectr classifier, and helps t clarify hw it wrks. Suppse the basis h arises frm the (pssibly finite) eigen-expansin f apsitivedefinitekernelk, K(x, x )= φ m (x)φ m (x )δ m (12.26) m=1 and h m (x) = δ m φ m (x). Then with θ m = δ m β m,wecanwrite(12.25) as [ ] min 1 y i (β 0 + θ m φ m (x i )) + λ θm 2. (12.27) β 0,θ 2 δ m m=1 + m=1 Nw (12.27) is identical in frm t (5.49) n page 169 in Sectin 5.8, and the thery f reprducing kernel Hilbert spaces described there guarantees afinite-dimensinalslutinfthefrm f(x) =β 0 + α i K(x, x i ). (12.28) In particular we see there an equivalent versin f the ptimizatin criterin (12.19) [Equatin (5.67) in Sectin 5.8.2; see als Wahba et al. (2000)], min β 0,α (1 y i f(x i )) + + λ 2 αt Kα, (12.29) where K is the N N matrix f kernel evaluatins fr all pairs f training features (Exercise 12.2). These mdels are quite general, and include, fr example, the entire family f smthing splines, additive and interactin spline mdels discussed 1 Fr lgistic regressin with separable data, ˆβ λ diverges, but ˆβ λ / ˆβ λ cnverges t the ptimal separating directin.
Pattern Recognition 2014 Support Vector Machines
Pattern Recgnitin 2014 Supprt Vectr Machines Ad Feelders Universiteit Utrecht Ad Feelders ( Universiteit Utrecht ) Pattern Recgnitin 1 / 55 Overview 1 Separable Case 2 Kernel Functins 3 Allwing Errrs (Sft
More informationContents. This is page i Printer: Opaque this
Cntents This is page i Printer: Opaque this Supprt Vectr Machines and Flexible Discriminants. Intrductin............. The Supprt Vectr Classifier.... Cmputing the Supprt Vectr Classifier........ Mixture
More informationSupport Vector Machines and Flexible Discriminants
Supprt Vectr Machines and Flexible Discriminants This is page Printer: Opaque this. Intrductin In this chapter we describe generalizatins f linear decisin bundaries fr classificatin. Optimal separating
More informationSupport-Vector Machines
Supprt-Vectr Machines Intrductin Supprt vectr machine is a linear machine with sme very nice prperties. Haykin chapter 6. See Alpaydin chapter 13 fr similar cntent. Nte: Part f this lecture drew material
More informationLinear programming III
Linear prgramming III Review 1/33 What have cvered in previus tw classes LP prblem setup: linear bjective functin, linear cnstraints. exist extreme pint ptimal slutin. Simplex methd: g thrugh extreme pint
More informationCOMP 551 Applied Machine Learning Lecture 9: Support Vector Machines (cont d)
COMP 551 Applied Machine Learning Lecture 9: Supprt Vectr Machines (cnt d) Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Class web page: www.cs.mcgill.ca/~hvanh2/cmp551 Unless therwise
More informationWhat is Statistical Learning?
What is Statistical Learning? Sales 5 10 15 20 25 Sales 5 10 15 20 25 Sales 5 10 15 20 25 0 50 100 200 300 TV 0 10 20 30 40 50 Radi 0 20 40 60 80 100 Newspaper Shwn are Sales vs TV, Radi and Newspaper,
More informationIAML: Support Vector Machines
1 / 22 IAML: Supprt Vectr Machines Charles Suttn and Victr Lavrenk Schl f Infrmatics Semester 1 2 / 22 Outline Separating hyperplane with maimum margin Nn-separable training data Epanding the input int
More informationCOMP 551 Applied Machine Learning Lecture 11: Support Vector Machines
COMP 551 Applied Machine Learning Lecture 11: Supprt Vectr Machines Instructr: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/cmp551 Unless therwise nted, all material psted fr this curse
More informationThe blessing of dimensionality for kernel methods
fr kernel methds Building classifiers in high dimensinal space Pierre Dupnt Pierre.Dupnt@ucluvain.be Classifiers define decisin surfaces in sme feature space where the data is either initially represented
More informationStats Classification Ji Zhu, Michigan Statistics 1. Classification. Ji Zhu 445C West Hall
Stats 415 - Classificatin Ji Zhu, Michigan Statistics 1 Classificatin Ji Zhu 445C West Hall 734-936-2577 jizhu@umich.edu Stats 415 - Classificatin Ji Zhu, Michigan Statistics 2 Examples f Classificatin
More information3.4 Shrinkage Methods Prostate Cancer Data Example (Continued) Ridge Regression
3.3.4 Prstate Cancer Data Example (Cntinued) 3.4 Shrinkage Methds 61 Table 3.3 shws the cefficients frm a number f different selectin and shrinkage methds. They are best-subset selectin using an all-subsets
More informationIn SMV I. IAML: Support Vector Machines II. This Time. The SVM optimization problem. We saw:
In SMV I IAML: Supprt Vectr Machines II Nigel Gddard Schl f Infrmatics Semester 1 We sa: Ma margin trick Gemetry f the margin and h t cmpute it Finding the ma margin hyperplane using a cnstrained ptimizatin
More informationResampling Methods. Cross-validation, Bootstrapping. Marek Petrik 2/21/2017
Resampling Methds Crss-validatin, Btstrapping Marek Petrik 2/21/2017 Sme f the figures in this presentatin are taken frm An Intrductin t Statistical Learning, with applicatins in R (Springer, 2013) with
More informationResampling Methods. Chapter 5. Chapter 5 1 / 52
Resampling Methds Chapter 5 Chapter 5 1 / 52 1 51 Validatin set apprach 2 52 Crss validatin 3 53 Btstrap Chapter 5 2 / 52 Abut Resampling An imprtant statistical tl Pretending the data as ppulatin and
More informationCOMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification
COMP 551 Applied Machine Learning Lecture 5: Generative mdels fr linear classificatin Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Jelle Pineau Class web page: www.cs.mcgill.ca/~hvanh2/cmp551
More informationSmoothing, penalized least squares and splines
Smthing, penalized least squares and splines Duglas Nychka, www.image.ucar.edu/~nychka Lcally weighted averages Penalized least squares smthers Prperties f smthers Splines and Reprducing Kernels The interplatin
More informationBootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) >
Btstrap Methd > # Purpse: understand hw btstrap methd wrks > bs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(bs) > mean(bs) [1] 21.64625 > # estimate f lambda > lambda = 1/mean(bs);
More informationx 1 Outline IAML: Logistic Regression Decision Boundaries Example Data
Outline IAML: Lgistic Regressin Charles Suttn and Victr Lavrenk Schl f Infrmatics Semester Lgistic functin Lgistic regressin Learning lgistic regressin Optimizatin The pwer f nn-linear basis functins Least-squares
More informationCOMP 551 Applied Machine Learning Lecture 4: Linear classification
COMP 551 Applied Machine Learning Lecture 4: Linear classificatin Instructr: Jelle Pineau (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/cmp551 Unless therwise nted, all material psted
More informationCN700 Additive Models and Trees Chapter 9: Hastie et al. (2001)
CN700 Additive Mdels and Trees Chapter 9: Hastie et al. (2001) Madhusudana Shashanka Department f Cgnitive and Neural Systems Bstn University CN700 - Additive Mdels and Trees March 02, 2004 p.1/34 Overview
More informationSURVIVAL ANALYSIS WITH SUPPORT VECTOR MACHINES
1 SURVIVAL ANALYSIS WITH SUPPORT VECTOR MACHINES Wlfgang HÄRDLE Ruslan MORO Center fr Applied Statistics and Ecnmics (CASE), Humbldt-Universität zu Berlin Mtivatin 2 Applicatins in Medicine estimatin f
More informationTree Structured Classifier
Tree Structured Classifier Reference: Classificatin and Regressin Trees by L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stne, Chapman & Hall, 98. A Medical Eample (CART): Predict high risk patients
More informationEnhancing Performance of MLP/RBF Neural Classifiers via an Multivariate Data Distribution Scheme
Enhancing Perfrmance f / Neural Classifiers via an Multivariate Data Distributin Scheme Halis Altun, Gökhan Gelen Nigde University, Electrical and Electrnics Engineering Department Nigde, Turkey haltun@nigde.edu.tr
More informationSimple Linear Regression (single variable)
Simple Linear Regressin (single variable) Intrductin t Machine Learning Marek Petrik January 31, 2017 Sme f the figures in this presentatin are taken frm An Intrductin t Statistical Learning, with applicatins
More informationCHAPTER 3 INEQUALITIES. Copyright -The Institute of Chartered Accountants of India
CHAPTER 3 INEQUALITIES Cpyright -The Institute f Chartered Accuntants f India INEQUALITIES LEARNING OBJECTIVES One f the widely used decisin making prblems, nwadays, is t decide n the ptimal mix f scarce
More information4th Indian Institute of Astrophysics - PennState Astrostatistics School July, 2013 Vainu Bappu Observatory, Kavalur. Correlation and Regression
4th Indian Institute f Astrphysics - PennState Astrstatistics Schl July, 2013 Vainu Bappu Observatry, Kavalur Crrelatin and Regressin Rahul Ry Indian Statistical Institute, Delhi. Crrelatin Cnsider a tw
More informationCHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS
CHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS 1 Influential bservatins are bservatins whse presence in the data can have a distrting effect n the parameter estimates and pssibly the entire analysis,
More information, which yields. where z1. and z2
The Gaussian r Nrmal PDF, Page 1 The Gaussian r Nrmal Prbability Density Functin Authr: Jhn M Cimbala, Penn State University Latest revisin: 11 September 13 The Gaussian r Nrmal Prbability Density Functin
More informationPSU GISPOPSCI June 2011 Ordinary Least Squares & Spatial Linear Regression in GeoDa
There are tw parts t this lab. The first is intended t demnstrate hw t request and interpret the spatial diagnstics f a standard OLS regressin mdel using GeDa. The diagnstics prvide infrmatin abut the
More informationChapter 3: Cluster Analysis
Chapter 3: Cluster Analysis } 3.1 Basic Cncepts f Clustering 3.1.1 Cluster Analysis 3.1. Clustering Categries } 3. Partitining Methds 3..1 The principle 3.. K-Means Methd 3..3 K-Medids Methd 3..4 CLARA
More informationThe Kullback-Leibler Kernel as a Framework for Discriminant and Localized Representations for Visual Recognition
The Kullback-Leibler Kernel as a Framewrk fr Discriminant and Lcalized Representatins fr Visual Recgnitin Nun Vascncels Purdy H Pedr Mren ECE Department University f Califrnia, San Dieg HP Labs Cambridge
More informationSTATS216v Introduction to Statistical Learning Stanford University, Summer Practice Final (Solutions) Duration: 3 hours
STATS216v Intrductin t Statistical Learning Stanfrd University, Summer 2016 Practice Final (Slutins) Duratin: 3 hurs Instructins: (This is a practice final and will nt be graded.) Remember the university
More informationThe Solution Path of the Slab Support Vector Machine
CCCG 2008, Mntréal, Québec, August 3 5, 2008 The Slutin Path f the Slab Supprt Vectr Machine Michael Eigensatz Jachim Giesen Madhusudan Manjunath Abstract Given a set f pints in a Hilbert space that can
More informationAP Statistics Notes Unit Two: The Normal Distributions
AP Statistics Ntes Unit Tw: The Nrmal Distributins Syllabus Objectives: 1.5 The student will summarize distributins f data measuring the psitin using quartiles, percentiles, and standardized scres (z-scres).
More informationElements of Machine Intelligence - I
ECE-175A Elements f Machine Intelligence - I Ken Kreutz-Delgad Nun Vascncels ECE Department, UCSD Winter 2011 The curse The curse will cver basic, but imprtant, aspects f machine learning and pattern recgnitin
More informationFebruary 28, 2013 COMMENTS ON DIFFUSION, DIFFUSIVITY AND DERIVATION OF HYPERBOLIC EQUATIONS DESCRIBING THE DIFFUSION PHENOMENA
February 28, 2013 COMMENTS ON DIFFUSION, DIFFUSIVITY AND DERIVATION OF HYPERBOLIC EQUATIONS DESCRIBING THE DIFFUSION PHENOMENA Mental Experiment regarding 1D randm walk Cnsider a cntainer f gas in thermal
More informationUNIV1"'RSITY OF NORTH CAROLINA Department of Statistics Chapel Hill, N. C. CUMULATIVE SUM CONTROL CHARTS FOR THE FOLDED NORMAL DISTRIBUTION
UNIV1"'RSITY OF NORTH CAROLINA Department f Statistics Chapel Hill, N. C. CUMULATIVE SUM CONTROL CHARTS FOR THE FOLDED NORMAL DISTRIBUTION by N. L. Jlmsn December 1962 Grant N. AFOSR -62..148 Methds f
More informationBiplots in Practice MICHAEL GREENACRE. Professor of Statistics at the Pompeu Fabra University. Chapter 13 Offprint
Biplts in Practice MICHAEL GREENACRE Prfessr f Statistics at the Pmpeu Fabra University Chapter 13 Offprint CASE STUDY BIOMEDICINE Cmparing Cancer Types Accrding t Gene Epressin Arrays First published:
More informationMath Foundations 20 Work Plan
Math Fundatins 20 Wrk Plan Units / Tpics 20.8 Demnstrate understanding f systems f linear inequalities in tw variables. Time Frame December 1-3 weeks 6-10 Majr Learning Indicatrs Identify situatins relevant
More informationComputational modeling techniques
Cmputatinal mdeling techniques Lecture 4: Mdel checing fr ODE mdels In Petre Department f IT, Åb Aademi http://www.users.ab.fi/ipetre/cmpmd/ Cntent Stichimetric matrix Calculating the mass cnservatin relatins
More informationPart 3 Introduction to statistical classification techniques
Part 3 Intrductin t statistical classificatin techniques Machine Learning, Part 3, March 07 Fabi Rli Preamble ØIn Part we have seen that if we knw: Psterir prbabilities P(ω i / ) Or the equivalent terms
More informationMidwest Big Data Summer School: Machine Learning I: Introduction. Kris De Brabanter
Midwest Big Data Summer Schl: Machine Learning I: Intrductin Kris De Brabanter kbrabant@iastate.edu Iwa State University Department f Statistics Department f Cmputer Science June 24, 2016 1/24 Outline
More informationOverview of Supervised Learning
2 Overview f Supervised Learning 2.1 Intrductin The first three examples described in Chapter 1 have several cmpnents in cmmn. Fr each there is a set f variables that might be dented as inputs, which are
More information5 th grade Common Core Standards
5 th grade Cmmn Cre Standards In Grade 5, instructinal time shuld fcus n three critical areas: (1) develping fluency with additin and subtractin f fractins, and develping understanding f the multiplicatin
More informationLinear Classification
Linear Classificatin CS 54: Machine Learning Slides adapted frm Lee Cper, Jydeep Ghsh, and Sham Kakade Review: Linear Regressin CS 54 [Spring 07] - H Regressin Given an input vectr x T = (x, x,, xp), we
More informationLecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff
Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeff Reading: Chapter 2 STATS 202: Data mining and analysis September 27, 2017 1 / 20 Supervised vs. unsupervised learning In unsupervised
More informationComputational modeling techniques
Cmputatinal mdeling techniques Lecture 2: Mdeling change. In Petre Department f IT, Åb Akademi http://users.ab.fi/ipetre/cmpmd/ Cntent f the lecture Basic paradigm f mdeling change Examples Linear dynamical
More informationA New Evaluation Measure. J. Joiner and L. Werner. The problems of evaluation and the needed criteria of evaluation
III-l III. A New Evaluatin Measure J. Jiner and L. Werner Abstract The prblems f evaluatin and the needed criteria f evaluatin measures in the SMART system f infrmatin retrieval are reviewed and discussed.
More informationLead/Lag Compensator Frequency Domain Properties and Design Methods
Lectures 6 and 7 Lead/Lag Cmpensatr Frequency Dmain Prperties and Design Methds Definitin Cnsider the cmpensatr (ie cntrller Fr, it is called a lag cmpensatr s K Fr s, it is called a lead cmpensatr Ntatin
More informationLecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff
Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeff Reading: Chapter 2 STATS 202: Data mining and analysis September 27, 2017 1 / 20 Supervised vs. unsupervised learning In unsupervised
More informationModule 3: Gaussian Process Parameter Estimation, Prediction Uncertainty, and Diagnostics
Mdule 3: Gaussian Prcess Parameter Estimatin, Predictin Uncertainty, and Diagnstics Jerme Sacks and William J Welch Natinal Institute f Statistical Sciences and University f British Clumbia Adapted frm
More informationk-nearest Neighbor How to choose k Average of k points more reliable when: Large k: noise in attributes +o o noise in class labels
Mtivating Example Memry-Based Learning Instance-Based Learning K-earest eighbr Inductive Assumptin Similar inputs map t similar utputs If nt true => learning is impssible If true => learning reduces t
More informationLecture 8: Multiclass Classification (I)
Bayes Rule fr Multiclass Prblems Traditinal Methds fr Multiclass Prblems Linear Regressin Mdels Lecture 8: Multiclass Classificatin (I) Ha Helen Zhang Fall 07 Ha Helen Zhang Lecture 8: Multiclass Classificatin
More informationSequential Allocation with Minimal Switching
In Cmputing Science and Statistics 28 (1996), pp. 567 572 Sequential Allcatin with Minimal Switching Quentin F. Stut 1 Janis Hardwick 1 EECS Dept., University f Michigan Statistics Dept., Purdue University
More informationInternal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.
Sectin 7 Mdel Assessment This sectin is based n Stck and Watsn s Chapter 9. Internal vs. external validity Internal validity refers t whether the analysis is valid fr the ppulatin and sample being studied.
More informationA Matrix Representation of Panel Data
web Extensin 6 Appendix 6.A A Matrix Representatin f Panel Data Panel data mdels cme in tw brad varieties, distinct intercept DGPs and errr cmpnent DGPs. his appendix presents matrix algebra representatins
More information7 TH GRADE MATH STANDARDS
ALGEBRA STANDARDS Gal 1: Students will use the language f algebra t explre, describe, represent, and analyze number expressins and relatins 7 TH GRADE MATH STANDARDS 7.M.1.1: (Cmprehensin) Select, use,
More informationDepartment of Economics, University of California, Davis Ecn 200C Micro Theory Professor Giacomo Bonanno. Insurance Markets
Department f Ecnmics, University f alifrnia, Davis Ecn 200 Micr Thery Prfessr Giacm Bnann Insurance Markets nsider an individual wh has an initial wealth f. ith sme prbability p he faces a lss f x (0
More informationInterference is when two (or more) sets of waves meet and combine to produce a new pattern.
Interference Interference is when tw (r mre) sets f waves meet and cmbine t prduce a new pattern. This pattern can vary depending n the riginal wave directin, wavelength, amplitude, etc. The tw mst extreme
More informationcfl Cpyright by Ji Zhu 2003 All Rights Reserved ii
FLEXIBLE STATISTICAL MODELING a dissertatin submitted t the department f statistics and the cmmittee n graduate studies f stanfrd university in partial fulfillment f the requirements fr the degree f dctr
More informationThe Entire Regularization Path for the Support Vector Machine
Jurnal f Machine Learning esearch 0 (200) 2 Submitted 3/0; Published?? The Entire egularizatin Path fr the Supprt Vectr Machine Trevr Hastie Department f Statistics Stanfrd University Stanfrd, CA 9305,
More informationBuilding to Transformations on Coordinate Axis Grade 5: Geometry Graph points on the coordinate plane to solve real-world and mathematical problems.
Building t Transfrmatins n Crdinate Axis Grade 5: Gemetry Graph pints n the crdinate plane t slve real-wrld and mathematical prblems. 5.G.1. Use a pair f perpendicular number lines, called axes, t define
More informationSupport Vector Machines
Wien, June, 2010 Paul Hofmarcher, Stefan Theussl, WU Wien Hofmarcher/Theussl SVM 1/21 Linear Separable Separating Hyperplanes Non-Linear Separable Soft-Margin Hyperplanes Hofmarcher/Theussl SVM 2/21 (SVM)
More informationCS 477/677 Analysis of Algorithms Fall 2007 Dr. George Bebis Course Project Due Date: 11/29/2007
CS 477/677 Analysis f Algrithms Fall 2007 Dr. Gerge Bebis Curse Prject Due Date: 11/29/2007 Part1: Cmparisn f Srting Algrithms (70% f the prject grade) The bjective f the first part f the assignment is
More informationDistributions, spatial statistics and a Bayesian perspective
Distributins, spatial statistics and a Bayesian perspective Dug Nychka Natinal Center fr Atmspheric Research Distributins and densities Cnditinal distributins and Bayes Thm Bivariate nrmal Spatial statistics
More informationLyapunov Stability Stability of Equilibrium Points
Lyapunv Stability Stability f Equilibrium Pints 1. Stability f Equilibrium Pints - Definitins In this sectin we cnsider n-th rder nnlinear time varying cntinuus time (C) systems f the frm x = f ( t, x),
More informationCHAPTER 24: INFERENCE IN REGRESSION. Chapter 24: Make inferences about the population from which the sample data came.
MATH 1342 Ch. 24 April 25 and 27, 2013 Page 1 f 5 CHAPTER 24: INFERENCE IN REGRESSION Chapters 4 and 5: Relatinships between tw quantitative variables. Be able t Make a graph (scatterplt) Summarize the
More informationFigure 1a. A planar mechanism.
ME 5 - Machine Design I Fall Semester 0 Name f Student Lab Sectin Number EXAM. OPEN BOOK AND CLOSED NOTES. Mnday, September rd, 0 Write n ne side nly f the paper prvided fr yur slutins. Where necessary,
More informationHomology groups of disks with holes
Hmlgy grups f disks with hles THEOREM. Let p 1,, p k } be a sequence f distinct pints in the interir unit disk D n where n 2, and suppse that fr all j the sets E j Int D n are clsed, pairwise disjint subdisks.
More informationSUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical model for microarray data analysis
SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical mdel fr micrarray data analysis David Rssell Department f Bistatistics M.D. Andersn Cancer Center, Hustn, TX 77030, USA rsselldavid@gmail.cm
More informationPreparation work for A2 Mathematics [2017]
Preparatin wrk fr A2 Mathematics [2017] The wrk studied in Y12 after the return frm study leave is frm the Cre 3 mdule f the A2 Mathematics curse. This wrk will nly be reviewed during Year 13, it will
More information22.54 Neutron Interactions and Applications (Spring 2004) Chapter 11 (3/11/04) Neutron Diffusion
.54 Neutrn Interactins and Applicatins (Spring 004) Chapter (3//04) Neutrn Diffusin References -- J. R. Lamarsh, Intrductin t Nuclear Reactr Thery (Addisn-Wesley, Reading, 966) T study neutrn diffusin
More informationMargin Distribution and Learning Algorithms
ICML 03 Margin Distributin and Learning Algrithms Ashutsh Garg IBM Almaden Research Center, San Jse, CA 9513 USA Dan Rth Department f Cmputer Science, University f Illinis, Urbana, IL 61801 USA ASHUTOSH@US.IBM.COM
More informationEDA Engineering Design & Analysis Ltd
EDA Engineering Design & Analysis Ltd THE FINITE ELEMENT METHOD A shrt tutrial giving an verview f the histry, thery and applicatin f the finite element methd. Intrductin Value f FEM Applicatins Elements
More informationthe results to larger systems due to prop'erties of the projection algorithm. First, the number of hidden nodes must
M.E. Aggune, M.J. Dambrg, M.A. El-Sharkawi, R.J. Marks II and L.E. Atlas, "Dynamic and static security assessment f pwer systems using artificial neural netwrks", Prceedings f the NSF Wrkshp n Applicatins
More informationComputational modeling techniques
Cmputatinal mdeling techniques Lecture 11: Mdeling with systems f ODEs In Petre Department f IT, Ab Akademi http://www.users.ab.fi/ipetre/cmpmd/ Mdeling with differential equatins Mdeling strategy Fcus
More informationENSC Discrete Time Systems. Project Outline. Semester
ENSC 49 - iscrete Time Systems Prject Outline Semester 006-1. Objectives The gal f the prject is t design a channel fading simulatr. Upn successful cmpletin f the prject, yu will reinfrce yur understanding
More informationChapter 4. Unsteady State Conduction
Chapter 4 Unsteady State Cnductin Chapter 5 Steady State Cnductin Chee 318 1 4-1 Intrductin ransient Cnductin Many heat transfer prblems are time dependent Changes in perating cnditins in a system cause
More informationModeling the Nonlinear Rheological Behavior of Materials with a Hyper-Exponential Type Function
www.ccsenet.rg/mer Mechanical Engineering Research Vl. 1, N. 1; December 011 Mdeling the Nnlinear Rhelgical Behavir f Materials with a Hyper-Expnential Type Functin Marc Delphin Mnsia Département de Physique,
More informationChapter 15 & 16: Random Forests & Ensemble Learning
Chapter 15 & 16: Randm Frests & Ensemble Learning DD3364 Nvember 27, 2012 Ty Prblem fr Bsted Tree Bsted Tree Example Estimate this functin with a sum f trees with 9-terminal ndes by minimizing the sum
More informationOn Huntsberger Type Shrinkage Estimator for the Mean of Normal Distribution ABSTRACT INTRODUCTION
Malaysian Jurnal f Mathematical Sciences 4(): 7-4 () On Huntsberger Type Shrinkage Estimatr fr the Mean f Nrmal Distributin Department f Mathematical and Physical Sciences, University f Nizwa, Sultanate
More informationKinetic Model Completeness
5.68J/10.652J Spring 2003 Lecture Ntes Tuesday April 15, 2003 Kinetic Mdel Cmpleteness We say a chemical kinetic mdel is cmplete fr a particular reactin cnditin when it cntains all the species and reactins
More informationLecture 17: Free Energy of Multi-phase Solutions at Equilibrium
Lecture 17: 11.07.05 Free Energy f Multi-phase Slutins at Equilibrium Tday: LAST TIME...2 FREE ENERGY DIAGRAMS OF MULTI-PHASE SOLUTIONS 1...3 The cmmn tangent cnstructin and the lever rule...3 Practical
More informationFloating Point Method for Solving Transportation. Problems with Additional Constraints
Internatinal Mathematical Frum, Vl. 6, 20, n. 40, 983-992 Flating Pint Methd fr Slving Transprtatin Prblems with Additinal Cnstraints P. Pandian and D. Anuradha Department f Mathematics, Schl f Advanced
More informationDifferentiation Applications 1: Related Rates
Differentiatin Applicatins 1: Related Rates 151 Differentiatin Applicatins 1: Related Rates Mdel 1: Sliding Ladder 10 ladder y 10 ladder 10 ladder A 10 ft ladder is leaning against a wall when the bttm
More informationAdmissibility Conditions and Asymptotic Behavior of Strongly Regular Graphs
Admissibility Cnditins and Asympttic Behavir f Strngly Regular Graphs VASCO MOÇO MANO Department f Mathematics University f Prt Oprt PORTUGAL vascmcman@gmailcm LUÍS ANTÓNIO DE ALMEIDA VIEIRA Department
More informationEric Klein and Ning Sa
Week 12. Statistical Appraches t Netwrks: p1 and p* Wasserman and Faust Chapter 15: Statistical Analysis f Single Relatinal Netwrks There are fur tasks in psitinal analysis: 1) Define Equivalence 2) Measure
More informationMaterials Engineering 272-C Fall 2001, Lecture 7 & 8 Fundamentals of Diffusion
Materials Engineering 272-C Fall 2001, Lecture 7 & 8 Fundamentals f Diffusin Diffusin: Transprt in a slid, liquid, r gas driven by a cncentratin gradient (r, in the case f mass transprt, a chemical ptential
More informationinitially lcated away frm the data set never win the cmpetitin, resulting in a nnptimal nal cdebk, [2] [3] [4] and [5]. Khnen's Self Organizing Featur
Cdewrd Distributin fr Frequency Sensitive Cmpetitive Learning with One Dimensinal Input Data Aristides S. Galanpuls and Stanley C. Ahalt Department f Electrical Engineering The Ohi State University Abstract
More informationand the Doppler frequency rate f R , can be related to the coefficients of this polynomial. The relationships are:
Algrithm fr Estimating R and R - (David Sandwell, SIO, August 4, 2006) Azimith cmpressin invlves the alignment f successive eches t be fcused n a pint target Let s be the slw time alng the satellite track
More informationStatistical Learning. 2.1 What Is Statistical Learning?
2 Statistical Learning 2.1 What Is Statistical Learning? In rder t mtivate ur study f statistical learning, we begin with a simple example. Suppse that we are statistical cnsultants hired by a client t
More informationThermodynamics and Equilibrium
Thermdynamics and Equilibrium Thermdynamics Thermdynamics is the study f the relatinship between heat and ther frms f energy in a chemical r physical prcess. We intrduced the thermdynamic prperty f enthalpy,
More informationLinear Methods for Regression
3 Linear Methds fr Regressin This is page 43 Printer: Opaque this 3.1 Intrductin A linear regressin mdel assumes that the regressin functin E(Y X) is linear in the inputs X 1,...,X p. Linear mdels were
More informationFall 2013 Physics 172 Recitation 3 Momentum and Springs
Fall 03 Physics 7 Recitatin 3 Mmentum and Springs Purpse: The purpse f this recitatin is t give yu experience wrking with mmentum and the mmentum update frmula. Readings: Chapter.3-.5 Learning Objectives:.3.
More informationABSORPTION OF GAMMA RAYS
6 Sep 11 Gamma.1 ABSORPTIO OF GAMMA RAYS Gamma rays is the name given t high energy electrmagnetic radiatin riginating frm nuclear energy level transitins. (Typical wavelength, frequency, and energy ranges
More informationA Comparison of Methods for Computing the Eigenvalues and Eigenvectors of a Real Symmetric Matrix. By Paul A. White and Robert R.
A Cmparisn f Methds fr Cmputing the Eigenvalues and Eigenvectrs f a Real Symmetric Matrix By Paul A. White and Rbert R. Brwn Part I. The Eigenvalues I. Purpse. T cmpare thse methds fr cmputing the eigenvalues
More informationLecture 3: Principal Components Analysis (PCA)
Lecture 3: Principal Cmpnents Analysis (PCA) Reading: Sectins 6.3.1, 10.1, 10.2, 10.4 STATS 202: Data mining and analysis Jnathan Taylr, 9/28 Slide credits: Sergi Bacallad 1 / 24 The bias variance decmpsitin
More information1 The limitations of Hartree Fock approximation
Chapter: Pst-Hartree Fck Methds - I The limitatins f Hartree Fck apprximatin The n electrn single determinant Hartree Fck wave functin is the variatinal best amng all pssible n electrn single determinants
More informationGeneral Chemistry II, Unit I: Study Guide (part I)
1 General Chemistry II, Unit I: Study Guide (part I) CDS Chapter 14: Physical Prperties f Gases Observatin 1: Pressure- Vlume Measurements n Gases The spring f air is measured as pressure, defined as the
More information