Linear programming III
|
|
- Dorthy Summers
- 6 years ago
- Views:
Transcription
1 Linear prgramming III
2 Review 1/33 What have cvered in previus tw classes LP prblem setup: linear bjective functin, linear cnstraints. exist extreme pint ptimal slutin. Simplex methd: g thrugh extreme pint t find the ptimal slutin. Primal-dual prperty f the LP prblem. Interir pint algrithm: based n the Primal-dual prperty, travel thrugh the interir f the feasible slutin space. Quadratic prgramming: based n KKT cnditin. LP applicatin: quantile regressin minimize the asymmetric abslute deviatins.
3 LP/QP applicatin in statistics II: LASSO 2/33 Cnsider usual regressin settings with data (x i, y i ), where x i = (x i1,..., x ip ) is a vectr f predictrs and y i is the respnse fr the i th bject. The rdinary linear regressin setting is: Find cefficient t minimize the residual sum f squares: n ˆb = argmin (y b i x i b) 2 Here b = (b 1, b 2,..., b p ) T is a vectr f cefficients. The slutin happens t be the MLE assuming a nrmal mdel: i=1 y i = x i b + ɛ i, ɛ i N(0, σ 2 ) This is nt ideal when the number f predictrs (p) is large, because 1. it requires p < n, r there must be sme degree f freedms fr residual. 2. ne wants a small subset f predictrs in the mdel, but OLS prvides an estimated cefficient fr each predictr.
4 The LASSO 3/33 LASSO stands fr Least Abslute Shrinkage and Selectin Operatr, which aims fr mdel selectin when p is large (wrks even p > n). The LASSO prcedure will shrink the cefficients tward 0, and eventually frce sme t be exactly 0 (s predictrs with 0 cefficient will be selected ut). The LASSO estimates are defined as: n b = argmin (y b i x i b) 2, s.t. b 1 t i=1 Here b 1 = p j=1 b j is the L 1 nrm, and t 0 is a tuning parameter cntrlling the strength f shrinkage. S LASSO tries t minimize the residual sum f square, with a cnstraint n the sum f the abslute values f the cefficients. NOTE: There are ther types f regularized regressins. Fr example, regressin with an L 2 penalty, e.g., j b 2 j t, is called ridge regressin.
5 Mdel selectin by LASSO 4/33 The feasible slutin space fr LASSO is linear (defined by the cnstraints), s ften the ptimal slutin is at a crner pint. The implicatin: at ptimal, many cefficient (nn-basic variables) will be 0 variable selectin. On the cntrary, ridge regressin usually desn t have any cefficient being 0, s it desn t d mdel selectin. The LASSO prblem can be slved by standard quadratic prgramming algrithm.
6 LASSO mdel fitting 5/33 In LASSO, we need t slve the fllwing ptimizatin prblem: n max (y i b j x j ) 2 s.t. i=1 b j t j The trick is t cnvert the prblem int the standard QP prblem setting, e.g., remve the abslute value peratr. The easiest way is t let b j = b + j b j, where b + j, b j 0. Then b j = b + j + b j, and the prblem can be written as: n max (y i b + j x j + b j x j) 2 s.t. i=1 (b + j + b j ) t, j b + j, b j 0 j j j This is a standard QP prblem can be slved by standard QP slvers.
7 A little mre n LASSO 6/33 The Lagrangian fr the LASSO ptimizatin prblem is: n L(b, λ) = (y i b j x j ) 2 λ i=1 j p b j This is equivalent t the likelihd functin f a hierarchical mdel with a duble expnential (DE) prir n b s (remember ADE used in quantile regressin?): b j DE(1/λ) Y X, b N(Xb, 1) j=1 The DE density functin is f (x, τ) = 1 ( ) x 2τ exp. τ
8 As a side nte, the ridge regressin is equivalent with the hierarchical mdel with a Nrmal prir n b (verify it).
9 LASSO in R 8/33 The glmnet package has functin glmnet glmnet package:glmnet R Dcumentatin fit a GLM with lass r elasticnet regularizatin Descriptin: Fit a generalized linear mdel via penalized maximum likelihd. The regularizatin path is cmputed fr the lass r elasticnet penalty at a grid f values fr the regularizatin parameter lambda. Can deal with all shapes f data, including very large sparse data matrices. Fits linear, lgistic and multinmial, pissn, and Cx regressin mdels. Usage: glmnet(x, y, family=c("gaussian","binmial","pissn","multinmial","cx","mgaussian"), weights, ffset=null, alpha = 1, nlambda = 100, lambda.min.rati = ifelse(nbs<nvars,0.01,0.0001), lambda=null, standardize = TRUE, intercept=true, thresh = 1e-07, dfmax = nvars + 1, pmax = min(dfmax * 2+20, nvars), exclude, penalty.factr = rep(1, nvars), lwer.limits=-inf, upper.limits=inf, maxit=100000, type.gaussian=ifelse(nvars<500,"cvariance","naive"), type.lgistic=c("newtn","mdified.newtn"), standardize.respnse=false, type.multinmial=c("ungruped","gruped"))
10 LASSO in R example 9/33 > x=matrix(rnrm(100*20),100,10) > b = c(-1, 2) > y=rnrm(100) + x[,1:2]%*%b > fit1=glmnet(x,y) > > cef(fit1, s=0.05) 11 x 1 sparse Matrix f class "dgcmatrix" 1 (Intercept) V V V V V5. V6. V V8. V9. V10.
11 > cef(fit1, s=0.1) 11 x 1 sparse Matrix f class "dgcmatrix" 1 (Intercept) V V V V4. V5. V6. V V8. V9. V10. > cef(fit1, s=0.5) 11 x 1 sparse Matrix f class "dgcmatrix" 1 (Intercept) V V V3. V4. V5. V6....
12 > plt(fit1, "lambda") #### run crss validatin > cv=cv.glmnet(x,y) > plt(cv) Cefficients Lg Lambda Mean Squared Errr lg(lambda)
13 Supprt Vectr Machine (SVM) 12/33 Figures fr the slides are btained frm Hastie et al. The Elements f Statistical Learning. Prblem setting: Given training data pairs (x 1, y 1 ),..., (x N, y N ). x i s are p-vectr predictrs. y i { 1, 1} are utcmes. Our gal: t predict y based n x (find a classifier). Such classifier is defined as a functin f x, G(x). G is estimated based n the training data (x, y) pairs. Once G is btained, it can be used fr future predictins. There are many ways t cnstruct G(x), and Supprt Vectr Machine (SVM) is ne f them. We ll first cnsider the simple case: G(x) is based n linear functin f x. It s ften called linear SVM r supprt vectr classifier.
14 Simple case: perfectly separable case 13/33 First define a linear hyperplane by {x : f (x) = x T b + b 0 = 0}. It is required that b is a unit vectr with b = 1 fr identifiability. A classificatin rule can be defined as G(x) = sign[x T b + b 0 ]. The prblem is t estimate b s. Cnsider a simple case where tw grups are perfectly separated. We want t find a brder t separate tw grups. There are infinite number f brders can perfectly separate tw grups. Which ne is ptimal? Cnceptually, the ptimal brder shuld separates the tw classes with the largest margins. We define the ptimal brder t be the ne satisfying: (1) the distances between the clsest pints t the brder are the same in bth grups, dente the distance by M; and (2) M is maximized. M is called the margin.
15 Prblem setup 14/33 Then prblem t find the best brder can be framed int fllwing ptimizatin prblem: max β,β 0 s.t. M y i (x T i b + b 0) M, i = 1,..., N This is nt a typical LP/QP prblem s we d sme transfrmatins t make it lk mre familiar. Divided bth sides f the cnstraint by M, and define β = b/m, β 0 = b 0 /M, the cnstraints becme: y i (x T i β + β 0) 1. This means that we scale the cefficients f the brder hyperplane, s that the margin lines are in the frms f x T i β + β = 0 (upper margin) and x T i β + β 0 1 = 0 (lwer margin).
16 Nw we have β = b /M = 1/M. S the bjective functin (maximizing M) is equivalent t minimizing β. After this transfrmatin, the ptimizatin prblem can be expressed as a simpler, mre familiar frm: min β,β 0 s.t. β y i (x T i β + β 0) 1, i = 1,..., N This is a typical quadratic prgram prblem.
17 Illustratin f the ptimal brder (slid line) with margins (dash lines) Flexible Discriminants x T β + β 0 =0 M = 1 β M = 1 β margin x T β + β 0 =0 ξ ξ 3 1 ξ 2 ξ 4 ξ5 M = FIGURE Supprt vectr classifiers. The left panel shw case. The decisin bundary is the slid line, while brken lines b maximal margin f width 2M =2/ β. The right panel shws th (verlap) case. The pints labeled ξj are n the wrng side f t an amunt ξj = Mξ j ; pints n the crrect side have ξj =0. maximized subject t a ttal budget P ξ i cnstant. Hence P distance f pints n the wrng side f their margin.
18 {x : f(x) =x T β + β 0 =0}, (12.1) Nn-separable case 17/33 When tw classes are nt perfectly separable, we still want t find a brder with tw margins. But nw there will be pints n the wrng sides. We intrduce slack 418 variables 12. Flexible t accunt Discriminants fr thse pints. x T β + β 0 =0 M = 1 β M = 1 β margin x T β + β 0 =0 ξ ξ 3 1 ξ 2 ξ 4 ξ5 M = 1 β M = 1 β margin FIGURE Supprt vectr classifiers. The left panel shws the separable case. Define The decisin slack variables bundary is the {ξslid 1,.. line,., ξwhile N }, where brken lines ξ i bund 0 i theand shaded maximal margin f width 2M =2/ β. The right panel shws the nnseparable (verlap) case. The pints labeled ξj are n the wrng side f their margin by an amunt ξj = Mξ j ; pints n the crrect side have ξj =0.Themarginis maximized subject t a ttal budget P ξ i cnstant. Hence P ξj is the ttal distance f pints n the wrng side f their margin. ξ = 0 when the pint is n the crrect side f the margin. ξ > 1 when the pint passes the brder t the wrng side. 0 < ξ < 1 when the pint is in the margin but still n the crrect side. Our training data cnsists f N pairs (x 1,y 1 ), (x 2,y 2 ),...,(x N,y N ), with x i IR p and y i { 1, 1}. Define a hyperplane by
19 Nw the cnstraints in the riginal ptimizatin prblem is mdified t: y i (x T i β + β 0) 1 ξ i, i = 1,..., N ξ i can be interpreted as the prprtinal amunt by which the predicatin is n the wrng side f the margin. Anther cnstraint i ξ i C is added t bund the ttal number f misclassificatin. Tgether, the ptimizatin prblem fr this case is written as : 1 min β,β 0 2 β s.t. y i (xi T β + β 0) 1 ξ i ξ i C, ξ i 0 Again this is a quadratic prgramming prblem. What are the unknwns? i
20 Cmputatin 19/33 The primal Lagrangian is: L P = 1 2 β 2 + γ ξ i α i [y i (xi T β + β 0) (1 ξ i )] µ i ξ i i i i Take derivatives f β, β 0, ξ i then set t zer, get (the statinary cnditins) : β = α i y i x i i 0 = α i y i i α i = γ µ i, i Plug these back t the primal Lagrangian, get the fllwing dual bjective functin (verify): L D = α i 1 α i α i y i y i xi T 2 x i i i i
21 The L D needs t be maximized subject t cnstraints: α i y i = 0 i 0 α i γ The KKT cnditins fr the prblem (in additinal t the statinary cnditins) include fllwing cmplementary slackness and primal/dual feasibilities: α i [y i (x T i β + β 0) (1 ξ i )] = 0 µ i ξ i = 0 y i (x T i β + β 0) (1 ξ i ) 0 α i, µ i, ξ i 0 The QP prblem can be slved using interir pint methd based n these.
22 Slve fr ˆβ 0 21/33 With ˆα i and ˆβ given, we still need t get ˆβ 0 t cnstruct the decisin bundary. One f the cmplementary slackness cnditin is: α i [y i (x T i β + β 0) (1 ξ i )] = 0 Any pint with ˆα i > 0 and ˆξ i = 0 (the pints n the margins) can be used t slve fr ˆβ 0. In practice we ften use the average f thse t get a stable result fr ˆβ 0.
23 The supprt vectrs 22/33 At ptimal slutin, β is in the frm f: ˆβ = i ˆα i y i x i. This means ˆβ is a linear cmbinatin f y i x i, and nly depends n thse data pints with ˆα 0. These data pints are called supprt vectrs. Accrding t the cmplmentary slackness in the KKT cnditins, at ptimal pint we have: α i [y i (xi T β + β 0) (1 ξ i )] = 0, i which means α i culd be nn-zer nly when y i (x T i β + β 0) (1 ξ i ) = 0. What des this result tell us?
24 Fr pints with nn-zer α i : The pints with ξ i = 0 will have y i (x T i β + β 0) = 1, r these pints are n the margin lines. Other pints with y i (x T i β + β 0) = 1 ξ i are n the wrng side f the margins. S nly the pints n the margin r at the wrng side f the margin are infrmative fr the separating hyperplane. These pints are called the supprt vectrs, because they prvide supprt fr the decisin bundary. This makes sense, because the pints that can be crrectly separated and far away frm the margin (thse easy pints) dn t tell us anything abut the classificatin rule (the hyperplane).
25 Supprt Vectr Machine 24/33 We have discussed supprt vectr classifier, which uses hyperplane t separate tw grups. Supprt Vectr Machine enlarges the feature space t make the prcedure mre flexible. T be specific, we transfrm the input data x i using sme basis functins h m (x), m = 1,..., M. Nw the input data becme h(x i ) = (h 1 (x i ),..., h M (x i )). This basically transfrm the data t anther space, which culd be nnlinear in the riginal space. We then find SV classifier in the transfrmed space using the same prcedure, e.g., find ptimal ˆ f (x) = h(x) T ˆβ + ˆβ 0. And the decisin is made by: Ĝ(x) = sign( ˆ f (x)). Nte: the classifier is linear in the transfrmed space, but nnlinear in the riginal ne.
26 Chse basis functin? 25/33 Nw the prblem becmes the chice f basis functin, r d we even need t chse basis functin. Recall in the linear space, β is in the frm f: β = α i y i x i. In the transfrmed space, it becmes: β = α i y i h(x i ). i i S the decisin bundary is: f (x) = h(x) T i α i y i h(x i ) + β 0 = α i y i h(x), h(x i ) + β 0. i
27 Mrever, the dual bjective functin in transfrmed space becmes: L D = α i 1 α i α i y i y i h(x i ), h(x i ) 2 i i i What des this tell us? Bth the bjective functin and the decisin bundary in the transfrmed space invlves nly the inner prducts f the transfrmed data, nt the transfrmatin itself! S the basis functins are nt imprtant, as lng as we knw h(x), h(x i ).
28 Kernel tricks 27/33 Define the kernal functin K : R P R P R, t represent the inner prduct in the transfrmed space: K(x, x ) = h(x), h(x ). K needs t be a symmetric and psitive semi-definite. With the kernel trick, the decisin bundary becmes: f (x) = α i y i K(x, x i ) + β 0. i Sme ppular chices f the kernel functins are: Plynmial with d degree: K(x, x ) = (a 0 + a 1 x, x ) d. Radial basis functin (RBF): K(x, x ) = exp{ x x 2 /c}. Sigmid: K(x, x ) = tanh(a 0 + a 1 x, x ).
29 Cmputatin f SVM 28/33 With kernels defined, the Lagrangian dual functin is: L D = α i 1 α i α i y i y i K(x i, x i ) 2 i i i Maximize L D, with α i s being the unknwns, subject t the same cnstrains: α i y i = 0 i 0 < α i < γ This is a standard QP prblem can be slved easily.
30 The rle f γ 29/33 T cntrl the smthness f bundary. Remember γ is intrduced in the primal prblem t cntrl the ttal misclassificatin, e.g., dual variable fr riginal cnstraint i ξ i C. we can always prject the riginal data t higher dimensinal space s that they can be better separated by a linear classifier (in the transfrmed space), but Large γ: fewer errr in transfrmed space, wiggly bundary in riginal space. Small γ: mre errrs in transfrm space, smther bundary in riginal space. γ is a tuning parameter ften btained frm crss-validatin.
31 A little mre abut the decisin rule 30/33 Recall the decisin bundary nly depends n supprt vectrs, r the pints with α i 0. S f (x) can be written as: f (x) = α i y i K(x, x i ) + β 0, where S is the set f supprt vectrs. i S The kernel K(x, x ) can be seen as a similarity measure between x and x. S t classify fr pint x, the decisin is made essentially by a weighted sum f similarity f x t all the supprt vectrs.
32 An example 31/33 SVM using 4-degree plynmial kernal. Decisin bundary prjected int 2-D space Supprt Vectr Machines and Kernels 425 SVM - Degree-4 Plynmial in Feature Space Training Errr: Test Errr: Bayes Errr: SVM - Radial Kernel in Feature Space
33 SVM in R 32/33 There are several R packages include SVM functin: e1071, kernlab, klar, svmpath, etc. Jurnal f Statistical Sftware 21 Table belw summarize the R SVM functins. Fr mre details please refer t the Supprt Vectr Machines in R paper at the class website. ksvm() svm() svmlight() svmpath() (kernlab) (e1071) (klar) (svmpath) Frmulatins C-SVC, C-SVC, ν- C-SVC, -SVR binary C-SVC ν-svc, SVC, ne- C-BSVC, SVC, -SVR, spc-svc, ν-svr ne-svc, - SVR, ν-svr, -BSVR Kernels Gaussian, plynmial, Gaussian, plynmial, Gaussian, plynmial, Gaussian, plynmial linear, sigmid, linear, sigmid linear, sigmid Laplace, Bessel, Anva, Spline Optimizer SMO, TRON SMO chunking NA Mdel Selectin hyperparameter estimatin grid-search functin NA NA fr Gaussian kernels Data frmula, ma- frmula, ma- frmula, ma- matrix
34 Summary f SVM 33/33 Strengths f SVM: flexibility. scales well fr high-dimensinal data. can cntrl cmplexity and errr trade-ff explicitly. as lng as a kernel can be defined, nn-traditinal (vectr) data, like strings, trees can be input. Weakness: hw t chse a gd kernel (a lw degree plynmial r radial basis functin can be a gd start).
Pattern Recognition 2014 Support Vector Machines
Pattern Recgnitin 2014 Supprt Vectr Machines Ad Feelders Universiteit Utrecht Ad Feelders ( Universiteit Utrecht ) Pattern Recgnitin 1 / 55 Overview 1 Separable Case 2 Kernel Functins 3 Allwing Errrs (Sft
More informationIAML: Support Vector Machines
1 / 22 IAML: Supprt Vectr Machines Charles Suttn and Victr Lavrenk Schl f Infrmatics Semester 1 2 / 22 Outline Separating hyperplane with maimum margin Nn-separable training data Epanding the input int
More informationSupport-Vector Machines
Supprt-Vectr Machines Intrductin Supprt vectr machine is a linear machine with sme very nice prperties. Haykin chapter 6. See Alpaydin chapter 13 fr similar cntent. Nte: Part f this lecture drew material
More informationCOMP 551 Applied Machine Learning Lecture 11: Support Vector Machines
COMP 551 Applied Machine Learning Lecture 11: Supprt Vectr Machines Instructr: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/cmp551 Unless therwise nted, all material psted fr this curse
More informationSupport Vector Machines and Flexible Discriminants
12 Supprt Vectr Machines and Flexible Discriminants This is page 417 Printer: Opaque this 12.1 Intrductin In this chapter we describe generalizatins f linear decisin bundaries fr classificatin. Optimal
More informationCOMP 551 Applied Machine Learning Lecture 9: Support Vector Machines (cont d)
COMP 551 Applied Machine Learning Lecture 9: Supprt Vectr Machines (cnt d) Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Class web page: www.cs.mcgill.ca/~hvanh2/cmp551 Unless therwise
More informationIn SMV I. IAML: Support Vector Machines II. This Time. The SVM optimization problem. We saw:
In SMV I IAML: Supprt Vectr Machines II Nigel Gddard Schl f Infrmatics Semester 1 We sa: Ma margin trick Gemetry f the margin and h t cmpute it Finding the ma margin hyperplane using a cnstrained ptimizatin
More informationResampling Methods. Cross-validation, Bootstrapping. Marek Petrik 2/21/2017
Resampling Methds Crss-validatin, Btstrapping Marek Petrik 2/21/2017 Sme f the figures in this presentatin are taken frm An Intrductin t Statistical Learning, with applicatins in R (Springer, 2013) with
More informationCOMP 551 Applied Machine Learning Lecture 4: Linear classification
COMP 551 Applied Machine Learning Lecture 4: Linear classificatin Instructr: Jelle Pineau (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/cmp551 Unless therwise nted, all material psted
More informationWhat is Statistical Learning?
What is Statistical Learning? Sales 5 10 15 20 25 Sales 5 10 15 20 25 Sales 5 10 15 20 25 0 50 100 200 300 TV 0 10 20 30 40 50 Radi 0 20 40 60 80 100 Newspaper Shwn are Sales vs TV, Radi and Newspaper,
More informationx 1 Outline IAML: Logistic Regression Decision Boundaries Example Data
Outline IAML: Lgistic Regressin Charles Suttn and Victr Lavrenk Schl f Infrmatics Semester Lgistic functin Lgistic regressin Learning lgistic regressin Optimizatin The pwer f nn-linear basis functins Least-squares
More informationThe blessing of dimensionality for kernel methods
fr kernel methds Building classifiers in high dimensinal space Pierre Dupnt Pierre.Dupnt@ucluvain.be Classifiers define decisin surfaces in sme feature space where the data is either initially represented
More informationSmoothing, penalized least squares and splines
Smthing, penalized least squares and splines Duglas Nychka, www.image.ucar.edu/~nychka Lcally weighted averages Penalized least squares smthers Prperties f smthers Splines and Reprducing Kernels The interplatin
More informationCOMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification
COMP 551 Applied Machine Learning Lecture 5: Generative mdels fr linear classificatin Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Jelle Pineau Class web page: www.cs.mcgill.ca/~hvanh2/cmp551
More informationContents. This is page i Printer: Opaque this
Cntents This is page i Printer: Opaque this Supprt Vectr Machines and Flexible Discriminants. Intrductin............. The Supprt Vectr Classifier.... Cmputing the Supprt Vectr Classifier........ Mixture
More informationResampling Methods. Chapter 5. Chapter 5 1 / 52
Resampling Methds Chapter 5 Chapter 5 1 / 52 1 51 Validatin set apprach 2 52 Crss validatin 3 53 Btstrap Chapter 5 2 / 52 Abut Resampling An imprtant statistical tl Pretending the data as ppulatin and
More information3.4 Shrinkage Methods Prostate Cancer Data Example (Continued) Ridge Regression
3.3.4 Prstate Cancer Data Example (Cntinued) 3.4 Shrinkage Methds 61 Table 3.3 shws the cefficients frm a number f different selectin and shrinkage methds. They are best-subset selectin using an all-subsets
More informationk-nearest Neighbor How to choose k Average of k points more reliable when: Large k: noise in attributes +o o noise in class labels
Mtivating Example Memry-Based Learning Instance-Based Learning K-earest eighbr Inductive Assumptin Similar inputs map t similar utputs If nt true => learning is impssible If true => learning reduces t
More informationSupport Vector Machines and Flexible Discriminants
Supprt Vectr Machines and Flexible Discriminants This is page Printer: Opaque this. Intrductin In this chapter we describe generalizatins f linear decisin bundaries fr classificatin. Optimal separating
More informationSURVIVAL ANALYSIS WITH SUPPORT VECTOR MACHINES
1 SURVIVAL ANALYSIS WITH SUPPORT VECTOR MACHINES Wlfgang HÄRDLE Ruslan MORO Center fr Applied Statistics and Ecnmics (CASE), Humbldt-Universität zu Berlin Mtivatin 2 Applicatins in Medicine estimatin f
More informationLecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff
Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeff Reading: Chapter 2 STATS 202: Data mining and analysis September 27, 2017 1 / 20 Supervised vs. unsupervised learning In unsupervised
More informationStats Classification Ji Zhu, Michigan Statistics 1. Classification. Ji Zhu 445C West Hall
Stats 415 - Classificatin Ji Zhu, Michigan Statistics 1 Classificatin Ji Zhu 445C West Hall 734-936-2577 jizhu@umich.edu Stats 415 - Classificatin Ji Zhu, Michigan Statistics 2 Examples f Classificatin
More informationMidwest Big Data Summer School: Machine Learning I: Introduction. Kris De Brabanter
Midwest Big Data Summer Schl: Machine Learning I: Intrductin Kris De Brabanter kbrabant@iastate.edu Iwa State University Department f Statistics Department f Cmputer Science June 24, 2016 1/24 Outline
More informationLecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff
Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeff Reading: Chapter 2 STATS 202: Data mining and analysis September 27, 2017 1 / 20 Supervised vs. unsupervised learning In unsupervised
More informationChapter 3: Cluster Analysis
Chapter 3: Cluster Analysis } 3.1 Basic Cncepts f Clustering 3.1.1 Cluster Analysis 3.1. Clustering Categries } 3. Partitining Methds 3..1 The principle 3.. K-Means Methd 3..3 K-Medids Methd 3..4 CLARA
More informationCN700 Additive Models and Trees Chapter 9: Hastie et al. (2001)
CN700 Additive Mdels and Trees Chapter 9: Hastie et al. (2001) Madhusudana Shashanka Department f Cgnitive and Neural Systems Bstn University CN700 - Additive Mdels and Trees March 02, 2004 p.1/34 Overview
More informationDifferentiation Applications 1: Related Rates
Differentiatin Applicatins 1: Related Rates 151 Differentiatin Applicatins 1: Related Rates Mdel 1: Sliding Ladder 10 ladder y 10 ladder 10 ladder A 10 ft ladder is leaning against a wall when the bttm
More informationModule 3: Gaussian Process Parameter Estimation, Prediction Uncertainty, and Diagnostics
Mdule 3: Gaussian Prcess Parameter Estimatin, Predictin Uncertainty, and Diagnstics Jerme Sacks and William J Welch Natinal Institute f Statistical Sciences and University f British Clumbia Adapted frm
More informationBootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) >
Btstrap Methd > # Purpse: understand hw btstrap methd wrks > bs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(bs) > mean(bs) [1] 21.64625 > # estimate f lambda > lambda = 1/mean(bs);
More informationSTATS216v Introduction to Statistical Learning Stanford University, Summer Practice Final (Solutions) Duration: 3 hours
STATS216v Intrductin t Statistical Learning Stanfrd University, Summer 2016 Practice Final (Slutins) Duratin: 3 hurs Instructins: (This is a practice final and will nt be graded.) Remember the university
More informationThe Solution Path of the Slab Support Vector Machine
CCCG 2008, Mntréal, Québec, August 3 5, 2008 The Slutin Path f the Slab Supprt Vectr Machine Michael Eigensatz Jachim Giesen Madhusudan Manjunath Abstract Given a set f pints in a Hilbert space that can
More informationComputational modeling techniques
Cmputatinal mdeling techniques Lecture 4: Mdel checing fr ODE mdels In Petre Department f IT, Åb Aademi http://www.users.ab.fi/ipetre/cmpmd/ Cntent Stichimetric matrix Calculating the mass cnservatin relatins
More informationSimple Linear Regression (single variable)
Simple Linear Regressin (single variable) Intrductin t Machine Learning Marek Petrik January 31, 2017 Sme f the figures in this presentatin are taken frm An Intrductin t Statistical Learning, with applicatins
More informationMath Foundations 20 Work Plan
Math Fundatins 20 Wrk Plan Units / Tpics 20.8 Demnstrate understanding f systems f linear inequalities in tw variables. Time Frame December 1-3 weeks 6-10 Majr Learning Indicatrs Identify situatins relevant
More informationLead/Lag Compensator Frequency Domain Properties and Design Methods
Lectures 6 and 7 Lead/Lag Cmpensatr Frequency Dmain Prperties and Design Methds Definitin Cnsider the cmpensatr (ie cntrller Fr, it is called a lag cmpensatr s K Fr s, it is called a lead cmpensatr Ntatin
More informationLinear Classification
Linear Classificatin CS 54: Machine Learning Slides adapted frm Lee Cper, Jydeep Ghsh, and Sham Kakade Review: Linear Regressin CS 54 [Spring 07] - H Regressin Given an input vectr x T = (x, x,, xp), we
More information[COLLEGE ALGEBRA EXAM I REVIEW TOPICS] ( u s e t h i s t o m a k e s u r e y o u a r e r e a d y )
(Abut the final) [COLLEGE ALGEBRA EXAM I REVIEW TOPICS] ( u s e t h i s t m a k e s u r e y u a r e r e a d y ) The department writes the final exam s I dn't really knw what's n it and I can't very well
More informationCHAPTER 3 INEQUALITIES. Copyright -The Institute of Chartered Accountants of India
CHAPTER 3 INEQUALITIES Cpyright -The Institute f Chartered Accuntants f India INEQUALITIES LEARNING OBJECTIVES One f the widely used decisin making prblems, nwadays, is t decide n the ptimal mix f scarce
More informationSection 6-2: Simplex Method: Maximization with Problem Constraints of the Form ~
Sectin 6-2: Simplex Methd: Maximizatin with Prblem Cnstraints f the Frm ~ Nte: This methd was develped by Gerge B. Dantzig in 1947 while n assignment t the U.S. Department f the Air Frce. Definitin: Standard
More informationThermodynamics Partial Outline of Topics
Thermdynamics Partial Outline f Tpics I. The secnd law f thermdynamics addresses the issue f spntaneity and invlves a functin called entrpy (S): If a prcess is spntaneus, then Suniverse > 0 (2 nd Law!)
More informationTree Structured Classifier
Tree Structured Classifier Reference: Classificatin and Regressin Trees by L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stne, Chapman & Hall, 98. A Medical Eample (CART): Predict high risk patients
More informationNUMBERS, MATHEMATICS AND EQUATIONS
AUSTRALIAN CURRICULUM PHYSICS GETTING STARTED WITH PHYSICS NUMBERS, MATHEMATICS AND EQUATIONS An integral part t the understanding f ur physical wrld is the use f mathematical mdels which can be used t
More informationThermodynamics and Equilibrium
Thermdynamics and Equilibrium Thermdynamics Thermdynamics is the study f the relatinship between heat and ther frms f energy in a chemical r physical prcess. We intrduced the thermdynamic prperty f enthalpy,
More informationT Algorithmic methods for data mining. Slide set 6: dimensionality reduction
T-61.5060 Algrithmic methds fr data mining Slide set 6: dimensinality reductin reading assignment LRU bk: 11.1 11.3 PCA tutrial in mycurses (ptinal) ptinal: An Elementary Prf f a Therem f Jhnsn and Lindenstrauss,
More informationModeling the Nonlinear Rheological Behavior of Materials with a Hyper-Exponential Type Function
www.ccsenet.rg/mer Mechanical Engineering Research Vl. 1, N. 1; December 011 Mdeling the Nnlinear Rhelgical Behavir f Materials with a Hyper-Expnential Type Functin Marc Delphin Mnsia Département de Physique,
More informationPSU GISPOPSCI June 2011 Ordinary Least Squares & Spatial Linear Regression in GeoDa
There are tw parts t this lab. The first is intended t demnstrate hw t request and interpret the spatial diagnstics f a standard OLS regressin mdel using GeDa. The diagnstics prvide infrmatin abut the
More informationAdmin. MDP Search Trees. Optimal Quantities. Reinforcement Learning
Admin Reinfrcement Learning Cntent adapted frm Berkeley CS188 MDP Search Trees Each MDP state prjects an expectimax-like search tree Optimal Quantities The value (utility) f a state s: V*(s) = expected
More informationPreparation work for A2 Mathematics [2018]
Preparatin wrk fr A Mathematics [018] The wrk studied in Y1 will frm the fundatins n which will build upn in Year 13. It will nly be reviewed during Year 13, it will nt be retaught. This is t allw time
More informationCHAPTER 24: INFERENCE IN REGRESSION. Chapter 24: Make inferences about the population from which the sample data came.
MATH 1342 Ch. 24 April 25 and 27, 2013 Page 1 f 5 CHAPTER 24: INFERENCE IN REGRESSION Chapters 4 and 5: Relatinships between tw quantitative variables. Be able t Make a graph (scatterplt) Summarize the
More informationChapters 29 and 35 Thermochemistry and Chemical Thermodynamics
Chapters 9 and 35 Thermchemistry and Chemical Thermdynamics 1 Cpyright (c) 011 by Michael A. Janusa, PhD. All rights reserved. Thermchemistry Thermchemistry is the study f the energy effects that accmpany
More informationInternal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.
Sectin 7 Mdel Assessment This sectin is based n Stck and Watsn s Chapter 9. Internal vs. external validity Internal validity refers t whether the analysis is valid fr the ppulatin and sample being studied.
More informationSPH3U1 Lesson 06 Kinematics
PROJECTILE MOTION LEARNING GOALS Students will: Describe the mtin f an bject thrwn at arbitrary angles thrugh the air. Describe the hrizntal and vertical mtins f a prjectile. Slve prjectile mtin prblems.
More informationYou need to be able to define the following terms and answer basic questions about them:
CS440/ECE448 Sectin Q Fall 2017 Midterm Review Yu need t be able t define the fllwing terms and answer basic questins abut them: Intr t AI, agents and envirnments Pssible definitins f AI, prs and cns f
More informationDistributions, spatial statistics and a Bayesian perspective
Distributins, spatial statistics and a Bayesian perspective Dug Nychka Natinal Center fr Atmspheric Research Distributins and densities Cnditinal distributins and Bayes Thm Bivariate nrmal Spatial statistics
More informationComputational modeling techniques
Cmputatinal mdeling techniques Lecture 2: Mdeling change. In Petre Department f IT, Åb Akademi http://users.ab.fi/ipetre/cmpmd/ Cntent f the lecture Basic paradigm f mdeling change Examples Linear dynamical
More informationLHS Mathematics Department Honors Pre-Calculus Final Exam 2002 Answers
LHS Mathematics Department Hnrs Pre-alculus Final Eam nswers Part Shrt Prblems The table at the right gives the ppulatin f Massachusetts ver the past several decades Using an epnential mdel, predict the
More informationThe Kullback-Leibler Kernel as a Framework for Discriminant and Localized Representations for Visual Recognition
The Kullback-Leibler Kernel as a Framewrk fr Discriminant and Lcalized Representatins fr Visual Recgnitin Nun Vascncels Purdy H Pedr Mren ECE Department University f Califrnia, San Dieg HP Labs Cambridge
More informationThis section is primarily focused on tools to aid us in finding roots/zeros/ -intercepts of polynomials. Essentially, our focus turns to solving.
Sectin 3.2: Many f yu WILL need t watch the crrespnding vides fr this sectin n MyOpenMath! This sectin is primarily fcused n tls t aid us in finding rts/zers/ -intercepts f plynmials. Essentially, ur fcus
More informationDetermining Optimum Path in Synthesis of Organic Compounds using Branch and Bound Algorithm
Determining Optimum Path in Synthesis f Organic Cmpunds using Branch and Bund Algrithm Diastuti Utami 13514071 Prgram Studi Teknik Infrmatika Seklah Teknik Elektr dan Infrmatika Institut Teknlgi Bandung,
More informationMaterials Engineering 272-C Fall 2001, Lecture 7 & 8 Fundamentals of Diffusion
Materials Engineering 272-C Fall 2001, Lecture 7 & 8 Fundamentals f Diffusin Diffusin: Transprt in a slid, liquid, r gas driven by a cncentratin gradient (r, in the case f mass transprt, a chemical ptential
More informationSUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical model for microarray data analysis
SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical mdel fr micrarray data analysis David Rssell Department f Bistatistics M.D. Andersn Cancer Center, Hustn, TX 77030, USA rsselldavid@gmail.cm
More informationChapter 2 GAUSS LAW Recommended Problems:
Chapter GAUSS LAW Recmmended Prblems: 1,4,5,6,7,9,11,13,15,18,19,1,7,9,31,35,37,39,41,43,45,47,49,51,55,57,61,6,69. LCTRIC FLUX lectric flux is a measure f the number f electric filed lines penetrating
More informationLecture 5: Equilibrium and Oscillations
Lecture 5: Equilibrium and Oscillatins Energy and Mtin Last time, we fund that fr a system with energy cnserved, v = ± E U m ( ) ( ) One result we see immediately is that there is n slutin fr velcity if
More informationElements of Machine Intelligence - I
ECE-175A Elements f Machine Intelligence - I Ken Kreutz-Delgad Nun Vascncels ECE Department, UCSD Winter 2011 The curse The curse will cver basic, but imprtant, aspects f machine learning and pattern recgnitin
More information7 TH GRADE MATH STANDARDS
ALGEBRA STANDARDS Gal 1: Students will use the language f algebra t explre, describe, represent, and analyze number expressins and relatins 7 TH GRADE MATH STANDARDS 7.M.1.1: (Cmprehensin) Select, use,
More informationA Matrix Representation of Panel Data
web Extensin 6 Appendix 6.A A Matrix Representatin f Panel Data Panel data mdels cme in tw brad varieties, distinct intercept DGPs and errr cmpnent DGPs. his appendix presents matrix algebra representatins
More informationCHM112 Lab Graphing with Excel Grading Rubric
Name CHM112 Lab Graphing with Excel Grading Rubric Criteria Pints pssible Pints earned Graphs crrectly pltted and adhere t all guidelines (including descriptive title, prperly frmatted axes, trendline
More informationExperiment #3. Graphing with Excel
Experiment #3. Graphing with Excel Study the "Graphing with Excel" instructins that have been prvided. Additinal help with learning t use Excel can be fund n several web sites, including http://www.ncsu.edu/labwrite/res/gt/gt-
More informationFunction notation & composite functions Factoring Dividing polynomials Remainder theorem & factor property
Functin ntatin & cmpsite functins Factring Dividing plynmials Remainder therem & factr prperty Can d s by gruping r by: Always lk fr a cmmn factr first 2 numbers that ADD t give yu middle term and MULTIPLY
More informationLecture 7: Damped and Driven Oscillations
Lecture 7: Damped and Driven Oscillatins Last time, we fund fr underdamped scillatrs: βt x t = e A1 + A csω1t + i A1 A sinω1t A 1 and A are cmplex numbers, but ur answer must be real Implies that A 1 and
More informationMATHEMATICS SYLLABUS SECONDARY 5th YEAR
Eurpean Schls Office f the Secretary-General Pedaggical Develpment Unit Ref. : 011-01-D-8-en- Orig. : EN MATHEMATICS SYLLABUS SECONDARY 5th YEAR 6 perid/week curse APPROVED BY THE JOINT TEACHING COMMITTEE
More informationCHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS
CHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS 1 Influential bservatins are bservatins whse presence in the data can have a distrting effect n the parameter estimates and pssibly the entire analysis,
More information4th Indian Institute of Astrophysics - PennState Astrostatistics School July, 2013 Vainu Bappu Observatory, Kavalur. Correlation and Regression
4th Indian Institute f Astrphysics - PennState Astrstatistics Schl July, 2013 Vainu Bappu Observatry, Kavalur Crrelatin and Regressin Rahul Ry Indian Statistical Institute, Delhi. Crrelatin Cnsider a tw
More informationHomology groups of disks with holes
Hmlgy grups f disks with hles THEOREM. Let p 1,, p k } be a sequence f distinct pints in the interir unit disk D n where n 2, and suppse that fr all j the sets E j Int D n are clsed, pairwise disjint subdisks.
More informationLecture 8: Multiclass Classification (I)
Bayes Rule fr Multiclass Prblems Traditinal Methds fr Multiclass Prblems Linear Regressin Mdels Lecture 8: Multiclass Classificatin (I) Ha Helen Zhang Fall 07 Ha Helen Zhang Lecture 8: Multiclass Classificatin
More informationDead-beat controller design
J. Hetthéssy, A. Barta, R. Bars: Dead beat cntrller design Nvember, 4 Dead-beat cntrller design In sampled data cntrl systems the cntrller is realised by an intelligent device, typically by a PLC (Prgrammable
More informationMaximum A Posteriori (MAP) CS 109 Lecture 22 May 16th, 2016
Maximum A Psteriri (MAP) CS 109 Lecture 22 May 16th, 2016 Previusly in CS109 Game f Estimatrs Maximum Likelihd Nn spiler: this didn t happen Side Plt argmax argmax f lg Mther f ptimizatins? Reviving an
More informationCS 109 Lecture 23 May 18th, 2016
CS 109 Lecture 23 May 18th, 2016 New Datasets Heart Ancestry Netflix Our Path Parameter Estimatin Machine Learning: Frmally Many different frms f Machine Learning We fcus n the prblem f predictin Want
More informationFall 2013 Physics 172 Recitation 3 Momentum and Springs
Fall 03 Physics 7 Recitatin 3 Mmentum and Springs Purpse: The purpse f this recitatin is t give yu experience wrking with mmentum and the mmentum update frmula. Readings: Chapter.3-.5 Learning Objectives:.3.
More informationBiplots in Practice MICHAEL GREENACRE. Professor of Statistics at the Pompeu Fabra University. Chapter 13 Offprint
Biplts in Practice MICHAEL GREENACRE Prfessr f Statistics at the Pmpeu Fabra University Chapter 13 Offprint CASE STUDY BIOMEDICINE Cmparing Cancer Types Accrding t Gene Epressin Arrays First published:
More informationSection 5.8 Notes Page Exponential Growth and Decay Models; Newton s Law
Sectin 5.8 Ntes Page 1 5.8 Expnential Grwth and Decay Mdels; Newtn s Law There are many applicatins t expnential functins that we will fcus n in this sectin. First let s lk at the expnential mdel. Expnential
More informationMath 302 Learning Objectives
Multivariable Calculus (Part I) 13.1 Vectrs in Three-Dimensinal Space Math 302 Learning Objectives Plt pints in three-dimensinal space. Find the distance between tw pints in three-dimensinal space. Write
More informationReinforcement Learning" CMPSCI 383 Nov 29, 2011!
Reinfrcement Learning" CMPSCI 383 Nv 29, 2011! 1 Tdayʼs lecture" Review f Chapter 17: Making Cmple Decisins! Sequential decisin prblems! The mtivatin and advantages f reinfrcement learning.! Passive learning!
More informationLyapunov Stability Stability of Equilibrium Points
Lyapunv Stability Stability f Equilibrium Pints 1. Stability f Equilibrium Pints - Definitins In this sectin we cnsider n-th rder nnlinear time varying cntinuus time (C) systems f the frm x = f ( t, x),
More informationLab 1 The Scientific Method
INTRODUCTION The fllwing labratry exercise is designed t give yu, the student, an pprtunity t explre unknwn systems, r universes, and hypthesize pssible rules which may gvern the behavir within them. Scientific
More informationFloating Point Method for Solving Transportation. Problems with Additional Constraints
Internatinal Mathematical Frum, Vl. 6, 20, n. 40, 983-992 Flating Pint Methd fr Slving Transprtatin Prblems with Additinal Cnstraints P. Pandian and D. Anuradha Department f Mathematics, Schl f Advanced
More informationPreparation work for A2 Mathematics [2017]
Preparatin wrk fr A2 Mathematics [2017] The wrk studied in Y12 after the return frm study leave is frm the Cre 3 mdule f the A2 Mathematics curse. This wrk will nly be reviewed during Year 13, it will
More informationMODULE FOUR. This module addresses functions. SC Academic Elementary Algebra Standards:
MODULE FOUR This mdule addresses functins SC Academic Standards: EA-3.1 Classify a relatinship as being either a functin r nt a functin when given data as a table, set f rdered pairs, r graph. EA-3.2 Use
More informationAdministrativia. Assignment 1 due thursday 9/23/2004 BEFORE midnight. Midterm exam 10/07/2003 in class. CS 460, Sessions 8-9 1
Administrativia Assignment 1 due thursday 9/23/2004 BEFORE midnight Midterm eam 10/07/2003 in class CS 460, Sessins 8-9 1 Last time: search strategies Uninfrmed: Use nly infrmatin available in the prblem
More informationDepartment of Economics, University of California, Davis Ecn 200C Micro Theory Professor Giacomo Bonanno. Insurance Markets
Department f Ecnmics, University f alifrnia, Davis Ecn 200 Micr Thery Prfessr Giacm Bnann Insurance Markets nsider an individual wh has an initial wealth f. ith sme prbability p he faces a lss f x (0
More informationCS 477/677 Analysis of Algorithms Fall 2007 Dr. George Bebis Course Project Due Date: 11/29/2007
CS 477/677 Analysis f Algrithms Fall 2007 Dr. Gerge Bebis Curse Prject Due Date: 11/29/2007 Part1: Cmparisn f Srting Algrithms (70% f the prject grade) The bjective f the first part f the assignment is
More information**DO NOT ONLY RELY ON THIS STUDY GUIDE!!!**
Tpics lists: UV-Vis Absrbance Spectrscpy Lab & ChemActivity 3-6 (nly thrugh 4) I. UV-Vis Absrbance Spectrscpy Lab Beer s law Relates cncentratin f a chemical species in a slutin and the absrbance f that
More informationLab #3: Pendulum Period and Proportionalities
Physics 144 Chwdary Hw Things Wrk Spring 2006 Name: Partners Name(s): Intrductin Lab #3: Pendulum Perid and Prprtinalities Smetimes, it is useful t knw the dependence f ne quantity n anther, like hw the
More informationDetermining the Accuracy of Modal Parameter Estimation Methods
Determining the Accuracy f Mdal Parameter Estimatin Methds by Michael Lee Ph.D., P.E. & Mar Richardsn Ph.D. Structural Measurement Systems Milpitas, CA Abstract The mst cmmn type f mdal testing system
More informationBuilding to Transformations on Coordinate Axis Grade 5: Geometry Graph points on the coordinate plane to solve real-world and mathematical problems.
Building t Transfrmatins n Crdinate Axis Grade 5: Gemetry Graph pints n the crdinate plane t slve real-wrld and mathematical prblems. 5.G.1. Use a pair f perpendicular number lines, called axes, t define
More informationPart 3 Introduction to statistical classification techniques
Part 3 Intrductin t statistical classificatin techniques Machine Learning, Part 3, March 07 Fabi Rli Preamble ØIn Part we have seen that if we knw: Psterir prbabilities P(ω i / ) Or the equivalent terms
More informationExponential Functions, Growth and Decay
Name..Class. Date. Expnential Functins, Grwth and Decay Essential questin: What are the characteristics f an expnential junctin? In an expnential functin, the variable is an expnent. The parent functin
More informationRevision: August 19, E Main Suite D Pullman, WA (509) Voice and Fax
.7.4: Direct frequency dmain circuit analysis Revisin: August 9, 00 5 E Main Suite D Pullman, WA 9963 (509) 334 6306 ice and Fax Overview n chapter.7., we determined the steadystate respnse f electrical
More informationChapter Summary. Mathematical Induction Strong Induction Recursive Definitions Structural Induction Recursive Algorithms
Chapter 5 1 Chapter Summary Mathematical Inductin Strng Inductin Recursive Definitins Structural Inductin Recursive Algrithms Sectin 5.1 3 Sectin Summary Mathematical Inductin Examples f Prf by Mathematical
More informationPipetting 101 Developed by BSU CityLab
Discver the Micrbes Within: The Wlbachia Prject Pipetting 101 Develped by BSU CityLab Clr Cmparisns Pipetting Exercise #1 STUDENT OBJECTIVES Students will be able t: Chse the crrect size micrpipette fr
More information