CN700 Additive Models and Trees Chapter 9: Hastie et al. (2001)
|
|
- Eleanore Cooper
- 6 years ago
- Views:
Transcription
1 CN700 Additive Mdels and Trees Chapter 9: Hastie et al. (2001) Madhusudana Shashanka Department f Cgnitive and Neural Systems Bstn University CN700 - Additive Mdels and Trees March 02, 2004 p.1/34
2 Overview Generalized additive mdels Tree-based mdels PRIM bump hunting MARS CN700 - Additive Mdels and Trees March 02, 2004 p.2/34
3 Generalized Additive Mdels Techniques that use predefined basis functins achieve nnlinearity. Anther apprach generalized additive mdels. Mre autmatic and flexible. In the regressin setting, it can be expressed as E(Y X 1, X 2,..., X p ) = α + f 1 (X 1 ) + f 2 (X 2 ) f p (X p ). f j s are unspecified smth (nnparametric) functins. Fit each functin using a scatterplt smther (cubic smthing spline r kernel smther). Simultaneusly estimate all p functins. CN700 - Additive Mdels and Trees March 02, 2004 p.3/34
4 Examples In general, the cnditinal mean µ(x) f a respnse Y is related t an additive functin f the predictrs via a link functin g: g[µ(x)] = α + f 1 (X 1 ) f p (X p ). g(µ) = µ is the identity link. Linear and additive mdels fr Gaussian respnse data. g(µ) = lgit(µ) r g(µ) = prbit(µ) fr binmial prbabilities. Prbit is the inverse Gaussian cumulative distributin functin. g(µ) = lg(µ) fr lg-linear r lg-additive mdels fr Pissn cunt data. Mre flexibility Mix linear and ther parametric frms Nnlinear cmpnents in tw r mre variables Separate curves in X j fr each level f the factr X k. CN700 - Additive Mdels and Trees March 02, 2004 p.4/34
5 Examples g(µ) = X T β + α k + f(z). A semiparametric mdel, where α k is the effect fr the kth level f a qualitative input V. g(µ) = f(x) + g k (Z), where g k (Z) = g(v, Z) is an interactin term fr the effect f V and Z. g(µ) = f(x) + g(z, W ), where g is nnparametric in tw features. Example where additive mdels apply additive decmpsitin f time series, Y t = S t + T t + ɛ t, where S t is a seasnal cmpnent, T t is a trend and ɛ is an errr term. CN700 - Additive Mdels and Trees March 02, 2004 p.5/34
6 Fitting additive mdels The mdel: Y = α + p j=1 f j(x j ) + ɛ. Criterin: penalized sum f squares P RSS(α, f 1,..., f p ) = N i=1 { yi α p j=1 f j(x ij ) } 2 + p j=1 λ j f j (t j) 2 dt j. λ j 0 are tuning parameters. Minimizer is an additive cubic spline mdel each f j is a cubic spline in X j and knts are at each unique x ij, i = 1,..., N. Hwever, slutin nt unique. Mre restrictins Assume N 1 f j(x ij ) = 0 j, and the matrix f input values is nnsingular. CN700 - Additive Mdels and Trees March 02, 2004 p.6/34
7 Backfitting Algrithm 1. Initialize: ˆα = 1 N N 1 y i, ˆf j 0, i, j. 2. Cycle: j = 1, 2,..., p,..., 1, 2,..., p,..., [ ˆf j S j {y i ˆα ˆf k (x ik )} N 1 k j ] ˆf j ˆf j 1 N N i=1 ˆf j (x ij ). until the functins ˆf j changes less than a specified threshld. Algrithm analgus t multiple regressin fr linear mdels. CN700 - Additive Mdels and Trees March 02, 2004 p.7/34
8 Backfitting Algrithm Can accmmdate ther fitting methds by specifying apprpriate smthing peratrs S j : Univariate regressin smthers lcal plynmial regressin and kernel methds. Linear regressin peratrs plynmial fits, piecewise cnstant fits, parametric spline fits, series and Furier fits. Others surface smthers fr secnd (higher) rder interactins, and peridic smthers fr seasnal effects. CN700 - Additive Mdels and Trees March 02, 2004 p.8/34
9 Eg: Additive Lgistic Regressin lg P r(y =1 X) P r(y =0 X) = α + f 1(X 1 ) f p (X p ). Lcal Scring Algrithm 1. Cmpute starting values: ˆα = lg[ȳ/(1 ȳ)], where ȳ = ave(y i ). Set ˆf j 0 j. 2. Define ˆη i = ˆα + j ˆf j (x ij ) and ˆp j = 1/[1 + exp( ˆη i )]. Iterate Cnstruct z i = ˆη i + (y i ˆp i ) ˆp i (1 ˆp i ). Cnstruct weights w i = ˆp i (1 ˆp i ). Fit an additive mdel t the targets z i with weights w i using a weighted backfitting algrithm. New estimates ˆα, ˆf j, j. 3. Cntinue step 2. until change is less than a specified threshld. CN700 - Additive Mdels and Trees March 02, 2004 p.9/34
10 Summary: Additive Mdels Flexible, yet interpretable. Familiar tls fr mdelling and inference in linear mdels als avialable here. Backfitting simple and mdular, can chse a fitting methd apprpriate fr each input variable. Limitatins fr large data-mining applicatins. Backfitting fits all predictrs nt feasible r desirable with large data. CN700 - Additive Mdels and Trees March 02, 2004 p.10/34
11 Overview Generalized additive mdels Tree-based mdels PRIM bump hunting MARS CN700 - Additive Mdels and Trees March 02, 2004 p.11/34
12 Intrductin Partitin the feature space int a set f rectangles. Fit a simple mdel (like a cnstant) in each ne. Key advantage interpretability. Predictin: ˆf(X) = 5 m=1 c mi{(x 1, X 2 ) R m }. CN700 - Additive Mdels and Trees March 02, 2004 p.12/34
13 Regressin Trees Data: (x i, y i ) fr i = 1,..., N, with x i = (x i1, x i2,..., x ip ). Aim: algrithm t autmatically decide splitting variables and split pints; and the tree tplgy. Mdel: M regins R 1, R 2,..., R M and a cnstant respnse c m in each regin f(x) = M m=1 c mi(x R m ). Criterin: minimizatin f sum f squares (y i f(x i )) 2. Best ĉ m : average f y i in R m, i.e. ĉ m = ave(y i x i R m ). Best binary partitin: Cmputatinally infeasible. Hw t prceed? Greedy algrithm. CN700 - Additive Mdels and Trees March 02, 2004 p.13/34
14 Best Split Cnsider a splitting variable j and a split pint s. Define the pair f half-planes R 1 (j, s) = {X X j s} and R 2 (j, s) = {X X j > s}. Find j and s that slve [ min (y i c 1 ) 2 + min c 1 c 2 min j,s x i R 1 (j,s) x i R 2 (j,s) (y i c 2 ) 2]. Inner minimizatin is slved by ĉ 1 = ave(y i x i R 1 (j, s)) and ĉ 2 = ave(y i x i R 2 (j, s)). Find the best pair (j, s) by scanning thrugh all split pints fr each splitting variable and then scanning thrugh all variables. CN700 - Additive Mdels and Trees March 02, 2004 p.14/34
15 Tree Size Adaptively chsen frm the data. Grw a large tree T 0 till sme minimum nde size is reached. Prune this tree using cst-cmplexity pruning. Cst cmplexity criterin: C α (T ) = T m=1 N mq m (T ) + α T, where Q m (T ) = N 1 m x i R m (y i ĉ m ) 2 and ĉ m = N 1 m x i R m y i. Idea: Fr each α, find the subtree T α T 0 t minimize C α (T ). Tuning parameter α 0 gverns tradeff between tree-size and gdness f fit. CN700 - Additive Mdels and Trees March 02, 2004 p.15/34
16 Tree Size Tuning parameter Fr each α, there is a unique smallest subtree T α that minimizes C α (T ). Use weakest link pruning t find T α. Successively cllapse the internal nde that prduces the smallest per-nde increase in m N mq m (T ), and cntinue until the single-nde tree. This sequence must cntain T α. Estimatin f α is by crss-validatin. Chse ˆα t minimize the crss-validated sum f squares. Final tree is Tˆα. CN700 - Additive Mdels and Trees March 02, 2004 p.16/34
17 Classificatin Trees K classes The prprtin f class k bservatins in nde m is given by ˆp mk = 1 N m x i R m I(y i = k). Observatins in nde m classified t class k(m) = arg max k ˆp mk. Different measures Q m (T ) f nde impurity Misclassificatin errr: 1 N m i R m I(y i k(m)) = 1 ˆp mk(m). Gini index: k k ˆp mk ˆp mk = K k=1 ˆp mk(1 ˆp mk ). Crss-entrpy (deviance): K k=1 ˆp mk lg ˆp mk. Crss-entrpy and Gini index are differentiable. Hence, mre amenable t numerical ptimizatin. CN700 - Additive Mdels and Trees March 02, 2004 p.17/34
18 Classificatin Trees When grwing the tree, either gini index r crss-entrpy shuld be used. T guide cst-cmplexity pruning, typically misclassificatin rate is used. CN700 - Additive Mdels and Trees March 02, 2004 p.18/34
19 Other issues and mdificatins Categrical Predictrs: Given a predictr with q pssible unrdered values and a binary utcme - Order predictr classes accrding t the prprtin falling in utcme class 1. Split predictr as if it were rdered. Lss Matrix: In the multi-class case, mdify Gini index t k k L kk ˆp mk ˆp mk. Fr tw classes, weight bservatins in class k by L kk. Missing Predictr Values: Categrical predictrs - make a new missing categry. General apprach - make surrgate variables. CN700 - Additive Mdels and Trees March 02, 2004 p.19/34
20 Disadvantages Instability and high variance: hierarchical nature. Lack f smthness: can degrade perfrmance in regressin setting. Difficulty with additive structures Cnsider Y = c 1 I(X 1 < t 1 ) + c 2 I(X 2 < t 2 ) + ɛ. First split n X 1 near t 1. The next split at bth ndes shuld be n X 2 at t 2. CN700 - Additive Mdels and Trees March 02, 2004 p.20/34
21 Overview Generalized additive mdels Tree-based mdels PRIM bump hunting MARS CN700 - Additive Mdels and Trees March 02, 2004 p.21/34
22 Intrductin PRIM patient rule inductin methd. Bxes in feature space where respnse average is high. Lks fr maxima in target functin bump hunting. Bx definitins nt defined by a binary tree. Characterized by peeling and pasting. CN700 - Additive Mdels and Trees March 02, 2004 p.22/34
23 Algrithm 1. Start with a maximal bx cntaining all training data. 2. Shrink bx by cmpressing ne face, s as t peel ff prprtin α f bservatins such that the peeling prduces the highest respnse mean in the remaining bx. 3. Repeat step 2 until sme minimal number f bservatins remain in the bx. 4. Expand alng any face, as lng as the resulting bx mean increases. 5. Steps 1-4 give a sequence f bxes, with different numbers f bservatins in each bx. Use crss-validatin t chse a member f the sequence and call the bx B Remve the data in bx B 1 frm the dataset and repeat steps 2-5 t btain a secnd bx, and cntinue t get as many bxes as desired. CN700 - Additive Mdels and Trees March 02, 2004 p.23/34
24 Algrithm Illustratin Tw classes blue (class 0) and red (class 1). CN700 - Additive Mdels and Trees March 02, 2004 p.24/34
25 PRIM and CART PRIM handles categrical variables and missing values like CART. PRIM: N simple way t deal with k > 2 classes simultaneusly. Run PRIM separately fr each class versus a baseline class. Advantage f PRIM ver CART: patience. CART fragments data quite quickly. lg 2 (N) 1 steps befre running ut f data. PRIM: apprx lg(n)/ lg(1 α) peeling steps befre running ut f data. CN700 - Additive Mdels and Trees March 02, 2004 p.25/34
26 Overview Generalized additive mdels Tree-based mdels PRIM bump hunting MARS CN700 - Additive Mdels and Trees March 02, 2004 p.26/34
27 Intrductin MARS Multivariate Adaptive Regressin Splines. Generalizatin f stepwise linear regressin. Mdificatin f CART fr the regressin setting. Well-suited fr high-dimensinal prblems. Uses expansins in piecewise linear basis functins f the frm (x t) + and (t x) +. The functins frm a reflected pair with a knt at value t. Basis functin t-x x-t t x CN700 - Additive Mdels and Trees March 02, 2004 p.27/34
28 MARS descriptin Idea: Frm reflected pairs fr each input X j with knts at each bserved value x ij f that input. Cllectin f basis functins is Mdel has the frm C = {(X j t) +, (t X j ) + } t {x 1j,x 2j,...,x Nj } j=1,2,...,p f(x) = β 0 + M m=1 β m h m (X), where each h m (X) is a functin in C, r a prduct f tw r mre such functins. Given h m, β m estimated by standard linear regressin. CN700 - Additive Mdels and Trees March 02, 2004 p.28/34
29 Basis Functins Start with the cnstant functin h 0 (X) = 1 in the mdel set M. At each stage, cnsider as a new basis functin pair all prducts f a functin h m in M with ne f the reflected pairs in C. Add t M the term f the frm ˆβ M+1 h l (X).(X j t) + + ˆβ M+2 h l (X).(t X j ) +, h l M that prduces the largest decrease in training errr. Estimate cefficients by least-squares. Cntinue until M cntains sme preset maximum number f terms. Restrictin: each input at mst nce in a prduct. CN700 - Additive Mdels and Trees March 02, 2004 p.29/34
30 MARS illustratin c x y z x y y z y x x y z M n the left clumn and C n the right. Selected functins shwn in red. CN700 - Additive Mdels and Trees March 02, 2004 p.30/34
31 Backward deletin Mdel M is large and typically verfits data. At each stage Term whse remval causes the smallest increase in residual squared errr is deleted. Estimated best mdel ˆfλ, where λ = number f terms. Use generalized crss-validatin (GCV) fr ptimal λ. GCV criterin: GCV (λ) = N i=1 (y i ˆf λ (x i )) 2 (1 M(λ)/N) 2. M(λ) is the effective number f parameters: M(λ) = r + ck, r L.I. basis functins in M, K knts and c = 3. c = 2 when mdel is restricted t be additive. CN700 - Additive Mdels and Trees March 02, 2004 p.31/34
32 Advantages Piecewise linear functins perate lcally. X2 X1 Cmputatins: cnsider pdt f a functin in M with each f the N reflected pairs fr an input X j. O(N) peratins t try every knt! Multiway prducts built up frm prducts invlving terms already in mdel reasnable wrking assumptin. CN700 - Additive Mdels and Trees March 02, 2004 p.32/34
33 MARS fr classificatin Tw classes: 0/1 respnse, treat as regressin. Multiclass: 0/1 indicatr variables, use multirespnse MARS regressin. Masking prblems with the abve apprach. PlyMARS specifically designed fr classificatin: Multiple lgistic framewrk. Use quadratic apprximatin t the multinmial lg-likelihd t search fr the next basis-functin pair. Fit enlarged mdel by maximum likelihd. CN700 - Additive Mdels and Trees March 02, 2004 p.33/34
34 References Hastie, T., Tibshirani, R., and Friedman, J. (2001). The Elements f Statistical Learning: data mining, inference, and predictin. Springer-Verlag. CN700 - Additive Mdels and Trees March 02, 2004 p.34/34
Contents. This is page i Printer: Opaque this
Cntents This is page i Printer: Opaque this 9 Additive Mdels, Trees, and Related Methds 1 9.1 Generalized Additive Mdels................. 1 9.1.1 Fitting Additive Mdels................ 3 9.1.2 Example:
More informationTree Structured Classifier
Tree Structured Classifier Reference: Classificatin and Regressin Trees by L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stne, Chapman & Hall, 98. A Medical Eample (CART): Predict high risk patients
More informationWhat is Statistical Learning?
What is Statistical Learning? Sales 5 10 15 20 25 Sales 5 10 15 20 25 Sales 5 10 15 20 25 0 50 100 200 300 TV 0 10 20 30 40 50 Radi 0 20 40 60 80 100 Newspaper Shwn are Sales vs TV, Radi and Newspaper,
More informationResampling Methods. Cross-validation, Bootstrapping. Marek Petrik 2/21/2017
Resampling Methds Crss-validatin, Btstrapping Marek Petrik 2/21/2017 Sme f the figures in this presentatin are taken frm An Intrductin t Statistical Learning, with applicatins in R (Springer, 2013) with
More informationLecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff
Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeff Reading: Chapter 2 STATS 202: Data mining and analysis September 27, 2017 1 / 20 Supervised vs. unsupervised learning In unsupervised
More informationLecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff
Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeff Reading: Chapter 2 STATS 202: Data mining and analysis September 27, 2017 1 / 20 Supervised vs. unsupervised learning In unsupervised
More informationSimple Linear Regression (single variable)
Simple Linear Regressin (single variable) Intrductin t Machine Learning Marek Petrik January 31, 2017 Sme f the figures in this presentatin are taken frm An Intrductin t Statistical Learning, with applicatins
More informationPattern Recognition 2014 Support Vector Machines
Pattern Recgnitin 2014 Supprt Vectr Machines Ad Feelders Universiteit Utrecht Ad Feelders ( Universiteit Utrecht ) Pattern Recgnitin 1 / 55 Overview 1 Separable Case 2 Kernel Functins 3 Allwing Errrs (Sft
More informationResampling Methods. Chapter 5. Chapter 5 1 / 52
Resampling Methds Chapter 5 Chapter 5 1 / 52 1 51 Validatin set apprach 2 52 Crss validatin 3 53 Btstrap Chapter 5 2 / 52 Abut Resampling An imprtant statistical tl Pretending the data as ppulatin and
More informationk-nearest Neighbor How to choose k Average of k points more reliable when: Large k: noise in attributes +o o noise in class labels
Mtivating Example Memry-Based Learning Instance-Based Learning K-earest eighbr Inductive Assumptin Similar inputs map t similar utputs If nt true => learning is impssible If true => learning reduces t
More informationCOMP 551 Applied Machine Learning Lecture 4: Linear classification
COMP 551 Applied Machine Learning Lecture 4: Linear classificatin Instructr: Jelle Pineau (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/cmp551 Unless therwise nted, all material psted
More informationMidwest Big Data Summer School: Machine Learning I: Introduction. Kris De Brabanter
Midwest Big Data Summer Schl: Machine Learning I: Intrductin Kris De Brabanter kbrabant@iastate.edu Iwa State University Department f Statistics Department f Cmputer Science June 24, 2016 1/24 Outline
More informationSmoothing, penalized least squares and splines
Smthing, penalized least squares and splines Duglas Nychka, www.image.ucar.edu/~nychka Lcally weighted averages Penalized least squares smthers Prperties f smthers Splines and Reprducing Kernels The interplatin
More informationIAML: Support Vector Machines
1 / 22 IAML: Supprt Vectr Machines Charles Suttn and Victr Lavrenk Schl f Infrmatics Semester 1 2 / 22 Outline Separating hyperplane with maimum margin Nn-separable training data Epanding the input int
More informationChapter 3: Cluster Analysis
Chapter 3: Cluster Analysis } 3.1 Basic Cncepts f Clustering 3.1.1 Cluster Analysis 3.1. Clustering Categries } 3. Partitining Methds 3..1 The principle 3.. K-Means Methd 3..3 K-Medids Methd 3..4 CLARA
More informationSupport Vector Machines and Flexible Discriminants
12 Supprt Vectr Machines and Flexible Discriminants This is page 417 Printer: Opaque this 12.1 Intrductin In this chapter we describe generalizatins f linear decisin bundaries fr classificatin. Optimal
More informationCOMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification
COMP 551 Applied Machine Learning Lecture 5: Generative mdels fr linear classificatin Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Jelle Pineau Class web page: www.cs.mcgill.ca/~hvanh2/cmp551
More informationStats Classification Ji Zhu, Michigan Statistics 1. Classification. Ji Zhu 445C West Hall
Stats 415 - Classificatin Ji Zhu, Michigan Statistics 1 Classificatin Ji Zhu 445C West Hall 734-936-2577 jizhu@umich.edu Stats 415 - Classificatin Ji Zhu, Michigan Statistics 2 Examples f Classificatin
More information3.4 Shrinkage Methods Prostate Cancer Data Example (Continued) Ridge Regression
3.3.4 Prstate Cancer Data Example (Cntinued) 3.4 Shrinkage Methds 61 Table 3.3 shws the cefficients frm a number f different selectin and shrinkage methds. They are best-subset selectin using an all-subsets
More informationComputational modeling techniques
Cmputatinal mdeling techniques Lecture 4: Mdel checing fr ODE mdels In Petre Department f IT, Åb Aademi http://www.users.ab.fi/ipetre/cmpmd/ Cntent Stichimetric matrix Calculating the mass cnservatin relatins
More informationCOMP 551 Applied Machine Learning Lecture 9: Support Vector Machines (cont d)
COMP 551 Applied Machine Learning Lecture 9: Supprt Vectr Machines (cnt d) Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Class web page: www.cs.mcgill.ca/~hvanh2/cmp551 Unless therwise
More informationCOMP 551 Applied Machine Learning Lecture 11: Support Vector Machines
COMP 551 Applied Machine Learning Lecture 11: Supprt Vectr Machines Instructr: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/cmp551 Unless therwise nted, all material psted fr this curse
More informationThe general linear model and Statistical Parametric Mapping I: Introduction to the GLM
The general linear mdel and Statistical Parametric Mapping I: Intrductin t the GLM Alexa Mrcm and Stefan Kiebel, Rik Hensn, Andrew Hlmes & J-B J Pline Overview Intrductin Essential cncepts Mdelling Design
More informationx 1 Outline IAML: Logistic Regression Decision Boundaries Example Data
Outline IAML: Lgistic Regressin Charles Suttn and Victr Lavrenk Schl f Infrmatics Semester Lgistic functin Lgistic regressin Learning lgistic regressin Optimizatin The pwer f nn-linear basis functins Least-squares
More informationIn SMV I. IAML: Support Vector Machines II. This Time. The SVM optimization problem. We saw:
In SMV I IAML: Supprt Vectr Machines II Nigel Gddard Schl f Infrmatics Semester 1 We sa: Ma margin trick Gemetry f the margin and h t cmpute it Finding the ma margin hyperplane using a cnstrained ptimizatin
More informationSupport Vector Machines and Flexible Discriminants
Supprt Vectr Machines and Flexible Discriminants This is page Printer: Opaque this. Intrductin In this chapter we describe generalizatins f linear decisin bundaries fr classificatin. Optimal separating
More informationBootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) >
Btstrap Methd > # Purpse: understand hw btstrap methd wrks > bs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(bs) > mean(bs) [1] 21.64625 > # estimate f lambda > lambda = 1/mean(bs);
More informationContents. This is page i Printer: Opaque this
Cntents This is page i Printer: Opaque this Supprt Vectr Machines and Flexible Discriminants. Intrductin............. The Supprt Vectr Classifier.... Cmputing the Supprt Vectr Classifier........ Mixture
More informationLinear programming III
Linear prgramming III Review 1/33 What have cvered in previus tw classes LP prblem setup: linear bjective functin, linear cnstraints. exist extreme pint ptimal slutin. Simplex methd: g thrugh extreme pint
More informationThe blessing of dimensionality for kernel methods
fr kernel methds Building classifiers in high dimensinal space Pierre Dupnt Pierre.Dupnt@ucluvain.be Classifiers define decisin surfaces in sme feature space where the data is either initially represented
More informationPart 3 Introduction to statistical classification techniques
Part 3 Intrductin t statistical classificatin techniques Machine Learning, Part 3, March 07 Fabi Rli Preamble ØIn Part we have seen that if we knw: Psterir prbabilities P(ω i / ) Or the equivalent terms
More informationSTATS216v Introduction to Statistical Learning Stanford University, Summer Practice Final (Solutions) Duration: 3 hours
STATS216v Intrductin t Statistical Learning Stanfrd University, Summer 2016 Practice Final (Slutins) Duratin: 3 hurs Instructins: (This is a practice final and will nt be graded.) Remember the university
More informationNAME: Prof. Ruiz. 1. [5 points] What is the difference between simple random sampling and stratified random sampling?
CS4445 ata Mining and Kwledge iscery in atabases. B Term 2014 Exam 1 Nember 24, 2014 Prf. Carlina Ruiz epartment f Cmputer Science Wrcester Plytechnic Institute NAME: Prf. Ruiz Prblem I: Prblem II: Prblem
More informationPSU GISPOPSCI June 2011 Ordinary Least Squares & Spatial Linear Regression in GeoDa
There are tw parts t this lab. The first is intended t demnstrate hw t request and interpret the spatial diagnstics f a standard OLS regressin mdel using GeDa. The diagnstics prvide infrmatin abut the
More informationEnhancing Performance of MLP/RBF Neural Classifiers via an Multivariate Data Distribution Scheme
Enhancing Perfrmance f / Neural Classifiers via an Multivariate Data Distributin Scheme Halis Altun, Gökhan Gelen Nigde University, Electrical and Electrnics Engineering Department Nigde, Turkey haltun@nigde.edu.tr
More informationSUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical model for microarray data analysis
SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical mdel fr micrarray data analysis David Rssell Department f Bistatistics M.D. Andersn Cancer Center, Hustn, TX 77030, USA rsselldavid@gmail.cm
More informationOverview of Supervised Learning
2 Overview f Supervised Learning 2.1 Intrductin The first three examples described in Chapter 1 have several cmpnents in cmmn. Fr each there is a set f variables that might be dented as inputs, which are
More informationthe results to larger systems due to prop'erties of the projection algorithm. First, the number of hidden nodes must
M.E. Aggune, M.J. Dambrg, M.A. El-Sharkawi, R.J. Marks II and L.E. Atlas, "Dynamic and static security assessment f pwer systems using artificial neural netwrks", Prceedings f the NSF Wrkshp n Applicatins
More informationLeast Squares Optimal Filtering with Multirate Observations
Prc. 36th Asilmar Cnf. n Signals, Systems, and Cmputers, Pacific Grve, CA, Nvember 2002 Least Squares Optimal Filtering with Multirate Observatins Charles W. herrien and Anthny H. Hawes Department f Electrical
More informationSupport-Vector Machines
Supprt-Vectr Machines Intrductin Supprt vectr machine is a linear machine with sme very nice prperties. Haykin chapter 6. See Alpaydin chapter 13 fr similar cntent. Nte: Part f this lecture drew material
More informationSequential Allocation with Minimal Switching
In Cmputing Science and Statistics 28 (1996), pp. 567 572 Sequential Allcatin with Minimal Switching Quentin F. Stut 1 Janis Hardwick 1 EECS Dept., University f Michigan Statistics Dept., Purdue University
More informationLHS Mathematics Department Honors Pre-Calculus Final Exam 2002 Answers
LHS Mathematics Department Hnrs Pre-alculus Final Eam nswers Part Shrt Prblems The table at the right gives the ppulatin f Massachusetts ver the past several decades Using an epnential mdel, predict the
More informationLecture 10, Principal Component Analysis
Principal Cmpnent Analysis Lecture 10, Principal Cmpnent Analysis Ha Helen Zhang Fall 2017 Ha Helen Zhang Lecture 10, Principal Cmpnent Analysis 1 / 16 Principal Cmpnent Analysis Lecture 10, Principal
More informationSURVIVAL ANALYSIS WITH SUPPORT VECTOR MACHINES
1 SURVIVAL ANALYSIS WITH SUPPORT VECTOR MACHINES Wlfgang HÄRDLE Ruslan MORO Center fr Applied Statistics and Ecnmics (CASE), Humbldt-Universität zu Berlin Mtivatin 2 Applicatins in Medicine estimatin f
More informationDifferentiation Applications 1: Related Rates
Differentiatin Applicatins 1: Related Rates 151 Differentiatin Applicatins 1: Related Rates Mdel 1: Sliding Ladder 10 ladder y 10 ladder 10 ladder A 10 ft ladder is leaning against a wall when the bttm
More informationT Algorithmic methods for data mining. Slide set 6: dimensionality reduction
T-61.5060 Algrithmic methds fr data mining Slide set 6: dimensinality reductin reading assignment LRU bk: 11.1 11.3 PCA tutrial in mycurses (ptinal) ptinal: An Elementary Prf f a Therem f Jhnsn and Lindenstrauss,
More informationLocalized Model Selection for Regression
Lcalized Mdel Selectin fr Regressin Yuhng Yang Schl f Statistics University f Minnesta Church Street S.E. Minneaplis, MN 5555 May 7, 007 Abstract Research n mdel/prcedure selectin has fcused n selecting
More informationinitially lcated away frm the data set never win the cmpetitin, resulting in a nnptimal nal cdebk, [2] [3] [4] and [5]. Khnen's Self Organizing Featur
Cdewrd Distributin fr Frequency Sensitive Cmpetitive Learning with One Dimensinal Input Data Aristides S. Galanpuls and Stanley C. Ahalt Department f Electrical Engineering The Ohi State University Abstract
More informationChapter 15 & 16: Random Forests & Ensemble Learning
Chapter 15 & 16: Randm Frests & Ensemble Learning DD3364 Nvember 27, 2012 Ty Prblem fr Bsted Tree Bsted Tree Example Estimate this functin with a sum f trees with 9-terminal ndes by minimizing the sum
More informationIntroduction to Regression
Intrductin t Regressin Administrivia Hmewrk 6 psted later tnight. Due Friday after Break. 2 Statistical Mdeling Thus far we ve talked abut Descriptive Statistics: This is the way my sample is Inferential
More informationModule 4: General Formulation of Electric Circuit Theory
Mdule 4: General Frmulatin f Electric Circuit Thery 4. General Frmulatin f Electric Circuit Thery All electrmagnetic phenmena are described at a fundamental level by Maxwell's equatins and the assciated
More informationFall 2013 Physics 172 Recitation 3 Momentum and Springs
Fall 03 Physics 7 Recitatin 3 Mmentum and Springs Purpse: The purpse f this recitatin is t give yu experience wrking with mmentum and the mmentum update frmula. Readings: Chapter.3-.5 Learning Objectives:.3.
More informationStatistical Learning. 2.1 What Is Statistical Learning?
2 Statistical Learning 2.1 What Is Statistical Learning? In rder t mtivate ur study f statistical learning, we begin with a simple example. Suppse that we are statistical cnsultants hired by a client t
More informationModule 3: Gaussian Process Parameter Estimation, Prediction Uncertainty, and Diagnostics
Mdule 3: Gaussian Prcess Parameter Estimatin, Predictin Uncertainty, and Diagnstics Jerme Sacks and William J Welch Natinal Institute f Statistical Sciences and University f British Clumbia Adapted frm
More informationDistributions, spatial statistics and a Bayesian perspective
Distributins, spatial statistics and a Bayesian perspective Dug Nychka Natinal Center fr Atmspheric Research Distributins and densities Cnditinal distributins and Bayes Thm Bivariate nrmal Spatial statistics
More informationCHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS
CHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS 1 Influential bservatins are bservatins whse presence in the data can have a distrting effect n the parameter estimates and pssibly the entire analysis,
More informationDetermining the Accuracy of Modal Parameter Estimation Methods
Determining the Accuracy f Mdal Parameter Estimatin Methds by Michael Lee Ph.D., P.E. & Mar Richardsn Ph.D. Structural Measurement Systems Milpitas, CA Abstract The mst cmmn type f mdal testing system
More information4th Indian Institute of Astrophysics - PennState Astrostatistics School July, 2013 Vainu Bappu Observatory, Kavalur. Correlation and Regression
4th Indian Institute f Astrphysics - PennState Astrstatistics Schl July, 2013 Vainu Bappu Observatry, Kavalur Crrelatin and Regressin Rahul Ry Indian Statistical Institute, Delhi. Crrelatin Cnsider a tw
More informationThe Kullback-Leibler Kernel as a Framework for Discriminant and Localized Representations for Visual Recognition
The Kullback-Leibler Kernel as a Framewrk fr Discriminant and Lcalized Representatins fr Visual Recgnitin Nun Vascncels Purdy H Pedr Mren ECE Department University f Califrnia, San Dieg HP Labs Cambridge
More informationMATCHING TECHNIQUES. Technical Track Session VI. Emanuela Galasso. The World Bank
MATCHING TECHNIQUES Technical Track Sessin VI Emanuela Galass The Wrld Bank These slides were develped by Christel Vermeersch and mdified by Emanuela Galass fr the purpse f this wrkshp When can we use
More informationBiplots in Practice MICHAEL GREENACRE. Professor of Statistics at the Pompeu Fabra University. Chapter 13 Offprint
Biplts in Practice MICHAEL GREENACRE Prfessr f Statistics at the Pmpeu Fabra University Chapter 13 Offprint CASE STUDY BIOMEDICINE Cmparing Cancer Types Accrding t Gene Epressin Arrays First published:
More informationComparing Several Means: ANOVA. Group Means and Grand Mean
STAT 511 ANOVA and Regressin 1 Cmparing Several Means: ANOVA Slide 1 Blue Lake snap beans were grwn in 12 pen-tp chambers which are subject t 4 treatments 3 each with O 3 and SO 2 present/absent. The ttal
More informationLogistic Regression. and Maximum Likelihood. Marek Petrik. Feb
Lgistic Regressin and Maximum Likelihd Marek Petrik Feb 09 2017 S Far in ML Regressin vs Classificatin Linear regressin Bias-variance decmpsitin Practical methds fr linear regressin Simple Linear Regressin
More informationDead-beat controller design
J. Hetthéssy, A. Barta, R. Bars: Dead beat cntrller design Nvember, 4 Dead-beat cntrller design In sampled data cntrl systems the cntrller is realised by an intelligent device, typically by a PLC (Prgrammable
More informationLinear Classification
Linear Classificatin CS 54: Machine Learning Slides adapted frm Lee Cper, Jydeep Ghsh, and Sham Kakade Review: Linear Regressin CS 54 [Spring 07] - H Regressin Given an input vectr x T = (x, x,, xp), we
More informationCS 477/677 Analysis of Algorithms Fall 2007 Dr. George Bebis Course Project Due Date: 11/29/2007
CS 477/677 Analysis f Algrithms Fall 2007 Dr. Gerge Bebis Curse Prject Due Date: 11/29/2007 Part1: Cmparisn f Srting Algrithms (70% f the prject grade) The bjective f the first part f the assignment is
More informationAP Statistics Notes Unit Two: The Normal Distributions
AP Statistics Ntes Unit Tw: The Nrmal Distributins Syllabus Objectives: 1.5 The student will summarize distributins f data measuring the psitin using quartiles, percentiles, and standardized scres (z-scres).
More informationLecture 8: Multiclass Classification (I)
Bayes Rule fr Multiclass Prblems Traditinal Methds fr Multiclass Prblems Linear Regressin Mdels Lecture 8: Multiclass Classificatin (I) Ha Helen Zhang Fall 07 Ha Helen Zhang Lecture 8: Multiclass Classificatin
More informationSlide04 (supplemental) Haykin Chapter 4 (both 2nd and 3rd ed): Multi-Layer Perceptrons
Slide04 supplemental) Haykin Chapter 4 bth 2nd and 3rd ed): Multi-Layer Perceptrns CPSC 636-600 Instructr: Ynsuck Che Heuristic fr Making Backprp Perfrm Better 1. Sequential vs. batch update: fr large
More informationData Mining: Concepts and Techniques. Classification and Prediction. Chapter February 8, 2007 CSE-4412: Data Mining 1
Data Mining: Cncepts and Techniques Classificatin and Predictin Chapter 6.4-6 February 8, 2007 CSE-4412: Data Mining 1 Chapter 6 Classificatin and Predictin 1. What is classificatin? What is predictin?
More informationChapter 4. Unsteady State Conduction
Chapter 4 Unsteady State Cnductin Chapter 5 Steady State Cnductin Chee 318 1 4-1 Intrductin ransient Cnductin Many heat transfer prblems are time dependent Changes in perating cnditins in a system cause
More informationBayesian nonparametric modeling approaches for quantile regression
Bayesian nnparametric mdeling appraches fr quantile regressin Athanasis Kttas Department f Applied Mathematics and Statistics University f Califrnia, Santa Cruz Department f Statistics Athens University
More informationEngineering Decision Methods
GSOE9210 vicj@cse.unsw.edu.au www.cse.unsw.edu.au/~gs9210 Maximin and minimax regret 1 2 Indifference; equal preference 3 Graphing decisin prblems 4 Dminance The Maximin principle Maximin and minimax Regret
More informationand the Doppler frequency rate f R , can be related to the coefficients of this polynomial. The relationships are:
Algrithm fr Estimating R and R - (David Sandwell, SIO, August 4, 2006) Azimith cmpressin invlves the alignment f successive eches t be fcused n a pint target Let s be the slw time alng the satellite track
More informationYou need to be able to define the following terms and answer basic questions about them:
CS440/ECE448 Sectin Q Fall 2017 Midterm Review Yu need t be able t define the fllwing terms and answer basic questins abut them: Intr t AI, agents and envirnments Pssible definitins f AI, prs and cns f
More informationGraduate AI Lecture 16: Planning 2. Teachers: Martial Hebert Ariel Procaccia (this time)
Graduate AI Lecture 16: Planning 2 Teachers: Martial Hebert Ariel Prcaccia (this time) Reminder State is a cnjunctin f cnditins, e.g., at(truck 1,Shadyside) at(truck 2,Oakland) States are transfrmed via
More informationTuring Machines. Human-aware Robotics. 2017/10/17 & 19 Chapter 3.2 & 3.3 in Sipser Ø Announcement:
Turing Machines Human-aware Rbtics 2017/10/17 & 19 Chapter 3.2 & 3.3 in Sipser Ø Annuncement: q q q q Slides fr this lecture are here: http://www.public.asu.edu/~yzhan442/teaching/cse355/lectures/tm-ii.pdf
More informationAgenda. What is Machine Learning? Learning Type of Learning: Supervised, Unsupervised and semi supervised Classification
Agenda Artificial Intelligence and its applicatins Lecture 6 Supervised Learning Prfessr Daniel Yeung danyeung@ieee.rg Dr. Patrick Chan patrickchan@ieee.rg Suth China University f Technlgy, China Learning
More informationA Scalable Recurrent Neural Network Framework for Model-free
A Scalable Recurrent Neural Netwrk Framewrk fr Mdel-free POMDPs April 3, 2007 Zhenzhen Liu, Itamar Elhanany Machine Intelligence Lab Department f Electrical and Cmputer Engineering The University f Tennessee
More informationChecking the resolved resonance region in EXFOR database
Checking the reslved resnance regin in EXFOR database Gttfried Bertn Sciété de Calcul Mathématique (SCM) Oscar Cabells OECD/NEA Data Bank JEFF Meetings - Sessin JEFF Experiments Nvember 0-4, 017 Bulgne-Billancurt,
More informationCHAPTER 24: INFERENCE IN REGRESSION. Chapter 24: Make inferences about the population from which the sample data came.
MATH 1342 Ch. 24 April 25 and 27, 2013 Page 1 f 5 CHAPTER 24: INFERENCE IN REGRESSION Chapters 4 and 5: Relatinships between tw quantitative variables. Be able t Make a graph (scatterplt) Summarize the
More informationComputational Statistics
Cmputatinal Statistics Spring 2008 Peter Bühlmann and Martin Mächler Seminar für Statistik ETH Zürich February 2008 (February 23, 2011) ii Cntents 1 Multiple Linear Regressin 1 1.1 Intrductin....................................
More informationFIELD QUALITY IN ACCELERATOR MAGNETS
FIELD QUALITY IN ACCELERATOR MAGNETS S. Russenschuck CERN, 1211 Geneva 23, Switzerland Abstract The field quality in the supercnducting magnets is expressed in terms f the cefficients f the Furier series
More informationSection 6-2: Simplex Method: Maximization with Problem Constraints of the Form ~
Sectin 6-2: Simplex Methd: Maximizatin with Prblem Cnstraints f the Frm ~ Nte: This methd was develped by Gerge B. Dantzig in 1947 while n assignment t the U.S. Department f the Air Frce. Definitin: Standard
More informationMATHEMATICS SYLLABUS SECONDARY 5th YEAR
Eurpean Schls Office f the Secretary-General Pedaggical Develpment Unit Ref. : 011-01-D-8-en- Orig. : EN MATHEMATICS SYLLABUS SECONDARY 5th YEAR 6 perid/week curse APPROVED BY THE JOINT TEACHING COMMITTEE
More informationCS 109 Lecture 23 May 18th, 2016
CS 109 Lecture 23 May 18th, 2016 New Datasets Heart Ancestry Netflix Our Path Parameter Estimatin Machine Learning: Frmally Many different frms f Machine Learning We fcus n the prblem f predictin Want
More informationAdministrativia. Assignment 1 due thursday 9/23/2004 BEFORE midnight. Midterm exam 10/07/2003 in class. CS 460, Sessions 8-9 1
Administrativia Assignment 1 due thursday 9/23/2004 BEFORE midnight Midterm eam 10/07/2003 in class CS 460, Sessins 8-9 1 Last time: search strategies Uninfrmed: Use nly infrmatin available in the prblem
More informationMathematics Methods Units 1 and 2
Mathematics Methds Units 1 and 2 Mathematics Methds is an ATAR curse which fcuses n the use f calculus and statistical analysis. The study f calculus prvides a basis fr understanding rates f change in
More informationMATCHING TECHNIQUES Technical Track Session VI Céline Ferré The World Bank
MATCHING TECHNIQUES Technical Track Sessin VI Céline Ferré The Wrld Bank When can we use matching? What if the assignment t the treatment is nt dne randmly r based n an eligibility index, but n the basis
More informationCAUSAL INFERENCE. Technical Track Session I. Phillippe Leite. The World Bank
CAUSAL INFERENCE Technical Track Sessin I Phillippe Leite The Wrld Bank These slides were develped by Christel Vermeersch and mdified by Phillippe Leite fr the purpse f this wrkshp Plicy questins are causal
More informationFloating Point Method for Solving Transportation. Problems with Additional Constraints
Internatinal Mathematical Frum, Vl. 6, 20, n. 40, 983-992 Flating Pint Methd fr Slving Transprtatin Prblems with Additinal Cnstraints P. Pandian and D. Anuradha Department f Mathematics, Schl f Advanced
More information1 The limitations of Hartree Fock approximation
Chapter: Pst-Hartree Fck Methds - I The limitatins f Hartree Fck apprximatin The n electrn single determinant Hartree Fck wave functin is the variatinal best amng all pssible n electrn single determinants
More informationMore Tutorial at
Answer each questin in the space prvided; use back f page if extra space is needed. Answer questins s the grader can READILY understand yur wrk; nly wrk n the exam sheet will be cnsidered. Write answers,
More informationMultiple Source Multiple. using Network Coding
Multiple Surce Multiple Destinatin Tplgy Inference using Netwrk Cding Pegah Sattari EECS, UC Irvine Jint wrk with Athina Markpulu, at UCI, Christina Fraguli, at EPFL, Lausanne Outline Netwrk Tmgraphy Gal,
More informationAP Statistics Practice Test Unit Three Exploring Relationships Between Variables. Name Period Date
AP Statistics Practice Test Unit Three Explring Relatinships Between Variables Name Perid Date True r False: 1. Crrelatin and regressin require explanatry and respnse variables. 1. 2. Every least squares
More informationComputational modeling techniques
Cmputatinal mdeling techniques Lecture 11: Mdeling with systems f ODEs In Petre Department f IT, Ab Akademi http://www.users.ab.fi/ipetre/cmpmd/ Mdeling with differential equatins Mdeling strategy Fcus
More informationDataflow Analysis and Abstract Interpretation
Dataflw Analysis and Abstract Interpretatin Cmputer Science and Artificial Intelligence Labratry MIT Nvember 9, 2015 Recap Last time we develped frm first principles an algrithm t derive invariants. Key
More informationChapter 8: The Binomial and Geometric Distributions
Sectin 8.1: The Binmial Distributins Chapter 8: The Binmial and Gemetric Distributins A randm variable X is called a BINOMIAL RANDOM VARIABLE if it meets ALL the fllwing cnditins: 1) 2) 3) 4) The MOST
More informationOnline Model Racing based on Extreme Performance
Online Mdel Racing based n Extreme Perfrmance Tiantian Zhang, Michael Gergipuls, Gergis Anagnstpuls Electrical & Cmputer Engineering University f Central Flrida Overview Racing Algrithm Offline vs Online
More informationMaximum A Posteriori (MAP) CS 109 Lecture 22 May 16th, 2016
Maximum A Psteriri (MAP) CS 109 Lecture 22 May 16th, 2016 Previusly in CS109 Game f Estimatrs Maximum Likelihd Nn spiler: this didn t happen Side Plt argmax argmax f lg Mther f ptimizatins? Reviving an
More information