COMP 551 Applied Machine Learning Lecture 4: Linear classification

Size: px
Start display at page:

Download "COMP 551 Applied Machine Learning Lecture 4: Linear classification"

Transcription

1 COMP 551 Applied Machine Learning Lecture 4: Linear classificatin Instructr: Jelle Pineau Class web page: Unless therwise nted, all material psted fr this curse are cpyright f the instructr, and cannt be reused r repsted withut the instructr s written permissin.

2 Tday s Quiz 1. What is meant by the term verfitting? What can cause verfitting? Hw can ne avid verfitting? 2. Which f the fllwing increases the chances f verfitting (assuming everything else is held cnstant): a) Reducing the size f the training set. b) Increasing the size f the training set. c) Reducing the size f the test set. d) Increasing the size f the test set. e) Reducing the number f features. f) Increasing the number f features. 2 Jelle Pineau

3 Evaluatin Use crss-validatin fr mdel selectin. Training set is used t select a hypthesis f frm a class f hyptheses F (e.g. regressin f a given degree). Validatin set is used t cmpare best f frm each hypthesis class acrss different classes (e.g. different degree regressin). Must be untuched during the prcess f lking fr f within a class F. Test set: Ideally, a separate set f (labeled) data is withheld t get a true estimate f the generalizatin errr. (Often the validatin set is called test set, withut distinctin.) 3 Jelle Pineau

4 Validatin vs Train errr [Frm Hastie et al. textbk] High Bias Lw Variance Lw Bias High Variance Predictin Errr Test Sample Training Sample Lw Mdel Cmplexity High FIGURE Test and training errr as a functin f mdel cmplexity. 4 Jelle Pineau

5 Bias vs Variance Gauss-Markv Therem says: The least-squares estimates f the parameters w have the smallest variance amng all linear unbiased estimates. Insight: Find lwer variance slutin, at the expense f sme bias. E.g. Include penalty fr mdel cmplexity in errr t reduce verfitting. Err(w) = i=1:n ( y i - w T x i ) 2 + λ mdel_size λ is a hyper-parameter that cntrls penalty size. 5 Jelle Pineau

6 Ridge regressin (aka L2-regularizatin) Cnstrains the weights by impsing a penalty n their size: ŵ ridge = argmin w { i=1:n ( y i - w T x i ) 2 + λ j=0:m w j2 } where λ can be selected manually, r by crss-validatin. D a little algebra t get the slutin: ŵ ridge = (X T X+λI) -1 X T Y The ridge slutin is nt equivariant under scaling f the data, s typically need t nrmalize the inputs first. Ridge gives a smth slutin, effectively shrinking the weights, but drives few weights t 0. 6 Jelle Pineau

7 Lass regressin (aka L1-regularizatin) Cnstrains the weights by penalizing the abslute value f their size: ŵ lass = argmin W { i=1:n ( y i - w T x i ) 2 + λ j=1:m w j } Nw the bjective is nn-linear in the utput y, and there is n clsed-frm slutin. Need t slve a quadratic prgramming prblem instead. Mre cmputatinally expensive than Ridge regressin. Effectively sets the weights f less relevant input features t zer. 7 Jelle Pineau

8 Cmparing Ridge and Lass Ridge g regularizatin (2 pa w 2 Cnturs f equal regressin errr Lass w 2 1 w? w? w 1 w 1 Cnturs f equal mdel cmplexity penalty 8 Jelle Pineau

9 A quick lk at evaluatin functins We call L(Y,f w (x)) the lss functin. Least-square / Mean squared-errr (MSE) lss: L(Y, f w (X)) = i=1:n ( y i - w T x i ) 2 Other lss functins? Abslute errr lss: L(Y, f w (X)) = i=1:n y i w T x i 0-1 lss (fr classificatin): L(Y, f w (X)) = i=1:n I ( y i f w (x i ) ) Different lss functins make different assumptins. Squared errr lss assumes the data can be apprximated by a glbal linear mdel with Gaussian nise. 9 Jelle Pineau

10 Next: Linear mdels fr classificatin Linear Regressin f 0/1 Respnse FIGURE 2.1. A classificatin example in tw dimensins. The classes are cded as a binary variable (BLUE =0, ORANGE =1), and then fit by linear regressin. The line is the decisin bundary defined by x T ˆβ =0.5. Therangeshadedregin dentes that part f input space classified as ORANGE, while the blue regin is classified as BLUE. 10 Jelle Pineau

11 Classificatin prblems Given data set D=<x i,y i >, i=1:n, with discrete y i, find a hypthesis which best fits the data. If y i {0, 1} this is binary classificatin. If y i can take mre than tw values, the prblem is called multi-class classificatin. 11 Jelle Pineau

12 Applicatins f classificatin Text classificatin (spam filtering, news filtering, building web directries, etc.) Image classificatin (face detectin, bject recgnitin, etc.) Predictin f cancer recurrence. Financial frecasting. Many, many mre! 12 Jelle Pineau

13 Simple example Given nucleus size, predict cancer recurrence. Univariate input: X = nucleus size. Binary utput: Y = {NRecurrence = 0; Recurrence = 1} Try: Minimize the least-square errr. nnrecurrence cunt NRecurrence nucleus size 15 Recurrence recurrence cunt nucleus size 13 Jelle Pineau

14 Predicting a class frm linear regressin Here red line is: Y = X (X T X) -1 X T Y Hw t get a binary utput? 1. Threshld the utput: { y <= t fr NRecurrence, y > t fr Recurrence} 2. Interpret utput as prbability: y = Pr (Recurrence) *3*4.,+405# ,+405"6 "&$ " #&) #&( #&' #&$ # Can we find a better mdel?!#&$! "# "! $# $! %# *+,-.+/0/ Jelle Pineau

15 Mdeling fr binary classificatin Tw prbabilistic appraches: 1. Discriminative learning: Directly estimate P(y x). 2. Generative learning: Separately mdel P(x y) and P(y). Use Bayes rule, t estimate P(y x): P(y =1 x) = P(x y =1)P(y =1) P(x) 15 Jelle Pineau

16 Prbabilistic view f discriminative learning Suppse we have 2 classes: y {0, 1} What is the prbability f a given input x having class y = 1? Cnsider Bayes rule: P(y =1 x) = where = 1+ P(x, y =1) P(x) a = ln = P(x y =1)P(y =1) P(x y =1)P(y =1)+ P(x y = 0)P(y = 0) 1 = P(x y = 0)P(y = 0) 1+ exp(ln P(x y =1)P(y =1) P(x y =1)P(y =1) P(x y = 0)P(y = 0) 1 = P(x y = 0)P(y = 0) P(x y =1)P(y =1) ) = ln P(y =1 x) P(y = 0 x) Here σ has a special frm, called the lgistic functin (By Bayes rule; P(x) n tp and bttm cancels ut.) and a is the lg-dds rati f data being class 1 vs. class exp( a) = σ 16 Jelle Pineau

17 Discriminative learning: Lgistic regressin Idea: Directly mdel the lg-dds with a linear functin: a = ln P(x y =1)P(y =1) P(x y = 0)P(y = 0) = w 0 + w 1 x w m x m The decisin bundary is the set f pints fr which a=0. The lgistic functin (= sigmid curve): σ(w T x) = 1 / (1 + e -wtx ) Hw d we find the weights? Need an ptimizatin functin. 17 Jelle Pineau

18 Fitting the weights Recall: σ(w T x i ) is the prbability that y i =1 (given x i ) 1-σ(w T x i ) be the prbability that y i = 0. Fr y {0, 1}, the likelihd functin, Pr(x 1,y 1,, x n,y h w), is: i=1:n σ(w T x i ) yi (1- σ(w T x i )) (1-yi) (samples are i.i.d.) Gal: Minimize the lg-likelihd (als called crss-entrpy errr functin): - i=1:n y i lg(σ(w T x i )) + (1-y i )lg(1-σ(w T x i )) 18 Jelle Pineau

19 Gradient descent fr lgistic regressin Errr fn: Err(w) = - [ i=1:n y i lg(σ(w T x i )) + (1-y i )lg(1-σ(w T x i )) ] Take the derivative: δlg(σ)/δw=1/σ Err(w)/ w = - [ i=1:n y i (1/σ(w T x i ))(1-σ(w T x i )) σ(w T x i )x i + 19 Jelle Pineau

20 Gradient descent fr lgistic regressin Errr fn: Err(w) = - [ i=1:n y i lg(σ(w T x i )) + (1-y i )lg(1-σ(w T x i )) ] Take the derivative: δσ/δw=σ(1-σ) Err(w)/ w = - [ i=1:n y i (1/σ(w T x i ))(1-σ(w T x i )) σ(w T x i )x i + 20 Jelle Pineau

21 Gradient descent fr lgistic regressin Errr fn: Err(w) = - [ i=1:n y i lg(σ(w T x i )) + (1-y i )lg(1-σ(w T x i )) ] Take the derivative: δw T x/δw=x Err(w)/ w = - [ i=1:n y i (1/σ(w T x i ))(1-σ(w T x i )) σ(w T x i )x i + 21 Jelle Pineau

22 Gradient descent fr lgistic regressin Errr fn: Err(w) = - [ i=1:n y i lg(σ(w T x i )) + (1-y i )lg(1-σ(w T x i )) ] Take the derivative: δ(1-σ)/δw= (1-σ)σ(-1) Err(w)/ w = - [ i=1:n y i (1/σ(w T x i ))(1-σ(w T x i )) σ(w T x i )x i + (1-y i )(1/(1-σ(w T x i )))(1-σ(w T x i ))σ(w T x i )(-1) x i ] 22 Jelle Pineau

23 Gradient descent fr lgistic regressin Errr fn: Err(w) = - [ i=1:n y i lg(σ(w T x i )) + (1-y i )lg(1-σ(w T x i )) ] Take the derivative: Err(w)/ w = - [ i=1:n y i (1/σ(w T x i ))(1-σ(w T x i )) σ(w T x i )x i + (1-y i )(1/(1-σ(w T x i )))(1-σ(w T x i ))σ(w T x i )(-1) x i ] = - i=1:n x i (y i (1-σ(w T x i )) - (1-y i )σ(w T x i )) = - i=1:n x i (y i - σ(w T x i )) 23 Jelle Pineau

24 Gradient descent fr lgistic regressin Errr fn: Err(w) = - [ i=1:n y i lg(σ(w T x i )) + (1-y i )lg(1-σ(w T x i )) ] Take the derivative: Err(w)/ w = - [ i=1:n y i (1/σ(w T x i ))(1-σ(w T x i )) σ(w T x i )x i + (1-y i )(1/(1-σ(w T x i )))(1-σ(w T x i ))σ(w T x i )(-1) x i ] = - i=1:n x i (y i (1-σ(w T x i )) - (1-y i )σ(w T x i )) = - i=1:n x i (y i - σ(w T x i )) Nw apply iteratively: w k+1 = w k + α k i=1:n x i (y i σ(w kt x i )) Can als apply ther iterative methds, e.g. Newtn s methd, crdinate descent, L-BFGS, etc. 24 Jelle Pineau

25 Mdeling fr binary classificatin Tw prbabilistic appraches: 1. Discriminative learning: Directly estimate P(y x). 2. Generative learning: Separately mdel P(x y) and P(y). Use Bayes rule, t estimate P(y x): P(y =1 x) = P(x y =1)P(y =1) P(x) 25 Jelle Pineau

26 What yu shuld knw Basic definitin f linear classificatin prblem. Derivatin f lgistic regressin. Linear discriminant analysis: definitin, decisin bundary. Quadratic discriminant analysis: basic idea, decisin bundary. LDA vs QDA prs/cns. Wrth reading further: Under sme cnditins, linear regressin fr classificatin and LDA are the same (Hastie et al., p ). Relatin between Lgistic regressin and LDA (Hastie et al., 4.4.5) 26 Jelle Pineau

27 Final ntes Yu dn t yet have a team fr Prject #1? => Use mycurses. Yu dn t yet have a plan fr Prject #1? => Start planning! Feedback n tutrial 1? 27 Jelle Pineau

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification COMP 551 Applied Machine Learning Lecture 5: Generative mdels fr linear classificatin Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Jelle Pineau Class web page: www.cs.mcgill.ca/~hvanh2/cmp551

More information

COMP 551 Applied Machine Learning Lecture 11: Support Vector Machines

COMP 551 Applied Machine Learning Lecture 11: Support Vector Machines COMP 551 Applied Machine Learning Lecture 11: Supprt Vectr Machines Instructr: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/cmp551 Unless therwise nted, all material psted fr this curse

More information

Resampling Methods. Cross-validation, Bootstrapping. Marek Petrik 2/21/2017

Resampling Methods. Cross-validation, Bootstrapping. Marek Petrik 2/21/2017 Resampling Methds Crss-validatin, Btstrapping Marek Petrik 2/21/2017 Sme f the figures in this presentatin are taken frm An Intrductin t Statistical Learning, with applicatins in R (Springer, 2013) with

More information

Pattern Recognition 2014 Support Vector Machines

Pattern Recognition 2014 Support Vector Machines Pattern Recgnitin 2014 Supprt Vectr Machines Ad Feelders Universiteit Utrecht Ad Feelders ( Universiteit Utrecht ) Pattern Recgnitin 1 / 55 Overview 1 Separable Case 2 Kernel Functins 3 Allwing Errrs (Sft

More information

COMP 551 Applied Machine Learning Lecture 9: Support Vector Machines (cont d)

COMP 551 Applied Machine Learning Lecture 9: Support Vector Machines (cont d) COMP 551 Applied Machine Learning Lecture 9: Supprt Vectr Machines (cnt d) Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Class web page: www.cs.mcgill.ca/~hvanh2/cmp551 Unless therwise

More information

x 1 Outline IAML: Logistic Regression Decision Boundaries Example Data

x 1 Outline IAML: Logistic Regression Decision Boundaries Example Data Outline IAML: Lgistic Regressin Charles Suttn and Victr Lavrenk Schl f Infrmatics Semester Lgistic functin Lgistic regressin Learning lgistic regressin Optimizatin The pwer f nn-linear basis functins Least-squares

More information

Midwest Big Data Summer School: Machine Learning I: Introduction. Kris De Brabanter

Midwest Big Data Summer School: Machine Learning I: Introduction. Kris De Brabanter Midwest Big Data Summer Schl: Machine Learning I: Intrductin Kris De Brabanter kbrabant@iastate.edu Iwa State University Department f Statistics Department f Cmputer Science June 24, 2016 1/24 Outline

More information

IAML: Support Vector Machines

IAML: Support Vector Machines 1 / 22 IAML: Supprt Vectr Machines Charles Suttn and Victr Lavrenk Schl f Infrmatics Semester 1 2 / 22 Outline Separating hyperplane with maimum margin Nn-separable training data Epanding the input int

More information

Simple Linear Regression (single variable)

Simple Linear Regression (single variable) Simple Linear Regressin (single variable) Intrductin t Machine Learning Marek Petrik January 31, 2017 Sme f the figures in this presentatin are taken frm An Intrductin t Statistical Learning, with applicatins

More information

What is Statistical Learning?

What is Statistical Learning? What is Statistical Learning? Sales 5 10 15 20 25 Sales 5 10 15 20 25 Sales 5 10 15 20 25 0 50 100 200 300 TV 0 10 20 30 40 50 Radi 0 20 40 60 80 100 Newspaper Shwn are Sales vs TV, Radi and Newspaper,

More information

COMP 551 Applied Machine Learning Lecture 3: Linear regression (cont d)

COMP 551 Applied Machine Learning Lecture 3: Linear regression (cont d) COMP 551 Applied Machine Learning Lecture 3: Linear regression (cont d) Instructor: Herke van Hoof (herke.vanhoof@mail.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless

More information

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeff Reading: Chapter 2 STATS 202: Data mining and analysis September 27, 2017 1 / 20 Supervised vs. unsupervised learning In unsupervised

More information

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeff Reading: Chapter 2 STATS 202: Data mining and analysis September 27, 2017 1 / 20 Supervised vs. unsupervised learning In unsupervised

More information

Linear Classification

Linear Classification Linear Classificatin CS 54: Machine Learning Slides adapted frm Lee Cper, Jydeep Ghsh, and Sham Kakade Review: Linear Regressin CS 54 [Spring 07] - H Regressin Given an input vectr x T = (x, x,, xp), we

More information

k-nearest Neighbor How to choose k Average of k points more reliable when: Large k: noise in attributes +o o noise in class labels

k-nearest Neighbor How to choose k Average of k points more reliable when: Large k: noise in attributes +o o noise in class labels Mtivating Example Memry-Based Learning Instance-Based Learning K-earest eighbr Inductive Assumptin Similar inputs map t similar utputs If nt true => learning is impssible If true => learning reduces t

More information

In SMV I. IAML: Support Vector Machines II. This Time. The SVM optimization problem. We saw:

In SMV I. IAML: Support Vector Machines II. This Time. The SVM optimization problem. We saw: In SMV I IAML: Supprt Vectr Machines II Nigel Gddard Schl f Infrmatics Semester 1 We sa: Ma margin trick Gemetry f the margin and h t cmpute it Finding the ma margin hyperplane using a cnstrained ptimizatin

More information

Part 3 Introduction to statistical classification techniques

Part 3 Introduction to statistical classification techniques Part 3 Intrductin t statistical classificatin techniques Machine Learning, Part 3, March 07 Fabi Rli Preamble ØIn Part we have seen that if we knw: Psterir prbabilities P(ω i / ) Or the equivalent terms

More information

Resampling Methods. Chapter 5. Chapter 5 1 / 52

Resampling Methods. Chapter 5. Chapter 5 1 / 52 Resampling Methds Chapter 5 Chapter 5 1 / 52 1 51 Validatin set apprach 2 52 Crss validatin 3 53 Btstrap Chapter 5 2 / 52 Abut Resampling An imprtant statistical tl Pretending the data as ppulatin and

More information

Support-Vector Machines

Support-Vector Machines Supprt-Vectr Machines Intrductin Supprt vectr machine is a linear machine with sme very nice prperties. Haykin chapter 6. See Alpaydin chapter 13 fr similar cntent. Nte: Part f this lecture drew material

More information

Computational modeling techniques

Computational modeling techniques Cmputatinal mdeling techniques Lecture 2: Mdeling change. In Petre Department f IT, Åb Akademi http://users.ab.fi/ipetre/cmpmd/ Cntent f the lecture Basic paradigm f mdeling change Examples Linear dynamical

More information

3.4 Shrinkage Methods Prostate Cancer Data Example (Continued) Ridge Regression

3.4 Shrinkage Methods Prostate Cancer Data Example (Continued) Ridge Regression 3.3.4 Prstate Cancer Data Example (Cntinued) 3.4 Shrinkage Methds 61 Table 3.3 shws the cefficients frm a number f different selectin and shrinkage methds. They are best-subset selectin using an all-subsets

More information

Tree Structured Classifier

Tree Structured Classifier Tree Structured Classifier Reference: Classificatin and Regressin Trees by L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stne, Chapman & Hall, 98. A Medical Eample (CART): Predict high risk patients

More information

ENSC Discrete Time Systems. Project Outline. Semester

ENSC Discrete Time Systems. Project Outline. Semester ENSC 49 - iscrete Time Systems Prject Outline Semester 006-1. Objectives The gal f the prject is t design a channel fading simulatr. Upn successful cmpletin f the prject, yu will reinfrce yur understanding

More information

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9. Sectin 7 Mdel Assessment This sectin is based n Stck and Watsn s Chapter 9. Internal vs. external validity Internal validity refers t whether the analysis is valid fr the ppulatin and sample being studied.

More information

Admin. MDP Search Trees. Optimal Quantities. Reinforcement Learning

Admin. MDP Search Trees. Optimal Quantities. Reinforcement Learning Admin Reinfrcement Learning Cntent adapted frm Berkeley CS188 MDP Search Trees Each MDP state prjects an expectimax-like search tree Optimal Quantities The value (utility) f a state s: V*(s) = expected

More information

T Algorithmic methods for data mining. Slide set 6: dimensionality reduction

T Algorithmic methods for data mining. Slide set 6: dimensionality reduction T-61.5060 Algrithmic methds fr data mining Slide set 6: dimensinality reductin reading assignment LRU bk: 11.1 11.3 PCA tutrial in mycurses (ptinal) ptinal: An Elementary Prf f a Therem f Jhnsn and Lindenstrauss,

More information

The blessing of dimensionality for kernel methods

The blessing of dimensionality for kernel methods fr kernel methds Building classifiers in high dimensinal space Pierre Dupnt Pierre.Dupnt@ucluvain.be Classifiers define decisin surfaces in sme feature space where the data is either initially represented

More information

We can see from the graph above that the intersection is, i.e., [ ).

We can see from the graph above that the intersection is, i.e., [ ). MTH 111 Cllege Algebra Lecture Ntes July 2, 2014 Functin Arithmetic: With nt t much difficulty, we ntice that inputs f functins are numbers, and utputs f functins are numbers. S whatever we can d with

More information

Lecture 8: Multiclass Classification (I)

Lecture 8: Multiclass Classification (I) Bayes Rule fr Multiclass Prblems Traditinal Methds fr Multiclass Prblems Linear Regressin Mdels Lecture 8: Multiclass Classificatin (I) Ha Helen Zhang Fall 07 Ha Helen Zhang Lecture 8: Multiclass Classificatin

More information

Linear programming III

Linear programming III Linear prgramming III Review 1/33 What have cvered in previus tw classes LP prblem setup: linear bjective functin, linear cnstraints. exist extreme pint ptimal slutin. Simplex methd: g thrugh extreme pint

More information

CN700 Additive Models and Trees Chapter 9: Hastie et al. (2001)

CN700 Additive Models and Trees Chapter 9: Hastie et al. (2001) CN700 Additive Mdels and Trees Chapter 9: Hastie et al. (2001) Madhusudana Shashanka Department f Cgnitive and Neural Systems Bstn University CN700 - Additive Mdels and Trees March 02, 2004 p.1/34 Overview

More information

Fall 2013 Physics 172 Recitation 3 Momentum and Springs

Fall 2013 Physics 172 Recitation 3 Momentum and Springs Fall 03 Physics 7 Recitatin 3 Mmentum and Springs Purpse: The purpse f this recitatin is t give yu experience wrking with mmentum and the mmentum update frmula. Readings: Chapter.3-.5 Learning Objectives:.3.

More information

Artificial Neural Networks MLP, Backpropagation

Artificial Neural Networks MLP, Backpropagation Artificial Neural Netwrks MLP, Backprpagatin 01001110 01100101 01110101 01110010 01101111 01101110 01101111 01110110 01100001 00100000 01110011 01101011 01110101 01110000 01101001 01101110 01100001 00100000

More information

Section 5.8 Notes Page Exponential Growth and Decay Models; Newton s Law

Section 5.8 Notes Page Exponential Growth and Decay Models; Newton s Law Sectin 5.8 Ntes Page 1 5.8 Expnential Grwth and Decay Mdels; Newtn s Law There are many applicatins t expnential functins that we will fcus n in this sectin. First let s lk at the expnential mdel. Expnential

More information

COMP9444 Neural Networks and Deep Learning 3. Backpropagation

COMP9444 Neural Networks and Deep Learning 3. Backpropagation COMP9444 Neural Netwrks and Deep Learning 3. Backprpagatin Tetbk, Sectins 4.3, 5.2, 6.5.2 COMP9444 17s2 Backprpagatin 1 Outline Supervised Learning Ockham s Razr (5.2) Multi-Layer Netwrks Gradient Descent

More information

Logistic Regression. and Maximum Likelihood. Marek Petrik. Feb

Logistic Regression. and Maximum Likelihood. Marek Petrik. Feb Lgistic Regressin and Maximum Likelihd Marek Petrik Feb 09 2017 S Far in ML Regressin vs Classificatin Linear regressin Bias-variance decmpsitin Practical methds fr linear regressin Simple Linear Regressin

More information

Support Vector Machines and Flexible Discriminants

Support Vector Machines and Flexible Discriminants 12 Supprt Vectr Machines and Flexible Discriminants This is page 417 Printer: Opaque this 12.1 Intrductin In this chapter we describe generalizatins f linear decisin bundaries fr classificatin. Optimal

More information

This section is primarily focused on tools to aid us in finding roots/zeros/ -intercepts of polynomials. Essentially, our focus turns to solving.

This section is primarily focused on tools to aid us in finding roots/zeros/ -intercepts of polynomials. Essentially, our focus turns to solving. Sectin 3.2: Many f yu WILL need t watch the crrespnding vides fr this sectin n MyOpenMath! This sectin is primarily fcused n tls t aid us in finding rts/zers/ -intercepts f plynmials. Essentially, ur fcus

More information

Distributions, spatial statistics and a Bayesian perspective

Distributions, spatial statistics and a Bayesian perspective Distributins, spatial statistics and a Bayesian perspective Dug Nychka Natinal Center fr Atmspheric Research Distributins and densities Cnditinal distributins and Bayes Thm Bivariate nrmal Spatial statistics

More information

CHAPTER 24: INFERENCE IN REGRESSION. Chapter 24: Make inferences about the population from which the sample data came.

CHAPTER 24: INFERENCE IN REGRESSION. Chapter 24: Make inferences about the population from which the sample data came. MATH 1342 Ch. 24 April 25 and 27, 2013 Page 1 f 5 CHAPTER 24: INFERENCE IN REGRESSION Chapters 4 and 5: Relatinships between tw quantitative variables. Be able t Make a graph (scatterplt) Summarize the

More information

Trigonometric Ratios Unit 5 Tentative TEST date

Trigonometric Ratios Unit 5 Tentative TEST date 1 U n i t 5 11U Date: Name: Trignmetric Ratis Unit 5 Tentative TEST date Big idea/learning Gals In this unit yu will extend yur knwledge f SOH CAH TOA t wrk with btuse and reflex angles. This extensin

More information

Stats Classification Ji Zhu, Michigan Statistics 1. Classification. Ji Zhu 445C West Hall

Stats Classification Ji Zhu, Michigan Statistics 1. Classification. Ji Zhu 445C West Hall Stats 415 - Classificatin Ji Zhu, Michigan Statistics 1 Classificatin Ji Zhu 445C West Hall 734-936-2577 jizhu@umich.edu Stats 415 - Classificatin Ji Zhu, Michigan Statistics 2 Examples f Classificatin

More information

Computational modeling techniques

Computational modeling techniques Cmputatinal mdeling techniques Lecture 4: Mdel checing fr ODE mdels In Petre Department f IT, Åb Aademi http://www.users.ab.fi/ipetre/cmpmd/ Cntent Stichimetric matrix Calculating the mass cnservatin relatins

More information

CS 109 Lecture 23 May 18th, 2016

CS 109 Lecture 23 May 18th, 2016 CS 109 Lecture 23 May 18th, 2016 New Datasets Heart Ancestry Netflix Our Path Parameter Estimatin Machine Learning: Frmally Many different frms f Machine Learning We fcus n the prblem f predictin Want

More information

CS 477/677 Analysis of Algorithms Fall 2007 Dr. George Bebis Course Project Due Date: 11/29/2007

CS 477/677 Analysis of Algorithms Fall 2007 Dr. George Bebis Course Project Due Date: 11/29/2007 CS 477/677 Analysis f Algrithms Fall 2007 Dr. Gerge Bebis Curse Prject Due Date: 11/29/2007 Part1: Cmparisn f Srting Algrithms (70% f the prject grade) The bjective f the first part f the assignment is

More information

Statistics, Numerical Models and Ensembles

Statistics, Numerical Models and Ensembles Statistics, Numerical Mdels and Ensembles Duglas Nychka, Reinhard Furrer,, Dan Cley Claudia Tebaldi, Linda Mearns, Jerry Meehl and Richard Smith (UNC). Spatial predictin and data assimilatin Precipitatin

More information

MODULE FOUR. This module addresses functions. SC Academic Elementary Algebra Standards:

MODULE FOUR. This module addresses functions. SC Academic Elementary Algebra Standards: MODULE FOUR This mdule addresses functins SC Academic Standards: EA-3.1 Classify a relatinship as being either a functin r nt a functin when given data as a table, set f rdered pairs, r graph. EA-3.2 Use

More information

Eric Klein and Ning Sa

Eric Klein and Ning Sa Week 12. Statistical Appraches t Netwrks: p1 and p* Wasserman and Faust Chapter 15: Statistical Analysis f Single Relatinal Netwrks There are fur tasks in psitinal analysis: 1) Define Equivalence 2) Measure

More information

Chapter 3: Cluster Analysis

Chapter 3: Cluster Analysis Chapter 3: Cluster Analysis } 3.1 Basic Cncepts f Clustering 3.1.1 Cluster Analysis 3.1. Clustering Categries } 3. Partitining Methds 3..1 The principle 3.. K-Means Methd 3..3 K-Medids Methd 3..4 CLARA

More information

Reinforcement Learning" CMPSCI 383 Nov 29, 2011!

Reinforcement Learning CMPSCI 383 Nov 29, 2011! Reinfrcement Learning" CMPSCI 383 Nv 29, 2011! 1 Tdayʼs lecture" Review f Chapter 17: Making Cmple Decisins! Sequential decisin prblems! The mtivatin and advantages f reinfrcement learning.! Passive learning!

More information

Data mining/machine learning large data sets. STA 302 or 442 (Applied Statistics) :, 1

Data mining/machine learning large data sets. STA 302 or 442 (Applied Statistics) :, 1 Data mining/machine learning large data sets STA 302 r 442 (Applied Statistics) :, 1 Data mining/machine learning large data sets high dimensinal spaces STA 302 r 442 (Applied Statistics) :, 2 Data mining/machine

More information

Lecture 20a. Circuit Topologies and Techniques: Opamps

Lecture 20a. Circuit Topologies and Techniques: Opamps Lecture a Circuit Tplgies and Techniques: Opamps In this lecture yu will learn: Sme circuit tplgies and techniques Intrductin t peratinal amplifiers Differential mplifier IBIS1 I BIS M VI1 vi1 Vi vi I

More information

Math Foundations 20 Work Plan

Math Foundations 20 Work Plan Math Fundatins 20 Wrk Plan Units / Tpics 20.8 Demnstrate understanding f systems f linear inequalities in tw variables. Time Frame December 1-3 weeks 6-10 Majr Learning Indicatrs Identify situatins relevant

More information

ELE Final Exam - Dec. 2018

ELE Final Exam - Dec. 2018 ELE 509 Final Exam Dec 2018 1 Cnsider tw Gaussian randm sequences X[n] and Y[n] Assume that they are independent f each ther with means and autcvariances μ ' 3 μ * 4 C ' [m] 1 2 1 3 and C * [m] 3 1 10

More information

[COLLEGE ALGEBRA EXAM I REVIEW TOPICS] ( u s e t h i s t o m a k e s u r e y o u a r e r e a d y )

[COLLEGE ALGEBRA EXAM I REVIEW TOPICS] ( u s e t h i s t o m a k e s u r e y o u a r e r e a d y ) (Abut the final) [COLLEGE ALGEBRA EXAM I REVIEW TOPICS] ( u s e t h i s t m a k e s u r e y u a r e r e a d y ) The department writes the final exam s I dn't really knw what's n it and I can't very well

More information

The Kullback-Leibler Kernel as a Framework for Discriminant and Localized Representations for Visual Recognition

The Kullback-Leibler Kernel as a Framework for Discriminant and Localized Representations for Visual Recognition The Kullback-Leibler Kernel as a Framewrk fr Discriminant and Lcalized Representatins fr Visual Recgnitin Nun Vascncels Purdy H Pedr Mren ECE Department University f Califrnia, San Dieg HP Labs Cambridge

More information

STATS216v Introduction to Statistical Learning Stanford University, Summer Practice Final (Solutions) Duration: 3 hours

STATS216v Introduction to Statistical Learning Stanford University, Summer Practice Final (Solutions) Duration: 3 hours STATS216v Intrductin t Statistical Learning Stanfrd University, Summer 2016 Practice Final (Slutins) Duratin: 3 hurs Instructins: (This is a practice final and will nt be graded.) Remember the university

More information

Localized Model Selection for Regression

Localized Model Selection for Regression Lcalized Mdel Selectin fr Regressin Yuhng Yang Schl f Statistics University f Minnesta Church Street S.E. Minneaplis, MN 5555 May 7, 007 Abstract Research n mdel/prcedure selectin has fcused n selecting

More information

Chapter 3 Kinematics in Two Dimensions; Vectors

Chapter 3 Kinematics in Two Dimensions; Vectors Chapter 3 Kinematics in Tw Dimensins; Vectrs Vectrs and Scalars Additin f Vectrs Graphical Methds (One and Tw- Dimensin) Multiplicatin f a Vectr b a Scalar Subtractin f Vectrs Graphical Methds Adding Vectrs

More information

Differentiation Applications 1: Related Rates

Differentiation Applications 1: Related Rates Differentiatin Applicatins 1: Related Rates 151 Differentiatin Applicatins 1: Related Rates Mdel 1: Sliding Ladder 10 ladder y 10 ladder 10 ladder A 10 ft ladder is leaning against a wall when the bttm

More information

ECEN 4872/5827 Lecture Notes

ECEN 4872/5827 Lecture Notes ECEN 4872/5827 Lecture Ntes Lecture #5 Objectives fr lecture #5: 1. Analysis f precisin current reference 2. Appraches fr evaluating tlerances 3. Temperature Cefficients evaluatin technique 4. Fundamentals

More information

The general linear model and Statistical Parametric Mapping I: Introduction to the GLM

The general linear model and Statistical Parametric Mapping I: Introduction to the GLM The general linear mdel and Statistical Parametric Mapping I: Intrductin t the GLM Alexa Mrcm and Stefan Kiebel, Rik Hensn, Andrew Hlmes & J-B J Pline Overview Intrductin Essential cncepts Mdelling Design

More information

Elements of Machine Intelligence - I

Elements of Machine Intelligence - I ECE-175A Elements f Machine Intelligence - I Ken Kreutz-Delgad Nun Vascncels ECE Department, UCSD Winter 2011 The curse The curse will cver basic, but imprtant, aspects f machine learning and pattern recgnitin

More information

NAME: Prof. Ruiz. 1. [5 points] What is the difference between simple random sampling and stratified random sampling?

NAME: Prof. Ruiz. 1. [5 points] What is the difference between simple random sampling and stratified random sampling? CS4445 ata Mining and Kwledge iscery in atabases. B Term 2014 Exam 1 Nember 24, 2014 Prf. Carlina Ruiz epartment f Cmputer Science Wrcester Plytechnic Institute NAME: Prf. Ruiz Prblem I: Prblem II: Prblem

More information

AP Physics Kinematic Wrap Up

AP Physics Kinematic Wrap Up AP Physics Kinematic Wrap Up S what d yu need t knw abut this mtin in tw-dimensin stuff t get a gd scre n the ld AP Physics Test? First ff, here are the equatins that yu ll have t wrk with: v v at x x

More information

Linear Methods for Regression

Linear Methods for Regression 3 Linear Methds fr Regressin This is page 43 Printer: Opaque this 3.1 Intrductin A linear regressin mdel assumes that the regressin functin E(Y X) is linear in the inputs X 1,...,X p. Linear mdels were

More information

PSU GISPOPSCI June 2011 Ordinary Least Squares & Spatial Linear Regression in GeoDa

PSU GISPOPSCI June 2011 Ordinary Least Squares & Spatial Linear Regression in GeoDa There are tw parts t this lab. The first is intended t demnstrate hw t request and interpret the spatial diagnstics f a standard OLS regressin mdel using GeDa. The diagnstics prvide infrmatin abut the

More information

You need to be able to define the following terms and answer basic questions about them:

You need to be able to define the following terms and answer basic questions about them: CS440/ECE448 Sectin Q Fall 2017 Midterm Review Yu need t be able t define the fllwing terms and answer basic questins abut them: Intr t AI, agents and envirnments Pssible definitins f AI, prs and cns f

More information

Statistical classifiers: Bayesian decision theory and density estimation

Statistical classifiers: Bayesian decision theory and density estimation 3 rd NOSE Shrt Curse Alpbach, st 6 th Mar 004 Statistical classifiers: Bayesian decisin thery and density estimatin Ricard Gutierrez- Department f Cmputer Science rgutier@cs.tamu.edu http://research.cs.tamu.edu/prism

More information

4th Indian Institute of Astrophysics - PennState Astrostatistics School July, 2013 Vainu Bappu Observatory, Kavalur. Correlation and Regression

4th Indian Institute of Astrophysics - PennState Astrostatistics School July, 2013 Vainu Bappu Observatory, Kavalur. Correlation and Regression 4th Indian Institute f Astrphysics - PennState Astrstatistics Schl July, 2013 Vainu Bappu Observatry, Kavalur Crrelatin and Regressin Rahul Ry Indian Statistical Institute, Delhi. Crrelatin Cnsider a tw

More information

Kinetic Model Completeness

Kinetic Model Completeness 5.68J/10.652J Spring 2003 Lecture Ntes Tuesday April 15, 2003 Kinetic Mdel Cmpleteness We say a chemical kinetic mdel is cmplete fr a particular reactin cnditin when it cntains all the species and reactins

More information

Optimization Programming Problems For Control And Management Of Bacterial Disease With Two Stage Growth/Spread Among Plants

Optimization Programming Problems For Control And Management Of Bacterial Disease With Two Stage Growth/Spread Among Plants Internatinal Jurnal f Engineering Science Inventin ISSN (Online): 9 67, ISSN (Print): 9 676 www.ijesi.rg Vlume 5 Issue 8 ugust 06 PP.0-07 Optimizatin Prgramming Prblems Fr Cntrl nd Management Of Bacterial

More information

Hypothesis Tests for One Population Mean

Hypothesis Tests for One Population Mean Hypthesis Tests fr One Ppulatin Mean Chapter 9 Ala Abdelbaki Objective Objective: T estimate the value f ne ppulatin mean Inferential statistics using statistics in rder t estimate parameters We will be

More information

Smoothing, penalized least squares and splines

Smoothing, penalized least squares and splines Smthing, penalized least squares and splines Duglas Nychka, www.image.ucar.edu/~nychka Lcally weighted averages Penalized least squares smthers Prperties f smthers Splines and Reprducing Kernels The interplatin

More information

Data Mining: Concepts and Techniques. Classification and Prediction. Chapter February 8, 2007 CSE-4412: Data Mining 1

Data Mining: Concepts and Techniques. Classification and Prediction. Chapter February 8, 2007 CSE-4412: Data Mining 1 Data Mining: Cncepts and Techniques Classificatin and Predictin Chapter 6.4-6 February 8, 2007 CSE-4412: Data Mining 1 Chapter 6 Classificatin and Predictin 1. What is classificatin? What is predictin?

More information

Computational modeling techniques

Computational modeling techniques Cmputatinal mdeling techniques Lecture 11: Mdeling with systems f ODEs In Petre Department f IT, Ab Akademi http://www.users.ab.fi/ipetre/cmpmd/ Mdeling with differential equatins Mdeling strategy Fcus

More information

Chapter 15 & 16: Random Forests & Ensemble Learning

Chapter 15 & 16: Random Forests & Ensemble Learning Chapter 15 & 16: Randm Frests & Ensemble Learning DD3364 Nvember 27, 2012 Ty Prblem fr Bsted Tree Bsted Tree Example Estimate this functin with a sum f trees with 9-terminal ndes by minimizing the sum

More information

MODULE 1. e x + c. [You can t separate a demominator, but you can divide a single denominator into each numerator term] a + b a(a + b)+1 = a + b

MODULE 1. e x + c. [You can t separate a demominator, but you can divide a single denominator into each numerator term] a + b a(a + b)+1 = a + b . REVIEW OF SOME BASIC ALGEBRA MODULE () Slving Equatins Yu shuld be able t slve fr x: a + b = c a d + e x + c and get x = e(ba +) b(c a) d(ba +) c Cmmn mistakes and strategies:. a b + c a b + a c, but

More information

Lyapunov Stability Stability of Equilibrium Points

Lyapunov Stability Stability of Equilibrium Points Lyapunv Stability Stability f Equilibrium Pints 1. Stability f Equilibrium Pints - Definitins In this sectin we cnsider n-th rder nnlinear time varying cntinuus time (C) systems f the frm x = f ( t, x),

More information

Materials Engineering 272-C Fall 2001, Lecture 7 & 8 Fundamentals of Diffusion

Materials Engineering 272-C Fall 2001, Lecture 7 & 8 Fundamentals of Diffusion Materials Engineering 272-C Fall 2001, Lecture 7 & 8 Fundamentals f Diffusin Diffusin: Transprt in a slid, liquid, r gas driven by a cncentratin gradient (r, in the case f mass transprt, a chemical ptential

More information

Feedforward Neural Networks

Feedforward Neural Networks Feedfrward Neural Netwrks Yagmur Gizem Cinar, Eric Gaussier AMA, LIG, Univ. Grenble Alpes 17 March 2017 Yagmur Gizem Cinar, Eric Gaussier Multilayer Perceptrns (MLP) 17 March 2017 1 / 42 Reference Bk Deep

More information

Contents. This is page i Printer: Opaque this

Contents. This is page i Printer: Opaque this Cntents This is page i Printer: Opaque this Supprt Vectr Machines and Flexible Discriminants. Intrductin............. The Supprt Vectr Classifier.... Cmputing the Supprt Vectr Classifier........ Mixture

More information

Lead/Lag Compensator Frequency Domain Properties and Design Methods

Lead/Lag Compensator Frequency Domain Properties and Design Methods Lectures 6 and 7 Lead/Lag Cmpensatr Frequency Dmain Prperties and Design Methds Definitin Cnsider the cmpensatr (ie cntrller Fr, it is called a lag cmpensatr s K Fr s, it is called a lead cmpensatr Ntatin

More information

NUMBERS, MATHEMATICS AND EQUATIONS

NUMBERS, MATHEMATICS AND EQUATIONS AUSTRALIAN CURRICULUM PHYSICS GETTING STARTED WITH PHYSICS NUMBERS, MATHEMATICS AND EQUATIONS An integral part t the understanding f ur physical wrld is the use f mathematical mdels which can be used t

More information

Online Model Racing based on Extreme Performance

Online Model Racing based on Extreme Performance Online Mdel Racing based n Extreme Perfrmance Tiantian Zhang, Michael Gergipuls, Gergis Anagnstpuls Electrical & Cmputer Engineering University f Central Flrida Overview Racing Algrithm Offline vs Online

More information

CAUSAL INFERENCE. Technical Track Session I. Phillippe Leite. The World Bank

CAUSAL INFERENCE. Technical Track Session I. Phillippe Leite. The World Bank CAUSAL INFERENCE Technical Track Sessin I Phillippe Leite The Wrld Bank These slides were develped by Christel Vermeersch and mdified by Phillippe Leite fr the purpse f this wrkshp Plicy questins are causal

More information

SAMPLING DYNAMICAL SYSTEMS

SAMPLING DYNAMICAL SYSTEMS SAMPLING DYNAMICAL SYSTEMS Melvin J. Hinich Applied Research Labratries The University f Texas at Austin Austin, TX 78713-8029, USA (512) 835-3278 (Vice) 835-3259 (Fax) hinich@mail.la.utexas.edu ABSTRACT

More information

Comparing Several Means: ANOVA. Group Means and Grand Mean

Comparing Several Means: ANOVA. Group Means and Grand Mean STAT 511 ANOVA and Regressin 1 Cmparing Several Means: ANOVA Slide 1 Blue Lake snap beans were grwn in 12 pen-tp chambers which are subject t 4 treatments 3 each with O 3 and SO 2 present/absent. The ttal

More information

A Matrix Representation of Panel Data

A Matrix Representation of Panel Data web Extensin 6 Appendix 6.A A Matrix Representatin f Panel Data Panel data mdels cme in tw brad varieties, distinct intercept DGPs and errr cmpnent DGPs. his appendix presents matrix algebra representatins

More information

MATCHING TECHNIQUES. Technical Track Session VI. Emanuela Galasso. The World Bank

MATCHING TECHNIQUES. Technical Track Session VI. Emanuela Galasso. The World Bank MATCHING TECHNIQUES Technical Track Sessin VI Emanuela Galass The Wrld Bank These slides were develped by Christel Vermeersch and mdified by Emanuela Galass fr the purpse f this wrkshp When can we use

More information

Data Analysis, Statistics, Machine Learning

Data Analysis, Statistics, Machine Learning Data Analysis, Statistics, Machine Learning Leland Wilkinsn Adjunct Prfessr UIC Cmputer Science Chief Scien

More information

1 PreCalculus AP Unit G Rotational Trig (MCR) Name:

1 PreCalculus AP Unit G Rotational Trig (MCR) Name: 1 PreCalculus AP Unit G Rtatinal Trig (MCR) Name: Big idea In this unit yu will extend yur knwledge f SOH CAH TOA t wrk with btuse and reflex angles. This extensin will invlve the unit circle which will

More information

Checking the resolved resonance region in EXFOR database

Checking the resolved resonance region in EXFOR database Checking the reslved resnance regin in EXFOR database Gttfried Bertn Sciété de Calcul Mathématique (SCM) Oscar Cabells OECD/NEA Data Bank JEFF Meetings - Sessin JEFF Experiments Nvember 0-4, 017 Bulgne-Billancurt,

More information

ELT COMMUNICATION THEORY

ELT COMMUNICATION THEORY ELT 41307 COMMUNICATION THEORY Matlab Exercise #2 Randm variables and randm prcesses 1 RANDOM VARIABLES 1.1 ROLLING A FAIR 6 FACED DICE (DISCRETE VALIABLE) Generate randm samples fr rlling a fair 6 faced

More information

Bootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) >

Bootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) > Btstrap Methd > # Purpse: understand hw btstrap methd wrks > bs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(bs) > mean(bs) [1] 21.64625 > # estimate f lambda > lambda = 1/mean(bs);

More information

Maximum A Posteriori (MAP) CS 109 Lecture 22 May 16th, 2016

Maximum A Posteriori (MAP) CS 109 Lecture 22 May 16th, 2016 Maximum A Psteriri (MAP) CS 109 Lecture 22 May 16th, 2016 Previusly in CS109 Game f Estimatrs Maximum Likelihd Nn spiler: this didn t happen Side Plt argmax argmax f lg Mther f ptimizatins? Reviving an

More information

7 TH GRADE MATH STANDARDS

7 TH GRADE MATH STANDARDS ALGEBRA STANDARDS Gal 1: Students will use the language f algebra t explre, describe, represent, and analyze number expressins and relatins 7 TH GRADE MATH STANDARDS 7.M.1.1: (Cmprehensin) Select, use,

More information

MATCHING TECHNIQUES Technical Track Session VI Céline Ferré The World Bank

MATCHING TECHNIQUES Technical Track Session VI Céline Ferré The World Bank MATCHING TECHNIQUES Technical Track Sessin VI Céline Ferré The Wrld Bank When can we use matching? What if the assignment t the treatment is nt dne randmly r based n an eligibility index, but n the basis

More information

ANSWER KEY FOR MATH 10 SAMPLE EXAMINATION. Instructions: If asked to label the axes please use real world (contextual) labels

ANSWER KEY FOR MATH 10 SAMPLE EXAMINATION. Instructions: If asked to label the axes please use real world (contextual) labels ANSWER KEY FOR MATH 10 SAMPLE EXAMINATION Instructins: If asked t label the axes please use real wrld (cntextual) labels Multiple Chice Answers: 0 questins x 1.5 = 30 Pints ttal Questin Answer Number 1

More information

Support Vector Machines and Flexible Discriminants

Support Vector Machines and Flexible Discriminants Supprt Vectr Machines and Flexible Discriminants This is page Printer: Opaque this. Intrductin In this chapter we describe generalizatins f linear decisin bundaries fr classificatin. Optimal separating

More information