Resampling Methods. Cross-validation, Bootstrapping. Marek Petrik 2/21/2017
|
|
- Edmund Greer
- 5 years ago
- Views:
Transcription
1 Resampling Methds Crss-validatin, Btstrapping Marek Petrik 2/21/2017 Sme f the figures in this presentatin are taken frm An Intrductin t Statistical Learning, with applicatins in R (Springer, 2013) with permissin frm the authrs: G. James, D. Witten, T. Hastie and R. Tibshirani
2 S Far in ML Regressin vs classificatin Linear regressin Lgistic regressin Linear discriminant analysis, QDA Maximum likelihd
3 Discriminative vs Generative Mdels Discriminative mdels Estimate cnditinal mdels Pr[Y X] Linear regressin Lgistic regressin Generative mdels Estimate jint prbability Pr[Y, X] = Pr[Y X] Pr[X] Estimates nt nly prbability f labels but als the features Once mdel is fit, can be used t generate data LDA, QDA, Naive Bayes
4 Lgistic Regressin Y = { 1 if default 0 therwise Linear regressin Lgistic regressin Balance Prbability f Default Balance Prbability f Default Predict: Pr[default = yes balance]
5 LDA: Linear Discriminant Analysis Generative mdel: capture prbability f predictrs fr each label Predict: 1. Pr[balance default = yes] and Pr[default = yes] 2. Pr[balance default = n] and Pr[default = n] Classes are nrmal: Pr[balance default = yes]
6 Bayes Therem Classificatin frm label distributins: Pr[Y = k X = x] = Pr[X = x Y = k] Pr[Y = k] Pr[X = x] Example: Ntatin: Pr[default = yes balance = $100] = Pr[balance = $100 default = yes] Pr[default = yes] Pr[balance = $100] Pr[Y = k X = x] = π kf k (x) K l=1 π lf l (x)
7 LDA with Multiple Features Multivariate Nrmal Distributins: x 2 x 2 x 1 x 1 Multivariate nrmal distributin density (mean vectr µ, cvariance matrix Σ): ( 1 p(x) = (2π) p/2 Σ exp 1 ) 1 /2 2 (x µ) Σ 1 (x µ)
8 Multivariate Classificatin Using LDA Linear: Decisin bundaries are linear X X X X 1
9 QDA: Quadratic Discriminant Analysis X X X X 1
10 Cnfusin Matrix: Predict default True Yes N Ttal Predicted Yes a b a + b N c d c + d Ttal a + c b + d N Result f LDA classificatin: Predict default if Pr[default = yes balance] > 1 /2 Predicted True Yes N Ttal Yes N Ttal
11 Tday Successfully using basic machine learning methds Prblems: 1. Hw well is the machine learning methd ding 2. Which methd is best fr my prblem? 3. Hw many features (and which nes) t use? 4. What is the uncertainty in the learned parameters?
12 Tday Successfully using basic machine learning methds Prblems: 1. Hw well is the machine learning methd ding 2. Which methd is best fr my prblem? 3. Hw many features (and which nes) t use? 4. What is the uncertainty in the learned parameters? Methds: 1. Validatin set 2. Leave ne ut crss-validatin 3. k-fld crss validatin 4. Btstrapping
13 Prblem: Hw t design features? Miles per galln Linear Degree 2 Degree Hrsepwer
14 Benefit f Gd Features Y Mean Squared Errr X gray: training errr Flexibility red: test errr
15 Just Use Training Data?
16 Just Use Training Data? Using mre features will always reduce MSE
17 Just Use Training Data? Using mre features will always reduce MSE Errr n the test set will be greater Y Mean Squared Errr X gray: training errr Flexibility red: test errr
18 Slutin 1: Validatin Set Just evaluate hw well the methd wrks n the test set Randmly split data t: 1. Training set: abut half f all data 2. Validatin set (AKA hld-ut set): remaining half
19 Slutin 1: Validatin Set Just evaluate hw well the methd wrks n the test set Randmly split data t: 1. Training set: abut half f all data 2. Validatin set (AKA hld-ut set): remaining half Chse the number f features/representatin based n minimizing errr n validatin set
20 Feature Selectin Using Validatin Set Y Mean Squared Errr gray: training errr X Flexibility red: test errr (validatin set)
21 Prblems using Validatin Set 1. Highly variable (imprecise) estimates: Each line shws validatin errr fr ne pssible divisin f data Mean Squared Errr Mean Squared Errr Degree f Plynmial Degree f Plynmial
22 Prblems using Validatin Set 1. Highly variable (imprecise) estimates: Each line shws validatin errr fr ne pssible divisin f data Mean Squared Errr Mean Squared Errr Degree f Plynmial Degree f Plynmial 2. Only subset f data is used (validatin set is excluded nly abut half f data is used)
23 Slutin 2: Leave-ne-ut Addresses prblems with validatin set Split the data set int 2 parts: 1. Training: Size n 1 2. Validatin: Size 1 Repeat n times: t get n learning prblems
24 Leave-ne-ut Get n learning prblems: Train n n 1 instances (blue) Test n 1 instance (red) MSE i = (y i ŷ i ) 2 LOOCV estimate CV (n) = 1 n n MSE i i=1
25 Leave-ne-ut vs Validatin Set Advantages
26 Leave-ne-ut vs Validatin Set Advantages 1. Using almst all data nt just half
27 Leave-ne-ut vs Validatin Set Advantages 1. Using almst all data nt just half 2. Stable results: Des nt have any randmness
28 Leave-ne-ut vs Validatin Set Advantages 1. Using almst all data nt just half 2. Stable results: Des nt have any randmness 3. Evaluatin is perfrmed with mre test data
29 Leave-ne-ut vs Validatin Set Advantages 1. Using almst all data nt just half 2. Stable results: Des nt have any randmness 3. Evaluatin is perfrmed with mre test data Disadvantages
30 Leave-ne-ut vs Validatin Set Advantages 1. Using almst all data nt just half 2. Stable results: Des nt have any randmness 3. Evaluatin is perfrmed with mre test data Disadvantages Can be very cmputatinally expensive: Fits the mdel n times
31 Speeding Up Leave-One-Out 1. Slve each fit independently and distribute the cmputatin
32 Speeding Up Leave-One-Out 1. Slve each fit independently and distribute the cmputatin 2. Linear regressin:
33 Speeding Up Leave-One-Out 1. Slve each fit independently and distribute the cmputatin 2. Linear regressin: Slve nly ne linear regressin using all data
34 Speeding Up Leave-One-Out 1. Slve each fit independently and distribute the cmputatin 2. Linear regressin: Slve nly ne linear regressin using all data Cmpute leave-ne-ut errr as: n ( yi ŷ i ) 2 CV (n) = 1 n i=1 1 h i
35 Speeding Up Leave-One-Out 1. Slve each fit independently and distribute the cmputatin 2. Linear regressin: Slve nly ne linear regressin using all data Cmpute leave-ne-ut errr as: n ( yi ŷ i ) 2 CV (n) = 1 n i=1 1 h i True value: y i, Predictin: ŷ i
36 Speeding Up Leave-One-Out 1. Slve each fit independently and distribute the cmputatin 2. Linear regressin: Slve nly ne linear regressin using all data Cmpute leave-ne-ut errr as: n ( yi ŷ i ) 2 CV (n) = 1 n i=1 1 h i True value: y i, Predictin: ŷ i hi is the leverage f data pint i: h i = 1 n + (x i x) 2 n j=1 (x j x) 2
37 Slutin 3: k-fld Crss-validatin Hybrid between validatin set and LOO Split training set int k subsets 1. Training set: n n /k 2. Test set: n /k k learning prblems Crss-validatin errr: CV (k) = 1 k k MSE i i=1
38 Crss-validatin vs Leave-One-Out k-fld Crss-validatin Leave-ne-ut
39 Crss-validatin vs Leave-One-Out LOOCV 10 fld CV Mean Squared Errr Mean Squared Errr Degree f Plynmial Degree f Plynmial
40 Empirical Evaluatin: 3 Examples Mean Squared Errr Mean Squared Errr Mean Squared Errr Flexibility Flexibility Flexibility Blue True errr Dashed LOOCV estimate Orange 10-fld CV
41 Hw t Chse k in CV? As k increases we have: 1. Increasing cmputatinal cmplexity 2. Decreasing bias (mre training data) 3. Increasing variance (bigger verlap between training sets) Empirically gd values: 5-10
42 Crss-validatin in Classificatin
43 Lgistic Regressin Predict prbability f a class: p(x) Example: p(balance) prbability f default fr persn with balance Linear regressin: Lgistic regressin: p(x) = β 0 + β 1 p(x) = eβ 0+β 1 X 1 + e β 0+β 1 X the same as: ( ) p(x) lg = β 0 + β 1 X 1 p(x) Linear decisin bundary (derive frm lg dds: p(x 1 ) p(x 2 ))
44 Features in Lgistic Regressin Lgistic regressin decisin bundary is als linear...nn-linear decisins? Degree=1 Degree=2 Degree=3 Degree=4
45 Lgistic Regressin with Nnlinear Features Linear: ( ) p(x) lg = β 0 + β 1 X 1 p(x) Nnlinear dds: ( ) p(x) lg = β 0 + β 1 X + β 2 X 2 + β 3 X 3 1 p(x) Nnlinear prbability: p(x) = eβ 0+β 1 X+β 2 X 2 +β 3 X e β 0+β 1 X+β 2 X 2 +β 3 X 3
46 Crss-validatin in Classificatin Wrks the same as fr regressin D nt use MSE but: CV (n) = 1 n n Err i i=1 Errr is an indicatr functin: Err i = I(y i ŷ i )
47 K in KNN Hw t decide n the right k t use in KNN?
48 K in KNN Hw t decide n the right k t use in KNN? Crss-validatin! Lgistic regressin KNN Errr Rate Errr Rate Order f Plynmials Used /K Brwn Test errr Blue Training errr Black CV errr
49 Overfitting and CV Is it pssible t verfit when using crss-validatin?
50 Overfitting and CV Is it pssible t verfit when using crss-validatin? Yes!
51 Overfitting and CV Is it pssible t verfit when using crss-validatin? Yes! Inferring k in KNN using crss-validatin is learning
52 Overfitting and CV Is it pssible t verfit when using crss-validatin? Yes! Inferring k in KNN using crss-validatin is learning Insightful theretical analysis: Prbably Apprximately Crrect (PAC) Learning
53 Overfitting and CV Is it pssible t verfit when using crss-validatin? Yes! Inferring k in KNN using crss-validatin is learning Insightful theretical analysis: Prbably Apprximately Crrect (PAC) Learning Crss-validatin will nt verfit when learning simple cncepts
54 Overfitting with Crss-validatin Task: Predict mpg pwer Define a new feature fr sme βs: f = β 0 + β 1 pwer + β 2 pwer 2 + β 3 pwer 3 + β 4 pwer Linear regressin: Find α such that: mpg = α f Crss-validatin: Find values f βs
55 Overfitting with Crss-validatin Task: Predict mpg pwer Define a new feature fr sme βs: f = β 0 + β 1 pwer + β 2 pwer 2 + β 3 pwer 3 + β 4 pwer Linear regressin: Find α such that: mpg = α f Crss-validatin: Find values f βs Will verfit Same slutin as using linear regressin n entire data (n crss-validatin)
56 Preventing Overfitting Gld standard: Have a test set that is used nly nce Rarely pssible $1M Netflix prize design: 1. Publicly available training set 2. Leader-bard results using a test set 3. Private data set used t determine the final winner
57 Btstrap Gal: Understand the cnfidence in learned parameters Mst useful in inference Hw cnfident are we in learned values f β: mpg = β 0 + β 1 pwer
58 Btstrap Gal: Understand the cnfidence in learned parameters Mst useful in inference Hw cnfident are we in learned values f β: mpg = β 0 + β 1 pwer Apprach: Run learning algrithm multiple times with different data sets:
59 Btstrap Gal: Understand the cnfidence in learned parameters Mst useful in inference Hw cnfident are we in learned values f β: mpg = β 0 + β 1 pwer Apprach: Run learning algrithm multiple times with different data sets: Create a new data-set by sampling with replacement frm the riginal ne
60 Btstrap Illustratin Y X Obs Y X Obs Y X Obs Y X Obs Original Data (Z) *1 Z *2 Z Z *B αˆ*1 ˆα *2 ˆα *B
61 Btstrap Results α α α True Btstrap
Resampling Methods. Chapter 5. Chapter 5 1 / 52
Resampling Methds Chapter 5 Chapter 5 1 / 52 1 51 Validatin set apprach 2 52 Crss validatin 3 53 Btstrap Chapter 5 2 / 52 Abut Resampling An imprtant statistical tl Pretending the data as ppulatin and
More informationSimple Linear Regression (single variable)
Simple Linear Regressin (single variable) Intrductin t Machine Learning Marek Petrik January 31, 2017 Sme f the figures in this presentatin are taken frm An Intrductin t Statistical Learning, with applicatins
More informationMidwest Big Data Summer School: Machine Learning I: Introduction. Kris De Brabanter
Midwest Big Data Summer Schl: Machine Learning I: Intrductin Kris De Brabanter kbrabant@iastate.edu Iwa State University Department f Statistics Department f Cmputer Science June 24, 2016 1/24 Outline
More informationCOMP 551 Applied Machine Learning Lecture 4: Linear classification
COMP 551 Applied Machine Learning Lecture 4: Linear classificatin Instructr: Jelle Pineau (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/cmp551 Unless therwise nted, all material psted
More informationCOMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification
COMP 551 Applied Machine Learning Lecture 5: Generative mdels fr linear classificatin Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Jelle Pineau Class web page: www.cs.mcgill.ca/~hvanh2/cmp551
More informationWhat is Statistical Learning?
What is Statistical Learning? Sales 5 10 15 20 25 Sales 5 10 15 20 25 Sales 5 10 15 20 25 0 50 100 200 300 TV 0 10 20 30 40 50 Radi 0 20 40 60 80 100 Newspaper Shwn are Sales vs TV, Radi and Newspaper,
More informationIAML: Support Vector Machines
1 / 22 IAML: Supprt Vectr Machines Charles Suttn and Victr Lavrenk Schl f Infrmatics Semester 1 2 / 22 Outline Separating hyperplane with maimum margin Nn-separable training data Epanding the input int
More informationk-nearest Neighbor How to choose k Average of k points more reliable when: Large k: noise in attributes +o o noise in class labels
Mtivating Example Memry-Based Learning Instance-Based Learning K-earest eighbr Inductive Assumptin Similar inputs map t similar utputs If nt true => learning is impssible If true => learning reduces t
More informationx 1 Outline IAML: Logistic Regression Decision Boundaries Example Data
Outline IAML: Lgistic Regressin Charles Suttn and Victr Lavrenk Schl f Infrmatics Semester Lgistic functin Lgistic regressin Learning lgistic regressin Optimizatin The pwer f nn-linear basis functins Least-squares
More informationLogistic Regression. and Maximum Likelihood. Marek Petrik. Feb
Lgistic Regressin and Maximum Likelihd Marek Petrik Feb 09 2017 S Far in ML Regressin vs Classificatin Linear regressin Bias-variance decmpsitin Practical methds fr linear regressin Simple Linear Regressin
More informationLDA, QDA, Naive Bayes
LDA, QDA, Naive Bayes Generative Classification Models Marek Petrik 2/16/2017 Last Class Logistic Regression Maximum Likelihood Principle Logistic Regression Predict probability of a class: p(x) Example:
More informationIn SMV I. IAML: Support Vector Machines II. This Time. The SVM optimization problem. We saw:
In SMV I IAML: Supprt Vectr Machines II Nigel Gddard Schl f Infrmatics Semester 1 We sa: Ma margin trick Gemetry f the margin and h t cmpute it Finding the ma margin hyperplane using a cnstrained ptimizatin
More informationLecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff
Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeff Reading: Chapter 2 STATS 202: Data mining and analysis September 27, 2017 1 / 20 Supervised vs. unsupervised learning In unsupervised
More informationFinal Overview. Introduction to ML. Marek Petrik 4/25/2017
Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,
More informationLecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff
Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeff Reading: Chapter 2 STATS 202: Data mining and analysis September 27, 2017 1 / 20 Supervised vs. unsupervised learning In unsupervised
More informationPattern Recognition 2014 Support Vector Machines
Pattern Recgnitin 2014 Supprt Vectr Machines Ad Feelders Universiteit Utrecht Ad Feelders ( Universiteit Utrecht ) Pattern Recgnitin 1 / 55 Overview 1 Separable Case 2 Kernel Functins 3 Allwing Errrs (Sft
More informationCOMP 551 Applied Machine Learning Lecture 11: Support Vector Machines
COMP 551 Applied Machine Learning Lecture 11: Supprt Vectr Machines Instructr: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/cmp551 Unless therwise nted, all material psted fr this curse
More informationDistributions, spatial statistics and a Bayesian perspective
Distributins, spatial statistics and a Bayesian perspective Dug Nychka Natinal Center fr Atmspheric Research Distributins and densities Cnditinal distributins and Bayes Thm Bivariate nrmal Spatial statistics
More informationBootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) >
Btstrap Methd > # Purpse: understand hw btstrap methd wrks > bs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(bs) > mean(bs) [1] 21.64625 > # estimate f lambda > lambda = 1/mean(bs);
More informationModule 3: Gaussian Process Parameter Estimation, Prediction Uncertainty, and Diagnostics
Mdule 3: Gaussian Prcess Parameter Estimatin, Predictin Uncertainty, and Diagnstics Jerme Sacks and William J Welch Natinal Institute f Statistical Sciences and University f British Clumbia Adapted frm
More informationStats Classification Ji Zhu, Michigan Statistics 1. Classification. Ji Zhu 445C West Hall
Stats 415 - Classificatin Ji Zhu, Michigan Statistics 1 Classificatin Ji Zhu 445C West Hall 734-936-2577 jizhu@umich.edu Stats 415 - Classificatin Ji Zhu, Michigan Statistics 2 Examples f Classificatin
More informationLinear Classification
Linear Classificatin CS 54: Machine Learning Slides adapted frm Lee Cper, Jydeep Ghsh, and Sham Kakade Review: Linear Regressin CS 54 [Spring 07] - H Regressin Given an input vectr x T = (x, x,, xp), we
More informationChapter 3: Cluster Analysis
Chapter 3: Cluster Analysis } 3.1 Basic Cncepts f Clustering 3.1.1 Cluster Analysis 3.1. Clustering Categries } 3. Partitining Methds 3..1 The principle 3.. K-Means Methd 3..3 K-Medids Methd 3..4 CLARA
More informationSTATS216v Introduction to Statistical Learning Stanford University, Summer Practice Final (Solutions) Duration: 3 hours
STATS216v Intrductin t Statistical Learning Stanfrd University, Summer 2016 Practice Final (Slutins) Duratin: 3 hurs Instructins: (This is a practice final and will nt be graded.) Remember the university
More informationCOMP 551 Applied Machine Learning Lecture 9: Support Vector Machines (cont d)
COMP 551 Applied Machine Learning Lecture 9: Supprt Vectr Machines (cnt d) Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Class web page: www.cs.mcgill.ca/~hvanh2/cmp551 Unless therwise
More informationPart 3 Introduction to statistical classification techniques
Part 3 Intrductin t statistical classificatin techniques Machine Learning, Part 3, March 07 Fabi Rli Preamble ØIn Part we have seen that if we knw: Psterir prbabilities P(ω i / ) Or the equivalent terms
More informationLecture 8: Multiclass Classification (I)
Bayes Rule fr Multiclass Prblems Traditinal Methds fr Multiclass Prblems Linear Regressin Mdels Lecture 8: Multiclass Classificatin (I) Ha Helen Zhang Fall 07 Ha Helen Zhang Lecture 8: Multiclass Classificatin
More informationCN700 Additive Models and Trees Chapter 9: Hastie et al. (2001)
CN700 Additive Mdels and Trees Chapter 9: Hastie et al. (2001) Madhusudana Shashanka Department f Cgnitive and Neural Systems Bstn University CN700 - Additive Mdels and Trees March 02, 2004 p.1/34 Overview
More informationTree Structured Classifier
Tree Structured Classifier Reference: Classificatin and Regressin Trees by L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stne, Chapman & Hall, 98. A Medical Eample (CART): Predict high risk patients
More informationLinear programming III
Linear prgramming III Review 1/33 What have cvered in previus tw classes LP prblem setup: linear bjective functin, linear cnstraints. exist extreme pint ptimal slutin. Simplex methd: g thrugh extreme pint
More informationSupport-Vector Machines
Supprt-Vectr Machines Intrductin Supprt vectr machine is a linear machine with sme very nice prperties. Haykin chapter 6. See Alpaydin chapter 13 fr similar cntent. Nte: Part f this lecture drew material
More informationSupport Vector Machines and Flexible Discriminants
12 Supprt Vectr Machines and Flexible Discriminants This is page 417 Printer: Opaque this 12.1 Intrductin In this chapter we describe generalizatins f linear decisin bundaries fr classificatin. Optimal
More informationCS 109 Lecture 23 May 18th, 2016
CS 109 Lecture 23 May 18th, 2016 New Datasets Heart Ancestry Netflix Our Path Parameter Estimatin Machine Learning: Frmally Many different frms f Machine Learning We fcus n the prblem f predictin Want
More informationT Algorithmic methods for data mining. Slide set 6: dimensionality reduction
T-61.5060 Algrithmic methds fr data mining Slide set 6: dimensinality reductin reading assignment LRU bk: 11.1 11.3 PCA tutrial in mycurses (ptinal) ptinal: An Elementary Prf f a Therem f Jhnsn and Lindenstrauss,
More informationInternal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.
Sectin 7 Mdel Assessment This sectin is based n Stck and Watsn s Chapter 9. Internal vs. external validity Internal validity refers t whether the analysis is valid fr the ppulatin and sample being studied.
More informationSmoothing, penalized least squares and splines
Smthing, penalized least squares and splines Duglas Nychka, www.image.ucar.edu/~nychka Lcally weighted averages Penalized least squares smthers Prperties f smthers Splines and Reprducing Kernels The interplatin
More informationNAME: Prof. Ruiz. 1. [5 points] What is the difference between simple random sampling and stratified random sampling?
CS4445 ata Mining and Kwledge iscery in atabases. B Term 2014 Exam 1 Nember 24, 2014 Prf. Carlina Ruiz epartment f Cmputer Science Wrcester Plytechnic Institute NAME: Prf. Ruiz Prblem I: Prblem II: Prblem
More informationNUMBERS, MATHEMATICS AND EQUATIONS
AUSTRALIAN CURRICULUM PHYSICS GETTING STARTED WITH PHYSICS NUMBERS, MATHEMATICS AND EQUATIONS An integral part t the understanding f ur physical wrld is the use f mathematical mdels which can be used t
More informationThe blessing of dimensionality for kernel methods
fr kernel methds Building classifiers in high dimensinal space Pierre Dupnt Pierre.Dupnt@ucluvain.be Classifiers define decisin surfaces in sme feature space where the data is either initially represented
More informationSIZE BIAS IN LINE TRANSECT SAMPLING: A FIELD TEST. Mark C. Otto Statistics Research Division, Bureau of the Census Washington, D.C , U.S.A.
SIZE BIAS IN LINE TRANSECT SAMPLING: A FIELD TEST Mark C. Ott Statistics Research Divisin, Bureau f the Census Washingtn, D.C. 20233, U.S.A. and Kenneth H. Pllck Department f Statistics, Nrth Carlina State
More informationData Mining: Concepts and Techniques. Classification and Prediction. Chapter February 8, 2007 CSE-4412: Data Mining 1
Data Mining: Cncepts and Techniques Classificatin and Predictin Chapter 6.4-6 February 8, 2007 CSE-4412: Data Mining 1 Chapter 6 Classificatin and Predictin 1. What is classificatin? What is predictin?
More informationData Mining with Linear Discriminants. Exercise: Business Intelligence (Part 6) Summer Term 2014 Stefan Feuerriegel
Data Mining with Linear Discriminants Exercise: Business Intelligence (Part 6) Summer Term 2014 Stefan Feuerriegel Tday s Lecture Objectives 1 Recgnizing the ideas f artificial neural netwrks and their
More informationChurn Prediction using Dynamic RFM-Augmented node2vec
Churn Predictin using Dynamic RFM-Augmented nde2vec Sandra Mitrvić, Jchen de Weerdt, Bart Baesens & Wilfried Lemahieu Department f Decisin Sciences and Infrmatin Management, KU Leuven 18 September 2017,
More informationIntroduction to Regression
Intrductin t Regressin Administrivia Hmewrk 6 psted later tnight. Due Friday after Break. 2 Statistical Mdeling Thus far we ve talked abut Descriptive Statistics: This is the way my sample is Inferential
More informationCOMP9444 Neural Networks and Deep Learning 3. Backpropagation
COMP9444 Neural Netwrks and Deep Learning 3. Backprpagatin Tetbk, Sectins 4.3, 5.2, 6.5.2 COMP9444 17s2 Backprpagatin 1 Outline Supervised Learning Ockham s Razr (5.2) Multi-Layer Netwrks Gradient Descent
More informationKinetic Model Completeness
5.68J/10.652J Spring 2003 Lecture Ntes Tuesday April 15, 2003 Kinetic Mdel Cmpleteness We say a chemical kinetic mdel is cmplete fr a particular reactin cnditin when it cntains all the species and reactins
More informationAdmin. MDP Search Trees. Optimal Quantities. Reinforcement Learning
Admin Reinfrcement Learning Cntent adapted frm Berkeley CS188 MDP Search Trees Each MDP state prjects an expectimax-like search tree Optimal Quantities The value (utility) f a state s: V*(s) = expected
More informationContents. This is page i Printer: Opaque this
Cntents This is page i Printer: Opaque this Supprt Vectr Machines and Flexible Discriminants. Intrductin............. The Supprt Vectr Classifier.... Cmputing the Supprt Vectr Classifier........ Mixture
More informationA New Evaluation Measure. J. Joiner and L. Werner. The problems of evaluation and the needed criteria of evaluation
III-l III. A New Evaluatin Measure J. Jiner and L. Werner Abstract The prblems f evaluatin and the needed criteria f evaluatin measures in the SMART system f infrmatin retrieval are reviewed and discussed.
More informationSupport Vector Machines and Flexible Discriminants
Supprt Vectr Machines and Flexible Discriminants This is page Printer: Opaque this. Intrductin In this chapter we describe generalizatins f linear decisin bundaries fr classificatin. Optimal separating
More information7 TH GRADE MATH STANDARDS
ALGEBRA STANDARDS Gal 1: Students will use the language f algebra t explre, describe, represent, and analyze number expressins and relatins 7 TH GRADE MATH STANDARDS 7.M.1.1: (Cmprehensin) Select, use,
More informationINSTRUMENTAL VARIABLES
INSTRUMENTAL VARIABLES Technical Track Sessin IV Sergi Urzua University f Maryland Instrumental Variables and IE Tw main uses f IV in impact evaluatin: 1. Crrect fr difference between assignment f treatment
More informationthe results to larger systems due to prop'erties of the projection algorithm. First, the number of hidden nodes must
M.E. Aggune, M.J. Dambrg, M.A. El-Sharkawi, R.J. Marks II and L.E. Atlas, "Dynamic and static security assessment f pwer systems using artificial neural netwrks", Prceedings f the NSF Wrkshp n Applicatins
More information[COLLEGE ALGEBRA EXAM I REVIEW TOPICS] ( u s e t h i s t o m a k e s u r e y o u a r e r e a d y )
(Abut the final) [COLLEGE ALGEBRA EXAM I REVIEW TOPICS] ( u s e t h i s t m a k e s u r e y u a r e r e a d y ) The department writes the final exam s I dn't really knw what's n it and I can't very well
More information, which yields. where z1. and z2
The Gaussian r Nrmal PDF, Page 1 The Gaussian r Nrmal Prbability Density Functin Authr: Jhn M Cimbala, Penn State University Latest revisin: 11 September 13 The Gaussian r Nrmal Prbability Density Functin
More informationComputational modeling techniques
Cmputatinal mdeling techniques Lecture 2: Mdeling change. In Petre Department f IT, Åb Akademi http://users.ab.fi/ipetre/cmpmd/ Cntent f the lecture Basic paradigm f mdeling change Examples Linear dynamical
More informationArtificial Neural Networks MLP, Backpropagation
Artificial Neural Netwrks MLP, Backprpagatin 01001110 01100101 01110101 01110010 01101111 01101110 01101111 01110110 01100001 00100000 01110011 01101011 01110101 01110000 01101001 01101110 01100001 00100000
More informationTuring Machines. Human-aware Robotics. 2017/10/17 & 19 Chapter 3.2 & 3.3 in Sipser Ø Announcement:
Turing Machines Human-aware Rbtics 2017/10/17 & 19 Chapter 3.2 & 3.3 in Sipser Ø Annuncement: q q q q Slides fr this lecture are here: http://www.public.asu.edu/~yzhan442/teaching/cse355/lectures/tm-ii.pdf
More informationStatistics, Numerical Models and Ensembles
Statistics, Numerical Mdels and Ensembles Duglas Nychka, Reinhard Furrer,, Dan Cley Claudia Tebaldi, Linda Mearns, Jerry Meehl and Richard Smith (UNC). Spatial predictin and data assimilatin Precipitatin
More informationChapter 15 & 16: Random Forests & Ensemble Learning
Chapter 15 & 16: Randm Frests & Ensemble Learning DD3364 Nvember 27, 2012 Ty Prblem fr Bsted Tree Bsted Tree Example Estimate this functin with a sum f trees with 9-terminal ndes by minimizing the sum
More informationSUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical model for microarray data analysis
SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical mdel fr micrarray data analysis David Rssell Department f Bistatistics M.D. Andersn Cancer Center, Hustn, TX 77030, USA rsselldavid@gmail.cm
More informationChecking the resolved resonance region in EXFOR database
Checking the reslved resnance regin in EXFOR database Gttfried Bertn Sciété de Calcul Mathématique (SCM) Oscar Cabells OECD/NEA Data Bank JEFF Meetings - Sessin JEFF Experiments Nvember 0-4, 017 Bulgne-Billancurt,
More information4th Indian Institute of Astrophysics - PennState Astrostatistics School July, 2013 Vainu Bappu Observatory, Kavalur. Correlation and Regression
4th Indian Institute f Astrphysics - PennState Astrstatistics Schl July, 2013 Vainu Bappu Observatry, Kavalur Crrelatin and Regressin Rahul Ry Indian Statistical Institute, Delhi. Crrelatin Cnsider a tw
More information3.4 Shrinkage Methods Prostate Cancer Data Example (Continued) Ridge Regression
3.3.4 Prstate Cancer Data Example (Cntinued) 3.4 Shrinkage Methds 61 Table 3.3 shws the cefficients frm a number f different selectin and shrinkage methds. They are best-subset selectin using an all-subsets
More informationCS 477/677 Analysis of Algorithms Fall 2007 Dr. George Bebis Course Project Due Date: 11/29/2007
CS 477/677 Analysis f Algrithms Fall 2007 Dr. Gerge Bebis Curse Prject Due Date: 11/29/2007 Part1: Cmparisn f Srting Algrithms (70% f the prject grade) The bjective f the first part f the assignment is
More informationPhysical Layer: Outline
18-: Intrductin t Telecmmunicatin Netwrks Lectures : Physical Layer Peter Steenkiste Spring 01 www.cs.cmu.edu/~prs/nets-ece Physical Layer: Outline Digital Representatin f Infrmatin Characterizatin f Cmmunicatin
More informationElements of Machine Intelligence - I
ECE-175A Elements f Machine Intelligence - I Ken Kreutz-Delgad Nun Vascncels ECE Department, UCSD Winter 2011 The curse The curse will cver basic, but imprtant, aspects f machine learning and pattern recgnitin
More informationReinforcement Learning" CMPSCI 383 Nov 29, 2011!
Reinfrcement Learning" CMPSCI 383 Nv 29, 2011! 1 Tdayʼs lecture" Review f Chapter 17: Making Cmple Decisins! Sequential decisin prblems! The mtivatin and advantages f reinfrcement learning.! Passive learning!
More informationinitially lcated away frm the data set never win the cmpetitin, resulting in a nnptimal nal cdebk, [2] [3] [4] and [5]. Khnen's Self Organizing Featur
Cdewrd Distributin fr Frequency Sensitive Cmpetitive Learning with One Dimensinal Input Data Aristides S. Galanpuls and Stanley C. Ahalt Department f Electrical Engineering The Ohi State University Abstract
More informationMaximum A Posteriori (MAP) CS 109 Lecture 22 May 16th, 2016
Maximum A Psteriri (MAP) CS 109 Lecture 22 May 16th, 2016 Previusly in CS109 Game f Estimatrs Maximum Likelihd Nn spiler: this didn t happen Side Plt argmax argmax f lg Mther f ptimizatins? Reviving an
More informationEnhancing Performance of MLP/RBF Neural Classifiers via an Multivariate Data Distribution Scheme
Enhancing Perfrmance f / Neural Classifiers via an Multivariate Data Distributin Scheme Halis Altun, Gökhan Gelen Nigde University, Electrical and Electrnics Engineering Department Nigde, Turkey haltun@nigde.edu.tr
More informationLocalized Model Selection for Regression
Lcalized Mdel Selectin fr Regressin Yuhng Yang Schl f Statistics University f Minnesta Church Street S.E. Minneaplis, MN 5555 May 7, 007 Abstract Research n mdel/prcedure selectin has fcused n selecting
More informationFall 2013 Physics 172 Recitation 3 Momentum and Springs
Fall 03 Physics 7 Recitatin 3 Mmentum and Springs Purpse: The purpse f this recitatin is t give yu experience wrking with mmentum and the mmentum update frmula. Readings: Chapter.3-.5 Learning Objectives:.3.
More informationDetermining the Accuracy of Modal Parameter Estimation Methods
Determining the Accuracy f Mdal Parameter Estimatin Methds by Michael Lee Ph.D., P.E. & Mar Richardsn Ph.D. Structural Measurement Systems Milpitas, CA Abstract The mst cmmn type f mdal testing system
More informationModelling of Clock Behaviour. Don Percival. Applied Physics Laboratory University of Washington Seattle, Washington, USA
Mdelling f Clck Behaviur Dn Percival Applied Physics Labratry University f Washingtn Seattle, Washingtn, USA verheads and paper fr talk available at http://faculty.washingtn.edu/dbp/talks.html 1 Overview
More informationThe general linear model and Statistical Parametric Mapping I: Introduction to the GLM
The general linear mdel and Statistical Parametric Mapping I: Intrductin t the GLM Alexa Mrcm and Stefan Kiebel, Rik Hensn, Andrew Hlmes & J-B J Pline Overview Intrductin Essential cncepts Mdelling Design
More informationMATCHING TECHNIQUES. Technical Track Session VI. Emanuela Galasso. The World Bank
MATCHING TECHNIQUES Technical Track Sessin VI Emanuela Galass The Wrld Bank These slides were develped by Christel Vermeersch and mdified by Emanuela Galass fr the purpse f this wrkshp When can we use
More informationON-LINE PROCEDURE FOR TERMINATING AN ACCELERATED DEGRADATION TEST
Statistica Sinica 8(1998), 207-220 ON-LINE PROCEDURE FOR TERMINATING AN ACCELERATED DEGRADATION TEST Hng-Fwu Yu and Sheng-Tsaing Tseng Natinal Taiwan University f Science and Technlgy and Natinal Tsing-Hua
More informationAP Statistics Practice Test Unit Three Exploring Relationships Between Variables. Name Period Date
AP Statistics Practice Test Unit Three Explring Relatinships Between Variables Name Perid Date True r False: 1. Crrelatin and regressin require explanatry and respnse variables. 1. 2. Every least squares
More informationYou need to be able to define the following terms and answer basic questions about them:
CS440/ECE448 Sectin Q Fall 2017 Midterm Review Yu need t be able t define the fllwing terms and answer basic questins abut them: Intr t AI, agents and envirnments Pssible definitins f AI, prs and cns f
More informationStatistical classifiers: Bayesian decision theory and density estimation
3 rd NOSE Shrt Curse Alpbach, st 6 th Mar 004 Statistical classifiers: Bayesian decisin thery and density estimatin Ricard Gutierrez- Department f Cmputer Science rgutier@cs.tamu.edu http://research.cs.tamu.edu/prism
More informationECEN 4872/5827 Lecture Notes
ECEN 4872/5827 Lecture Ntes Lecture #5 Objectives fr lecture #5: 1. Analysis f precisin current reference 2. Appraches fr evaluating tlerances 3. Temperature Cefficients evaluatin technique 4. Fundamentals
More informationCHAPTER 24: INFERENCE IN REGRESSION. Chapter 24: Make inferences about the population from which the sample data came.
MATH 1342 Ch. 24 April 25 and 27, 2013 Page 1 f 5 CHAPTER 24: INFERENCE IN REGRESSION Chapters 4 and 5: Relatinships between tw quantitative variables. Be able t Make a graph (scatterplt) Summarize the
More information24 Multiple Eigenvectors; Latent Factor Analysis; Nearest Neighbors
Multiple Eigenvectrs; Latent Factr Analysis; Nearest Neighbrs 47 24 Multiple Eigenvectrs; Latent Factr Analysis; Nearest Neighbrs Clustering w/multiple Eigenvectrs [When we use the Fiedler vectr fr spectral
More informationName AP CHEM / / Chapter 1 Chemical Foundations
Name AP CHEM / / Chapter 1 Chemical Fundatins Metric Cnversins All measurements in chemistry are made using the metric system. In using the metric system yu must be able t cnvert between ne value and anther.
More informationInference in the Multiple-Regression
Sectin 5 Mdel Inference in the Multiple-Regressin Kinds f hypthesis tests in a multiple regressin There are several distinct kinds f hypthesis tests we can run in a multiple regressin. Suppse that amng
More informationLecture 13: Markov Chain Monte Carlo. Gibbs sampling
Lecture 13: Markv hain Mnte arl Gibbs sampling Gibbs sampling Markv chains 1 Recall: Apprximate inference using samples Main idea: we generate samples frm ur Bayes net, then cmpute prbabilities using (weighted)
More informationCS:4420 Artificial Intelligence
CS:4420 Artificial Intelligence Spring 2017 Learning frm Examples Cesare Tinelli The University f Iwa Cpyright 2004 17, Cesare Tinelli and Stuart Russell a a These ntes were riginally develped by Stuart
More informationCAUSAL INFERENCE. Technical Track Session I. Phillippe Leite. The World Bank
CAUSAL INFERENCE Technical Track Sessin I Phillippe Leite The Wrld Bank These slides were develped by Christel Vermeersch and mdified by Phillippe Leite fr the purpse f this wrkshp Plicy questins are causal
More informationEASTERN ARIZONA COLLEGE Introduction to Statistics
EASTERN ARIZONA COLLEGE Intrductin t Statistics Curse Design 2014-2015 Curse Infrmatin Divisin Scial Sciences Curse Number PSY 220 Title Intrductin t Statistics Credits 3 Develped by Adam Stinchcmbe Lecture/Lab
More informationContents. This is page i Printer: Opaque this
Cntents This is page i Printer: Opaque this 9 Additive Mdels, Trees, and Related Methds 1 9.1 Generalized Additive Mdels................. 1 9.1.1 Fitting Additive Mdels................ 3 9.1.2 Example:
More informationHiding in plain sight
Hiding in plain sight Principles f stegangraphy CS349 Cryptgraphy Department f Cmputer Science Wellesley Cllege The prisners prblem Stegangraphy 1-2 1 Secret writing Lemn juice is very nearly clear s it
More informationStatistical Learning. 2.1 What Is Statistical Learning?
2 Statistical Learning 2.1 What Is Statistical Learning? In rder t mtivate ur study f statistical learning, we begin with a simple example. Suppse that we are statistical cnsultants hired by a client t
More informationLecture 10, Principal Component Analysis
Principal Cmpnent Analysis Lecture 10, Principal Cmpnent Analysis Ha Helen Zhang Fall 2017 Ha Helen Zhang Lecture 10, Principal Cmpnent Analysis 1 / 16 Principal Cmpnent Analysis Lecture 10, Principal
More informationOn Out-of-Sample Statistics for Financial Time-Series
On Out-f-Sample Statistics fr Financial Time-Series Françis Gingras Yshua Bengi Claude Nadeau CRM-2585 January 1999 Département de physique, Université de Mntréal Labratire d infrmatique des systèmes adaptatifs,
More informationChapter 11: Neural Networks
Chapter 11: Neural Netwrks DD3364 December 16, 2012 Prjectin Pursuit Regressin Prjectin Pursuit Regressin mdel: Prjectin Pursuit Regressin f(x) = M g m (wmx) t i=1 where X R p and have targets Y R. Additive
More informationMATHEMATICS SYLLABUS SECONDARY 5th YEAR
Eurpean Schls Office f the Secretary-General Pedaggical Develpment Unit Ref. : 011-01-D-8-en- Orig. : EN MATHEMATICS SYLLABUS SECONDARY 5th YEAR 6 perid/week curse APPROVED BY THE JOINT TEACHING COMMITTEE
More informationANALYTICAL SOLUTIONS TO THE PROBLEM OF EDDY CURRENT PROBES
ANALYTICAL SOLUTIONS TO THE PROBLEM OF EDDY CURRENT PROBES CONSISTING OF LONG PARALLEL CONDUCTORS B. de Halleux, O. Lesage, C. Mertes and A. Ptchelintsev Mechanical Engineering Department Cathlic University
More informationSome Theory Behind Algorithms for Stochastic Optimization
Sme Thery Behind Algrithms fr Stchastic Optimizatin Zelda Zabinsky University f Washingtn Industrial and Systems Engineering May 24, 2010 NSF Wrkshp n Simulatin Optimizatin Overview Prblem frmulatin Theretical
More information15-381/781 Bayesian Nets & Probabilistic Inference
15-381/781 Bayesian Nets & Prbabilistic Inference Emma Brunskill (this time) Ariel Prcaccia With thanks t Dan Klein (Berkeley), Percy Liang (Stanfrd) and Past 15-381 Instructrs fr sme slide cntent, and
More information