DISCRIMINANT ANALYSIS: LDA AND QDA
|
|
- Whitney Greer
- 5 years ago
- Views:
Transcription
1 Stat 427/627 Statistical Machine Learning (Baron) HOMEWORK 6, Solutions DISCRIMINANT ANALYSIS: LDA AND QDA. Chap 4, exercise 5. (a) On a training set, LDA and QDA are both expected to perform well. LDA uses a correct model, and QDA includes the linear model as a special case. QDA, including a wider range of models, will fit the given particular data set better. On a test set, however, LDA should be more accurate because QDA, with additional parameters and an overfit, will have a higher variance. (b) For a nonlinear boundary, QDA should perform better on both the training and test sets because LDA will not capture the nonlinear pattern. (c) As the sample size increases, QDA will perform better because its higher variance will be offset by a large sample size. (d) False. A higher variance of QDA prediction, without any gain in bias, will result in a higher error rate. 2. See the last page. 3. Chap 4, exercise 0[e,f,g,h]. (e) > library(islr) > attach(weekly) > Weekly_training Weekly[ Year < 2008, ] # Split the data by year into > Weekly_testing Weekly[ Year > 2008, ] # the training and testing sets > library(mass) # This library contains LDA > lda.fit lda(direction ~ Lag2, dataweekly_training) # Fit using training data only # Now use this model to predict classification of the testing data > Direction_Predicted_LDA predict(lda.fit, data.frame(weekly_testing))$class > > table( Weekly_testing$Direction, Direction_Predicted_LDA ) # Confusion matrix Direction_Predicted_LDA Direction Down Up Down 9 34 Up 5 56 > mean( Weekly_testing$Direction Direction_Predicted_LDA ) [] # Correct classification rate of LDA is 62.5%. (f) # Similarly with QDA. > qda.fit qda(direction ~ Lag2, dataweekly_training) > Direction_Predicted_QDA predict(qda.fit, data.frame(weekly_testing))$class > table( Weekly_testing$Direction, Direction_Predicted_QDA ) Direction_Predicted_QDA
2 Direction Down Up # Correct classification rate of QDA is only 58.7%. Down 0 43 # However, QDA always predicted that the market goes Up. Up 0 6 # LDA has a higher prediction power. QDA seems to be an overfit. > mean( Weekly_testing$Direction Direction_Predicted_QDA ) [] # Logistic regression and LDA yield the best results so far. (g) # KNN with k is a very flexible method, matching the local features. > X.train matrix( Lag2[ Year < 2008 ], sum(year < 2008), ) > X.test matrix( Lag2[ Year > 2009 ], sum(year > 2009), ) > Y.train Direction[ Year < 2008 ] > Y.test Direction[ Year > 2009 ] > knn.fit knn( X.train, X.test, Y.train, ) > table( Y.test, knn.fit ) knn.fit Y.test Down Up Down 2 22 Up > mean( Y.test knn.fit ) [] # With k4, the KNN method gives a better classification rate for these data: > knn.fit knn( X.train, X.test, Y.train, 4 ) > mean( Y.test knn.fit ) [] (h) Among the methods that we explored, Logistic regression and LDA yield the highest classification rate, KNN with K4 yields the classification rate QDA yields the classification rate KNN with K yields the classification rate Chap 4, exercise 3. > attach(boston) # Attach a new dataset Boston from ISLR package >?Boston # Data description > names(boston) [] "crim" "zn" "indus" "chas" "nox" "rm" "age" "dis" "rad" "tax" "ptratio" "black" "lstat" "medv" > Crime_Rate ( crim > median(crim) ) # Categorical variable Crime_Rate > table(crime_rate) Crime_Rate FALSE TRUE 2
3 # The median splits the data evenly > Boston cbind( Boston, Crime_Rate ) # Logistic regression First, we ll fit the full model. > lreg.fit glm(crime_rate ~ zn + indus + chas + nox + rm + age + dis + rad + tax + ptratio + black + lstat + medv, family"binomial") > summary(lreg.fit) Estimate Std. Error z value Pr(> z ) (Intercept) e-07 *** zn * indus chas nox e- *** rm age dis ** rad e-05 *** tax * ptratio ** black * lstat medv * # Some variables are not significant. We ll fit a reduced model, without them. > lreg.fit glm(crime_rate ~ zn + nox + age + dis + rad + tax + ptratio + black + medv, family"binomial") > summary(lreg.fit) Estimate Std. Error z value Pr(> z ) (Intercept) e-07 *** zn ** nox e- *** age * dis ** rad e-07 *** tax ** ptratio ** black * medv ** # To evaluate the model s predictive performance, we ll use the validation set method. > n length(crim) # Split the data randomly into training and testing subsets > n [] 506 3
4 > Z sample(n,n/2) > Data_training Boston[ Z, ] > Data_testing Boston[ -Z, ] > lreg.fit glm(crime_rate ~ zn + nox + age + dis + rad + tax + ptratio + black + medv, datadata_training, family"binomial") # Fit a logistic regression model on the training data # and use this model to predict the testing data > Predicted_probability predict( lreg.fit, data.frame(data_testing), type"response" ) > Predicted_Crime_Rate *(Predicted_probability > 0.5) # Predicted crime rate: high, 0 low > table( Data_testing$Crime_Rate, Predicted_Crime_Rate ) Predicted_Crime_Rate > mean( Data_testing$Crime_Rate Predicted_Crime_Rate ) [] # Logistic regression shows the correct classification rate of 89.5%, the error rate is 0.5%. # Now, try LDA. For a fair comparison, use the same training data, then predict the testing data. > lda.fit lda(crime_rate ~ zn + nox + age + dis + rad + tax + ptratio + black + medv, datadata_training) > Predicted_Crime_Rate_LDA predict(lda.fit, data.frame(data_testing))$class > table( Data_testing$Crime_Rate, Predicted_Crime_Rate_LDA ) Predicted_Crime_Rate_LDA > mean( Data_testing$Crime_Rate! Predicted_Crime_Rate_LDA ) [] # LDA s error rate is higher, 5.9%. # What about QDA? If logit has a nonlinear relation with predictors, QDA may be a better method. 4
5 > qda.fit qda(crime_rate ~ zn + nox + age + dis + rad + tax + ptratio + black + medv, datadata_training) > Predicted_Crime_Rate_QDA predict(qda.fit, data.frame(data_testing))$class > table( Data_testing$Crime_Rate, Predicted_Crime_Rate_QDA ) Predicted_Crime_Rate_QDA > mean( Data_testing$Crime_Rate! Predicted_Crime_Rate_QDA ) [] # QDA s error rate of 3.4% is between the LDA and the logistic regression. According to these quick results, logistic regression has the best prediction power. # It may be interesting to look at the distribution of predicted probabilities of high crime zones, produced by the three classification methods. All of them mostly discriminate between the two groups pretty well. > hist(predicted_probability, main'predicted PROBABILITY BY LOGISTIC REGRESSION') > hist(predict(lda.fit, data.frame(data_testing))$posterior, main'posterior DISTRIBUTION BY LDA') > hist(predict(qda.fit, data.frame(data_testing))$posterior, main'posterior DISTRIBUTION BY QDA') KNN method # Define training and testing data > names(data_training) [] "crim" "zn" "indus" "chas" "nox" [6] "rm" "age" "dis" "rad" "tax" [] "ptratio" "black" "lstat" "medv" "Crime_Rate" # Variables zn + nox + age + dis + rad + tax + ptratio + black + medv are in columns 2,5,7,8,9,0,,2,4. > X.train Data_training[, c(2,5,7,8,9,0,,2,4) ] 5
6 2. (a) For Class A, based on (4, 2), X A 3, s 2 A (4 3)2 +(2 3) 2 2, and s 2 A.4. For Class B, based on (9, 6, 3), X B 6, s 2 B (9 6)2 +(6 6) 2 +(3 6) 2 9, and s 3 B 3. The pooled variance is s 2 (4 3)2 +(2 3) 2 +(9 6) 2 +(6 6) 2 +(3 6) 2 20, and the pooled standard deviation is s (b) We know that in this situation, when π A π B 0.5, L(A B) L(B A), and σ A σ B, the point is classified to the closest class mean. x 5 is closer to X B 6, so we classify it to Class B. (c) Calculate posterior probabilities P {A x 5} π A f A (5) π A f A (5) + π B f B (5) π A σ exp { (5 µ 2π A) 2 /2σ 2 } π A σ exp { (5 µ 2π A) 2 /2σ 2 } + π B σ exp { (5 µ 2π B) 2 /2σ 2 } estimated by (0.8) exp { (5 3) 2 /(2 20/3)} (0.8) exp { (5 3) 2 /(2 20/3)} + (0.2) exp { (5 6) 2 /(2 20/3)} P {B x 5} P {A x 5} With a zero-one loss function, we classify the new observation according to the higher posterior probability, i.e., to Class A. (d) We already have the posterior probabilities. Then, The risk of classifying the new observation into Class A is Risk(A) (3)P {B x 5} The risk of classifying the new observation into Class B is Risk(B) ()P {A x 5} So, we classify x 5 into Class A because this decision has a lower risk. (e) Now we have to use separate variances instead of the pooled variance. With equal prior probabilities, we have P {A x 5} π A σ A 2π exp { (5 µ A ) 2 /2σA} 2 π A σ A 2π exp { (5 µ A ) 2 /2σA} 2 + π B σ B 2π exp { (5 µ B ) 2 /2σB} 2 estimated by (0.5) exp { (5.4 3)2 /(2 2)} (0.5) exp { (5.4 3)2 /(2 2)} + (0.5) exp { (5 3 6)2 /(2 9)} P {B x 5} P {A x 5} With a zero-one loss function, we classify the new observation according to the higher posterior probability, i.e., to Class B. With unequal probabilities π A 0.8 and π B 0.2, P {A x 5} (0.8) exp { (5.4 3)2 /(2 2)} (0.8) exp { (5.4 3)2 /(2 2)} + (0.2) exp { (5 3 6)2 /(2 9)}
7 P {B x 5} P {A x 5} and we classify x to Class A. With L(A B) 3L(B A), we compute the risks Risk(A) (3)P {B x 5} (3)(0.2326) Risk(B) ()P {A x 5} and classify x 5 into Class A because this decision has a lower risk. (f) In fact, we used LDA in (b-d) and QDA in (e), because LDA assumes σ A σ B whereas QDA does not make this assumption.
8 P {B x 5} P {A x 5} and we classify x to Class A. With L(A B) 3L(B A), we compute the risks Risk(A) (3)P {B x 5} (3)(0.2326) Risk(B) ()P {B x 5} and classify x 5 into Class A because this decision has a lower risk. (f) In fact, we used LDA in (b-d) and QDA in (e), because LDA assumes σ A σ B whereas QDA does not make this assumption.
Assignment 2: K-Nearest Neighbors and Logistic Regression
Assignment 2: K-Nearest Neighbors and Logistic Regression SDS293 - Machine Learning Due: 4 Oct 2017 by 11:59pm Conceptual Exercises 4.4 parts a-d (p. 168-169 ISLR) When the number of features p is large,
More informationHW1 Roshena MacPherson Feb 1, 2017
HW1 Roshena MacPherson Feb 1, 2017 This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code. Question 1: In this question we will consider some real
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationSupervised Learning. Regression Example: Boston Housing. Regression Example: Boston Housing
Supervised Learning Unsupervised learning: To extract structure and postulate hypotheses about data generating process from observations x 1,...,x n. Visualize, summarize and compress data. We have seen
More informationSKLearn Tutorial: DNN on Boston Data
SKLearn Tutorial: DNN on Boston Data This tutorial follows very closely two other good tutorials and merges elements from both: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/skflow/boston.py
More informationSTK 2100 Oblig 1. Zhou Siyu. February 15, 2017
STK 200 Oblig Zhou Siyu February 5, 207 Question a) Make a scatter box plot for the data set. Answer:Here is the code I used to plot the scatter box in R. library ( MASS ) 2 pairs ( Boston ) Figure : Scatter
More informationProblem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56
STAT 391 - Spring Quarter 2017 - Midterm 1 - April 27, 2017 Name: Student ID Number: Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56 Directions. Read directions carefully and show all your
More information<br /> D. Thiebaut <br />August """Example of DNNRegressor for Housing dataset.""" In [94]:
sklearn Tutorial: Linear Regression on Boston Data This is following the https://github.com/tensorflow/tensorflow/blob/maste
More informationGRAD6/8104; INES 8090 Spatial Statistic Spring 2017
Lab #5 Spatial Regression (Due Date: 04/29/2017) PURPOSES 1. Learn to conduct alternative linear regression modeling on spatial data 2. Learn to diagnose and take into account spatial autocorrelation in
More informationMultiple Regression Part I STAT315, 19-20/3/2014
Multiple Regression Part I STAT315, 19-20/3/2014 Regression problem Predictors/independent variables/features Or: Error which can never be eliminated. Our task is to estimate the regression function f.
More informationBayesian Classification Methods
Bayesian Classification Methods Suchit Mehrotra North Carolina State University smehrot@ncsu.edu October 24, 2014 Suchit Mehrotra (NCSU) Bayesian Classification October 24, 2014 1 / 33 How do you define
More informationFinal Overview. Introduction to ML. Marek Petrik 4/25/2017
Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,
More informationData Mining Techniques. Lecture 2: Regression
Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 2: Regression Jan-Willem van de Meent (credit: Yijun Zhao, Marc Toussaint, Bishop) Administrativa Instructor Jan-Willem van de Meent Email:
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationStat 401B Final Exam Fall 2016
Stat 40B Final Exam Fall 0 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning
More informationLecture 4 Discriminant Analysis, k-nearest Neighbors
Lecture 4 Discriminant Analysis, k-nearest Neighbors Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se fredrik.lindsten@it.uu.se
More informationClassification: Linear Discriminant Analysis
Classification: Linear Discriminant Analysis Discriminant analysis uses sample information about individuals that are known to belong to one of several populations for the purposes of classification. Based
More informationFSAN815/ELEG815: Foundations of Statistical Learning
FSAN815/ELEG815: Foundations of Statistical Learning Gonzalo R. Arce Chapter 14: Logistic Regression Fall 2014 Course Objectives & Structure Course Objectives & Structure The course provides an introduction
More informationLecture 9: Classification, LDA
Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis October 13, 2017 1 / 21 Review: Main strategy in Chapter 4 Find an estimate ˆP (Y X). Then, given an input x 0, we
More informationLecture 9: Classification, LDA
Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis Jonathan Taylor, 10/12 Slide credits: Sergio Bacallado 1 / 1 Review: Main strategy in Chapter 4 Find an estimate ˆP
More informationPerformance Evaluation
Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Example:
More informationSB2b Statistical Machine Learning Bagging Decision Trees, ROC curves
SB2b Statistical Machine Learning Bagging Decision Trees, ROC curves Dino Sejdinovic (guest lecturer) Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~flaxman/course_ml.html
More informationLecture 9: Classification, LDA
Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis October 13, 2017 1 / 21 Review: Main strategy in Chapter 4 Find an estimate ˆP (Y X). Then, given an input x 0, we
More informationChapter 4 Dimension Reduction
Chapter 4 Dimension Reduction Data Mining for Business Intelligence Shmueli, Patel & Bruce Galit Shmueli and Peter Bruce 2010 Exploring the data Statistical summary of data: common metrics Average Median
More informationLecture 5: LDA and Logistic Regression
Lecture 5: and Logistic Regression Hao Helen Zhang Hao Helen Zhang Lecture 5: and Logistic Regression 1 / 39 Outline Linear Classification Methods Two Popular Linear Models for Classification Linear Discriminant
More informationPrediction problems 3: Validation and Model Checking
Prediction problems 3: Validation and Model Checking Data Science 101 Team May 17, 2018 Outline Validation Why is it important How should we do it? Model checking Checking whether your model is a good
More informationContents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II)
Contents Lecture Lecture Linear Discriminant Analysis Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University Email: fredriklindsten@ituuse Summary of lecture
More informationCLUe Training An Introduction to Machine Learning in R with an example from handwritten digit recognition
CLUe Training An Introduction to Machine Learning in R with an example from handwritten digit recognition Ad Feelders Universiteit Utrecht Department of Information and Computing Sciences Algorithmic Data
More informationAssignment 4. Machine Learning, Summer term 2014, Ulrike von Luxburg To be discussed in exercise groups on May 12-14
Assignment 4 Machine Learning, Summer term 2014, Ulrike von Luxburg To be discussed in exercise groups on May 12-14 Exercise 1 (Rewriting the Fisher criterion for LDA, 2 points) criterion J(w) = w, m +
More informationCMSC858P Supervised Learning Methods
CMSC858P Supervised Learning Methods Hector Corrada Bravo March, 2010 Introduction Today we discuss the classification setting in detail. Our setting is that we observe for each subject i a set of p predictors
More informationStatistical Machine Learning Hilary Term 2018
Statistical Machine Learning Hilary Term 2018 Pier Francesco Palamara Department of Statistics University of Oxford Slide credits and other course material can be found at: http://www.stats.ox.ac.uk/~palamara/sml18.html
More informationLDA, QDA, Naive Bayes
LDA, QDA, Naive Bayes Generative Classification Models Marek Petrik 2/16/2017 Last Class Logistic Regression Maximum Likelihood Principle Logistic Regression Predict probability of a class: p(x) Example:
More informationIntroduction to PyTorch
Introduction to PyTorch Benjamin Roth Centrum für Informations- und Sprachverarbeitung Ludwig-Maximilian-Universität München beroth@cis.uni-muenchen.de Benjamin Roth (CIS) Introduction to PyTorch 1 / 16
More informationSTATS216v Introduction to Statistical Learning Stanford University, Summer Midterm Exam (Solutions) Duration: 1 hours
Instructions: STATS216v Introduction to Statistical Learning Stanford University, Summer 2017 Remember the university honor code. Midterm Exam (Solutions) Duration: 1 hours Write your name and SUNet ID
More informationSupervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2012
Supervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2012 Overview Review: Conditional Probability LDA / QDA: Theory Fisher s Discriminant Analysis LDA: Example Quality control:
More informationECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam
ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The
More informationChap 1. Overview of Statistical Learning (HTF, , 2.9) Yongdai Kim Seoul National University
Chap 1. Overview of Statistical Learning (HTF, 2.1-2.6, 2.9) Yongdai Kim Seoul National University 0. Learning vs Statistical learning Learning procedure Construct a claim by observing data or using logics
More informationECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam
ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The
More informationGeneralised linear models. Response variable can take a number of different formats
Generalised linear models Response variable can take a number of different formats Structure Limitations of linear models and GLM theory GLM for count data GLM for presence \ absence data GLM for proportion
More informationData Mining. 3.6 Regression Analysis. Fall Instructor: Dr. Masoud Yaghini. Numeric Prediction
Data Mining 3.6 Regression Analysis Fall 2008 Instructor: Dr. Masoud Yaghini Outline Introduction Straight-Line Linear Regression Multiple Linear Regression Other Regression Models References Introduction
More informationIntroduction to Machine Learning Spring 2018 Note 18
CS 189 Introduction to Machine Learning Spring 2018 Note 18 1 Gaussian Discriminant Analysis Recall the idea of generative models: we classify an arbitrary datapoint x with the class label that maximizes
More informationTrain the model with a subset of the data. Test the model on the remaining data (the validation set) What data to choose for training vs. test?
Train the model with a subset of the data Test the model on the remaining data (the validation set) What data to choose for training vs. test? In a time-series dimension, it is natural to hold out the
More informationReview on Spatial Data
Week 12 Lecture: Spatial Autocorrelation and Spatial Regression Introduction to Programming and Geoprocessing Using R GEO6938 4172 GEO4938 4166 Point data Review on Spatial Data Area/lattice data May be
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationSparse polynomial chaos expansions as a machine learning regression technique
Research Collection Other Conference Item Sparse polynomial chaos expansions as a machine learning regression technique Author(s): Sudret, Bruno; Marelli, Stefano; Lataniotis, Christos Publication Date:
More informationMachine Learning. Regression-Based Classification & Gaussian Discriminant Analysis. Manfred Huber
Machine Learning Regression-Based Classification & Gaussian Discriminant Analysis Manfred Huber 2015 1 Logistic Regression Linear regression provides a nice representation and an efficient solution to
More informationLinear Decision Boundaries
Linear Decision Boundaries A basic approach to classification is to find a decision boundary in the space of the predictor variables. The decision boundary is often a curve formed by a regression model:
More informationON CONCURVITY IN NONLINEAR AND NONPARAMETRIC REGRESSION MODELS
STATISTICA, anno LXXIV, n. 1, 2014 ON CONCURVITY IN NONLINEAR AND NONPARAMETRIC REGRESSION MODELS Sonia Amodio Department of Economics and Statistics, University of Naples Federico II, Via Cinthia 21,
More informationThe exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet.
CS 189 Spring 013 Introduction to Machine Learning Final You have 3 hours for the exam. The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet. Please
More informationEXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING
EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: June 9, 2018, 09.00 14.00 RESPONSIBLE TEACHER: Andreas Svensson NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical
More informationMathematical Tools for Neuroscience (NEU 314) Princeton University, Spring 2016 Jonathan Pillow. Homework 8: Logistic Regression & Information Theory
Mathematical Tools for Neuroscience (NEU 34) Princeton University, Spring 206 Jonathan Pillow Homework 8: Logistic Regression & Information Theory Due: Tuesday, April 26, 9:59am Optimization Toolbox One
More informationLogistic Regression 21/05
Logistic Regression 21/05 Recall that we are trying to solve a classification problem in which features x i can be continuous or discrete (coded as 0/1) and the response y is discrete (0/1). Logistic regression
More informationClassification Methods II: Linear and Quadratic Discrimminant Analysis
Classification Methods II: Linear and Quadratic Discrimminant Analysis Rebecca C. Steorts, Duke University STA 325, Chapter 4 ISL Agenda Linear Discrimminant Analysis (LDA) Classification Recall that linear
More informationSingle Index Quantile Regression for Heteroscedastic Data
Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University SMAC, November 6, 2015 E. Christou, M. G. Akritas (PSU) SIQR
More informationBayesian Learning (II)
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning (II) Niels Landwehr Overview Probabilities, expected values, variance Basic concepts of Bayesian learning MAP
More informationAdministration. Homework 1 on web page, due Feb 11 NSERC summer undergraduate award applications due Feb 5 Some helpful books
STA 44/04 Jan 6, 00 / 5 Administration Homework on web page, due Feb NSERC summer undergraduate award applications due Feb 5 Some helpful books STA 44/04 Jan 6, 00... administration / 5 STA 44/04 Jan 6,
More informationMachine Learning (CS 567) Lecture 5
Machine Learning (CS 567) Lecture 5 Time: T-Th 5:00pm - 6:20pm Location: GFS 118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol
More informationLecture 5: Classification
Lecture 5: Classification Advanced Applied Multivariate Analysis STAT 2221, Spring 2015 Sungkyu Jung Department of Statistics, University of Pittsburgh Xingye Qiao Department of Mathematical Sciences Binghamton
More information9/26/17. Ridge regression. What our model needs to do. Ridge Regression: L2 penalty. Ridge coefficients. Ridge coefficients
What our model needs to do regression Usually, we are not just trying to explain observed data We want to uncover meaningful trends And predict future observations Our questions then are Is β" a good estimate
More informationMachine Learning, Fall 2012 Homework 2
0-60 Machine Learning, Fall 202 Homework 2 Instructors: Tom Mitchell, Ziv Bar-Joseph TA in charge: Selen Uguroglu email: sugurogl@cs.cmu.edu SOLUTIONS Naive Bayes, 20 points Problem. Basic concepts, 0
More informationCS7267 MACHINE LEARNING
CS7267 MACHINE LEARNING ENSEMBLE LEARNING Ref: Dr. Ricardo Gutierrez-Osuna at TAMU, and Aarti Singh at CMU Mingon Kang, Ph.D. Computer Science, Kennesaw State University Definition of Ensemble Learning
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Ensembles Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique Fédérale de Lausanne
More informationModeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop
Modeling Data with Linear Combinations of Basis Functions Read Chapter 3 in the text by Bishop A Type of Supervised Learning Problem We want to model data (x 1, t 1 ),..., (x N, t N ), where x i is a vector
More informationApplied Machine Learning Annalisa Marsico
Applied Machine Learning Annalisa Marsico OWL RNA Bionformatics group Max Planck Institute for Molecular Genetics Free University of Berlin 22 April, SoSe 2015 Goals Feature Selection rather than Feature
More informationChap 2. Linear Classifiers (FTH, ) Yongdai Kim Seoul National University
Chap 2. Linear Classifiers (FTH, 4.1-4.4) Yongdai Kim Seoul National University Linear methods for classification 1. Linear classifiers For simplicity, we only consider two-class classification problems
More informationBayesian Model Averaging (BMA) with uncertain Spatial Effects A Tutorial. Martin Feldkircher
Bayesian Model Averaging (BMA) with uncertain Spatial Effects A Tutorial Martin Feldkircher This version: October 2010 This file illustrates the computer code to use spatial filtering in the context of
More informationApplied Multivariate and Longitudinal Data Analysis
Applied Multivariate and Longitudinal Data Analysis Discriminant analysis and classification Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 Consider the examples: An online banking service
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationIntroduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones
Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 Last week... supervised and unsupervised methods need adaptive
More informationLecture 6: Methods for high-dimensional problems
Lecture 6: Methods for high-dimensional problems Hector Corrada Bravo and Rafael A. Irizarry March, 2010 In this Section we will discuss methods where data lies on high-dimensional spaces. In particular,
More informationStat588 Homework 1 (Due in class on Oct 04) Fall 2011
Stat588 Homework 1 (Due in class on Oct 04) Fall 2011 Notes. There are three sections of the homework. Section 1 and Section 2 are required for all students. While Section 3 is only required for Ph.D.
More informationESS2222. Lecture 4 Linear model
ESS2222 Lecture 4 Linear model Hosein Shahnas University of Toronto, Department of Earth Sciences, 1 Outline Logistic Regression Predicting Continuous Target Variables Support Vector Machine (Some Details)
More informationClass 4: Classification. Quaid Morris February 11 th, 2011 ML4Bio
Class 4: Classification Quaid Morris February 11 th, 211 ML4Bio Overview Basic concepts in classification: overfitting, cross-validation, evaluation. Linear Discriminant Analysis and Quadratic Discriminant
More informationarxiv: v1 [stat.me] 16 Feb 2017
TREE ENSEMBLES WITH RULE STRUCTURED HORSESHOE REGULARIZATION MALTE NALENZ AND MATTIAS VILLANI arxiv:1702.05008v1 [stat.me] 16 Feb 2017 ABSTRACT. We propose a new Bayesian model for flexible nonlinear regression
More informationAssignment 3. Introduction to Machine Learning Prof. B. Ravindran
Assignment 3 Introduction to Machine Learning Prof. B. Ravindran 1. In building a linear regression model for a particular data set, you observe the coefficient of one of the features having a relatively
More informationClassification using stochastic ensembles
July 31, 2014 Topics Introduction Topics Classification Application and classfication Classification and Regression Trees Stochastic ensemble methods Our application: USAID Poverty Assessment Tools Topics
More informationSupport Vector Machines
Support Vector Machines Here we approach the two-class classification problem in a direct way: We try and find a plane that separates the classes in feature space. If we cannot, we get creative in two
More informationMachine Learning for OR & FE
Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com
More informationClassification: Logistic Regression and Naive Bayes Book Chapter 4. Carlos M. Carvalho The University of Texas McCombs School of Business
Classification: Logistic Regression and Naive Bayes Book Chapter 4. Carlos M. Carvalho The University of Texas McCombs School of Business 1 1. Classification 2. Logistic Regression, One Predictor 3. Inference:
More informationFrom statistics to data science. BAE 815 (Fall 2017) Dr. Zifei Liu
From statistics to data science BAE 815 (Fall 2017) Dr. Zifei Liu Zifeiliu@ksu.edu Why? How? What? How much? How many? Individual facts (quantities, characters, or symbols) The Data-Information-Knowledge-Wisdom
More informationOn the Harrison and Rubinfeld Data
On the Harrison and Rubinfeld Data By Otis W. Gilley Department of Economics and Finance College of Administration and Business Louisiana Tech University Ruston, Louisiana 71272 (318)-257-3468 and R. Kelley
More informationIntroduction to Logistic Regression
Introduction to Logistic Regression Problem & Data Overview Primary Research Questions: 1. What are the risk factors associated with CHD? Regression Questions: 1. What is Y? 2. What is X? Did player develop
More informationMachine Learning 4771
Machine Learning 4771 Instructor: Tony Jebara Topic 3 Additive Models and Linear Regression Sinusoids and Radial Basis Functions Classification Logistic Regression Gradient Descent Polynomial Basis Functions
More informationMachine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io
Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem
More informationNeural networks (NN) 1
Neural networks (NN) 1 Hedibert F. Lopes Insper Institute of Education and Research São Paulo, Brazil 1 Slides based on Chapter 11 of Hastie, Tibshirani and Friedman s book The Elements of Statistical
More informationData Mining Analysis of the Parkinson's Disease
Georgia State University ScholarWorks @ Georgia State University Mathematics Theses Department of Mathematics and Statistics 12-17-2014 Data Mining Analysis of the Parkinson's Disease Xiaoyuan Wang Follow
More informationECE521: Inference Algorithms and Machine Learning University of Toronto. Assignment 1: k-nn and Linear Regression
ECE521: Inference Algorithms and Machine Learning University of Toronto Assignment 1: k-nn and Linear Regression TA: Use Piazza for Q&A Due date: Feb 7 midnight, 2017 Electronic submission to: ece521ta@gmailcom
More informationCPSC 340: Machine Learning and Data Mining. Stochastic Gradient Fall 2017
CPSC 340: Machine Learning and Data Mining Stochastic Gradient Fall 2017 Assignment 3: Admin Check update thread on Piazza for correct definition of trainndx. This could make your cross-validation code
More informationChapter 10 Logistic Regression
Chapter 10 Logistic Regression Data Mining for Business Intelligence Shmueli, Patel & Bruce Galit Shmueli and Peter Bruce 2010 Logistic Regression Extends idea of linear regression to situation where outcome
More informationMS1b Statistical Data Mining Part 3: Supervised Learning Nonparametric Methods
MS1b Statistical Data Mining Part 3: Supervised Learning Nonparametric Methods Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Supervised Learning: Nonparametric
More informationNaive Bayes classification
Naive Bayes classification Christos Dimitrakakis December 4, 2015 1 Introduction One of the most important methods in machine learning and statistics is that of Bayesian inference. This is the most fundamental
More informationIntroduction to Signal Detection and Classification. Phani Chavali
Introduction to Signal Detection and Classification Phani Chavali Outline Detection Problem Performance Measures Receiver Operating Characteristics (ROC) F-Test - Test Linear Discriminant Analysis (LDA)
More informationWeek 5: Logistic Regression & Neural Networks
Week 5: Logistic Regression & Neural Networks Instructor: Sergey Levine 1 Summary: Logistic Regression In the previous lecture, we covered logistic regression. To recap, logistic regression models and
More informationLINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception
LINEAR MODELS FOR CLASSIFICATION Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification,
More informationStatistisches Data Mining (StDM) Woche 6. Read and do the excersises of chapter 4.6.1, 4.6.2, and in ILSR
Statistisches Data Mining (StDM) HS 2017 Woche 6 Aufgabe 1 Lab Read and do the excersises of chapter 4.6.1, 4.6.2, and 4.6.5 in ILSR Aufgabe 2 Logistic Regression (based on an excerice in ISLR) In this
More informationSUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION
SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology
More informationCSE 151 Machine Learning. Instructor: Kamalika Chaudhuri
CSE 151 Machine Learning Instructor: Kamalika Chaudhuri Ensemble Learning How to combine multiple classifiers into a single one Works well if the classifiers are complementary This class: two types of
More informationMethods and Criteria for Model Selection. CS57300 Data Mining Fall Instructor: Bruno Ribeiro
Methods and Criteria for Model Selection CS57300 Data Mining Fall 2016 Instructor: Bruno Ribeiro Goal } Introduce classifier evaluation criteria } Introduce Bias x Variance duality } Model Assessment }
More information1 A Support Vector Machine without Support Vectors
CS/CNS/EE 53 Advanced Topics in Machine Learning Problem Set 1 Handed out: 15 Jan 010 Due: 01 Feb 010 1 A Support Vector Machine without Support Vectors In this question, you ll be implementing an online
More informationANOVA, ANCOVA and MANOVA as sem
ANOVA, ANCOVA and MANOVA as sem Robin Beaumont 2017 Hoyle Chapter 24 Handbook of Structural Equation Modeling (2015 paperback), Examples converted to R and Onyx SEM diagrams. This workbook duplicates some
More information