Computer Assignment 8 - Discriminant Analysis. 1 Linear Discriminant Analysis

Similar documents
LEC 4: Discriminant Analysis for Classification

ISyE 6416: Computational Statistics Spring Lecture 5: Discriminant analysis and classification

Section 7: Discriminant Analysis.

Classification Methods II: Linear and Quadratic Discrimminant Analysis

Lecture 4 Discriminant Analysis, k-nearest Neighbors

Lecture 5: Classification

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II)

Classification: Linear Discriminant Analysis

Assignment 3. Introduction to Machine Learning Prof. B. Ravindran

Regularized Discriminant Analysis and Reduced-Rank LDA

Lecture 5: LDA and Logistic Regression

Supervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2012

Data Mining and Analysis: Fundamental Concepts and Algorithms

The Bayes classifier

Part I. Linear Discriminant Analysis. Discriminant analysis. Discriminant analysis

Extensions to LDA and multinomial regression

Linear Discriminant Analysis Based in part on slides from textbook, slides of Susan Holmes. November 9, Statistics 202: Data Mining

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones

Machine Learning Linear Classification. Prof. Matteo Matteucci

Classification with Gaussians

Introduction to Machine Learning

Naïve Bayes Introduction to Machine Learning. Matt Gormley Lecture 18 Oct. 31, 2018

Semiparametric Discriminant Analysis of Mixture Populations Using Mahalanobis Distance. Probal Chaudhuri and Subhajit Dutta

Classification techniques focus on Discriminant Analysis

SVM-flexible discriminant analysis

Lecture 9: Classification, LDA

Introduction to Machine Learning Spring 2018 Note 18

CS 340 Lec. 18: Multivariate Gaussian Distributions and Linear Discriminant Analysis

Lecture 9: Classification, LDA

PRINCIPAL COMPONENTS ANALYSIS

Lecture 9: Classification, LDA

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING

Machine Learning. Regression-Based Classification & Gaussian Discriminant Analysis. Manfred Huber

Statistical Machine Learning Hilary Term 2018

Introduction to Machine Learning

MLE/MAP + Naïve Bayes

Gaussian Models

10-701/ Machine Learning - Midterm Exam, Fall 2010

DISCRIMINANT ANALYSIS: LDA AND QDA

CS534 Machine Learning - Spring Final Exam

STATS306B STATS306B. Discriminant Analysis. Jonathan Taylor Department of Statistics Stanford University. June 3, 2010

Machine Learning CS-6350, Assignment - 3 Due: 08 th October 2013

Matematisk statistik allmän kurs, MASA01:A, HT-15 Laborationer

Machine Leanring Theory and Applications: third lecture

Statistics 202: Data Mining. c Jonathan Taylor. Week 6 Based in part on slides from textbook, slides of Susan Holmes. October 29, / 1

Motivating the Covariance Matrix

Naive Bayes & Introduction to Gaussians

Package SPreFuGED. July 29, 2016

Data Mining. Linear & nonlinear classifiers. Hamid Beigy. Sharif University of Technology. Fall 1396

Supervised Learning. Regression Example: Boston Housing. Regression Example: Boston Housing

EDAMI DIMENSION REDUCTION BY PRINCIPAL COMPONENT ANALYSIS

Multivariate statistical methods and data mining in particle physics

MLE/MAP + Naïve Bayes

An Introduction to Multivariate Methods

Managing Uncertainty

Linear Decision Boundaries

Machine Learning (CS 567) Lecture 5

The following postestimation commands are of special interest after discrim qda: The following standard postestimation commands are also available:

Computation. For QDA we need to calculate: Lets first consider the case that

STATS216v Introduction to Statistical Learning Stanford University, Summer Midterm Exam (Solutions) Duration: 1 hours

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification

6-1. Canonical Correlation Analysis

Statistical Simulation An Introduction

Machine Learning, Midterm Exam: Spring 2009 SOLUTION

Metric Predicted Variable on Two Groups

Homework 1 Solutions Probability, Maximum Likelihood Estimation (MLE), Bayes Rule, knn

CLASSICAL NORMAL-BASED DISCRIMINANT ANALYSIS

Advanced Introduction to Machine Learning

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014

Discriminant Analysis Documentation

Naïve Bayes Introduction to Machine Learning. Matt Gormley Lecture 3 September 14, Readings: Mitchell Ch Murphy Ch.

Creative Data Mining

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

Chap 2. Linear Classifiers (FTH, ) Yongdai Kim Seoul National University

Linear Methods for Prediction

6.867 Machine Learning

DISCRIMINATION AND CLASSIFICATION IN NIR SPECTROSCOPY. 1 Dept. Chemistry, University of Rome La Sapienza, Rome, Italy

MATH 567: Mathematical Techniques in Data Science Logistic regression and Discriminant Analysis

CMSC858P Supervised Learning Methods

University of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians

FINM 331: MULTIVARIATE DATA ANALYSIS FALL 2017 PROBLEM SET 3

Image Analysis. PCA and Eigenfaces

Cheng Soon Ong & Christian Walder. Canberra February June 2018

When Dictionary Learning Meets Classification

Mixture of Gaussians Models

L11: Pattern recognition principles

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

10-810: Advanced Algorithms and Models for Computational Biology. Optimal leaf ordering and classification

Machine Learning 11. week

Mathematical Tools for Neuroscience (NEU 314) Princeton University, Spring 2016 Jonathan Pillow. Homework 8: Logistic Regression & Information Theory

Lecture 7: Con3nuous Latent Variable Models

Applied Multivariate Analysis

SECTION 7: CURVE FITTING. MAE 4020/5020 Numerical Methods with MATLAB

ECE662: Pattern Recognition and Decision Making Processes: HW TWO

Gaussian Mixtures. ## Type 'citation("mclust")' for citing this R package in publications.

Introduction to Linear Regression

Generative v. Discriminative classifiers Intuition

MSA220 Statistical Learning for Big Data

LDA, QDA, Naive Bayes

Transcription:

Computer Assignment 8 - Discriminant Analysis Created by James D. Wilson, UNC Chapel Hill Edited by Andrew Nobel and Kelly Bodwin, UNC Chapel Hill In this assignment, we will investigate another tool for classifying Gaussian Mixture data: Fisher s Discriminant analysis. 1 Linear Discriminant Analysis Below, I provide code for Fisher s Linear Discriminant Analysis on a Gaussian dataset as an illustrative example. We will walk you through how to extend this code to the case of Quadratic discriminant analysis. All of these calculations are directly from the theory given in class. Once we make our own function, we will see how to perform qda using built-in R commands. My.Gaussian.LDA = function(training, test){ INPUT: training = n.1 x (p+1) matrix whose first p columns are observed variables and p+1st column includes the class labels. test = n.2 x p vector of observed variables with no class labels. OUTPUT: norm.vec = direction/normal vector associated with the projection of x offset = offset of projection of x onto norm.vec pred = prediction of class labels for test set NOTES: 1) We assume that there are only 2 possible class labels. 2) We assume the classes have equal variance and that the data is Gaussian 3) The linear discriminant of the data is given by function g(x) = x^t ** norm.vec + offset Extract summary information n = dim(training)[1] total number of samples p = dim(training)[2] - 1 number of variables labels = unique(training[,p+1]) the unique labels set.0 = which(training[,p+1] == labels[1]) set.1 = which(training[,p+1] == labels[2]) data.set.0 = training[set.0,1:p] observed variables for set.0 data.set.1 = training[set.1,1:p] observed variables for set.1 Calculate MLEs pi.0.hat = (1/n) * length(set.0) pi.1.hat = (1/n) * length(set.1) 1

} mu.0.hat = colsums(data.set.0)/length(set.0) mu.1.hat = colsums(data.set.1)/length(set.1) sigma.hat = ((t(data.set.0) - mu.0.hat)%*%t(t(data.set.0) - mu.0.hat) + (t(data.set.1) - mu.1.hat)%*%t Coefficients of linear discriminant b = log(pi.1.hat/pi.0.hat) -.5*(mu.0.hat + mu.1.hat)%*%solve(sigma.hat)%*%(mu.1.hat - mu.0.hat) a = solve(sigma.hat)%*%(mu.1.hat - mu.0.hat) Prediction of test set Pred = rep(0,dim(test)[1]) g = as.matrix(test)%*%a + as.numeric(b) Pred[which(g>0)] = 1 Return the coefficients and Prediction return(list(norm.vec = a, offset = b, pred = Pred)) **Notes:** 1. By using the above function, we assume that the covariance matrices of the variables of both classes of data are the same. Suppose that this was not the case. Write two lines of code that can be used in the above function to calculate the covariance matrices of each class, sigma.0 and sigma.1. 2. Given sigma.0 and sigma.1 from (1), write out 2 lines of code that would calculate the normal vector and offset in the scenario that sample covariances are not equal. Note that these results will correspond to quadratic discriminant analysis on Gaussian data. 3. How might you adjust the above function s INPUT arguments to handle the case of unequal covariances? Hint: think about using a logical value for an argument named equal.covariance.matrices. 4. (Optional) Note that once (c) has been completed, one can use the if() statement to handle cases of equal or unequal covariance matrices. To do this, one can write if(equal.covariance.matrices == FALSE){} and if(equal.covariance.matrices == TRUE){} to handle the cases separately. Give this a try if you d like. Once done, you will have written flexible code for Linear or Quadratic discriminant analysis on Gaussian data. 2

2. Visualizing Linear Discriminants: Let s now see how our LDA function performs. Load the training and test data from the course website using the following commands: training = read.table("http://www.unc.edu/%7ejameswd/data/training.txt",header = TRUE) test = read.table("http://www.unc.edu/%7ejameswd/data/test.txt", header = TRUE) The training data are simulated from a Gaussian Mixture in the following manner: * Labels are first chosen at random with probability P (Label = 1) = 0.5. The first of two variables (X 1 ) is generated conditionally on the data: X 1 Label = 1 N(0, 1) X 1 Label = 0 N(3, 1) The second of the two variables (X 2 ) is generated as: X 2 = X 1 + N(0, 1). The test data are also generated as Normal random variables with mean either 1 or 3 and standard deviation 1. Plot the training data to look for any noticeable structure by using the following commands: plot the two variables of the training data plot(training$x.1, training$x.2, col = as.factor(training$labels), xlab = "x1", ylab = "x2", main = "Training Data") add a legend to the plot legend("bottomright", c("0","1"), col = c("black","red"), pch = c(1,1)) Keep this plot up so that we can add a curve momentarily. Now, calculate the linear discriminant of this data using your function My.Gaussian.LDA from (1). Do this using the following code: Results = My.Gaussian.LDA(training,test) Type Results in your console to review the output of your function. Now, let s add the linear discriminant to your plot (which should still be visible). Add the line of the discriminant using the abline( ) command as in the following: abline(a = -Results$offset/Results$norm.vec[2], b = -Results$norm.vec[1]/Results$norm.vec[2], col = "green") Now, let s view the test data set and see how these will be classified according to our discriminant rule. Make these plots using the following code: plot(test$x.1, test$x.2, xlab = "x1", ylab = "x2", main = "Test Data") Add the discriminant line abline(a = -Results$offset/Results$norm.vec[2], b = -Results$norm.vec[1]/Results$norm.vec[2], col = "green") 3

1. Comment on the discriminant and the training data. Are there are any misclassifications on this data? 2. Comment on the discriminant and the test data. How many test points are classified as 1? How many are classified as 0? Discuss any potential uncertainty based on your plot. 3. Quadratic Discriminant Analysis: We can perform quadratic discriminant analysis in R by simply using the qda( ) command. It should be noted that we can also use lda( ) command to run linear discriminant analysis. Run quadratic discriminant analysis on this data using the following code: Split the training data into variables and labels training.variables = training[,1:2] training.labels = training[,3] Run QDA library(mass) quad.disc = qda(training.variables, grouping = training.labels) Now, the quad.disc variable contains summary information of the training data and can be used to predict the class of new data by using the predict( ) command. Predict the values of the test data using the following command: Predict test set Prediction.test = predict(quad.disc, test)$class Predict training set Prediction.training = predict(quad.disc,training.variables)$class 1. Are there any misclassifications on the predictions for the training variables? 2. Are there any differences on the predictions for the test variables between the quadratic discriminant rule here and the linear discriminant rule in Question (2)? Based on the plot in Question (2), which point do you think was classified differently? 4. Discriminant Analysis Application: Let s try quadratic discriminant analysis on the iris dataset. We will see how well we can distinguish the setosa and virginica species based on the measured variables. First pre-process the data using the following code: Load the data data(iris) Keep only the setosa and virginica species iris.sample = iris[which(iris$species == "setosa" iris$species == "virginica"),] 4

Keep a random sample of 50 of these for training. rand.sample = sample(1:100, 50, replace = FALSE) training = iris.sample[rand.sample,] Separate these into variables into labels and variables Training set training.labels = training$species training.variables = training[,1:4] Test set test.sample = setdiff(1:100,rand.sample) test = iris.sample[test.sample,] split the variables and labels test.labels = test$species test.variables = test[,1:4] Now, run quadratic discriminant analysis on the training.sample that you just created. Then, predict the labels of the test.variables. Use the following code: quad.disc = qda(training.variables,grouping = as.numeric(training.labels)) test.predict = predict(quad.disc,test.variables)$class 1. Comment on the results of QDA on the iris dataset. Compare test.labels with test.predict. How many misclassifications were there on the test set? 2. Would the My.Gaussian.LDA function that you created in Question (1) be appropriate for this dataset? Why or why not? If not, how could you adjust the function to be applicable to this dataset? 5