SF2935: MODERN METHODS OF STATISTICAL LECTURE 3 SUPERVISED CLASSIFICATION, LINEAR DISCRIMINANT ANALYSIS LEARNING. Tatjana Pavlenko.

Size: px
Start display at page:

Download "SF2935: MODERN METHODS OF STATISTICAL LECTURE 3 SUPERVISED CLASSIFICATION, LINEAR DISCRIMINANT ANALYSIS LEARNING. Tatjana Pavlenko."

Transcription

1 SF2935: MODERN METHODS OF STATISTICAL LEARNING LECTURE 3 SUPERVISED CLASSIFICATION, LINEAR DISCRIMINANT ANALYSIS Tatjana Pavlenko 5 November 2015

2 SUPERVISED LEARNING (REP.) Starting point: we have an outcome measurement Y, quantitative (such as a stock price, blood pressure ) or categorical (such as heart attack/no heart attack), that we wish to predict based on a set of features (also called inputs, regressors, covariates, features, independent variables), X = (X 1,..., X p ) (such as diet, clinical measurements). we have a training set of data, {(x 1, y 1 ),..., (x n, y n )} for a set of n objects (such as patients). On the basis of the training data we would like to build a prediction model/rule, or learner, which will enable us to predict the outcome for new unseen objects, understand which input variables affect the outcome, and how, assess the accuracy of our predictions and inferences. The scenario above is called supervised because of the presence of the outcome variable to guide the learning process.

3 SUPERVISED VS UNSUPERVISED LEARNING (CONT.) In the unsupervised learning problem: there is no outcome variable, just a set of predictors (features) measured on a set of samples objective is more fuzzy find groups of samples that behave similarly, find features that behave similarly, find linear combinations of features with the most variation difficult to know how well we are performing different from supervised learning, but can be useful as a pre-processing step for supervised learning.

4 CLASSIFICATION PROBLEMS EXAMPLES Use information on sex, age, income, education level, merital status, debts, etc to classify a potential borrower as eligible or ineligible for a bank loan. Use measurements on blood proteins and family history to classify women as a carriers or non-carriers of a genetic disorder Predict whether a patient, hospitalized due to a heart attack, will have a second heart attack. The prediction is to be based on demographic, diet and clinical measurements for that patient Classify a tissue sample into one of several cancer classes, based on a gene expression profile (see fragment of the data on figure on the next slide).

5 Tatjana Pavlenko SF2935: Modern methods of statistical learning, l. 3

6 GENERAL CLASSIFICATION SET-UP, NOTATIONS AND OBJECTIVES Given is observed feature vector of variables X = (X 1,..., X p ) and a qualitative (response) variable Y (outcome measurment, target) that takes valuse in an unordered set, C of several predefined categories (classes, populations). Classification task is to build the a function C (X) that takes as input the feature vector X and predicts its value for Y, i.e C (X) C. More often, we focus on estimating the probabilities that X belongs to each category in the set C. To construct C (X), we have training data (x 1, y 1 ),..., (x n, y n ) where each observed x i is accompanied by its known class memebrship (supervised framework).

7 DISCRIMINANT ANALYSIS/CLASSIFICATION The goal is to find a discrimination rule for classification which in general classifies observations correctly minimizes probability of misclassification minimizes the expected cost of misclassification. If the population distributions are know we can use this knowledge to obtain the discriminant functon. Otherwise we have to use training data to derive an accurate discrimination rule.

8 CLASSIFICATION: TWO POPULATIONS Let Π i, i = 1, 2 denote two populations and π i be the prior probability that a randomly chosen observation comes from the ith population. π 1 = Pr(Π 1 ), π 2 = Pr(Π 2 ) = 1 π 1. Given is the observed X = (X 1,..., X p ). We model the distribution of X in each of Π i s separately. Let f i (x) denote the probability density function (pdf) of X corresponding to Π i. Partition the total sample space so that Ω = R 1 R 2 and R 1 R 2 =. R i = {x : x Π i } represents the region of observed X corresponding to Π i. Define the misclassifcation probability Similarly p(1 2). p(2 1) = Pr(assing X Π 2 actually X Π 1 ).

9 DISCRIMINATION RULES

10 CLASSIFICATION: TWO POPULATIONS (CONT.) The figure above shows an exemple of classification problem for two 2-dimensional populations (with two random variables, p = 2); the training data (with class labels) are shown in the scatter plots. The red-dotted lines show linear (left) or quadratic (right) decision boundaries that are used to define the decision regions R 1 and R 2. New observations will be assigned the population Π 1 or Π 2 depending on in which decision region they will fall into. We can already assume that our discrimination rule of a new (unseen observation) will not be perfect, i.e. and some percentage samples will likely be misclassified. With random X, we can, at best, reduce probabilities of misclassification.

11 TWO POPULATIONS: MISCLASSIFICATION PROB. Theoretically, misclassification probabilities can be computed as p(2 1) = Pr(X is in Π 1 but is assigned to Π 2 ) = f 1 (x)dx, R 2 =Ω R 1 and analogious p(1 2) = Pr(X R 1 Π 2 ) = f 2 (x)dx. R 1 Notations: R i = f (x 1,..., x p )dx 1,... dx p p-tuple integral For p = 1, see figure on the board! The more separated f 1 (x) and f 2 (x) the smaller misclassifications p(2 1) and p(1 2). Goal: We need to fix an optimality criteria to construct the discrimination rule!

12 TWO POPULATIONS: OPTIMALITY CRITERA Decision theory strategy. Desision theory is concerned with finding optimal decisions given certain information. Both estimation and hypothesis testing can be viewed as techniques of decision theory, as well as classification. A) What if we know in advance (a priori) that, say 80% of the observations are coming from Π 1 and the remaining 20% are from Π 2? How can we use this information to optimize the classifier? B) What is one kind of misclassification costs more than the other one? How can we use this information when designing the classifier? C) What if we wish to mininmize the total probability of misclassification? How does this effect the structure of the classifier?

13 TWO POPULATIONS: OPTIMALITY CRITERA (CONT) A) Maximize the posterior probability that the observation X belongs to the ith population. Bayesian approach (see Sec in ISL) which result in the so-called Bayes classifier: assign a test observation x 0 to the population with the largest posterior probability, given the feature values for that observation. Let p(π i x) the posterior probab that an observation X = x belongs to Π i. We compute the probabilities of the populations Π 1 and Π 2 after observing x, (hence the name posterior probabilities), prior probabiliteis are π 1 + π 2 = 1. By Bayes Theorem (details on the board) p(π 1 x) = π 1 f 1 (x) π 1 f 1 (x) + π 2 f 2 (x), p(π 2 x) = 1 p(π 1 x). Discriminant rule that maximizes the class posterior probability is: Assign x to Π 1 if p(π 1 x) > p(π 2 x), otherwise to Π 2.

14 OPTIMALITY CRITERA (CONT) What if the costs of the different misclassifications differ? For example, the cost of classifying a patient with a dengerous or deadly disease as healthy is higher than the cost of classifying a healthy patient as having a dengerous disease. B) Minimize the expected cost of misclassification (ECM): Let c(1 2) and c(2 1) be the costs associated with p(1 2) and p(2 1). c(1 1) = c(2 2) no cost of correct decisions. The expected cost of misclassification ECM is ECM = c(1 2)p(1 2)π 2 + c(2 1)p(2 1)π 1. Discriminant rule that minimizes the ECM is (see proof on the board): Assign x to Π 1 if f 1(x) f 2 (x) > π 2c(1 2) π 1 c(2 1), else to Π 2.

15 OPTIMALITY CRITERA (CONT) C) Minimize the total probability of misclassification (TPM): TPM = p(2 1)π 1 + p(1 2)π 2 = π 1 R 2 f 1 (x)dx + π 2 R 1 f 2 (x)dx. This leads (proof is similar to ECM case) to the discriminant rule: Assign x to Π 1 if f 1(x) f 2 (x) > π 2 π 1, else to Π 2. Special cases of ECM rule: if π 1 = π 2 then f 1(x) f 2 (x) > c(1 2) c(2 1). If c(1 2) = c(2 1) then f 1(x) f 2 (x) > π 2 π 1. Same as TPM and Bayes classifier. If c(1 2) = c(2 1) and π 1 = π 2 then f 1(x) > 1. Likelihood ratio f 2 (x) classification rule.

16 TWO MULTIVARIATE NORMAL POPULATIONS: LDA AND QDA When we use normal (Gaussian) distributions for each population Π i, this leads to linear or quadratic dicriminant analysis, LDA or QDA. However, this approach is quite general, and other distributions can be used as well. We will focus on normal distributions. Assume that f i (x) is N p (µ i, Σ i ) corresponding to Π i, i = 1, 2. µ i is a class-specific mean vector, Σ i is the class covariance matrix. X N p (µ, Σ). Here E(X) = µ, Cov(X) = Σ and the density is ( 1 f (x) = (2π) p/2 exp 1 ) Σ 1/2 2 (x µ) Σ 1 (x µ).

17 TWO BIVARIATE NORMAL DENSITY FUNCTIONS. FIGUR : Two two-dimensional (p = 2) density functions. Left: X 1 and X 2 are uncorrelated. Right: Corr(X 1, X 2 ) = 0.7.

18 TWO MULTIVARIATE NORMAL POPULATIONS: LDA Assume that f i (x) is N p (µ i, Σ) corresponding to Π i. Σ is the covariance matrix which is common for both populations. Then ( ) f1 (x) log = (µ f 2 (x) 1 µ 2 ) Σ 1 x 1 2 (µ 1 µ 2 ) Σ 1 (µ 1 + µ 2 ) = D(x) (say). The discriminant rule that minimizes EPM is: Assign x to Π 1 if D(x) > log π 2c(1 2) π 1 c(2 1), else to Π 2. Proof on the board. For Σ 1 = Σ 2, D(x) is linear in x which is the reason for the name Linear discriminant function or Linear discriminant analysis (LDA). Σ 1 = Σ 2 results in a quadratic discriminant rule or QDA, will be discussed more later.

19 LINEAR DISCRIMINANT FUNCTIONS.

20 TWO MULTIVARIATE NORMAL POPULATIONS: LDA In practice, population parameters µ i and Σ are unknown. We estmate D(x) from the data! Estimation technique: plug-in estimated µ i and Σ into D(x). This gives a sampe discriminant rule. Given the data X i : n i p from Π i, calculate x i, S i (unbiased). Since Σ 1 = Σ 2 use S pooled = (n 1 1)S 1 +(n 2 1)S 2 n 1 +n 2 2 and obtain D(x) = ( x 1 x 2 ) S 1 pooled x 1 2 ( x 1 x 2 ) S 1 pooled ( x 1 + x 2 ). The sample EPM rule is: Assign x to Π 1 if D(x) > log π 2c(1 2) π 1 c(2 1), else to Π 2.

21 SOME REMARKS ON LDA Estimation of π i s: Usually, π i = Bayes prior, guess,... n i n 1 +n 2 is assumed. Otherwise, use With D(x) there is no assurance that the resulting rule will minimize the ECM in a particular application. This is because the optimal rule was derived assuming population densities f i (x) were known completely. But D(x) is expected to perform well if the sample size n i is large. Denote by d = ( x 1 x 2 ) Spooled 1 and let ŷ(x) = d x, ȳ i = ˆd x i, i = 1, 2. When π 2c(1 2) = 1 the discriminant rule becomes π 1 c(2 1) ŷ(x) > 1 2 (ȳ 1 + ȳ 2 ). As ŷ(x) and ȳ i s are linear combinations, the multivariate expressions convert to univariate ones.

22 LDA. EXAMPLE Example is adapted from AMSA book and is concerned with the detection of hemophilia A carriers. The goal was to construct a procedure for classifying patients as hemophilia A carriers or not. Measurments on two variables were conducted for n 1 = 30 women from non-carriers population, Π 1 and n 2 = 22 from carriers population, Π 2. X 1 = log 10 (AHF activity), X 2 = log 10 (AHF-like antigen). Data: X approximately bivariate normal on log-transformed scale. Assume Σ 1 = Σ 2. To construct a sample-based LD function the following is provided x 1 = ( 0.065, ), x 2 = ( 0.248, ). ( ) Spooled =

23 LDA. EXAMPLE (CONT.) For the sample-based LD function we have ŷ(x) = ( x 1 x 2 ) Spooled 1 x = 37.61x x 2. }{{} d ȳ 1 = ( x 1 x 2 ) Spooled 1 x 1 = 0.88, ȳ 2 = ( x 1 x 2 }{{} ) 1 Spooled x 2 = }{{} d d The midpoint between these two means is 1 2 (ȳ 1 + ȳ 2 ) = Recall that the discriminant rule, when π 2c(1 2) π 1 c(2 1) = 1 is ŷ(x) > 1 2 (ȳ 1 + ȳ 2 ). Hence assign x to Π 1 (normal) if ŷ(x) > 1 2 (ȳ 1 + ȳ 2 ) = 4.61, else to Π 2 (carrier).

24 LDA. EXAMPLE (CONT.) Discriminant rule is ŷ(x) > 1 2 (ȳ 1 + ȳ 2 ) Assign x to normal groupp, Π 1 if ŷ(x) > 1 2 (ȳ 1 + ȳ 2 ) = 4.61, else to carrier group, Π 2. Q: Measurments of AHF activity and AHF-like antigen ona women who might be hemofilia A carrier give x 1 = 0.210, x 2 = Should this women be classified as Π 1 (normal) or Π 2 (carrier)? Given a new observation on a women x 0 = ( 0.210, 0 044) we obtain ŷ(x 0 ) = 6.62 < 1 2 (ȳ 1 + ȳ 2 ) = Hence we assign her to carrier group, i.e assign the new observation x 0 to Π 2. This classifier assumes equal costs and equal priors! R+ estimates prior probabilities by default if nor specified.

25 SOME REMARKS ON LDA (CONT.) Consider the rule (with π 2c(1 2) = 1) again. We have π 1 c(2 1) ( ) f1 (x) D(x) = log = (µ f 2 (x) 1 µ 2 ) Σ 1 x 1 2 (µ 1 µ 2 ) Σ 1 (µ 1 + µ 2 ). Rule: Assing x to Π 1 is D(x) > 0 otherwise to Π 2 The rule is called linear dicriminant function (LDF) Bayes decision boundary {x D(x) = 0} is a hyperplane (of size p 1) dviding the two classes. See ISL, p. 144: Bayes decision boundary represents the set of values x for which δ 1 (x) = δ 2 (x) where δ i (x) = x Σ 1 µ i 1 2 µ i Σ 1 µ i, i = 1, 2. (the term log(π i ) desappears since it is the same for Π 1 and Π 2.

26 EVALUATION OF PERFORMANCE ACCURACY Optimality criteria are based on probailities of misclassification. The smaller they are the more optimal is the classifier performance. Recall TMP = π 1 R 2 f 1 (x)dx + π 2 R 1 f 2 (x)dx, where min (TMP) = optimum misclassification rate, OMR. For completely known Π i s distributions OMR can be computed exact. For example, for LDA with Π i s defined by N p (µ i, Σ) we have ( OMR = min (TMP) = Φ ) 2 where 2 = (µ 1 µ 2 ) Σ 1 (µ 1 µ 2 ) is the Mahalanobis distance between Π 1 and Π 2, Φ( ) is the cdf of N(0, 1).

27 MULTI-SAMPLE LDA FIGUR : Three Gaussian classes with p = 2. Left: Ellipces represent regions containig 95% of the probability for each of three classes. The dashed lines are Bayesian decision boundaries. Right: 20 observations were generated from each class and the corresponding LDA decision boundaries are presented as solid lines along with the Bayesian boundaries (dashed lines). Overall, the sample-based LDA is close to Bayesian decision boundaries.

Lecture 8: Classification

Lecture 8: Classification 1/26 Lecture 8: Classification Måns Eriksson Department of Mathematics, Uppsala University eriksson@math.uu.se Multivariate Methods 19/5 2010 Classification: introductory examples Goal: Classify an observation

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

DISCRIMINANT ANALYSIS. 1. Introduction

DISCRIMINANT ANALYSIS. 1. Introduction DISCRIMINANT ANALYSIS. Introduction Discrimination and classification are concerned with separating objects from different populations into different groups and with allocating new observations to one

More information

Discrimination: finding the features that separate known groups in a multivariate sample.

Discrimination: finding the features that separate known groups in a multivariate sample. Discrimination and Classification Goals: Discrimination: finding the features that separate known groups in a multivariate sample. Classification: developing a rule to allocate a new object into one of

More information

ISyE 6416: Computational Statistics Spring Lecture 5: Discriminant analysis and classification

ISyE 6416: Computational Statistics Spring Lecture 5: Discriminant analysis and classification ISyE 6416: Computational Statistics Spring 2017 Lecture 5: Discriminant analysis and classification Prof. Yao Xie H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology

More information

The Bayes classifier

The Bayes classifier The Bayes classifier Consider where is a random vector in is a random variable (depending on ) Let be a classifier with probability of error/risk given by The Bayes classifier (denoted ) is the optimal

More information

THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2012, Mr. Ruey S. Tsay

THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2012, Mr. Ruey S. Tsay THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2012, Mr Ruey S Tsay Lecture 9: Discrimination and Classification 1 Basic concept Discrimination is concerned with separating

More information

Introduction to Data Science

Introduction to Data Science Introduction to Data Science Winter Semester 2018/19 Oliver Ernst TU Chemnitz, Fakultät für Mathematik, Professur Numerische Mathematik Lecture Slides Contents I 1 What is Data Science? 2 Learning Theory

More information

6-1. Canonical Correlation Analysis

6-1. Canonical Correlation Analysis 6-1. Canonical Correlation Analysis Canonical Correlatin analysis focuses on the correlation between a linear combination of the variable in one set and a linear combination of the variables in another

More information

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016 Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier

More information

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers Engineering Part IIB: Module 4F0 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers Phil Woodland: pcw@eng.cam.ac.uk Michaelmas 202 Engineering Part IIB:

More information

Discriminant analysis and supervised classification

Discriminant analysis and supervised classification Discriminant analysis and supervised classification Angela Montanari 1 Linear discriminant analysis Linear discriminant analysis (LDA) also known as Fisher s linear discriminant analysis or as Canonical

More information

Lecture 9: Classification, LDA

Lecture 9: Classification, LDA Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis October 13, 2017 1 / 21 Review: Main strategy in Chapter 4 Find an estimate ˆP (Y X). Then, given an input x 0, we

More information

Bayesian Decision Theory

Bayesian Decision Theory Introduction to Pattern Recognition [ Part 4 ] Mahdi Vasighi Remarks It is quite common to assume that the data in each class are adequately described by a Gaussian distribution. Bayesian classifier is

More information

Lecture 9: Classification, LDA

Lecture 9: Classification, LDA Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis October 13, 2017 1 / 21 Review: Main strategy in Chapter 4 Find an estimate ˆP (Y X). Then, given an input x 0, we

More information

which is the chance that, given the observation/subject actually comes from π 2, it is misclassified as from π 1. Analogously,

which is the chance that, given the observation/subject actually comes from π 2, it is misclassified as from π 1. Analogously, 47 Chapter 11 Discrimination and Classification Suppose we have a number of multivariate observations coming from two populations, just as in the standard two-sample problem Very often we wish to, by taking

More information

University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries

University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout :. The Multivariate Gaussian & Decision Boundaries..15.1.5 1 8 6 6 8 1 Mark Gales mjfg@eng.cam.ac.uk Lent

More information

Intro. ANN & Fuzzy Systems. Lecture 15. Pattern Classification (I): Statistical Formulation

Intro. ANN & Fuzzy Systems. Lecture 15. Pattern Classification (I): Statistical Formulation Lecture 15. Pattern Classification (I): Statistical Formulation Outline Statistical Pattern Recognition Maximum Posterior Probability (MAP) Classifier Maximum Likelihood (ML) Classifier K-Nearest Neighbor

More information

Classification Methods II: Linear and Quadratic Discrimminant Analysis

Classification Methods II: Linear and Quadratic Discrimminant Analysis Classification Methods II: Linear and Quadratic Discrimminant Analysis Rebecca C. Steorts, Duke University STA 325, Chapter 4 ISL Agenda Linear Discrimminant Analysis (LDA) Classification Recall that linear

More information

Generative classifiers: The Gaussian classifier. Ata Kaban School of Computer Science University of Birmingham

Generative classifiers: The Gaussian classifier. Ata Kaban School of Computer Science University of Birmingham Generative classifiers: The Gaussian classifier Ata Kaban School of Computer Science University of Birmingham Outline We have already seen how Bayes rule can be turned into a classifier In all our examples

More information

Supervised Learning. Regression Example: Boston Housing. Regression Example: Boston Housing

Supervised Learning. Regression Example: Boston Housing. Regression Example: Boston Housing Supervised Learning Unsupervised learning: To extract structure and postulate hypotheses about data generating process from observations x 1,...,x n. Visualize, summarize and compress data. We have seen

More information

Introduction to Machine Learning

Introduction to Machine Learning Outline Introduction to Machine Learning Bayesian Classification Varun Chandola March 8, 017 1. {circular,large,light,smooth,thick}, malignant. {circular,large,light,irregular,thick}, malignant 3. {oval,large,dark,smooth,thin},

More information

Chapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1)

Chapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1) HW 1 due today Parameter Estimation Biometrics CSE 190 Lecture 7 Today s lecture was on the blackboard. These slides are an alternative presentation of the material. CSE190, Winter10 CSE190, Winter10 Chapter

More information

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012 Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood

More information

Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012 Classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Topics Discriminant functions Logistic regression Perceptron Generative models Generative vs. discriminative

More information

STK Statistical Learning: Advanced Regression and Classification

STK Statistical Learning: Advanced Regression and Classification STK4030 - Statistical Learning: Advanced Regression and Classification Riccardo De Bin debin@math.uio.no STK4030: lecture 1 1/ 42 Outline of the lecture Introduction Overview of supervised learning Variable

More information

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II)

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II) Contents Lecture Lecture Linear Discriminant Analysis Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University Email: fredriklindsten@ituuse Summary of lecture

More information

CMU-Q Lecture 24:

CMU-Q Lecture 24: CMU-Q 15-381 Lecture 24: Supervised Learning 2 Teacher: Gianni A. Di Caro SUPERVISED LEARNING Hypotheses space Hypothesis function Labeled Given Errors Performance criteria Given a collection of input

More information

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Fundamentals to Biostatistics Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Statistics collection, analysis, interpretation of data development of new

More information

CMSC858P Supervised Learning Methods

CMSC858P Supervised Learning Methods CMSC858P Supervised Learning Methods Hector Corrada Bravo March, 2010 Introduction Today we discuss the classification setting in detail. Our setting is that we observe for each subject i a set of p predictors

More information

12 Discriminant Analysis

12 Discriminant Analysis 12 Discriminant Analysis Discriminant analysis is used in situations where the clusters are known a priori. The aim of discriminant analysis is to classify an observation, or several observations, into

More information

Classification 1: Linear regression of indicators, linear discriminant analysis

Classification 1: Linear regression of indicators, linear discriminant analysis Classification 1: Linear regression of indicators, linear discriminant analysis Ryan Tibshirani Data Mining: 36-462/36-662 April 2 2013 Optional reading: ISL 4.1, 4.2, 4.4, ESL 4.1 4.3 1 Classification

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Bayesian Classification Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574

More information

Statistical Classification. Minsoo Kim Pomona College Advisor: Jo Hardin

Statistical Classification. Minsoo Kim Pomona College Advisor: Jo Hardin Statistical Classification Minsoo Kim Pomona College Advisor: Jo Hardin April 2, 2010 2 Contents 1 Introduction 5 2 Basic Discriminants 7 2.1 Linear Discriminant Analysis for Two Populations...................

More information

Lecture 4 Discriminant Analysis, k-nearest Neighbors

Lecture 4 Discriminant Analysis, k-nearest Neighbors Lecture 4 Discriminant Analysis, k-nearest Neighbors Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se fredrik.lindsten@it.uu.se

More information

Lecture 9: Classification, LDA

Lecture 9: Classification, LDA Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis Jonathan Taylor, 10/12 Slide credits: Sergio Bacallado 1 / 1 Review: Main strategy in Chapter 4 Find an estimate ˆP

More information

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts Data Mining: Concepts and Techniques (3 rd ed.) Chapter 8 Chapter 8. Classification: Basic Concepts Classification: Basic Concepts Decision Tree Induction Bayes Classification Methods Rule-Based Classification

More information

Lecture 5. Gaussian Models - Part 1. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. November 29, 2016

Lecture 5. Gaussian Models - Part 1. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. November 29, 2016 Lecture 5 Gaussian Models - Part 1 Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza November 29, 2016 Luigi Freda ( La Sapienza University) Lecture 5 November 29, 2016 1 / 42 Outline 1 Basics

More information

Semiparametric Discriminant Analysis of Mixture Populations Using Mahalanobis Distance. Probal Chaudhuri and Subhajit Dutta

Semiparametric Discriminant Analysis of Mixture Populations Using Mahalanobis Distance. Probal Chaudhuri and Subhajit Dutta Semiparametric Discriminant Analysis of Mixture Populations Using Mahalanobis Distance Probal Chaudhuri and Subhajit Dutta Indian Statistical Institute, Kolkata. Workshop on Classification and Regression

More information

CSC 411: Lecture 09: Naive Bayes

CSC 411: Lecture 09: Naive Bayes CSC 411: Lecture 09: Naive Bayes Class based on Raquel Urtasun & Rich Zemel s lectures Sanja Fidler University of Toronto Feb 8, 2015 Urtasun, Zemel, Fidler (UofT) CSC 411: 09-Naive Bayes Feb 8, 2015 1

More information

Gaussian Models

Gaussian Models Gaussian Models ddebarr@uw.edu 2016-04-28 Agenda Introduction Gaussian Discriminant Analysis Inference Linear Gaussian Systems The Wishart Distribution Inferring Parameters Introduction Gaussian Density

More information

LDA, QDA, Naive Bayes

LDA, QDA, Naive Bayes LDA, QDA, Naive Bayes Generative Classification Models Marek Petrik 2/16/2017 Last Class Logistic Regression Maximum Likelihood Principle Logistic Regression Predict probability of a class: p(x) Example:

More information

L11: Pattern recognition principles

L11: Pattern recognition principles L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction

More information

Efficient Approach to Pattern Recognition Based on Minimization of Misclassification Probability

Efficient Approach to Pattern Recognition Based on Minimization of Misclassification Probability American Journal of Theoretical and Applied Statistics 05; 5(-): 7- Published online November 9, 05 (http://www.sciencepublishinggroup.com/j/ajtas) doi: 0.648/j.ajtas.s.060500. ISSN: 36-8999 (Print); ISSN:

More information

Classification: Linear Discriminant Analysis

Classification: Linear Discriminant Analysis Classification: Linear Discriminant Analysis Discriminant analysis uses sample information about individuals that are known to belong to one of several populations for the purposes of classification. Based

More information

Generative Model (Naïve Bayes, LDA)

Generative Model (Naïve Bayes, LDA) Generative Model (Naïve Bayes, LDA) IST557 Data Mining: Techniques and Applications Jessie Li, Penn State University Materials from Prof. Jia Li, sta3s3cal learning book (Has3e et al.), and machine learning

More information

Bayes Decision Theory

Bayes Decision Theory Bayes Decision Theory Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr 1 / 16

More information

Naïve Bayesian. From Han Kamber Pei

Naïve Bayesian. From Han Kamber Pei Naïve Bayesian From Han Kamber Pei Bayesian Theorem: Basics Let X be a data sample ( evidence ): class label is unknown Let H be a hypothesis that X belongs to class C Classification is to determine H

More information

Informatics 2B: Learning and Data Lecture 10 Discriminant functions 2. Minimal misclassifications. Decision Boundaries

Informatics 2B: Learning and Data Lecture 10 Discriminant functions 2. Minimal misclassifications. Decision Boundaries Overview Gaussians estimated from training data Guido Sanguinetti Informatics B Learning and Data Lecture 1 9 March 1 Today s lecture Posterior probabilities, decision regions and minimising the probability

More information

PATTERN RECOGNITION AND MACHINE LEARNING

PATTERN RECOGNITION AND MACHINE LEARNING PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality

More information

Machine Learning (CS 567) Lecture 5

Machine Learning (CS 567) Lecture 5 Machine Learning (CS 567) Lecture 5 Time: T-Th 5:00pm - 6:20pm Location: GFS 118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol

More information

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The

More information

Supervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2012

Supervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2012 Supervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2012 Overview Review: Conditional Probability LDA / QDA: Theory Fisher s Discriminant Analysis LDA: Example Quality control:

More information

Bayesian Decision Theory

Bayesian Decision Theory Bayesian Decision Theory Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 1 / 46 Bayesian

More information

Machine Learning 2017

Machine Learning 2017 Machine Learning 2017 Volker Roth Department of Mathematics & Computer Science University of Basel 21st March 2017 Volker Roth (University of Basel) Machine Learning 2017 21st March 2017 1 / 41 Section

More information

Bayesian Decision and Bayesian Learning

Bayesian Decision and Bayesian Learning Bayesian Decision and Bayesian Learning Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 30 Bayes Rule p(x ω i

More information

Statistical Machine Learning Hilary Term 2018

Statistical Machine Learning Hilary Term 2018 Statistical Machine Learning Hilary Term 2018 Pier Francesco Palamara Department of Statistics University of Oxford Slide credits and other course material can be found at: http://www.stats.ox.ac.uk/~palamara/sml18.html

More information

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1 EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle

More information

Algorithmisches Lernen/Machine Learning

Algorithmisches Lernen/Machine Learning Algorithmisches Lernen/Machine Learning Part 1: Stefan Wermter Introduction Connectionist Learning (e.g. Neural Networks) Decision-Trees, Genetic Algorithms Part 2: Norman Hendrich Support-Vector Machines

More information

Machine Learning. Theory of Classification and Nonparametric Classifier. Lecture 2, January 16, What is theoretically the best classifier

Machine Learning. Theory of Classification and Nonparametric Classifier. Lecture 2, January 16, What is theoretically the best classifier Machine Learning 10-701/15 701/15-781, 781, Spring 2008 Theory of Classification and Nonparametric Classifier Eric Xing Lecture 2, January 16, 2006 Reading: Chap. 2,5 CB and handouts Outline What is theoretically

More information

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

Computational Genomics

Computational Genomics Computational Genomics http://www.cs.cmu.edu/~02710 Introduction to probability, statistics and algorithms (brief) intro to probability Basic notations Random variable - referring to an element / event

More information

University of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians

University of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians Engineering Part IIB: Module F Statistical Pattern Processing University of Cambridge Engineering Part IIB Module F: Statistical Pattern Processing Handout : Multivariate Gaussians. Generative Model Decision

More information

Chap 2. Linear Classifiers (FTH, ) Yongdai Kim Seoul National University

Chap 2. Linear Classifiers (FTH, ) Yongdai Kim Seoul National University Chap 2. Linear Classifiers (FTH, 4.1-4.4) Yongdai Kim Seoul National University Linear methods for classification 1. Linear classifiers For simplicity, we only consider two-class classification problems

More information

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Machine Learning. Regression-Based Classification & Gaussian Discriminant Analysis. Manfred Huber

Machine Learning. Regression-Based Classification & Gaussian Discriminant Analysis. Manfred Huber Machine Learning Regression-Based Classification & Gaussian Discriminant Analysis Manfred Huber 2015 1 Logistic Regression Linear regression provides a nice representation and an efficient solution to

More information

Introduction to Machine Learning. Introduction to ML - TAU 2016/7 1

Introduction to Machine Learning. Introduction to ML - TAU 2016/7 1 Introduction to Machine Learning Introduction to ML - TAU 2016/7 1 Course Administration Lecturers: Amir Globerson (gamir@post.tau.ac.il) Yishay Mansour (Mansour@tau.ac.il) Teaching Assistance: Regev Schweiger

More information

Pattern Recognition. Parameter Estimation of Probability Density Functions

Pattern Recognition. Parameter Estimation of Probability Density Functions Pattern Recognition Parameter Estimation of Probability Density Functions Classification Problem (Review) The classification problem is to assign an arbitrary feature vector x F to one of c classes. The

More information

Machine Learning Lecture 2

Machine Learning Lecture 2 Machine Perceptual Learning and Sensory Summer Augmented 15 Computing Many slides adapted from B. Schiele Machine Learning Lecture 2 Probability Density Estimation 16.04.2015 Bastian Leibe RWTH Aachen

More information

Multivariate statistical methods and data mining in particle physics

Multivariate statistical methods and data mining in particle physics Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general

More information

CLASSICAL NORMAL-BASED DISCRIMINANT ANALYSIS

CLASSICAL NORMAL-BASED DISCRIMINANT ANALYSIS CLASSICAL NORMAL-BASED DISCRIMINANT ANALYSIS EECS 833, March 006 Geoff Bohling Assistant Scientist Kansas Geological Survey geoff@gs.u.edu 864-093 Overheads and resources available at http://people.u.edu/~gbohling/eecs833

More information

Introduction to Machine Learning

Introduction to Machine Learning 1, DATA11002 Introduction to Machine Learning Lecturer: Teemu Roos TAs: Ville Hyvönen and Janne Leppä-aho Department of Computer Science University of Helsinki (based in part on material by Patrik Hoyer

More information

Chapter 10 Logistic Regression

Chapter 10 Logistic Regression Chapter 10 Logistic Regression Data Mining for Business Intelligence Shmueli, Patel & Bruce Galit Shmueli and Peter Bruce 2010 Logistic Regression Extends idea of linear regression to situation where outcome

More information

Math for Machine Learning Open Doors to Data Science and Artificial Intelligence. Richard Han

Math for Machine Learning Open Doors to Data Science and Artificial Intelligence. Richard Han Math for Machine Learning Open Doors to Data Science and Artificial Intelligence Richard Han Copyright 05 Richard Han All rights reserved. CONTENTS PREFACE... - INTRODUCTION... LINEAR REGRESSION... 4 LINEAR

More information

Chapter 6 Classification and Prediction (2)

Chapter 6 Classification and Prediction (2) Chapter 6 Classification and Prediction (2) Outline Classification and Prediction Decision Tree Naïve Bayes Classifier Support Vector Machines (SVM) K-nearest Neighbors Accuracy and Error Measures Feature

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

Bayes Rule. CS789: Machine Learning and Neural Network Bayesian learning. A Side Note on Probability. What will we learn in this lecture?

Bayes Rule. CS789: Machine Learning and Neural Network Bayesian learning. A Side Note on Probability. What will we learn in this lecture? Bayes Rule CS789: Machine Learning and Neural Network Bayesian learning P (Y X) = P (X Y )P (Y ) P (X) Jakramate Bootkrajang Department of Computer Science Chiang Mai University P (Y ): prior belief, prior

More information

Introduction to Machine Learning

Introduction to Machine Learning 1, DATA11002 Introduction to Machine Learning Lecturer: Antti Ukkonen TAs: Saska Dönges and Janne Leppä-aho Department of Computer Science University of Helsinki (based in part on material by Patrik Hoyer,

More information

Bayesian Learning. CSL603 - Fall 2017 Narayanan C Krishnan

Bayesian Learning. CSL603 - Fall 2017 Narayanan C Krishnan Bayesian Learning CSL603 - Fall 2017 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Bayes Theorem MAP Learners Bayes optimal classifier Naïve Bayes classifier Example text classification Bayesian networks

More information

LINEAR CLASSIFICATION, PERCEPTRON, LOGISTIC REGRESSION, SVC, NAÏVE BAYES. Supervised Learning

LINEAR CLASSIFICATION, PERCEPTRON, LOGISTIC REGRESSION, SVC, NAÏVE BAYES. Supervised Learning LINEAR CLASSIFICATION, PERCEPTRON, LOGISTIC REGRESSION, SVC, NAÏVE BAYES Supervised Learning Linear vs non linear classifiers In K-NN we saw an example of a non-linear classifier: the decision boundary

More information

Classification using stochastic ensembles

Classification using stochastic ensembles July 31, 2014 Topics Introduction Topics Classification Application and classfication Classification and Regression Trees Stochastic ensemble methods Our application: USAID Poverty Assessment Tools Topics

More information

Machine Learning Lecture 5

Machine Learning Lecture 5 Machine Learning Lecture 5 Linear Discriminant Functions 26.10.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory

More information

Computer Vision Group Prof. Daniel Cremers. 3. Regression

Computer Vision Group Prof. Daniel Cremers. 3. Regression Prof. Daniel Cremers 3. Regression Categories of Learning (Rep.) Learnin g Unsupervise d Learning Clustering, density estimation Supervised Learning learning from a training data set, inference on the

More information

Incorporating detractors into SVM classification

Incorporating detractors into SVM classification Incorporating detractors into SVM classification AGH University of Science and Technology 1 2 3 4 5 (SVM) SVM - are a set of supervised learning methods used for classification and regression SVM maximal

More information

Machine Learning and Data Mining. Bayes Classifiers. Prof. Alexander Ihler

Machine Learning and Data Mining. Bayes Classifiers. Prof. Alexander Ihler + Machine Learning and Data Mining Bayes Classifiers Prof. Alexander Ihler A basic classifier Training data D={x (i),y (i) }, Classifier f(x ; D) Discrete feature vector x f(x ; D) is a con@ngency table

More information

Introduction to Machine Learning. Lecture 2

Introduction to Machine Learning. Lecture 2 Introduction to Machine Learning Lecturer: Eran Halperin Lecture 2 Fall Semester Scribe: Yishay Mansour Some of the material was not presented in class (and is marked with a side line) and is given for

More information

MODULE -4 BAYEIAN LEARNING

MODULE -4 BAYEIAN LEARNING MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities

More information

Unsupervised machine learning

Unsupervised machine learning Chapter 9 Unsupervised machine learning Unsupervised machine learning (a.k.a. cluster analysis) is a set of methods to assign objects into clusters under a predefined distance measure when class labels

More information

SGN (4 cr) Chapter 5

SGN (4 cr) Chapter 5 SGN-41006 (4 cr) Chapter 5 Linear Discriminant Analysis Jussi Tohka & Jari Niemi Department of Signal Processing Tampere University of Technology January 21, 2014 J. Tohka & J. Niemi (TUT-SGN) SGN-41006

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

Machine Learning (CS 567) Lecture 2

Machine Learning (CS 567) Lecture 2 Machine Learning (CS 567) Lecture 2 Time: T-Th 5:00pm - 6:20pm Location: GFS118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol

More information

Base Rates and Bayes Theorem

Base Rates and Bayes Theorem Slides to accompany Grove s handout March 8, 2016 Table of contents 1 Diagnostic and Prognostic Inference 2 Example inferences Does this patient have Alzheimer s disease, schizophrenia, depression, etc.?

More information

Classification 2: Linear discriminant analysis (continued); logistic regression

Classification 2: Linear discriminant analysis (continued); logistic regression Classification 2: Linear discriminant analysis (continued); logistic regression Ryan Tibshirani Data Mining: 36-462/36-662 April 4 2013 Optional reading: ISL 4.4, ESL 4.3; ISL 4.3, ESL 4.4 1 Reminder:

More information

Pattern recognition. "To understand is to perceive patterns" Sir Isaiah Berlin, Russian philosopher

Pattern recognition. To understand is to perceive patterns Sir Isaiah Berlin, Russian philosopher Pattern recognition "To understand is to perceive patterns" Sir Isaiah Berlin, Russian philosopher The more relevant patterns at your disposal, the better your decisions will be. This is hopeful news to

More information

INF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification

INF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification INF 4300 151014 Introduction to classifiction Anne Solberg anne@ifiuiono Based on Chapter 1-6 in Duda and Hart: Pattern Classification 151014 INF 4300 1 Introduction to classification One of the most challenging

More information

The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet.

The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet. CS 189 Spring 013 Introduction to Machine Learning Final You have 3 hours for the exam. The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet. Please

More information

Applied Multivariate and Longitudinal Data Analysis

Applied Multivariate and Longitudinal Data Analysis Applied Multivariate and Longitudinal Data Analysis Discriminant analysis and classification Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 Consider the examples: An online banking service

More information

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 Last week... supervised and unsupervised methods need adaptive

More information