ISSN X On misclassification probabilities of linear and quadratic classifiers

Size: px
Start display at page:

Download "ISSN X On misclassification probabilities of linear and quadratic classifiers"

Transcription

1 Afrika Statistika Vol. 111, 016, pages DOI: Afrika Statistika ISSN X On misclassification probabilities of linear and quadratic classifiers Olusola Samuel Makinde Department of Statistics, Federal University of Technology, Akure, Nigeria Received March 17, 016; Accepted May 0, 016 Copyright c 016, Afrika Statistika and Statistics and Probability African Society SPAS. All rights reserved Abstract. We study the theoretical misclassification probability of linear and quadratic classifiers and examine the performance of these classifiers under distributional variations in theory and using simulation. We derive expression for Bayes errors for some competing distributions from the same family under location shift. Résumé. Nous étudions les probabilités théoriques de malclassification de méthodes de classification linéaires et quadratiques. Ensuite, nous examinons les performance de ces méthodes de classifications selon différentes distributions, sur le plan théorique en les étayant avec des études de simulations. Nous exprimons l erreur de Bayes pour des distributions de même famille en compétition selon le changement du paramètre de centralisation. Key words: Misclassification probability; linear classifier; quadratic classifier; Bayes risk. AMS 010 Mathematics Subject Classification : 6H30; 60E Introduction Classification is aimed at getting maximum information about separability or distinction among classes or populations and then assigns each observation to one of these populations on the basis of a vector of measurements or features. It has many important applications in different fields, such as disease diagnosis in medical sciences, risk identification in finance, admission of prospective students into university based on a battery of tests among others. Anderson 1984 described classification problem as the problem of statistical decision making. A good classification procedure is the one that classifies observations from unknown populations correctly. Suppose competing populations have well defined distributions which are characterised by some location and scale parameters. Classification of observations to any of these populations can be viewed from this characterisation in terms of shift in location Corresponding author Olusola Samuel Makinde : osmakinde@futa.edu.ng

2 O.S. Makinde, Afrika Statistika, Vol. 111, 016, pages On misclassification probabilities of linear and quadratic classifiers. 944 and scale of each of the population distributions. Competing populations may have either location shift, scale shift or both location-scale shift. Consider populations π j, j = 1,,..., J from multivariate distributions, F j having probability density functions f j with prior probabilities p j. Bayes rule, proposed in Welch 1939, is to assign each observation x to population π j, whose posterior probability P π j x is the highest. It assigns x to population π 1, in a two class problem, if f 1 xp 1 f xp > 1 and to π otherwise. Wald 1944 argued that if each population has a cost, Ci j associated with misclassifying x whose true population is π j into π i, then assign observations to the class or population that has the highest expected cost of misclassification. Welch 1939 showed that for any two normally distributed populations, the ratio of log likelihood functions of the two populations is the theoretical basis for building discriminant function that best classifies new individuals to any of the two populations given that the prior probabilities of the populations are known. For two competing populations whose principal difference is in location, Fisher 1936 described the separation between these two populations to be ratio of variance between the populations to variance within the populations. This postulation leads to discriminant analysis, called Fisher s discriminant analysis. Suppose there are two populations from the same family of multivariate distributions to which observations can be classified. If these populations are normally distributed and have the same covariance matrix, the discriminant analysis is referred to as linear discriminant analysis LDA. Similarly, if these populations are normally distributed but have different covariance matrices, the optimal rule is nonlinear and referred to as quadratic discriminant analysis QDA. Welch 1939 and Wald 1944 showed that linear discriminant function has optimal properties for two group classification if the populations are multivariate normally distributed. Krzanowski 1977 reviewed the performance of Fisher s linear discriminant function when underlying assumptions are violated. Many classification methods, both parametric and non-parametric, have been compared with LDA and QDA under normality and non-normality which include Ghosh and Chaudhuri 005, Kim et al. 011 and Li et al. 01 among others. One way of evaluating the performance of a classification rule is to calculate its misclassification probabilities. One can define the total probability of misclassification as = p 1 f 1 xdx + p f xdx, R R 1 where R 1 = { x R d : f 1x f x c1 p } { and R = x R d : f 1x c 1p 1 f x < c1 p }. c 1p 1 The classification regions R 1 and R can be constructed only when the distributions F and G are fully known Makinde and Chakraborty, 015. This will rarely be the case, we have to work with the empirical versions of the classification regions and calculate the error rates. In this paper, we study the misclassification probabilities of linear and quadratic classifiers with emphasis to multivariate normal distributions and deduce the mathematical expression for Bayes error for some multivariate distributions under suitable conditions.

3 O.S. Makinde, Afrika Statistika, Vol. 111, 016, pages On misclassification probabilities of linear and quadratic classifiers Classification Rules Based on Normality We consider J populations having density function of the form f j x = gx µ j Σ 1 j x µ j, x R d, j = 1,..., J, for some strictly decreasing, continuous, non-negative scalar function g, where µ j and Σ j are mean vector and covariance matrix of jth population respectively. Assuming normality, equal prior probabilities and equal cost of misclassification for the J populations, Bayes rule can be defined as assign x to π k if D k x, µ k, Σ k = min 1 j J D jx, µ j, Σ j 1 where D j x, µ j, Σ j = x µ j Σ 1 j x µ j + log Σ j. This classification rule is known as quadratic discriminant analysisqda. It is linear if Σ 1 = Σ =... = Σ J. Define R k, the region of classification into kth population, as R k = {X : D k X, µ k, Σ k = min 1 j J D jx, µ j, Σ j } Then the probability of misclassification of the optimal rule in 1 is = J p j P x / R j x π j. j=1 A good classification method is the one that minimises. Let us consider two classes for simplicity. Suppose π i Nµ i, Σ i, i = 1,. The classification in 1 can be expressed as assign x to π 1 if µ 1 µ Σ 1 µ 1 + µ x > 0 3 if For Σ 1 = Σ = Σ. Observe that if Σ is a constant multiple of identity matrix, 3 is the generalisation of component-wise centroid classifier in see, Hall et al., 009, page Then, the probability of misclassification of x into either π 1 or π is = Φ c 0, where c 0 = µ 1 µ Σ 1 µ 1 µ and Φ is the cumulative distribution function of the standard normal distribution. See Johnson and Wichern 007 for further discussion. To illustrate the probability of misclassification, let π 1 and π be two d-variate normal populations with mean vector and covariance matrix, µ 1, Σ 1 and µ, Σ respectively. Assume that the prior probabilities of π 1 and π are equal. Consider µ 1 = 0, 0, µ = δ, 0 and Σ 1 = Σ = I. The total probability of misclassification associated with LDA is a function of non-centrality parameter δ and is obtained as = Φ δ When covariance matrix of a population is a scalar multiple of the other. The following results hold: Theorem 1. Let F and G be two competing distributions with prior probabilities p 1 and p respectively. Suppose F Nµ 1, Σ 1 and G Nµ, Σ. Take µ 1 = µ = 0, 0, Σ 1 = I, Σ = σ I for σ 1 and p 1 = p = 0.5. Then.

4 O.S. Makinde, Afrika Statistika, Vol. 111, 016, pages On misclassification probabilities of linear and quadratic classifiers for σ > 1 = 1 σ 1 F σ 1 log e σ ] + F σ 1 log e σ where F. denotes distribution function of central Chi-square distribution with degrees of freedom.. for 0 < σ < 1 = 1 ] σ 1 + F σ 1 log e σ F σ 1 log e σ Proof of Theorem 1: For x R d, if x Nµ 1, Σ 1, x x χ d and if x Nµ, Σ, x x σ χ d. f 1x/f x 1 implies x µ 1 Σ 1 1 x µ 1 x µ Σ 1 x µ log e Σ log e Σ 1. This gives σ 1 σ x x log e σ. For σ > 0, we consider two cases. These are σ > 1 and σ < When σ > 1, the region of classification is R 1 : x x σ σ 1 log e σ and R : x x > σ σ 1 log e σ. Define p 1 P 1 as probability that x comes from population π 1 but eventually falls in the region of classification into population π and p P 1 as probability that x comes from population π but eventually falls in the region of classification into population π 1. Then P 1 = P x x > σ σ 1 log e σ x σ x χ = 1 F σ 1 log e σ, P 1 = P x x σ σ 1 log e σ x x σ χ and, probability of misclassification is = 1 σ 1 F σ 1 log e σ. When σ < 1, the region of classification is = F σ 1 log e σ ] + F σ 1 log e σ. R 1 : x x σ σ 1 log e σ and R : x x < σ σ 1 log e σ. P 1 = P x x < σ σ 1 log e σ x x χ P 1 = P x x σ σ 1 log e σ x x σ χ The probability of misclassification is = 1 σ 1 + F σ 1 log e σ σ = F σ 1 log e σ = 1 F σ 1 log e σ ] F σ 1 log e σ..

5 O.S. Makinde, Afrika Statistika, Vol. 111, 016, pages On misclassification probabilities of linear and quadratic classifiers. 947 The results in Theorem 1 are compared with empirical results based on simulation. The procedure for the simulation follows from Section 3. The mean vectors and covariance matrices of competing distributions are taken to be µ 1 = µ = 0, 0, Σ1 = I, Σ = σ I for σ 1. The numerical results are presented in Figure 1b. Theorem. Suppose the conditions of Theorem 1 hold and take µ 1 = and Σ 1 = I, Σ = σ I, then { p1 P χ f = 1 > k c 1 + p P χ f < k c, for σ > 1 p 1 P χ f 1 < k c 1 + p P χ f > k c, for σ < 1 where k = log e σ + 1 δ σ σ 1, c i = σ i, f i = µ i, i = 1,, µ i c i Σ = I + σ I, A = I, υ = Σ 1 µ1 µ, 0, µ 0 = U = υ Συ = µ 1 µ µ 1 µ = δ, µ 1 = 1 { σ ]δ } σ σ + σ 1, µ 1 = 1 { 1 σ 1 + σ ]δ } σ + σ 1, 1 σ 1 = 1 σ { σ ]δ + σ 1 }, σ = 1 σ 1 + σ ]δ + σ 1. The proof of Theorem follows from Gilbert δ 0 3. Numerical Examples As illustration of actual error rates of LDA and QDA, we present a small simulation study. Let us consider the two populations π 1 and π to be bivariate spherically symmetric with centre of symmetries µ 1 = 0, 0 and µ = δ, 0, respectively. The sample sizes for X 1,..., X n from π 1 and Y 1,..., Y m from π are taken to be n = m = 100. We simulate a new random sample Z 1,..., Z m from π 1 and Z m+1,..., Z m from π with m = 100 and estimate the actual error rates by the proportion of misclassification in Z 1,..., Z m. The simulation size is Figure 1 presents the comparison between results from theory and simulation based on information above for misclassification probabilities of linear and quadratic classifiers under location shift and scale shift, as described in Section. It is clearly seen that the sample estimate of probability of misclassification is a good approximation for the population version of it. As expected, the error rate is nearly 0.5 when δ = 0 and it decreases as δ goes away from 0 and the separation between the populations increases for location shift case. Consider the nonlinear case where µ 1 = 0, 0 T and µ = δ, 0 T, Σ 1 = I and Σ = σ I. Figure shows that error rate is affected by the difference in location and scale parameters of competing populations. The error rate is smaller with σ than σ, given that σ > 1. It can be ascertained from figure that the error rate for σ and σ at the medians of symmetric distributions are the same. Also, misclassification rate decreases as σ goes farther from 1.

6 O.S. Makinde, Afrika Statistika, Vol. 111, 016, pages On misclassification probabilities of linear and quadratic classifiers. 948 Probability of misclassification Theoretical Simulation Probability of misclassification Theoretical Simulation δ σ a LDA b QDA Fig. 1: Misclassification probability: Theoretical versus Simulation. Theoretically, LDA and QDA can not used for discriminating between distributions whose first and second moments do not exist. Furthermore, the presence of an outlying training sample point will affect the performance of LDA and QDA. Hence, both linear and quadratic classifiers are not robust against outliers and extreme values. However, Hubert and Van Driessen 004 proposed replacing the estimates of µ 1, µ, Σ 1 and Σ in Equation1 by reweighted minimum covariance determinant MCD estimator of multivariate location and scatter based on FAST-MCD algorithm of Rousseeuw and Van Driessen Theoretical Bayes Risk - Location Shift We want to derive misclassification probability associated with Bayes rule for some competing distributions with location shift in a two-class problem. The distributions are multivariate t distribution with k degree of freedom and multivariate Laplace distribution. For multivariate normal distributions, the probability of misclassification associated with Bayes rule is discussed in Section above. Multivariate t distributions Let Z N d 0, Σ and U χ k be independent, where k is the degree of freedom of Chisquared distribution. Define X = Z + µ. The distribution of X is multivariate t k U distribution with k degree of freedom, denoted by tk, µ, Σ. The probability density function

7 O.S. Makinde, Afrika Statistika, Vol. 111, 016, pages On misclassification probabilities of linear and quadratic classifiers. 949 Probability of misclassification σ = 0. σ = 0.5 σ = 1 σ = σ = δ Fig. : Plot of error rate for location-scale shift problem using QDA. of x is fx = kπ d Γ k+d Γ k Σ 1 1 {1 + k x µ Σ 1 x µ} k+d. 4 Suppose π 1 has distribution tk, µ 1, Σ with probability density function f 1 x and π has distribution tk, µ, Σ with probability density function f x. Under the assumption of equal prior probabilities, Bayes rule is to assign x to π 1 if f 1 x > f x, which is equivalent to assign x to π 1 if x µ 1 Σ 1 x µ 1 < x µ Σ 1 x µ. This holds since the competing distributions have the same degree of freedom. The expression above is equivalent to x Σ 1 µ 1 µ + µ 1 + µ Σ 1 µ 1 µ < 0 and can also be written as k z u + µ Σ 1 µ 1 µ 1 µ 1 + µ Σ 1 µ 1 µ > 0,

8 O.S. Makinde, Afrika Statistika, Vol. 111, 016, pages On misclassification probabilities of linear and quadratic classifiers. 950 k where x = z u + µ and z is distributed as N d0, Σ. If x is from π 1, µ = µ 1. Similarly, if x is from π, µ = µ. It follows that k P 1 = P u z Σ 1 µ 1 µ + 1 ] µ 1 µ Σ 1 µ 1 µ < 0 ] u k µ 1 µ Σ 1 µ 1 µ. = P z Σ 1 µ 1 µ < 1 This holds because u takes values in 0,. For either of the population, E z Σ 1 µ 1 µ = 0 and var z Σ 1 µ 1 µ = µ 1 µ Σ 1 µ 1 µ, then P 1 = P R < 1 = P R < 1 ] u k c 0 = u k µ 1 µ Σ 1 µ 1 µ µ1 µ Σ 1 µ 1 µ 1/ Φc 1 f u udu where R is a standard normal random variable defined as R = z Σ 1 µ 1 µ Ez Σ 1 µ 1 µ ] varz Σ 1 µ 1 µ, 1/ ] Φ is the distribution function of the standard normal distribution, f u is probability density function of χ k, u is Chi-squared distributed random variable, Similarly, k P 1 = P = P z Σ 1 µ 1 µ > 1 = P R > c 1 = 1 u k c 0, c = 1 u k c 0. u z Σ 1 µ 1 µ 1 ] µ 1 µ Σ 1 µ 1 µ > 0 ] u k µ 1 µ Σ 1 µ 1 µ 1 u k µ 1 µ Σ 1 ] 1 µ 1 µ µ1 µ Σ 1 µ 1 µ = 1 P R < 1/ = 1 P R < 1 ] u k c 0 = 1 Φc f u udu u k µ 1 µ Σ 1 µ 1 µ µ1 µ Σ 1 µ 1 µ 1/ The probability of misclassification associated with Bayes rule, denoted by B, is B = p 1 P 1 + p P 1 = p 1 Φc 1 f u udu + p 1 Φc f u udu. ]

9 O.S. Makinde, Afrika Statistika, Vol. 111, 016, pages On misclassification probabilities of linear and quadratic classifiers. 951 Multivariate Laplace distributions Suppose the distribution of X R d is multivariate Laplace distribution Lµ, Σ, where µ and Σ are mean vector and covariance matrix of the distribution respectively. The probability density function of x is of the form fx e x µ Σ 1 x µ. Without loss of generality, let d =, r Gammad, θ Uniform0, π, µ = µ 1, µ. Define Z1 Z 1 = r cos θ, Z = r sin θ, Z =. Then, X = Σ 1 Z + µ has bivariate Laplace distribution BLµ, Σ, where µ and Σ are mean and covariance of the distribution respectively. It follows that X µ = Σ 1 Z. Suppose populations π 1 and π have distribution functions BLµ 1, Σ and BLµ, Σ respectively. If x π 1, then Z = Σ 1 X µ1, x µ 1 Σ 1 x µ 1 = z z = r Gammad and x µ Σ 1 x µ r except µ 1 = µ, where d =. Similarly, if x π, then Z = Σ 1 X µ, x µ Σ 1 x µ = z z = r Gammad and x µ 1 Σ 1 x µ 1 r except µ 1 = µ. It follows that log f1 x = x µ f x 1 Σ 1 x µ 1 + x µ Σ 1 x µ. Z Assuming equal prior probabilities, and for x, µ 1, µ hyperplane between π 1 and π can be written as R d and d, the separating x µ 1 Σ 1 x µ 1 = x µ Σ 1 x µ. This is equivalent to x Σ 1 µ 1 µ = 1 µ 1 + µ Σ 1 µ 1 µ. It follows that if x is distributed as population π 1, x Σ 1 µ 1 µ = 1 µ 1+µ Σ 1 µ 1 µ implies x µ 1 Σ 1 µ 1 µ = 1 µ 1 µ Σ 1 µ 1 µ and can be written as z a = 1 a a, where a = Σ 1/ µ 1 µ and z is a standard multivariate Laplace distributed random variable. Kotz et al. 001 has shown that linear combination of standard multivariate Laplace random variables has a univariate symmetric Laplace distribution L0, σ l See Proposition in pp. 3. That is, w = a z has a univariate Laplace distribution with mean 0 and variance σ l, where σ l = vara z and a is a vector of constant real numbers. Similarly, if x is distributed as population π, the separating hyperplane remains unchanged.

10 O.S. Makinde, Afrika Statistika, Vol. 111, 016, pages On misclassification probabilities of linear and quadratic classifiers. 95 Observe that f 1 x > f x implies x µ 1 Σ 1 x µ 1 < x µ Σ 1 x µ and x Σ 1 µ 1 µ > 1 µ 1 + µ Σ 1 µ 1 µ. It follows that P 1 = P x Σ 1 µ 1 µ < 1 µ 1 + µ Σ 1 µ 1 µ x π 1 = P x Σ 1 µ 1 µ µ 1 Σ 1 µ 1 µ < 1 µ 1 + µ Σ 1 µ 1 µ µ 1 Σ 1 µ 1 µ = P z a < 1 µ 1 µ Σ 1 µ 1 µ = P w < 1 µ 1 µ Σ 1 µ 1 µ = F 1 µ 1 µ Σ 1 µ 1 µ, where F is the distribution function of 1-dimensional symmetric Laplace distribution L 0, c 0 with c 0 > 0. Similarly, P 1 = P x Σ 1 µ 1 µ > 1 µ 1 + µ Σ 1 µ 1 µ x π = P x Σ 1 µ 1 µ µ Σ 1 µ 1 µ > 1 µ 1 + µ Σ 1 µ 1 µ µ Σ 1 µ 1 µ = P z a > 1 µ 1 µ Σ 1 µ 1 µ = P w > 1 µ 1 µ Σ 1 µ 1 µ 1 = 1 F µ 1 µ Σ 1 µ 1 µ where F is as defined above. The Bayes probability of misclassifying of x into either π 1 or π, denoted by B, is B = p 1 P 1 + p P 1 = p 1 F 1 ] 1 µ 1 µ Σ 1 µ 1 µ + p 1 F µ 1 µ Σ 1 µ 1 µ where p 1 + p = 1. Suppose G is a Laplace distribution function which is symmetric about c, then G c = 1 Gc for all c R. Hence B = p 1 F 1 µ 1 µ Σ 1 µ 1 µ + p F 1 ] µ 1 µ Σ 1 µ 1 µ = F 1 µ 1 µ Σ 1 µ 1 µ 5. Concluding Remarks In this paper, we have considered probabilities of misclassification between two populations only. However, these can be extended to more than two populations easily. The optimal performance of linear and quadratic discriminant functions are investigated and provide solutions of some theoretical examples. The theoretical probabilities of misclassification are compared with empirical error rates based on simulation, when competing populations differ in location and scale. The sample estimates of probability of misclassification associated with LDA and QDA are good approximation for their respective population versions. We derive expressions for Bayes error for multivariate Laplace distributions and multivariate t distributions with the same degree of freedom, under location shift.

11 O.S. Makinde, Afrika Statistika, Vol. 111, 016, pages On misclassification probabilities of linear and quadratic classifiers. 953 References Anderson, T.W., An introduction to multivariate statistical analysis. John Wiley & Sons, Inc, New York. Chang, P.C. and Afifi, A.A., 008. Classification based on dichotomous and continuous variables. JASA, 69, Fisher, R.A., The use of multiple measurements in taxonomic problems. Ann. Eug., 7, Ghosh, A.K. and Chaudhuri, P., 005. On maximum depth and related classifiers. Scand. J. of Stat., 3, Gilbert, E.S., The effect of unequal variance-covariance matrices on Fisher s linear discriminant function, Biometrics, 53, Hall, P., Titterington, D.M. and Xue, J., 009. Median Based classifiers for High Dimensional Data. Journal of the American Statistical Association, , Hubert, M. and Van Driessen, K., 004. Fast and robust discriminant analysis. Computational Statistics and Data Analysis, 45, Johnson, R.A. and Wichern, D.W., 007. Applied multivariate statistical analysis. Sixth edition, Pearson Prentice Hall inc. New Jersey. Kim, K.S., Choi, H.H., Moon, C.S. and Mun, C.W., 011. Comparison of k-nearest neighbor, quadratic discriminant and linear discriminant analysis in classification of electromyogram signals based on the wrist-motion directions. Curr. Appl. Phy., 11, Kotz, S., Kozubowski, T. and Podgorski, K., 001. The Laplace distribution and generalizations: A Revisit with Applications to Communications, Economics, Engineering, and Finance. Springer Science+Business Media, LLC. Krzanowski, W.J., The performance of Fisher s linear discriminant function under non-optimal conditions, Technometrics, 19, Li, J., Cuesta-Alberstos J.A. and Liu, R.Y., 01. DD-Classifier: Nonparametric Classification Procedure Based on DD-plot, JASA, 107, Makinde, O.S. and Chakraborty, B., 015. On some nonparametric classifiers based on distribution functions of multivariate ranks. In Nordhausen, K and Taskinen, S.eds: Modern Nonparametric, Robust and Multivariate Methods, Festschrift in Honour of Hannu Oja. Springer, Rousseeuw, P.J. and Van Driessen, K., A Fast Algorithm for the Minimum Covariance Determinant Estimator, Technometrics, 41, 1-3. Wald, A., On a statistical problem arising in the classification of an individual into one of two groups. Annals of Math. Stat., 15, Welch, B.L., Note on discriminant functions. Biometrika, 31, 18-0.

On Some Classification Methods for High Dimensional and Functional Data

On Some Classification Methods for High Dimensional and Functional Data On Some Classification Methods for High Dimensional and Functional Data by OLUSOLA SAMUEL MAKINDE A thesis submitted to The University of Birmingham for the degree of Doctor of Philosophy School of Mathematics

More information

Robustness of the Quadratic Discriminant Function to correlated and uncorrelated normal training samples

Robustness of the Quadratic Discriminant Function to correlated and uncorrelated normal training samples DOI 10.1186/s40064-016-1718-3 RESEARCH Open Access Robustness of the Quadratic Discriminant Function to correlated and uncorrelated normal training samples Atinuke Adebanji 1,2, Michael Asamoah Boaheng

More information

An Introduction to Multivariate Statistical Analysis

An Introduction to Multivariate Statistical Analysis An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents

More information

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception LINEAR MODELS FOR CLASSIFICATION Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification,

More information

Regularized Discriminant Analysis and Reduced-Rank LDA

Regularized Discriminant Analysis and Reduced-Rank LDA Regularized Discriminant Analysis and Reduced-Rank LDA Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Regularized Discriminant Analysis A compromise between LDA and

More information

12 Discriminant Analysis

12 Discriminant Analysis 12 Discriminant Analysis Discriminant analysis is used in situations where the clusters are known a priori. The aim of discriminant analysis is to classify an observation, or several observations, into

More information

ISyE 6416: Computational Statistics Spring Lecture 5: Discriminant analysis and classification

ISyE 6416: Computational Statistics Spring Lecture 5: Discriminant analysis and classification ISyE 6416: Computational Statistics Spring 2017 Lecture 5: Discriminant analysis and classification Prof. Yao Xie H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology

More information

The Bayes classifier

The Bayes classifier The Bayes classifier Consider where is a random vector in is a random variable (depending on ) Let be a classifier with probability of error/risk given by The Bayes classifier (denoted ) is the optimal

More information

Semiparametric Discriminant Analysis of Mixture Populations Using Mahalanobis Distance. Probal Chaudhuri and Subhajit Dutta

Semiparametric Discriminant Analysis of Mixture Populations Using Mahalanobis Distance. Probal Chaudhuri and Subhajit Dutta Semiparametric Discriminant Analysis of Mixture Populations Using Mahalanobis Distance Probal Chaudhuri and Subhajit Dutta Indian Statistical Institute, Kolkata. Workshop on Classification and Regression

More information

DD-Classifier: Nonparametric Classification Procedure Based on DD-plot 1. Abstract

DD-Classifier: Nonparametric Classification Procedure Based on DD-plot 1. Abstract DD-Classifier: Nonparametric Classification Procedure Based on DD-plot 1 Jun Li 2, Juan A. Cuesta-Albertos 3, Regina Y. Liu 4 Abstract Using the DD-plot (depth-versus-depth plot), we introduce a new nonparametric

More information

Modelling Academic Risks of Students in a Polytechnic System With the Use of Discriminant Analysis

Modelling Academic Risks of Students in a Polytechnic System With the Use of Discriminant Analysis Progress in Applied Mathematics Vol. 6, No., 03, pp. [59-69] DOI: 0.3968/j.pam.955803060.738 ISSN 95-5X [Print] ISSN 95-58 [Online] www.cscanada.net www.cscanada.org Modelling Academic Risks of Students

More information

Multistage Methodologies for Partitioning a Set of Exponential. populations.

Multistage Methodologies for Partitioning a Set of Exponential. populations. Multistage Methodologies for Partitioning a Set of Exponential Populations Department of Mathematics, University of New Orleans, 2000 Lakefront, New Orleans, LA 70148, USA tsolanky@uno.edu Tumulesh K.

More information

Discriminant Analysis and Statistical Pattern Recognition

Discriminant Analysis and Statistical Pattern Recognition Discriminant Analysis and Statistical Pattern Recognition GEOFFREY J. McLACHLAN Department of Mathematics The University of Queensland St. Lucia, Queensland, Australia A Wiley-Interscience Publication

More information

MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS. Maya Gupta, Luca Cazzanti, and Santosh Srivastava

MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS. Maya Gupta, Luca Cazzanti, and Santosh Srivastava MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS Maya Gupta, Luca Cazzanti, and Santosh Srivastava University of Washington Dept. of Electrical Engineering Seattle,

More information

Machine Learning 2017

Machine Learning 2017 Machine Learning 2017 Volker Roth Department of Mathematics & Computer Science University of Basel 21st March 2017 Volker Roth (University of Basel) Machine Learning 2017 21st March 2017 1 / 41 Section

More information

Invariant coordinate selection for multivariate data analysis - the package ICS

Invariant coordinate selection for multivariate data analysis - the package ICS Invariant coordinate selection for multivariate data analysis - the package ICS Klaus Nordhausen 1 Hannu Oja 1 David E. Tyler 2 1 Tampere School of Public Health University of Tampere 2 Department of Statistics

More information

p(x ω i 0.4 ω 2 ω

p(x ω i 0.4 ω 2 ω p( ω i ). ω.3.. 9 3 FIGURE.. Hypothetical class-conditional probability density functions show the probability density of measuring a particular feature value given the pattern is in category ω i.if represents

More information

Efficient Approach to Pattern Recognition Based on Minimization of Misclassification Probability

Efficient Approach to Pattern Recognition Based on Minimization of Misclassification Probability American Journal of Theoretical and Applied Statistics 05; 5(-): 7- Published online November 9, 05 (http://www.sciencepublishinggroup.com/j/ajtas) doi: 0.648/j.ajtas.s.060500. ISSN: 36-8999 (Print); ISSN:

More information

POWER AND TYPE I ERROR RATE COMPARISON OF MULTIVARIATE ANALYSIS OF VARIANCE

POWER AND TYPE I ERROR RATE COMPARISON OF MULTIVARIATE ANALYSIS OF VARIANCE POWER AND TYPE I ERROR RATE COMPARISON OF MULTIVARIATE ANALYSIS OF VARIANCE Supported by Patrick Adebayo 1 and Ahmed Ibrahim 1 Department of Statistics, University of Ilorin, Kwara State, Nigeria Department

More information

Classification via kernel regression based on univariate product density estimators

Classification via kernel regression based on univariate product density estimators Classification via kernel regression based on univariate product density estimators Bezza Hafidi 1, Abdelkarim Merbouha 2, and Abdallah Mkhadri 1 1 Department of Mathematics, Cadi Ayyad University, BP

More information

Classification: Linear Discriminant Analysis

Classification: Linear Discriminant Analysis Classification: Linear Discriminant Analysis Discriminant analysis uses sample information about individuals that are known to belong to one of several populations for the purposes of classification. Based

More information

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course. Name of the course Statistical methods and data analysis Audience The course is intended for students of the first or second year of the Graduate School in Materials Engineering. The aim of the course

More information

Small sample size in high dimensional space - minimum distance based classification.

Small sample size in high dimensional space - minimum distance based classification. Small sample size in high dimensional space - minimum distance based classification. Ewa Skubalska-Rafaj lowicz Institute of Computer Engineering, Automatics and Robotics, Department of Electronics, Wroc

More information

Chap 2. Linear Classifiers (FTH, ) Yongdai Kim Seoul National University

Chap 2. Linear Classifiers (FTH, ) Yongdai Kim Seoul National University Chap 2. Linear Classifiers (FTH, 4.1-4.4) Yongdai Kim Seoul National University Linear methods for classification 1. Linear classifiers For simplicity, we only consider two-class classification problems

More information

GENERAL PROBLEMS OF METROLOGY AND MEASUREMENT TECHNIQUE

GENERAL PROBLEMS OF METROLOGY AND MEASUREMENT TECHNIQUE DOI 10.1007/s11018-017-1141-3 Measurement Techniques, Vol. 60, No. 1, April, 2017 GENERAL PROBLEMS OF METROLOGY AND MEASUREMENT TECHNIQUE APPLICATION AND POWER OF PARAMETRIC CRITERIA FOR TESTING THE HOMOGENEITY

More information

Empirical Power of Four Statistical Tests in One Way Layout

Empirical Power of Four Statistical Tests in One Way Layout International Mathematical Forum, Vol. 9, 2014, no. 28, 1347-1356 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/imf.2014.47128 Empirical Power of Four Statistical Tests in One Way Layout Lorenzo

More information

Discriminant analysis and supervised classification

Discriminant analysis and supervised classification Discriminant analysis and supervised classification Angela Montanari 1 Linear discriminant analysis Linear discriminant analysis (LDA) also known as Fisher s linear discriminant analysis or as Canonical

More information

Classification 2: Linear discriminant analysis (continued); logistic regression

Classification 2: Linear discriminant analysis (continued); logistic regression Classification 2: Linear discriminant analysis (continued); logistic regression Ryan Tibshirani Data Mining: 36-462/36-662 April 4 2013 Optional reading: ISL 4.4, ESL 4.3; ISL 4.3, ESL 4.4 1 Reminder:

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Classification. Chapter Introduction. 6.2 The Bayes classifier

Classification. Chapter Introduction. 6.2 The Bayes classifier Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode

More information

Machine Learning Lecture 7

Machine Learning Lecture 7 Course Outline Machine Learning Lecture 7 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Statistical Learning Theory 23.05.2016 Discriminative Approaches (5 weeks) Linear Discriminant

More information

EPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large

EPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large EPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large Tetsuji Tonda 1 Tomoyuki Nakagawa and Hirofumi Wakaki Last modified: March 30 016 1 Faculty of Management and Information

More information

CLASSICAL NORMAL-BASED DISCRIMINANT ANALYSIS

CLASSICAL NORMAL-BASED DISCRIMINANT ANALYSIS CLASSICAL NORMAL-BASED DISCRIMINANT ANALYSIS EECS 833, March 006 Geoff Bohling Assistant Scientist Kansas Geological Survey geoff@gs.u.edu 864-093 Overheads and resources available at http://people.u.edu/~gbohling/eecs833

More information

Logistic Regression: Regression with a Binary Dependent Variable

Logistic Regression: Regression with a Binary Dependent Variable Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression

More information

Bayesian Decision Theory

Bayesian Decision Theory Bayesian Decision Theory Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 1 / 46 Bayesian

More information

6-1. Canonical Correlation Analysis

6-1. Canonical Correlation Analysis 6-1. Canonical Correlation Analysis Canonical Correlatin analysis focuses on the correlation between a linear combination of the variable in one set and a linear combination of the variables in another

More information

5. Discriminant analysis

5. Discriminant analysis 5. Discriminant analysis We continue from Bayes s rule presented in Section 3 on p. 85 (5.1) where c i is a class, x isap-dimensional vector (data case) and we use class conditional probability (density

More information

MULTIVARIATE TECHNIQUES, ROBUSTNESS

MULTIVARIATE TECHNIQUES, ROBUSTNESS MULTIVARIATE TECHNIQUES, ROBUSTNESS Mia Hubert Associate Professor, Department of Mathematics and L-STAT Katholieke Universiteit Leuven, Belgium mia.hubert@wis.kuleuven.be Peter J. Rousseeuw 1 Senior Researcher,

More information

Statistical methods and data analysis

Statistical methods and data analysis Statistical methods and data analysis Teacher Stefano Siboni Aim The aim of the course is to illustrate the basic mathematical tools for the analysis and modelling of experimental data, particularly concerning

More information

Applied Multivariate and Longitudinal Data Analysis

Applied Multivariate and Longitudinal Data Analysis Applied Multivariate and Longitudinal Data Analysis Chapter 2: Inference about the mean vector(s) Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 In this chapter we will discuss inference

More information

Asymptotic Relative Efficiency in Estimation

Asymptotic Relative Efficiency in Estimation Asymptotic Relative Efficiency in Estimation Robert Serfling University of Texas at Dallas October 2009 Prepared for forthcoming INTERNATIONAL ENCYCLOPEDIA OF STATISTICAL SCIENCES, to be published by Springer

More information

Lecture 5: LDA and Logistic Regression

Lecture 5: LDA and Logistic Regression Lecture 5: and Logistic Regression Hao Helen Zhang Hao Helen Zhang Lecture 5: and Logistic Regression 1 / 39 Outline Linear Classification Methods Two Popular Linear Models for Classification Linear Discriminant

More information

Supervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2012

Supervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2012 Supervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2012 Overview Review: Conditional Probability LDA / QDA: Theory Fisher s Discriminant Analysis LDA: Example Quality control:

More information

SOME RESULTS ON THE MULTIPLE GROUP DISCRIMINANT PROBLEM. Peter A. Lachenbruch

SOME RESULTS ON THE MULTIPLE GROUP DISCRIMINANT PROBLEM. Peter A. Lachenbruch .; SOME RESULTS ON THE MULTIPLE GROUP DISCRIMINANT PROBLEM By Peter A. Lachenbruch Department of Biostatistics University of North Carolina, Chapel Hill, N.C. Institute of Statistics Mimeo Series No. 829

More information

Lecture 4 Discriminant Analysis, k-nearest Neighbors

Lecture 4 Discriminant Analysis, k-nearest Neighbors Lecture 4 Discriminant Analysis, k-nearest Neighbors Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se fredrik.lindsten@it.uu.se

More information

Generative classifiers: The Gaussian classifier. Ata Kaban School of Computer Science University of Birmingham

Generative classifiers: The Gaussian classifier. Ata Kaban School of Computer Science University of Birmingham Generative classifiers: The Gaussian classifier Ata Kaban School of Computer Science University of Birmingham Outline We have already seen how Bayes rule can be turned into a classifier In all our examples

More information

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 4: Measures of Robustness, Robust Principal Component Analysis

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 4: Measures of Robustness, Robust Principal Component Analysis MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 4:, Robust Principal Component Analysis Contents Empirical Robust Statistical Methods In statistics, robust methods are methods that perform well

More information

Fast and Robust Discriminant Analysis

Fast and Robust Discriminant Analysis Fast and Robust Discriminant Analysis Mia Hubert a,1, Katrien Van Driessen b a Department of Mathematics, Katholieke Universiteit Leuven, W. De Croylaan 54, B-3001 Leuven. b UFSIA-RUCA Faculty of Applied

More information

Robust Linear Discriminant Analysis

Robust Linear Discriminant Analysis Journal of Mathematics and Statistics Original Research Paper Robust Linear Discriminant Analysis Sharipah Soaad Syed Yahaya, Yai-Fung Lim, Hazlina Ali and Zurni Omar School of Quantitative Sciences, UUM

More information

International Journal of Scientific & Engineering Research, Volume 6, Issue 2, February-2015 ISSN

International Journal of Scientific & Engineering Research, Volume 6, Issue 2, February-2015 ISSN 1686 On Some Generalised Transmuted Distributions Kishore K. Das Luku Deka Barman Department of Statistics Gauhati University Abstract In this paper a generalized form of the transmuted distributions has

More information

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1 EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle

More information

Small sample size generalization

Small sample size generalization 9th Scandinavian Conference on Image Analysis, June 6-9, 1995, Uppsala, Sweden, Preprint Small sample size generalization Robert P.W. Duin Pattern Recognition Group, Faculty of Applied Physics Delft University

More information

Bayes Decision Theory

Bayes Decision Theory Bayes Decision Theory Minimum-Error-Rate Classification Classifiers, Discriminant Functions and Decision Surfaces The Normal Density 0 Minimum-Error-Rate Classification Actions are decisions on classes

More information

Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption

Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption Alisa A. Gorbunova and Boris Yu. Lemeshko Novosibirsk State Technical University Department of Applied Mathematics,

More information

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II)

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II) Contents Lecture Lecture Linear Discriminant Analysis Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University Email: fredriklindsten@ituuse Summary of lecture

More information

IDENTIFIABILITY OF THE MULTIVARIATE NORMAL BY THE MAXIMUM AND THE MINIMUM

IDENTIFIABILITY OF THE MULTIVARIATE NORMAL BY THE MAXIMUM AND THE MINIMUM Surveys in Mathematics and its Applications ISSN 842-6298 (electronic), 843-7265 (print) Volume 5 (200), 3 320 IDENTIFIABILITY OF THE MULTIVARIATE NORMAL BY THE MAXIMUM AND THE MINIMUM Arunava Mukherjea

More information

CS 195-5: Machine Learning Problem Set 1

CS 195-5: Machine Learning Problem Set 1 CS 95-5: Machine Learning Problem Set Douglas Lanman dlanman@brown.edu 7 September Regression Problem Show that the prediction errors y f(x; ŵ) are necessarily uncorrelated with any linear function of

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries

University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout :. The Multivariate Gaussian & Decision Boundaries..15.1.5 1 8 6 6 8 1 Mark Gales mjfg@eng.cam.ac.uk Lent

More information

PROBABILITIES OF MISCLASSIFICATION IN DISCRIMINATORY ANALYSIS. M. Clemens Johnson

PROBABILITIES OF MISCLASSIFICATION IN DISCRIMINATORY ANALYSIS. M. Clemens Johnson RB-55-22 ~ [ s [ B A U R L t L Ii [ T I N PROBABILITIES OF MISCLASSIFICATION IN DISCRIMINATORY ANALYSIS M. Clemens Johnson This Bulletin is a draft for interoffice circulation. Corrections and suggestions

More information

Informatics 2B: Learning and Data Lecture 10 Discriminant functions 2. Minimal misclassifications. Decision Boundaries

Informatics 2B: Learning and Data Lecture 10 Discriminant functions 2. Minimal misclassifications. Decision Boundaries Overview Gaussians estimated from training data Guido Sanguinetti Informatics B Learning and Data Lecture 1 9 March 1 Today s lecture Posterior probabilities, decision regions and minimising the probability

More information

ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES

ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES Hisashi Tanizaki Graduate School of Economics, Kobe University, Kobe 657-8501, Japan e-mail: tanizaki@kobe-u.ac.jp Abstract:

More information

Textbook Examples of. SPSS Procedure

Textbook Examples of. SPSS Procedure Textbook s of IBM SPSS Procedures Each SPSS procedure listed below has its own section in the textbook. These sections include a purpose statement that describes the statistical test, identification of

More information

Ch 4. Linear Models for Classification

Ch 4. Linear Models for Classification Ch 4. Linear Models for Classification Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Department of Computer Science and Engineering Pohang University of Science and echnology 77 Cheongam-ro,

More information

Gaussian Models

Gaussian Models Gaussian Models ddebarr@uw.edu 2016-04-28 Agenda Introduction Gaussian Discriminant Analysis Inference Linear Gaussian Systems The Wishart Distribution Inferring Parameters Introduction Gaussian Density

More information

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 1: Introduction, Multivariate Location and Scatter

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 1: Introduction, Multivariate Location and Scatter MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 1:, Multivariate Location Contents , pauliina.ilmonen(a)aalto.fi Lectures on Mondays 12.15-14.00 (2.1. - 6.2., 20.2. - 27.3.), U147 (U5) Exercises

More information

Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition

Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Quadratic forms in skew normal variates

Quadratic forms in skew normal variates J. Math. Anal. Appl. 73 (00) 558 564 www.academicpress.com Quadratic forms in skew normal variates Arjun K. Gupta a,,1 and Wen-Jang Huang b a Department of Mathematics and Statistics, Bowling Green State

More information

High Dimensional Discriminant Analysis

High Dimensional Discriminant Analysis High Dimensional Discriminant Analysis Charles Bouveyron 1,2, Stéphane Girard 1, and Cordelia Schmid 2 1 LMC IMAG, BP 53, Université Grenoble 1, 38041 Grenoble cedex 9 France (e-mail: charles.bouveyron@imag.fr,

More information

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Fundamentals to Biostatistics Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Statistics collection, analysis, interpretation of data development of new

More information

A Bahadur Representation of the Linear Support Vector Machine

A Bahadur Representation of the Linear Support Vector Machine A Bahadur Representation of the Linear Support Vector Machine Yoonkyung Lee Department of Statistics The Ohio State University October 7, 2008 Data Mining and Statistical Learning Study Group Outline Support

More information

MULTIVARIATE PATTERN RECOGNITION FOR CHEMOMETRICS. Richard Brereton

MULTIVARIATE PATTERN RECOGNITION FOR CHEMOMETRICS. Richard Brereton MULTIVARIATE PATTERN RECOGNITION FOR CHEMOMETRICS Richard Brereton r.g.brereton@bris.ac.uk Pattern Recognition Book Chemometrics for Pattern Recognition, Wiley, 2009 Pattern Recognition Pattern Recognition

More information

Machine Learning (CS 567) Lecture 5

Machine Learning (CS 567) Lecture 5 Machine Learning (CS 567) Lecture 5 Time: T-Th 5:00pm - 6:20pm Location: GFS 118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol

More information

Linear Classification

Linear Classification Linear Classification Lili MOU moull12@sei.pku.edu.cn http://sei.pku.edu.cn/ moull12 23 April 2015 Outline Introduction Discriminant Functions Probabilistic Generative Models Probabilistic Discriminative

More information

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers Engineering Part IIB: Module 4F0 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers Phil Woodland: pcw@eng.cam.ac.uk Michaelmas 202 Engineering Part IIB:

More information

Symmetrised M-estimators of multivariate scatter

Symmetrised M-estimators of multivariate scatter Journal of Multivariate Analysis 98 (007) 1611 169 www.elsevier.com/locate/jmva Symmetrised M-estimators of multivariate scatter Seija Sirkiä a,, Sara Taskinen a, Hannu Oja b a Department of Mathematics

More information

Robust estimation of principal components from depth-based multivariate rank covariance matrix

Robust estimation of principal components from depth-based multivariate rank covariance matrix Robust estimation of principal components from depth-based multivariate rank covariance matrix Subho Majumdar Snigdhansu Chatterjee University of Minnesota, School of Statistics Table of contents Summary

More information

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 Last week... supervised and unsupervised methods need adaptive

More information

Fast and Robust Classifiers Adjusted for Skewness

Fast and Robust Classifiers Adjusted for Skewness Fast and Robust Classifiers Adjusted for Skewness Mia Hubert 1 and Stephan Van der Veeken 2 1 Department of Mathematics - LStat, Katholieke Universiteit Leuven Celestijnenlaan 200B, Leuven, Belgium, Mia.Hubert@wis.kuleuven.be

More information

MULTIVARIATE TIME SERIES ANALYSIS AN ADAPTATION OF BOX-JENKINS METHODOLOGY Joseph N Ladalla University of Illinois at Springfield, Springfield, IL

MULTIVARIATE TIME SERIES ANALYSIS AN ADAPTATION OF BOX-JENKINS METHODOLOGY Joseph N Ladalla University of Illinois at Springfield, Springfield, IL MULTIVARIATE TIME SERIES ANALYSIS AN ADAPTATION OF BOX-JENKINS METHODOLOGY Joseph N Ladalla University of Illinois at Springfield, Springfield, IL KEYWORDS: Multivariate time series, Box-Jenkins ARIMA

More information

A Simple Implementation of the Stochastic Discrimination for Pattern Recognition

A Simple Implementation of the Stochastic Discrimination for Pattern Recognition A Simple Implementation of the Stochastic Discrimination for Pattern Recognition Dechang Chen 1 and Xiuzhen Cheng 2 1 University of Wisconsin Green Bay, Green Bay, WI 54311, USA chend@uwgb.edu 2 University

More information

A ROBUST METHOD OF ESTIMATING COVARIANCE MATRIX IN MULTIVARIATE DATA ANALYSIS G.M. OYEYEMI *, R.A. IPINYOMI **

A ROBUST METHOD OF ESTIMATING COVARIANCE MATRIX IN MULTIVARIATE DATA ANALYSIS G.M. OYEYEMI *, R.A. IPINYOMI ** ANALELE ŞTIINłIFICE ALE UNIVERSITĂłII ALEXANDRU IOAN CUZA DIN IAŞI Tomul LVI ŞtiinŃe Economice 9 A ROBUST METHOD OF ESTIMATING COVARIANCE MATRIX IN MULTIVARIATE DATA ANALYSIS G.M. OYEYEMI, R.A. IPINYOMI

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.) Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori

More information

A Study of Relative Efficiency and Robustness of Classification Methods

A Study of Relative Efficiency and Robustness of Classification Methods A Study of Relative Efficiency and Robustness of Classification Methods Yoonkyung Lee* Department of Statistics The Ohio State University *joint work with Rui Wang April 28, 2011 Department of Statistics

More information

Performance In Science And Non Science Subjects

Performance In Science And Non Science Subjects Canonical Correlation And Hotelling s T 2 Analysis On Students Performance In Science And Non Science Subjects Mustapha Usman Baba 1, Nafisa Muhammad 1,Ibrahim Isa 1, Rabiu Ado Inusa 1 and Usman Hashim

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

More information

p(x ω i 0.4 ω 2 ω

p(x ω i 0.4 ω 2 ω p(x ω i ).4 ω.3.. 9 3 4 5 x FIGURE.. Hypothetical class-conditional probability density functions show the probability density of measuring a particular feature value x given the pattern is in category

More information

A process capability index for discrete processes

A process capability index for discrete processes Journal of Statistical Computation and Simulation Vol. 75, No. 3, March 2005, 175 187 A process capability index for discrete processes MICHAEL PERAKIS and EVDOKIA XEKALAKI* Department of Statistics, Athens

More information

Discrimination: finding the features that separate known groups in a multivariate sample.

Discrimination: finding the features that separate known groups in a multivariate sample. Discrimination and Classification Goals: Discrimination: finding the features that separate known groups in a multivariate sample. Classification: developing a rule to allocate a new object into one of

More information

Introduction to Machine Learning

Introduction to Machine Learning Outline Introduction to Machine Learning Bayesian Classification Varun Chandola March 8, 017 1. {circular,large,light,smooth,thick}, malignant. {circular,large,light,irregular,thick}, malignant 3. {oval,large,dark,smooth,thin},

More information

Quantile Approximation of the Chi square Distribution using the Quantile Mechanics

Quantile Approximation of the Chi square Distribution using the Quantile Mechanics Proceedings of the World Congress on Engineering and Computer Science 017 Vol I WCECS 017, October 57, 017, San Francisco, USA Quantile Approximation of the Chi square Distribution using the Quantile Mechanics

More information

Identification of Multivariate Outliers: A Performance Study

Identification of Multivariate Outliers: A Performance Study AUSTRIAN JOURNAL OF STATISTICS Volume 34 (2005), Number 2, 127 138 Identification of Multivariate Outliers: A Performance Study Peter Filzmoser Vienna University of Technology, Austria Abstract: Three

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

The Bayesian Approach to Multi-equation Econometric Model Estimation

The Bayesian Approach to Multi-equation Econometric Model Estimation Journal of Statistical and Econometric Methods, vol.3, no.1, 2014, 85-96 ISSN: 2241-0384 (print), 2241-0376 (online) Scienpress Ltd, 2014 The Bayesian Approach to Multi-equation Econometric Model Estimation

More information

High-dimensional asymptotic expansions for the distributions of canonical correlations

High-dimensional asymptotic expansions for the distributions of canonical correlations Journal of Multivariate Analysis 100 2009) 231 242 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva High-dimensional asymptotic

More information

Classification Methods II: Linear and Quadratic Discrimminant Analysis

Classification Methods II: Linear and Quadratic Discrimminant Analysis Classification Methods II: Linear and Quadratic Discrimminant Analysis Rebecca C. Steorts, Duke University STA 325, Chapter 4 ISL Agenda Linear Discrimminant Analysis (LDA) Classification Recall that linear

More information

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner

More information

Notes on Discriminant Functions and Optimal Classification

Notes on Discriminant Functions and Optimal Classification Notes on Discriminant Functions and Optimal Classification Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Discriminant Functions Consider a classification problem

More information