MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 4: Measures of Robustness, Robust Principal Component Analysis

Similar documents
MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 5: Bivariate Correspondence Analysis

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 1: Introduction, Multivariate Location and Scatter

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 8: Canonical Correlation Analysis

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 6: Bivariate Correspondence Analysis - part II

A Peak to the World of Multivariate Statistical Analysis

Robust Wilks' Statistic based on RMCD for One-Way Multivariate Analysis of Variance (MANOVA)

Supplementary Material for Wang and Serfling paper

Robust Estimation of Cronbach s Alpha

IMPROVING THE SMALL-SAMPLE EFFICIENCY OF A ROBUST CORRELATION MATRIX: A NOTE

Identification of Multivariate Outliers: A Performance Study

MULTIVARIATE TECHNIQUES, ROBUSTNESS

A ROBUST METHOD OF ESTIMATING COVARIANCE MATRIX IN MULTIVARIATE DATA ANALYSIS G.M. OYEYEMI *, R.A. IPINYOMI **

Robust scale estimation with extensions

The S-estimator of multivariate location and scatter in Stata

odhady a jejich (ekonometrické)

Small Sample Corrections for LTS and MCD

MULTICHANNEL SIGNAL PROCESSING USING SPATIAL RANK COVARIANCE MATRICES

ROBUST TESTS BASED ON MINIMUM DENSITY POWER DIVERGENCE ESTIMATORS AND SADDLEPOINT APPROXIMATIONS

Robust estimation of principal components from depth-based multivariate rank covariance matrix

Fast and Robust Discriminant Analysis

FAULT DETECTION AND ISOLATION WITH ROBUST PRINCIPAL COMPONENT ANALYSIS

robustness, efficiency, breakdown point, outliers, rank-based procedures, least absolute regression

FAST CROSS-VALIDATION IN ROBUST PCA

Introduction to Robust Statistics. Anthony Atkinson, London School of Economics, UK Marco Riani, Univ. of Parma, Italy

Small sample corrections for LTS and MCD

Re-weighted Robust Control Charts for Individual Observations

Robust multivariate methods in Chemometrics

Highly Robust Variogram Estimation 1. Marc G. Genton 2

arxiv: v1 [math.st] 2 Aug 2007

INVARIANT COORDINATE SELECTION

Robust estimation of scale and covariance with P n and its application to precision matrix estimation

Minimum Regularized Covariance Determinant Estimator

Introduction to Robust Statistics. Elvezio Ronchetti. Department of Econometrics University of Geneva Switzerland.

Inference based on robust estimators Part 2

Fast and robust bootstrap for LTS

Outlier detection for high-dimensional data

Introduction to Probability and Stocastic Processes - Part I

Joseph W. McKean 1. INTRODUCTION

arxiv: v1 [math.st] 11 Jun 2018

Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test

Influence Functions of the Spearman and Kendall Correlation Measures Croux, C.; Dehon, C.

ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY

Lecture 12 Robust Estimation

TITLE : Robust Control Charts for Monitoring Process Mean of. Phase-I Multivariate Individual Observations AUTHORS : Asokan Mulayath Variyath.

Independent Component (IC) Models: New Extensions of the Multinormal Model

Robust Weighted LAD Regression

Statistical Data Analysis

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

Robustness of the Affine Equivariant Scatter Estimator Based on the Spatial Rank Covariance Matrix

Hypothesis Testing. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA

Robust statistic for the one-way MANOVA

Influence functions of the Spearman and Kendall correlation measures

Discriminant Analysis with High Dimensional. von Mises-Fisher distribution and

Improvement of The Hotelling s T 2 Charts Using Robust Location Winsorized One Step M-Estimator (WMOM)

Bootstrapping the Confidence Intervals of R 2 MAD for Samples from Contaminated Standard Logistic Distribution

CLASSIFICATION EFFICIENCIES FOR ROBUST LINEAR DISCRIMINANT ANALYSIS

DEPTH WEIGHTED SCATTER ESTIMATORS. BY YIJUN ZUO 1 AND HENGJIAN CUI 2 Michigan State University and Beijing Normal University

Detecting outliers and/or leverage points: a robust two-stage procedure with bootstrap cut-off points

Subject CS1 Actuarial Statistics 1 Core Principles

REVIEW OF MAIN CONCEPTS AND FORMULAS A B = Ā B. Pr(A B C) = Pr(A) Pr(A B C) =Pr(A) Pr(B A) Pr(C A B)

arxiv: v3 [stat.me] 2 Feb 2018 Abstract

Introduction to robust statistics*

LOWHLL #WEEKLY JOURNAL.


Factor Analysis. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA

Robust Classification for Skewed Data

Introduction to Statistical Inference Lecture 8: Linear regression, Tests and confidence intervals

Computationally Easy Outlier Detection via Projection Pursuit with Finitely Many Directions

ISSN X On misclassification probabilities of linear and quadratic classifiers

Robust Exponential Smoothing of Multivariate Time Series

ON THE CALCULATION OF A ROBUST S-ESTIMATOR OF A COVARIANCE MATRIX

Construction of Combined Charts Based on Combining Functions

Efficient and Robust Scale Estimation

Statistical methods and data analysis

STAT 501 Assignment 1 Name Spring Written Assignment: Due Monday, January 22, in class. Please write your answers on this assignment

Machine Learning 11. week

Discriminant analysis and supervised classification

Homework Assignment #2 for Prob-Stats, Fall 2018 Due date: Monday, October 22, 2018

Principal Component Analysis

The minimum volume ellipsoid (MVE), introduced

Descriptive Statistics CE 311S

Detection of outliers in multivariate data:

Robustness of the Quadratic Discriminant Function to correlated and uncorrelated normal training samples

Sparse PCA for high-dimensional data with outliers

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

In Chapter 2, some concepts from the robustness literature were introduced. An important concept was the inuence function. In the present chapter, the

Supplement to Sparse Non-Negative Generalized PCA with Applications to Metabolomics

DD-Classifier: Nonparametric Classification Procedure Based on DD-plot 1. Abstract

arxiv:math/ v1 [math.st] 25 Apr 2005

Journal of Biostatistics and Epidemiology

DIMENSION REDUCTION OF THE EXPLANATORY VARIABLES IN MULTIPLE LINEAR REGRESSION. P. Filzmoser and C. Croux

Lecture 3. The Population Variance. The population variance, denoted σ 2, is the sum. of the squared deviations about the population

ASYMPTOTICS OF REWEIGHTED ESTIMATORS OF MULTIVARIATE LOCATION AND SCATTER. By Hendrik P. Lopuhaä Delft University of Technology

Using Ridge Least Median Squares to Estimate the Parameter by Solving Multicollinearity and Outliers Problems

Introduction to Statistical Inference Self-study

Computational Statistics and Data Analysis

A simple graphical method to explore tail-dependence in stock-return pairs

Robust estimation in time series

ECON 3150/4150, Spring term Lecture 6

Evaluation of robust PCA for supervised audio outlier detection

Transcription:

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 4:, Robust Principal Component Analysis

Contents Empirical

Robust Statistical Methods In statistics, robust methods are methods that perform well or do not perform too poorly in the presence of outlying observations.

Robust Statistical Methods, Example Mean vs median...

Let x denote a random variable or a random vector with a cumulative distribution function F x, and let X = {x 1, x 2,..., x n }, where x 1, x 2,..., x n are n independent and identically distributed observations from the distribution F x. Consider functional Q(F x ) (or Q(F n )). We wish to measure robustness of that functional.

Influence function measures the effect on functional Q when the underlying distribution deviates slightly from the assumed one. IF(y, Q, F x ) = lim 0<ε 0 Q((1 ε)f x +εδ y ) Q(F x ), ε where δ y is the cumulative distribution function having all its probability mass at y i.e. { 0, t < y, δ y (t) = 1, t y.

Influence function measures the effect of point-mass contamination, and thus it is considered as a measure of local robustness. A functional with bounded influence function (with respect to for example L 2 norm) is considered as robust and desirable.

, Examples Mean: IF(y,µ, F x ) = y µ(f x ). (Proof...) Covariance matrix: IF(y,Σ, F x ) = (y µ)(y µ) T Σ(F x ).

Empirical

Empirical Empirical influence function (also called the sensitivity curve) is a measure of the dependence of the estimator on the value of one of the points in the sample. The empirical influence function can be seen as an estimate of the theoretical influence function.

Empirical Let X = {x 1, x 2,..., x n }, and let X y = {x 1, x 2,..., x n, y}. Now IF E (y, Q, F n ) = Q((1 1 n+1 )F n + 1 n+1 δ y) Q(F n ) 1 n+1 = (n+1)(q(x y ) Q(X)).

Empirical, Example Empirical influence function of the mean...

Another very often used measure of robustness is the breakdown point. Whereas influence function measures local robustness, the breakdown point can be seen as a measure of global robustness.

Let X n = {x 1, x 2,..., x n }, where x 1, x 2,..., x n are n independent and identically distributed observations from the distribution F x. Assume that m < n and replace x 1, x 2,..., x m with x 1, x 2,..., x m. Let X n = {x 1, x 2,..., x m, x m+1,..., x n }. Now, the maximum bias maxbias(m, X n, Q) = sup d(q(x n ), Q(Xn )), x1,x 2,...,x m where d(, ) denotes some distance function (for example the euclidian distance). The finite sample breakdown point is now given by BP(Q, n) = min m { m n maxbias(m, X n, Q) = }, and the (asymptotic) breakdown point BP(Q) = lim n BP(Q, n).

A functional with large breakdown point is considered as robust. If BP(Q) = 1 2, then Q is very robust (according to its breakdown point), and if BP(Q) = 0, then Q is very nonrobust. When the value is in between 1 2 and 0, then it is a matter of taste ;-), but values close to 1 2 are desirable.

, Example Mean, covariance matrix, median...

If the data can be assumed to arise from elliptical distribution, then principal component analysis can be robustified by replacing the sample covariance matrix with some robust scatter estimate. The reason for that is that, under elliptical distribution, all scatter estimates do estimate the same population quantity (up to the scale). Note that in general (without ellipticity assumption) this does not hold!

Minimum Covariance Determinant (MCD) Method The determinant (volume) of a covariance matrix, can be seen as a measure of total variation of the data, and it is then called the generalized variance. Data points that are far away from the data cloud increase the volume of the covariance matrix.

Minimum Covariance Determinant (MCD) Method Minimum Covariance Determinant (MCD) method is a well-known method for robustifying the estimation of the covariance matrix, and the mean vector, under the assumption of multivariate ellipticity. MCD method is based on considering all subsets containing p% (usually 50%) of the original observations, and estimating the covariance matrix, and the mean vector, on the data of the subset associated with the smallest covariance matrix determinant. This is equivalent to finding the sub-sample with the smallest multivariate spread. The MCD sample covariance matrix, and the MCD sample mean vector, are then defined as the sample covariance matrix (up to the scale), and the sample mean vector, computed over this sub-sample.

Under the ellipticity assumption, PCA can be performed using the MCD scatter estimate instead of the traditional sample covariance matrix. MCD estimates are very robust, and thus as a consequence, robust PCA is obtained. Note that MCD is not the only possible robust scatter estimate - there exists several robust scatter estimates that all estimate the same population quantity (up to the scale) under the assumption of multivariate ellipticity.

Words of Warning

Some Words of Warning It is possible that a functional Q has bounded influence function, but its breakdown point is 0!, based on some robust scatter matrix, can be performed under the assumption of multivariate ellipticity. If the ellipticity assumption does not hold, instead of estimating the PCA transformation matrix Γ, one may be estimating some other population quantity.

Next Week Next week we will talk about bivariate correspondence analysis (CA).

I K. V. Mardia, J. T. Kent, J. M. Bibby, Multivariate Analysis, Academic Press, London, 2003 (reprint of 1979).

II R. V. Hogg, J. W. McKean, A. T. Craig, Introduction to Mathematical Statistics, Pearson Education, Upper Sadle River, 2005. R. A. Horn, C. R. Johnson, Matrix Analysis, Cambridge University Press, New York, 1985. R. A. Horn, C. R. Johnson, Topics in Matrix Analysis, Cambridge University Press, New York, 1991.

III P. J. Rousseeuw, Multivariate estimation with high breakdown point, Mathematical Statistics and Applications 8 (W. Grossmann, G. Pug, I. Vincze, W. Wertz, eds.), p. 283 297, 1985. P. J. Rousseeuw, K. Van Driessen, A fast algorithm for the minimum covariance determinant estimator, Technometrics 41, p. 212 223, 1999.