Robust scale estimation with extensions

Similar documents
Robust estimation of scale and covariance with P n and its application to precision matrix estimation

Efficient and Robust Scale Estimation

Principal Component Analysis (PCA) Principal Component Analysis (PCA)

MULTIVARIATE HOMEWORK #5

Independent component analysis for functional data

IMPROVING THE SMALL-SAMPLE EFFICIENCY OF A ROBUST CORRELATION MATRIX: A NOTE

PRINCIPAL COMPONENTS ANALYSIS

Robust methods and model selection. Garth Tarr September 2015

Lecture 12 Robust Estimation

Evaluation of robust PCA for supervised audio outlier detection

ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY

Robust and sparse Gaussian graphical modelling under cell-wise contamination

Principal Components Theory Notes

The SAS System 18:28 Saturday, March 10, Plot of Canonical Variables Identified by Cluster

Creative Data Mining

A ROBUST METHOD OF ESTIMATING COVARIANCE MATRIX IN MULTIVARIATE DATA ANALYSIS G.M. OYEYEMI *, R.A. IPINYOMI **

1. Introduction to Multivariate Analysis

MULTIVARIATE TECHNIQUES, ROBUSTNESS

Multivariate Statistics

Principal Components Analysis (PCA)

Outline. Administrivia and Introduction Course Structure Syllabus Introduction to Data Mining

Unsupervised Learning: Dimensionality Reduction

Robustifying Robust Estimators

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 4: Measures of Robustness, Robust Principal Component Analysis

Last time: PCA. Statistical Data Mining and Machine Learning Hilary Term Singular Value Decomposition (SVD) Eigendecomposition and PCA

Descriptive Statistics

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

4 Statistics of Normally Distributed Data

Evaluation of robust PCA for supervised audio outlier detection

Measuring robustness

Linear Dimensionality Reduction

MIT Spring 2015

Spatial autocorrelation: robustness of measures and tests

Fast and robust bootstrap for LTS

Why the Rousseeuw Yohai Paradigm is One of the Largest and Longest Running Scientific Hoaxes in History

Symmetrised M-estimators of multivariate scatter

Introduction Robust regression Examples Conclusion. Robust regression. Jiří Franc

J. W. LEE (Kumoh Institute of Technology, Kumi, South Korea) V. I. SHIN (Gwangju Institute of Science and Technology, Gwangju, South Korea)

Statistical Data Analysis

Applying the Q n Estimator Online

Multivariate analysis of genetic data: exploring groups diversity

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 1: Introduction, Multivariate Location and Scatter

Robust estimation of principal components from depth-based multivariate rank covariance matrix

Cellwise robust regularized discriminant analysis

Cellwise robust regularized discriminant analysis

MLCC 2015 Dimensionality Reduction and PCA

Influence functions of the Spearman and Kendall correlation measures

The Minimum Regularized Covariance Determinant estimator

FAST CROSS-VALIDATION IN ROBUST PCA

Second-Order Inference for Gaussian Random Curves

LARGE SAMPLE BEHAVIOR OF SOME WELL-KNOWN ROBUST ESTIMATORS UNDER LONG-RANGE DEPENDENCE

Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles

Robust and sparse estimation of the inverse covariance matrix using rank correlation measures

Independent Component (IC) Models: New Extensions of the Multinormal Model

Machine Learning - MT & 14. PCA and MDS

Applied Multivariate and Longitudinal Data Analysis

Robust Covariance Matrix Estimation With Canonical Correlation Analysis

APPLICATIONS OF A ROBUST DISPERSION ESTIMATOR. Jianfeng Zhang. Doctor of Philosophy in Mathematics, Southern Illinois University, 2011

Lecture 5: Classification

Scatter Matrices and Independent Component Analysis

Unit 2. Describing Data: Numerical

Segmentation of Subspace Arrangements III Robust GPCA

arxiv: v1 [stat.me] 8 May 2016

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

Stahel-Donoho Estimation for High-Dimensional Data

LEC 4: Discriminant Analysis for Classification

MULTICHANNEL SIGNAL PROCESSING USING SPATIAL RANK COVARIANCE MATRICES

Part I. Linear Discriminant Analysis. Discriminant analysis. Discriminant analysis

robustness, efficiency, breakdown point, outliers, rank-based procedures, least absolute regression

Statistical Machine Learning

odhady a jejich (ekonometrické)

5. Linear Regression

Discriminant Analysis (DA)

FAULT DETECTION AND ISOLATION WITH ROBUST PRINCIPAL COMPONENT ANALYSIS

Inference based on robust estimators Part 2

Robust Tools for the Imperfect World

ON THE CALCULATION OF A ROBUST S-ESTIMATOR OF A COVARIANCE MATRIX

5. Linear Regression

Improvement of The Hotelling s T 2 Charts Using Robust Location Winsorized One Step M-Estimator (WMOM)

Robust methods for multivariate data analysis

Discriminant Analysis

Multivariate outlier detection based on a robust Mahalanobis distance with shrinkage estimators

Introduction to Robust Statistics. Anthony Atkinson, London School of Economics, UK Marco Riani, Univ. of Parma, Italy

-Principal components analysis is by far the oldest multivariate technique, dating back to the early 1900's; ecologists have used PCA since the

Robust model selection criteria for robust S and LT S estimators

Linear Discriminant Analysis Based in part on slides from textbook, slides of Susan Holmes. November 9, Statistics 202: Data Mining

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

arxiv: v3 [stat.me] 2 Feb 2018 Abstract

Journal of Statistical Software

Convergence of Eigenspaces in Kernel Principal Component Analysis

Robust multivariate methods in Chemometrics

Supervised Learning. Regression Example: Boston Housing. Regression Example: Boston Housing

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Small Sample Corrections for LTS and MCD

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

z = β βσβ Statistical Analysis of MV Data Example : µ=0 (Σ known) consider Y = β X~ N 1 (β µ, β Σβ) test statistic for H 0β is

Jyh-Jen Horng Shiau 1 and Lin-An Chen 1

Chapter 4: Factor Analysis

Parallel Computation of High Dimensional Robust Correlation and Covariance Matrices

THE BREAKDOWN POINT EXAMPLES AND COUNTEREXAMPLES

Transcription:

Robust scale estimation with extensions Garth Tarr, Samuel Müller and Neville Weber School of Mathematics and Statistics THE UNIVERSITY OF SYDNEY

Outline The robust scale estimator P n Robust covariance estimation Robust covariance matrices Conclusion and key references

Outline The robust scale estimator P n Robust covariance estimation Robust covariance matrices Conclusion and key references

History of scale efficiencies at the Gaussian (n = 20) 1.00 SD 0.85 P n Relative Efficiency 0.67 0.38 IQR MAD Q n Year 1894 1931 1974 1993 2012

Pairwise mean scale estimator: P n Consider the U-statistic, based on the pairwise mean kernel, ( ) n 1 X i + X j U n (X) =. 2 2 Let H(t) = P ((X i + X j )/2 t) be the cdf of the kernels with corresponding empirical distribution function, ( ) n 1 { } Xi + X j H n (t) = I t, for t R. 2 2 i<j i<j Definition (Interquartile range of pairwise means) P n = c [ Hn 1 (0.75) Hn 1 (0.25) ], where c 1.048 is a correction factor to ensure P n is consistent for the standard deviation when the underlying observations are Gaussian.

Influence curve SD P n IC(x; T, Φ) -1 0 1 2-4 -2 0 2 4 x

Asymptotic variance and relative efficiency Tarr, Müller and Weber (2012) show that P n is asymptotically normal with variance, V, given by the expected square of the influence function. When the underlying data are Gaussian, V (P n, Φ) = IC(x; P n, Φ) 2 d Φ(x) = 0.579. This equates to an asymptotic efficiency of 0.86.! We have a robust scale estimator that is still quite efficient at the Gaussian distribution but how can we extend this to the multivariate setting?

Outline The robust scale estimator P n Robust covariance estimation Robust covariance matrices Conclusion and key references

From scale to covariance: the GK device Gnanadesikan and Kettenring (1972) relate scale and covariance using the following identity, cov(x, Y ) = 1 [var(αx + βy ) var(αx βy )], 4αβ where X and Y are random variables. In general, X and Y can have different units, so we set α = 1/ var(x) and β = 1/ var(y ). Replacing variance with P 2 n we can similarly construct, γ P (X, Y ) = 1 [ P 2 4αβ n (αx + βy ) Pn(αX 2 βy ) ], where α = 1/P n (X) and β = 1/P n (Y ).

Robustness properties Breakdown value P n can breakdown if 25% of the pairwise means are contaminated. This can occur when 13.4% of observations are contaminated. γ P inherits the 13.4% breakdown value from P n. Optionally, adaptive trimming = 50% breakdown value. Influence curve Following Genton and Ma (1999), we have shown that influence curve and therefore the gross error sensitivity of γ P can be derived from the IC of P n. Key point: the IC is bounded and hence the gross error sensitivity is finite.

Outline The robust scale estimator P n Robust covariance estimation Robust covariance matrices Conclusion and key references

Covariance matrices Good covariance matrices should be: 1. Positive-semidefinite. 2. Affine equivariant. That is, if Ĉ is a covariance matrix estimator, A and a are constants and X is random then, Ĉ(AX + a) = AĈ(X)A.! How can we ensure that a matrix full of pairwise covariances is a good covariance matrix? Maronna and Zamar (2002) outline the OGK routine which ensures that the resulting covariance matrix is positive-definite and approximately affine-equivariant matrix.

Application: Principal Component Analysis Definition (Principal Component Analysis) PCA is a dimension reduction technique that transforms a set of variables into a set of principal components (uncorrelated variables). PCA can be formulated as an eigenvalue decomposition of a covariance matrix and so is inherently susceptible to outliers. Example (Versicolor iris data) 50 observations on 4 variables: length and width of petals and sepals.

Iris data Sepal Length 2.0 2.4 2.8 3.2 1.0 1.2 1.4 1.6 1.8 5.0 6.0 7.0 2.0 2.6 3.2 Sepal Width Petal Length 3.0 4.0 5.0 1.0 1.4 1.8 Petal Width 5.0 5.5 6.0 6.5 7.0 3.0 3.5 4.0 4.5 5.0

Covariance matrices for the iris data Classical Sepal Petal covariance Length Width Length Width Sepal.Length 0.27 Sepal.Width 0.09 0.10 Petal.Length 0.18 0.08 0.22 Petal.Width 0.06 0.04 0.07 0.04 Robust Sepal Petal before OGK Length Width Length Width Sepal Length 0.28 Sepal Width 0.09 0.14 Petal Length 0.20 0.10 0.23 Petal Width 0.06 0.05 0.08 0.03

Covariance matrices for the iris data Robust Sepal Petal before OGK Length Width Length Width Sepal Length 0.28 Sepal Width 0.09 0.14 Petal Length 0.20 0.10 0.23 Petal Width 0.06 0.05 0.08 0.03 Robust Sepal Petal after OGK Length Width Length Width Sepal Length 0.29 Sepal Width 0.09 0.10 Petal Length 0.19 0.09 0.23 Petal Width 0.06 0.04 0.08 0.04

Iris data loadings Second PC -0.1-0.05 0.0 0.05 0.1 Classical Sepal Length Petal Length Petal Width Sepal Width -0.3-0.2-0.1 0.0 First PC Second PC -0.1-0.05 0.0 0.05 0.1 Robust Sepal Length Petal Width Petal Length Sepal Width -0.3-0.2-0.1 0.0 First PC

2 new obs. 2.0 2.5 3.0 3.5 0.5 1.0 1.5 Sepal Length 4.5 5.5 6.5 2.0 2.5 3.0 3.5 Sepal Width Petal Length 1 2 3 4 5 0.5 1.5 Petal Width 4.5 5.5 6.5 1 2 3 4 5

Iris data loadings (2 obs. from different species) Second PC -0.1-0.05 0.0 0.05 0.1 Classical (no outliers) Sepal Length Petal Length Sepal Width -0.3-0.2-0.1 0.0 First PC Second PC -0.1-0.05 0.0 0.05 0.1 Classical (with outliers) Sepal Width Sepal Length Petal Length -0.3-0.2-0.1 0.0 First PC

Iris data loadings (2 obs. from different species) Second PC -0.1-0.05 0.0 0.05 0.1 Robust (no outliers) Sepal Length Petal Length Sepal Width -0.3-0.2-0.1 0.0 First PC Second PC -0.1-0.05 0.0 0.05 0.1 Robust (with outliers) Sepal Length Sepal Width Petal Length -0.3-0.2-0.1 0.0 First PC

Extreme outlier Sepal Length 2.0 2.4 2.8 3.2 1.0 1.2 1.4 1.6 1.8 5.0 6.0 7.0 2.0 2.6 3.2 Sepal Width Petal Length 5 15 25 1.0 1.4 1.8 Petal Width 5.0 5.5 6.0 6.5 7.0 5 10 15 20 25

Iris data loadings (extreme outlier petal length) Second PC -0.2-0.1 0.0 0.1 0.2 Classical (no outlier) Sepal Length Petal Length -0.8-0.6-0.4-0.2 0.0 First PC Second PC -0.2-0.1 0.0 0.1 0.2 Classical (with outlier) Sepal Length Sepal Width Petal Width Petal Length -0.8-0.6-0.4-0.2 0.0 First PC

Iris data loadings (extreme outlier petal length) Second PC -0.2-0.1 0.0 0.1 0.2 Robust (no outlier) Sepal Length Petal Length -0.5-0.4-0.3-0.2-0.1 0.0 First PC Second PC -0.2-0.1 0.0 0.1 0.2 Robust (with outlier) Sepal Length Sepal Width Petal Length -0.5-0.4-0.3-0.2-0.1 0.0 First PC

Outline The robust scale estimator P n Robust covariance estimation Robust covariance matrices Conclusion and key references

Summary 1. Aim Efficient, robust and widely applicable scale and covariance estimators. 2. Method P n scale estimator with the GK identity and an orthogonalisation procedure for pairwise covariance matrices. 3. Results 86% asymptotic efficiency at the Gaussian distribution. Robustness properties flow through to covariance estimates and beyond.

References Genton, M. and Ma, Y. (1999). Robustness properties of dispersion estimators. Statistics & Probability Letters, 44(4):343 350. Gnanadesikan, R. and Kettenring J. R. (1972). Robust estimates, residuals and outlier detection with multiresponse data. Biometrics, 28(1):81 124. Maronna, R. and Zamar, R., (2002). Robust estimates of location and dispersion for high-dimensional datasets. Technometrics, 44(4):307 317. Rousseeuw, P. and Croux, C. (1993). Alternatives to the median absolute deviation. Journal of the American Statistical Association, 88(424):1273 1283. Tarr, G., Müller, S. and Weber, N.C., (2012). A robust scale estimator based on pairwise means. Journal of Nonparametric Statistics, 24(1):187 199.