Cellwise robust regularized discriminant analysis

Size: px

Start display at page:

Download "Cellwise robust regularized discriminant analysis"

Aileen Gilmore
5 years ago
Views:

1 Cellwise robust regularized discriminant analysis JSM 2017 Stéphanie Aerts University of Liège, Belgium Ines Wilms KU Leuven, Belgium Cellwise robust regularized discriminant analysis 1

2 Discriminant analysis X = {x 1,..., x N } of dimension p, splitted into K groups, each having n k observations Goal : Classify new data x π k prior probability N p (µ k, Σ k ) conditional distribution Cellwise robust regularized discriminant analysis 2

3 Discriminant analysis X = {x 1,..., x N } of dimension p, splitted into K groups, each having n k observations Goal : Classify new data x π k prior probability N p (µ k, Σ k ) conditional distribution Quadratic discriminant analysis (QDA) : ( max (x µk ) T ) Θ k (x µ k ) + log(det Θ k ) + 2 log π k k where Θ k := Σ 1 k Linear discriminant analysis (LDA) : Homoscedasticity : Θ k = Θ k Cellwise robust regularized discriminant analysis 2

4 Discriminant analysis In practice, the parameters µ k, Θ k or Θ are often estimated by the sample means x k the inverse of the sample covariance matrices Σ k the inverse of the sample pooled covariance matrix Σ pool = 1 N K K k=1 n k Σ k Cellwise robust regularized discriminant analysis 3

5 Example - Phoneme dataset Hastie et al., 2009 K = 2 phonemes: aa (like in barn) or ao (like in more) p = 256: log intensity of the sound across 256 frequencies N = 1717 records of a male voice Correct classification performance s-lda s-qda Cellwise robust regularized discriminant analysis 4

6 Example - Phoneme dataset Hastie et al., 2009 K = 2 phonemes: aa (like in barn) or ao (like in more) p = 256: log intensity of the sound across 256 frequencies N = 1717 records of a male voice Correct classification performance s-lda s-qda Σ 1 k inaccurate when p/n k is large, not computable when p > n k Σ 1 pool inaccurate when p/n is large, not computable when p > N Cellwise robust regularized discriminant analysis 4

7 Objectives Propose a family of discriminant methods that, unlike the classical approach, are 1 computable and accurate in high dimension 2 cover the path between LDA and QDA 3 robust against cellwise outliers Cellwise robust regularized discriminant analysis 5

8 1. Computable in high dimension Graphical Lasso QDA (Xu et al., 2014) Step 1 : Compute the sample means x k and covariance matrices Σ k Step 2 : Graphical Lasso (Friedman et al, 2008) to estimate Θ 1,..., Θ K : ) arg max n k log det(θ k ) n k tr (Θ k Σk λ 1 θ k,ij Θ k Step 3 : Plug x 1,..., x K and Θ 1,..., Θ K into the quadratic rule Note : Use pooled covariance matrix for Graphical Lasso LDA. i j Cellwise robust regularized discriminant analysis 6

9 QDA does not exploit similarities LDA ignores the group specificities Cellwise robust regularized discriminant analysis 7

10 2. Covering path between LDA and QDA Joint Graphical Lasso DA (Price et al., 2015) Step 1 : Compute the sample means x k and covariance matrices Σ k Step 2 : Joint Graphical Lasso (Danaher et al, 2014) to estimate Θ 1,..., Θ K : max Θ 1,...,Θ K K n k log det(θ k ) n k tr(θ k Σk ) λ 1 k=1 k=1 i j k<k i,j K θ k,ij λ 2 θ k,ij θ k,ij, Step 3 : Plug x 1,..., x K and Θ 1,..., Θ K into the quadratic discriminant rule Cellwise robust regularized discriminant analysis 8

11 Lack of robustness Cellwise robust regularized discriminant analysis 9

12 3. Robustness against cellwise outliers Robust Joint Graphical Lasso DA Step 1 : Compute robust mean m k and covariance matrice S k estimates Step 2 : Joint Graphical Lasso to estimate Θ 1,..., Θ K max Θ 1,...,Θ K K K n k log det(θ k ) n k tr(θ k S k ) λ 1 θ k,ij λ 2 θ k,ij θ k,ij, k=1 k=1 i j k<k i,j Step 3 : Plug m 1,..., m K and Θ 1,..., Θ K into the quadratic discriminant rule. Cellwise robust regularized discriminant analysis 10

13 Rowwise and cellwise contamination Cellwise robust regularized discriminant analysis 11

14 Rowwise and cellwise contamination Cellwise robust regularized discriminant analysis 11

15 Rowwise and cellwise contamination m k : vector of marginal medians S k : cellwise robust covariance matrices Cellwise robust regularized discriminant analysis 11

16 Cellwise robust covariance estimators s s 1i... s 1p... S k = s i1... s ij... s ip... s p1... s pj... s pp s ij = scale(x i ) scale(x j ) ĉorr(x i, X j ) scale(.) : Q n -estimator (Rousseeuw and Croux,93) ĉorr(.,.) : Kendall s correlation ĉorr K (X i, X j ) = 2 n(n 1) l<m ( ) sign (xl i xm)(x i j l xm) j. (see Croux and Öllerer, 2015; Tarr et al., 2016) Cellwise robust regularized discriminant analysis 12

17 Simulation study Setting K = 10 groups n k = 30 p = 30 M = 1000 training and test sets Classification Performance Average percentage of correct classification Estimation accuracy Average Kullback-Leibler distance : KL( Θ 1,..., Θ K ; Θ 1,..., Θ K ) = ( K k=1 log det( Θ k Θ 1 k ) + tr( Θ k Θ 1 k ) ) Kp. Cellwise robust regularized discriminant analysis 13

18 Uncontaminated scheme Non-robust estimators s-lda s-qda GL-LDA GL-QDA JGL-DA p = 30 CC 77.7 NA KL NA Robust estimators r-lda r-qda rgl-lda rgl-qda rjgl-da p = 30 CC KL Cellwise robust regularized discriminant analysis 14

19 Contaminated scheme: 5% of cellwise contamination Correct classification percentages, p = s LDA r LDA s QDA r QDA GL LDA rgl LDA GL QDA GL QDA JGL DA rjgl DA Cellwise robust regularized discriminant analysis 15

20 Example - Phoneme dataset K = 2 p = 256 N = 1717 Correct classification performance s-lda s-qda GL-LDA GL-QDA JGL-DA r-lda r-qda rgl-lda rgl-qda rjgl-da N train = 1030, N test = 687, averaged over 10 splits Cellwise robust regularized discriminant analysis 16

21 Conclusion The proposed discriminant methods : 1 are computable in high dimension 2 cover the path between LDA and QDA 3 are robust against cellwise outliers 4 detect rowwise and cellwise outliers Code publicly available Cellwise robust regularized discriminant analysis 17

22 References S. Aerts, I. Wilms. Cellwise robust regularized discriminant analysis. FEB Research Report, KBI 1701, C. Croux and V. Öllerer. Robust high-dimensional precision matrix estimation, Modern Multivariate and Robust Methods. Springer, 2015 P. Danaher, P. Wang, and D. Witten. The joint graphical lasso for inverse covariance estimation across multiple classes. Journal of the Royal Statistical Society, Series B, 76: , B. Price, C. Geyer, and A. Rothman. Ridge fusion in statistical learning. Journal of Computational and Graphical Statistics, 24(2): , P. Rousseeuw and C. Croux. Alternatives to the median absolute deviation. Journal of the American Statistical Association, 88(424): , G. Tarr, S. Müller, and N.C. Weber. Robust estimation of precision matrices under cellwise contamination. Computational Statistics and Data Analysis, 93: , B. Xu, K. Huang, King I., C. Liu, J. Sun, and N. Satoshi. Graphical lasso quadratic discriminant function and its application to character recognition. Neurocomputing, 129: 33 40, Cellwise robust regularized discriminant analysis 18

Cellwise robust regularized discriminant analysis

Cellwise robust regularized discriminant analysis Ines Wilms (KU Leuven) and Stéphanie Aerts (University of Liège) ICORS, July 2017 Wilms and Aerts Cellwise robust regularized discriminant analysis 1 Discriminant