Sparse Principal Component Analysis for multiblocks data and its extension to Sparse Multiple Correspondence Analysis

Size: px
Start display at page:

Download "Sparse Principal Component Analysis for multiblocks data and its extension to Sparse Multiple Correspondence Analysis"

Transcription

1 Sparse Principal Component Analysis for multiblocks data and its extension to Sparse Multiple Correspondence Analysis Anne Bernard 1,5, Hervé Abdi 2, Arthur Tenenhaus 3, Christiane Guinot 4, Gilbert Saporta 5 1 CE.R.I.E.S, Neuilly sur Seine, France 2 Department of Brain and Behavioral Sciences, The University of Texas at Dallas, Richardson, TX, USA 3 SUPELEC, Gif-sur-Yvette, France 4 Université Francois Rabelais, département d informatique, Tours, France 5 Laboratoire Cedric, CNAM, Paris June 13, 2013 Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 1 / 33

2 Background Variable selection and dimensionality reduction are necessary in different application domains, and more specifically in genetics. Quantitative data analysis: PCA Categorical data analysis: MCA In case of high dimensional data (n p): components (PCs) difficult to interpret. Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 2 / 33

3 Background Variable selection and dimensionality reduction are necessary in different application domains, and more specifically in genetics. Quantitative data analysis: PCA Categorical data analysis: MCA In case of high dimensional data (n p): components (PCs) difficult to interpret. A way for obtaining simple structures Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 2 / 33

4 Background Variable selection and dimensionality reduction are necessary in different application domains, and more specifically in genetics. Quantitative data analysis: PCA Categorical data analysis: MCA In case of high dimensional data (n p): components (PCs) difficult to interpret. A way for obtaining simple structures 1. Sparse methods like sparse PCA for quantitative data: PCs are combinations of few original variables Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 2 / 33

5 Background Variable selection and dimensionality reduction are necessary in different application domains, and more specifically in genetics. Quantitative data analysis: PCA Categorical data analysis: MCA In case of high dimensional data (n p): components (PCs) difficult to interpret. A way for obtaining simple structures 1. Sparse methods like sparse PCA for quantitative data: PCs are combinations of few original variables extension for multiblocks data structure (GSPCA) Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 2 / 33

6 Background Variable selection and dimensionality reduction are necessary in different application domains, and more specifically in genetics. Quantitative data analysis: PCA Categorical data analysis: MCA In case of high dimensional data (n p): components (PCs) difficult to interpret. A way for obtaining simple structures 1. Sparse methods like sparse PCA for quantitative data: PCs are combinations of few original variables extension for multiblocks data structure (GSPCA) 2. Extension for categorical variables: Development of a new sparse method (sparse MCA) Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 2 / 33

7 Data and methods Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 3 / 33

8 Data and methods Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 3 / 33

9 Uniblock data structure Principal Component Analysis (PCA): Each PC is a linear combination of all the original variables, so loadings are nonzero PCA results difficult to interpret Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 4 / 33

10 Uniblock data structure Principal Component Analysis (PCA): Each PC is a linear combination of all the original variables, so loadings are nonzero PCA results difficult to interpret Multiple Correspondence Analysis (MCA): Counterpart of PCA for categorical data Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 4 / 33

11 Uniblock data structure Principal Component Analysis (PCA): Each PC is a linear combination of all the original variables, so loadings are nonzero PCA results difficult to interpret Multiple Correspondence Analysis (MCA): Counterpart of PCA for categorical data Sparse PCA (SPCA): Modified PCs with sparse loadings using regularized SVD Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 4 / 33

12 Principal Component Analysis (PCA) X: matrix of quantitative variables PCA can be computed via the SVD of X Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 5 / 33

13 Principal Component Analysis (PCA) X: matrix of quantitative variables PCA can be computed via the SVD of X Singular Value Decomposition of X (SVD): X = UDV T with U T U = V T V = I (1) where Z=UD the PCs and V the corresponding nonzero loadings PCA results difficult to interpret Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 5 / 33

14 Principal Component Analysis (PCA) X: matrix of quantitative variables PCA can be computed via the SVD of X Singular Value Decomposition of X (SVD): X = UDV T with U T U = V T V = I (1) where Z=UD the PCs and V the corresponding nonzero loadings PCA results difficult to interpret SVD as low rank approximation of matrices: min X X (1) 2 X (1) 2 = min X ũ,ṽ ũṽt 2 2 (2) X (1) = ũṽ T is the best rank-one matrix approximation of X with: ũ = δ 1 u 1 the first component and ṽ = v 1 the first loading vector. Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 5 / 33

15 Sparse Principal Component Analysis (SPCA) Challenge: facilitate interpretation of PCA results How? PCs with a lot of zero loadings (sparse loadings) Several approaches: ScoTLass by Jolliffe, I.T. et al. (2003), SPCA by Zou, H. et al. (2006) SPCA-rSVD by Shen, H. and Huang, J.Z. (2008) Principle: use connection of PCA with SVD low rank matrix approximation problem with regularization penalty on the loadings Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 6 / 33

16 Regularization in regression Lasso penalty: (Tibshirani, 1996) a variable selection technique produces sparse models Imposes the L 1 norm on the linear regression coefficients Regression with L 1 penalization ˆβ lasso = min y Xβ 2 + λ β J β j with P λ (β) = λ j=1 J β j j=1 If λ=0: usual regression (non-zero coefficients) If λ increases: some coefficients sets to zero selection of variables but the number of variables selected is bounded by the number of units Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 7 / 33

17 Regularization in regression Other kinds of penalties: Hard thresholding penalty: (Antoniadis, 1997) P λ ( β ) = λ 2 ( β λ) 2 I( β < λ) SCAD penalty (Smoothly Clipped Absolute Deviation penalty): (Fan and Li, 2001) ( P λ (β) = λ I( β λ) + (aλ β) ) + I( β > λ) (a 1)λ Group Lasso penalty: (Yuan and Lin, 2007) P λ (β) = λ K k=1 with a > 2, β > 0 J [k] β k 2 (3) with J [k] the number of variables in the group k. Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 8 / 33

18 Sparse PCA via regularized SVD (SPCA-rSVD) Low rank matrix approximation problem min ũ,ṽ X ũṽt 2 2 (4) Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 9 / 33

19 Sparse PCA via regularized SVD (SPCA-rSVD) Low rank matrix approximation problem with X ũṽ T 2 2 min ũ,ṽ X ũṽt 2 2 (4) ( ) = tr (X ũṽ) T (X ũṽ) ( ) = X tr ṽ T ṽ 2 tr (X ) T ũṽ T = X J j=1 ( ṽ 2 j 2(X T ũ) j ṽ j ) (5) Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 9 / 33

20 Sparse PCA via regularized SVD (SPCA-rSVD) Low rank matrix approximation problem +regularization penalty function applied on ṽ with X ũṽ T 2 2 min ũ,ṽ X ũṽt 2 2+P λ (ṽ) (4) ( ) = tr (X ũṽ) T (X ũṽ) ( ) = X tr ṽ T ṽ 2 tr (X ) T ũṽ T = X J j=1 ( ṽ 2 j 2(X T ũ) j ṽ j ) (5) Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 9 / 33

21 Sparse PCA via regularized SVD (SPCA-rSVD) Low rank matrix approximation problem +regularization penalty function applied on ṽ min ũ,ṽ X ũṽt 2 2+P λ (ṽ) (4) with ( ) X ũṽ T 2 2+P λ (ṽ) = tr (X ũṽ) T (X ũṽ) +P λ (ṽ) (5) ( ) ( ) = X tr ṽ T ṽ 2 tr X T ũṽ T +P λ (ṽ) = X = X J j=1 J j=1 ( ṽ 2 j 2(X T ũ) j ṽ j ) + J P λ (ṽ j ) j=1 ( ) ṽj 2 2(X T ũ) j ṽ j +P λ (ṽ j ) Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 9 / 33

22 Sparse PCA via regularized SVD (SPCA-rSVD) Iterative procedure: 1. ṽ fixed: minimizer of (4) is ũ = Xṽ/ Xṽ 2. ũ fixed: minimizer of (4) is ṽ = h λ (X T ũ) with h λ (X T ũ) = min ṽ j ṽj 2 2(X T ũ) j ṽ j +P λ (ṽ j ) ( ) (6) For the Lasso penalty: h λ (X T ũ) = sign(x T ũ)( X T ũ λ) + For the hard thresholding penalty: h λ (X T ũ) = I( X T ũ > λ)x T ũ Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 10 / 33

23 Sparse PCA via regularized SVD (SPCA-rSVD) Iterative procedure: 1. ṽ fixed: minimizer of (4) is ũ = Xṽ/ Xṽ 2. ũ fixed: minimizer of (4) is ṽ = h λ (X T ũ) with h λ (X T ũ) = min ṽ j ṽj 2 2(X T ũ) j ṽ j +P λ (ṽ j ) ( ) (6) For the Lasso penalty: h λ (X T ũ) = sign(x T ũ)( X T ũ λ) + For the hard thresholding penalty: h λ (X T ũ) = I( X T ũ > λ)x T ũ min obtained by applying h λ to the vector X T ũ componentwise v 1 is obtained. Subsequently sparse loadings v i (i > 1) obtained via rank-one approximation of residual matrices. Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 10 / 33

24 Multiblock data structure Group Sparse PCA (GSPCA): Challenge: To create sparseness in multiblock data structure Principle: Modified PCs with sparse loadings using the group lasso penalty Result: All variables of a block is selected or removed Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 11 / 33

25 Group Sparse PCA Group Sparse PCA: Extension of SPCA-rSVD for blocks of quantitative variables Challenge: select groups of continuous variables zero coefficients to entire blocks of variables How? A compromise between SPCA-rSVD and group Lasso Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 12 / 33

26 Group Sparse PCA X: matrix of quantitative variables divided into K sub-matrices SVD of X: X = [ ] X [1]... X [k]... X [K] = UDV T [ ] = UD V[1] T... VT [k]... VT [K] where each line of V T is a vector divided into K blocks. Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 13 / 33

27 Group Sparse PCA Remember SPCA-rSVD: min X ũ,ṽ ũṽt P λ (ṽ) (7) ) ũ fixed: ṽ = h λ (X T ũ) = minṽj (ṽj 2 2(X T ũ) j ṽ j + P λ (ṽ j ) Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 14 / 33

28 Group Sparse PCA Remember SPCA-rSVD: min X ũ,ṽ ũṽt P λ (ṽ) (7) ) ũ fixed: ṽ = h λ (X T ũ) = minṽj (ṽj 2 2(X T ũ) j ṽ j + P λ (ṽ j ) In GSPCA: ( ) ṽ = h λ (X T ũ) = min tr(ṽ [k] ṽ[k] T ) 2 tr(ṽ [k]ũ T X [k] ) + P λ (ṽ [k] ) ṽ [k] Here P λ is the group lasso regularizer: K P λ (ṽ) = λ J [k] ṽ [k] 2 (8) The solution is h λ (X T [k]ũ) = ( k=1 1 λ 2 J[k] X T [k]ũ 2 ) + X T [k]ũ (9) Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 14 / 33

29 Group Sparse PCA Group Sparse PCA Algorithm Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 15 / 33

30 Group Sparse PCA Group Sparse PCA Algorithm 1 Apply the SVD to X and obtain the best rank-one approximation of X as δũṽ T. Set ṽ old = ṽ = [ṽ [1],..., ṽ [K] ] and ũ old = δũ. Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 15 / 33

31 Group Sparse PCA Group Sparse PCA Algorithm 1 Apply the SVD to X and obtain the best rank-one approximation of X as δũṽ T. Set ṽ old = ṽ = [ṽ [1],..., ṽ [K] ] and ũ old = δũ. 2 Update: a) ṽ new = [h λ (X T [1]ũold),..., h λ (X T [K]ũold)] b) ũ new = Xṽ new / Xṽ new Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 15 / 33

32 Group Sparse PCA Group Sparse PCA Algorithm 1 Apply the SVD to X and obtain the best rank-one approximation of X as δũṽ T. Set ṽ old = ṽ = [ṽ [1],..., ṽ [K] ] and ũ old = δũ. 2 Update: a) ṽ new = [h λ (X T [1]ũold),..., h λ (X T [K]ũold)] b) ũ new = Xṽ new / Xṽ new 3 Repeat Step 2 replacing ũ old and ṽ old by ũ new and ṽ new until convergence. Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 15 / 33

33 Group Sparse PCA Group Sparse PCA Algorithm 1 Apply the SVD to X and obtain the best rank-one approximation of X as δũṽ T. Set ṽ old = ṽ = [ṽ [1],..., ṽ [K] ] and ũ old = δũ. 2 Update: a) ṽ new = [h λ (X T [1]ũold),..., h λ (X T [K]ũold)] b) ũ new = Xṽ new / Xṽ new 3 Repeat Step 2 replacing ũ old and ṽ old by ũ new and ṽ new until convergence. Sparse loadings v i (i > 1) are obtained via rank-one approximation of residual matrices. Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 15 / 33

34 Group Sparse PCA Group Sparse PCA Algorithm 1 Apply the SVD to X and obtain the best rank-one approximation of X as δũṽ T. Set ṽ old = ṽ = [ṽ [1],..., ṽ [K] ] and ũ old = δũ. 2 Update: a) ṽ new = [h λ (X T [1]ũold),..., h λ (X T [K]ũold)] b) ũ new = Xṽ new / Xṽ new 3 Repeat Step 2 replacing ũ old and ṽ old by ũ new and ṽ new until convergence. Sparse loadings v i (i > 1) are obtained via rank-one approximation of residual matrices. Choice of λ using cross-validation or an ad-hoc approach. Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 15 / 33

35 Multiblock data structure To select 1 column in the original table (categorical variable X) = To select a block of indicator variables in the complete disjunctive table Sparse MCA (SMCA): Extension of the GSPCA for blocks of indicator variables Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 16 / 33

36 MCA Generalized SVD of F r = F1 c = F T 1 D r = diag(r) D c = diag(c) p [j] =Nb of modalities of variable j F = P Q T with P T MP = Q T WQ = I with F = [F [1]... F [j]... F [J] ] and Q = [Q [1]... Q [j]... Q [J] ] In the case of PCA: M = W = I In the case of MCA: M = D r W = D c Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 17 / 33

37 From MCA to SMCA GSVD as low rank approximation matrices min F F (1) 2 F (1) W = min F p, q p qt 2 W (10) Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 18 / 33

38 From MCA to SMCA GSVD as low rank approximation matrices min F F (1) 2 F (1) W = min F p, q p qt 2 W (10) where F 2 W = tr(m 1 2 FWF T M 1 2 ) is the norm W-generalized The norm is under the constraints of M and W Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 18 / 33

39 From MCA to SMCA GSVD as low rank approximation matrices min F F (1) 2 F (1) W = min F p, q p qt 2 W (10) where F 2 W = tr(m 1 2 FWF T M 1 2 ) is the norm W-generalized The norm is under the constraints of M and W F p q T 2 W = tr (M 1 2 (F p q T )W(F p q T ) T M 1 2 ) (11) Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 18 / 33

40 From MCA to SMCA GSVD as low rank approximation matrices min F F (1) 2 F (1) W = min F p, q p qt 2 W (10) where F 2 W = tr(m 1 2 FWF T M 1 2 ) is the norm W-generalized The norm is under the constraints of M and W F p q T 2 W = tr (M 1 2 (F p q T )W(F p q T ) T M 1 2 ) (11) Sparse MCA problem min p, q F p qt 2 W +P λ( q) p T M p = q t W q = 1 (12) Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 18 / 33

41 From MCA to SMCA GSVD as low rank approximation matrices min F F (1) 2 F (1) W = min F p, q p qt 2 W (10) where F 2 W = tr(m 1 2 FWF T M 1 2 ) is the norm W-generalized The norm is under the constraints of M and W F p q T 2 W = tr (M 1 2 (F p q T )W(F p q T ) T M 1 2 ) (11) Sparse MCA problem min p, q F p qt 2 W +P λ( q) p T M p = q t W q = 1 (12) ( ( ) ) min tr M 1 2 (F p q T )W(F p q T ) T M 1 2 +P λ ( q) p, q (13) Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 18 / 33

42 From MCA to SMCA Remember GSPCA solution ( ) min tr(ṽ [k] ṽ[k] T ) 2 tr(ṽ [k]ũ T X [k] ) + P λ (ṽ [k] ) ṽ [k] ṽ = h λ (X T [k]ũ) = ( 1 λ 2 J[k] X T [k]ũ 2 ) + X T [k]ũ Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 19 / 33

43 From MCA to SMCA Remember GSPCA solution ( ) min tr(ṽ [k] ṽ[k] T ) 2 tr(ṽ [k]ũ T X [k] ) + P λ (ṽ [k] ) ṽ [k] ṽ = h λ (X T [k]ũ) = ( Sparse MCA solution 1 λ 2 J[k] X T [k]ũ 2 ) + X T [k]ũ ( ) min tr( q [k] W q [k] q T [k] ) 2 tr(mf [k]w [k] q [k] p T ) + P λ ( q [k] ) [k] q = h λ (F T [k] M p) = ( J[k] 1 λ 2 F T [k] M p W ) + F T [k] M p Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 19 / 33

44 Sparse MCA Sparse MCA Algorithm 1 Apply the GSVD to F and obtain the best rank-one approximation of F as δ p q T. Set p old = δ p and q old = q. 2 Update: a) q new = [h λ (F T [1] M p old),..., h λ (F T [J] M p old)] b) p new = F q new / F q new 3 Repeat Step 2 replacing p old and q old by p new and q new until convergence. Sparse loadings q i (i > 1) are obtained via rank-one approximation of residual matrices. Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 20 / 33

45 Properties of the SMCA Properties MCA Sparse MCA Barycentric properties Non correlated components Orthogonal loadings Inertia TRUE TRUE TRUE δ j tot 100 TRUE FALSE FALSE Adjusted variance Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 21 / 33

46 Sparse MCA: exemple on breeds of dogs From Tenenhaus, M. (2007) Data I = 27 breeds of dogs J = 6 variables Q = 16 (total number of modalities) X: 27 6 matrix of categorical variables D: complete disjunctive table D=(D [1],..., D [6] ) 1 block = 1 descriptive variable = 1 sub-matrix D [j] Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 22 / 33

47 Sparse MCA: exemple on breeds of dogs Result of the MCA Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 23 / 33

48 Sparse MCA: exemple on breeds of dogs Result of the Sparse MCA From λ = 0.25: CPEV rapidly We choose λ = 0.25 to find a compromise between the number of variables selected and the % of variance lost. Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 24 / 33

49 Sparse MCA: exemple on breeds of dogs Result of the Sparse MCA From λ = 0.25: CPEV rapidly We choose λ = 0.25 to find a compromise between the number of variables selected and the % of variance lost. Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 24 / 33

50 Sparse MCA: exemple on breeds of dogs Result of the Sparse MCA Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 25 / 33

51 Sparse MCA: exemple on breeds of dogs Comparison of the loadings MCA Sparse MCA Variable Comp1 Comp2 Comp3 Comp4 Comp1 Comp2 Comp3 Comp4 large medium small lightweight heavy very heavy slow fast veryfast unintelligent avg intelligent very intelligent unloving veryaffectionate agressive non agressive Nb non-zero loadings Adjusted variance (%) Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 26 / 33

52 Sparse MCA: application on SNPs data Single nucleotide polymorphism (SNP): A single base pair mutation at a specific locus The most frequent polymorphism Data I = 502 women J = 537 SNPs Q = 1554 (2 or 3 modalities by SNP) X: matrix of categorical variables (SNPs) D: complete disjunctive matrix D=(D [1],..., D [537] ) 1 block = 1 SNP=1 sub-matrix D [j] Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 27 / 33

53 Sparse MCA: application on SNPs data Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 28 / 33

54 Sparse MCA: application on SNPs data λ = : CPEV=0.36% and 300 columns selected on Comp 1. Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 28 / 33

55 Sparse MCA: application on SNPs data λ = : CPEV=0.36% and 300 columns selected on Comp 1. λ = 0.005: CPEV= 0.32% and 174 columns selected on Comp 1. we choose λ = Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 28 / 33

56 Sparse MCA: application on SNPs data Comparison of the loadings SNPs MCA SMCA Comp1 Comp2 Comp1 Comp2 SNP1.AA SNP1.AG SNP1.GG SNP2.AA SNP2.AG SNP2.GG SNP3.CC SNP3.CG SNP3.GG SNP4.AA SNP4.AG SNP4.GG Nb non-zero loadings Variance (%) Cumulative variance (%) Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 29 / 33

57 Sparse MCA: application on SNPs data Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 30 / 33

58 Sparse MCA: application on SNPs data Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 31 / 33

59 Conclusions In a unsupervised multiblock data context: GSPCA for continuous variables, SMCA for categorical variables. Produce sparse loading structures (with limited loss of explained variance) easier interpretation and comprehension of the results. Very powerful in a context of variable selection in high dimension issues reduce noise as well as computation time. Research in progress: Extension of Sparse MCA to select groups and predictors within a group sparsity at both group and individual feature levels compromise between Sparse MCA and sparse group lasso developed by Simon et al. (2012). Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 32 / 33

60 References [1] Jolliffe, I. T. et al. (2003), A modified principal component technique based on the LASSO, Journal of Computational and Graphical Statistics, 12, , [2] Shen, H. et Huang, J. (2008), Sparse principal component analysis via regularized low rank matrix approximation, Journal of Multivariate Analysis, 99, , [3] Simon, N., Friedman, J., Hastie, T. et Tibshirani, R. (2012), A Sparse-Group Lasso, Journal of Computational and Graphical Statistics, DOI: / , [4] Tenenhaus, M. (2007), Statistique; méthodes pour décrire, expliquer et prévoir, Dunod, , [5] Yuan, M. et Lin, Y. (2006), Model selection and estimation in regression with grouped variables, Journal of the Royal Statistical Society: Series B, 68, 49 67, [6] Zou, H., Hastie, T. et Tibshirani, R. (2006), Sparse Principal Component Analysis, Journal of Computational and Graphical Statistics, 15, Bernard, Abdi, Tenenhaus, Guinot, Saporta Group Sparse PCA and ACM Sparse 33 / 33

A SURVEY OF SOME SPARSE METHODS FOR HIGH DIMENSIONAL DATA

A SURVEY OF SOME SPARSE METHODS FOR HIGH DIMENSIONAL DATA A SURVEY OF SOME SPARSE METHODS FOR HIGH DIMENSIONAL DATA Gilbert Saporta CEDRIC, Conservatoire National des Arts et Métiers, Paris gilbert.saporta@cnam.fr With inputs from Anne Bernard, Ph.D student Industrial

More information

Some Sparse Methods for High Dimensional Data. Gilbert Saporta CEDRIC, Conservatoire National des Arts et Métiers, Paris

Some Sparse Methods for High Dimensional Data. Gilbert Saporta CEDRIC, Conservatoire National des Arts et Métiers, Paris Some Sparse Methods for High Dimensional Data Gilbert Saporta CEDRIC, Conservatoire National des Arts et Métiers, Paris gilbert.saporta@cnam.fr A joint work with Anne Bernard (QFAB Bioinformatics) Stéphanie

More information

Sparse principal component analysis via regularized low rank matrix approximation

Sparse principal component analysis via regularized low rank matrix approximation Journal of Multivariate Analysis 99 (2008) 1015 1034 www.elsevier.com/locate/jmva Sparse principal component analysis via regularized low rank matrix approximation Haipeng Shen a,, Jianhua Z. Huang b a

More information

Supplement to A Generalized Least Squares Matrix Decomposition. 1 GPMF & Smoothness: Ω-norm Penalty & Functional Data

Supplement to A Generalized Least Squares Matrix Decomposition. 1 GPMF & Smoothness: Ω-norm Penalty & Functional Data Supplement to A Generalized Least Squares Matrix Decomposition Genevera I. Allen 1, Logan Grosenic 2, & Jonathan Taylor 3 1 Department of Statistics and Electrical and Computer Engineering, Rice University

More information

Sparse PCA with applications in finance

Sparse PCA with applications in finance Sparse PCA with applications in finance A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon 1 Introduction

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon

More information

Supplement to Sparse Non-Negative Generalized PCA with Applications to Metabolomics

Supplement to Sparse Non-Negative Generalized PCA with Applications to Metabolomics Supplement to Sparse Non-Negative Generalized PCA with Applications to Metabolomics Genevera I. Allen Department of Pediatrics-Neurology, Baylor College of Medicine, Jan and Dan Duncan Neurological Research

More information

STATS 306B: Unsupervised Learning Spring Lecture 13 May 12

STATS 306B: Unsupervised Learning Spring Lecture 13 May 12 STATS 306B: Unsupervised Learning Spring 2014 Lecture 13 May 12 Lecturer: Lester Mackey Scribe: Jessy Hwang, Minzhe Wang 13.1 Canonical correlation analysis 13.1.1 Recap CCA is a linear dimensionality

More information

Functional SVD for Big Data

Functional SVD for Big Data Functional SVD for Big Data Pan Chao April 23, 2014 Pan Chao Functional SVD for Big Data April 23, 2014 1 / 24 Outline 1 One-Way Functional SVD a) Interpretation b) Robustness c) CV/GCV 2 Two-Way Problem

More information

How to Sparsify the Singular Value Decomposition with Orthogonal Components

How to Sparsify the Singular Value Decomposition with Orthogonal Components How to Sparsify the Singular Value Decomposition with Orthogonal Components Vincent Guillemot, Aida Eslami, Arnaud Gloaguen 3, Arthur Tenenhaus 3, & Hervé Abdi 4 Institut Pasteur C3BI, USR 3756 IP CNRS

More information

Biclustering via Sparse Singular Value Decomposition

Biclustering via Sparse Singular Value Decomposition Biometrics 000, 000 000 DOI: 000 December 2009 Biclustering via Sparse Singular Value Decomposition Mihee Lee 1, Haipeng Shen 1,, Jianhua Z. Huang 2, and J. S. Marron 1 1 Department of Statistics and Operations

More information

Generalized Power Method for Sparse Principal Component Analysis

Generalized Power Method for Sparse Principal Component Analysis Generalized Power Method for Sparse Principal Component Analysis Peter Richtárik CORE/INMA Catholic University of Louvain Belgium VOCAL 2008, Veszprém, Hungary CORE Discussion Paper #2008/70 joint work

More information

ESL Chap3. Some extensions of lasso

ESL Chap3. Some extensions of lasso ESL Chap3 Some extensions of lasso 1 Outline Consistency of lasso for model selection Adaptive lasso Elastic net Group lasso 2 Consistency of lasso for model selection A number of authors have studied

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables LIB-MA, FSSM Cadi Ayyad University (Morocco) COMPSTAT 2010 Paris, August 22-27, 2010 Motivations Fan and Li (2001), Zou and Li (2008)

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley A. d Aspremont, INFORMS, Denver,

More information

Journal of Multivariate Analysis. Consistency of sparse PCA in High Dimension, Low Sample Size contexts

Journal of Multivariate Analysis. Consistency of sparse PCA in High Dimension, Low Sample Size contexts Journal of Multivariate Analysis 5 (03) 37 333 Contents lists available at SciVerse ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Consistency of sparse PCA

More information

Sparse Principal Component Analysis

Sparse Principal Component Analysis Sparse Principal Component Analysis Hui Zou, Trevor Hastie, Robert Tibshirani April 26, 2004 Abstract Principal component analysis (PCA) is widely used in data processing and dimensionality reduction.

More information

Bi-level feature selection with applications to genetic association

Bi-level feature selection with applications to genetic association Bi-level feature selection with applications to genetic association studies October 15, 2008 Motivation In many applications, biological features possess a grouping structure Categorical variables may

More information

A Generalized Least Squares Matrix Decomposition

A Generalized Least Squares Matrix Decomposition A Generalized Least Squares Matrix Decomposition Genevera I. Allen Department of Pediatrics-Neurology, Baylor College of Medicine, Jan and Dan Duncan Neurological Research Institute, Texas Children s Hospital,

More information

Tractable Upper Bounds on the Restricted Isometry Constant

Tractable Upper Bounds on the Restricted Isometry Constant Tractable Upper Bounds on the Restricted Isometry Constant Alex d Aspremont, Francis Bach, Laurent El Ghaoui Princeton University, École Normale Supérieure, U.C. Berkeley. Support from NSF, DHS and Google.

More information

A note on the group lasso and a sparse group lasso

A note on the group lasso and a sparse group lasso A note on the group lasso and a sparse group lasso arxiv:1001.0736v1 [math.st] 5 Jan 2010 Jerome Friedman Trevor Hastie and Robert Tibshirani January 5, 2010 Abstract We consider the group lasso penalty

More information

Effective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data

Effective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data Effective Linear Discriant Analysis for High Dimensional, Low Sample Size Data Zhihua Qiao, Lan Zhou and Jianhua Z. Huang Abstract In the so-called high dimensional, low sample size (HDLSS) settings, LDA

More information

A Generalized Least Squares Matrix Decomposition

A Generalized Least Squares Matrix Decomposition A Generalized Least Squares Matrix Decomposition Genevera I. Allen Department of Pediatrics-Neurology, Baylor College of Medicine, & Department of Statistics, Rice University, Houston, TX. Logan Grosenick

More information

Lecture 6: Methods for high-dimensional problems

Lecture 6: Methods for high-dimensional problems Lecture 6: Methods for high-dimensional problems Hector Corrada Bravo and Rafael A. Irizarry March, 2010 In this Section we will discuss methods where data lies on high-dimensional spaces. In particular,

More information

Biclustering via Sparse Singular Value Decomposition

Biclustering via Sparse Singular Value Decomposition Biometrics 66, 187 195 December 21 DOI: 1.1111/j.1541-42.21.1392.x Biclustering via Sparse Singular Value Decomposition Mihee Lee, 1 Haipeng Shen, 1, Jianhua Z. Huang, 2 andj.s.marron 1 1 Department of

More information

Learning with Singular Vectors

Learning with Singular Vectors Learning with Singular Vectors CIS 520 Lecture 30 October 2015 Barry Slaff Based on: CIS 520 Wiki Materials Slides by Jia Li (PSU) Works cited throughout Overview Linear regression: Given X, Y find w:

More information

CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS

CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS * Some contents are adapted from Dr. Hung Huang and Dr. Chengkai Li at UT Arlington Mingon Kang, Ph.D. Computer Science, Kennesaw State University Problems

More information

https://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:

More information

Chapter 3. Linear Models for Regression

Chapter 3. Linear Models for Regression Chapter 3. Linear Models for Regression Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Linear

More information

MSA220/MVE440 Statistical Learning for Big Data

MSA220/MVE440 Statistical Learning for Big Data MSA220/MVE440 Statistical Learning for Big Data Lecture 9-10 - High-dimensional regression Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Recap from

More information

Sparse statistical modelling

Sparse statistical modelling Sparse statistical modelling Tom Bartlett Sparse statistical modelling Tom Bartlett 1 / 28 Introduction A sparse statistical model is one having only a small number of nonzero parameters or weights. [1]

More information

Lecture 14: Variable Selection - Beyond LASSO

Lecture 14: Variable Selection - Beyond LASSO Fall, 2017 Extension of LASSO To achieve oracle properties, L q penalty with 0 < q < 1, SCAD penalty (Fan and Li 2001; Zhang et al. 2007). Adaptive LASSO (Zou 2006; Zhang and Lu 2007; Wang et al. 2007)

More information

Jianhua Z. Huang, Haipeng Shen, Andreas Buja

Jianhua Z. Huang, Haipeng Shen, Andreas Buja Several Flawed Approaches to Penalized SVDs A supplementary note to The analysis of two-way functional data using two-way regularized singular value decompositions Jianhua Z. Huang, Haipeng Shen, Andreas

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming Alexandre d Aspremont alexandre.daspremont@m4x.org Department of Electrical Engineering and Computer Science Laurent El Ghaoui elghaoui@eecs.berkeley.edu

More information

STATISTICAL LEARNING SYSTEMS

STATISTICAL LEARNING SYSTEMS STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis

More information

Regularization: Ridge Regression and the LASSO

Regularization: Ridge Regression and the LASSO Agenda Wednesday, November 29, 2006 Agenda Agenda 1 The Bias-Variance Tradeoff 2 Ridge Regression Solution to the l 2 problem Data Augmentation Approach Bayesian Interpretation The SVD and Ridge Regression

More information

Sparse orthogonal factor analysis

Sparse orthogonal factor analysis Sparse orthogonal factor analysis Kohei Adachi and Nickolay T. Trendafilov Abstract A sparse orthogonal factor analysis procedure is proposed for estimating the optimal solution with sparse loadings. In

More information

Analysis Methods for Supersaturated Design: Some Comparisons

Analysis Methods for Supersaturated Design: Some Comparisons Journal of Data Science 1(2003), 249-260 Analysis Methods for Supersaturated Design: Some Comparisons Runze Li 1 and Dennis K. J. Lin 2 The Pennsylvania State University Abstract: Supersaturated designs

More information

A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis

A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis Biostatistics (2009), 10, 3, pp. 515 534 doi:10.1093/biostatistics/kxp008 Advance Access publication on April 17, 2009 A penalized matrix decomposition, with applications to sparse principal components

More information

Forward Selection and Estimation in High Dimensional Single Index Models

Forward Selection and Estimation in High Dimensional Single Index Models Forward Selection and Estimation in High Dimensional Single Index Models Shikai Luo and Subhashis Ghosal North Carolina State University August 29, 2016 Abstract We propose a new variable selection and

More information

Package sgpca. R topics documented: July 6, Type Package. Title Sparse Generalized Principal Component Analysis. Version 1.0.

Package sgpca. R topics documented: July 6, Type Package. Title Sparse Generalized Principal Component Analysis. Version 1.0. Package sgpca July 6, 2013 Type Package Title Sparse Generalized Principal Component Analysis Version 1.0 Date 2012-07-05 Author Frederick Campbell Maintainer Frederick Campbell

More information

SOLVING NON-CONVEX LASSO TYPE PROBLEMS WITH DC PROGRAMMING. Gilles Gasso, Alain Rakotomamonjy and Stéphane Canu

SOLVING NON-CONVEX LASSO TYPE PROBLEMS WITH DC PROGRAMMING. Gilles Gasso, Alain Rakotomamonjy and Stéphane Canu SOLVING NON-CONVEX LASSO TYPE PROBLEMS WITH DC PROGRAMMING Gilles Gasso, Alain Rakotomamonjy and Stéphane Canu LITIS - EA 48 - INSA/Universite de Rouen Avenue de l Université - 768 Saint-Etienne du Rouvray

More information

A Generalized Least-Square Matrix Decomposition

A Generalized Least-Square Matrix Decomposition Supplementary materials for this article are available online. Please go to www.tandfonline.com/r/jasa A Generalized Least-Square Matrix Decomposition Genevera I. ALLEN, Logan GROSENICK, and Jonathan TAYLOR

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Permutation-invariant regularization of large covariance matrices. Liza Levina

Permutation-invariant regularization of large covariance matrices. Liza Levina Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work

More information

Principal Component Analysis with Sparse Fused Loadings

Principal Component Analysis with Sparse Fused Loadings Principal Component Analysis with Sparse Fused Loadings Frank Jian Guo, Gareth James, Elizaveta Levina, George Michailidis and Ji Zhu September 16, 2009 Abstract In this paper, we propose a new method

More information

Principal Component Analysis

Principal Component Analysis I.T. Jolliffe Principal Component Analysis Second Edition With 28 Illustrations Springer Contents Preface to the Second Edition Preface to the First Edition Acknowledgments List of Figures List of Tables

More information

Package tsvd. March 24, 2015

Package tsvd. March 24, 2015 Package tsvd March 24, 2015 Type Package Title Thresholding-based SVD for multivariate reduced rank regression Version 1.3 Date 2015-03-24 Author Xin Ma, Luo Xiao, and Wing H. Wong* Maintainer Xin Ma

More information

Regularized PCA to denoise and visualise data

Regularized PCA to denoise and visualise data Regularized PCA to denoise and visualise data Marie Verbanck Julie Josse François Husson Laboratoire de statistique, Agrocampus Ouest, Rennes, France CNAM, Paris, 16 janvier 2013 1 / 30 Outline 1 PCA 2

More information

Package tsvd. March 11, 2015

Package tsvd. March 11, 2015 Package tsvd March 11, 2015 Type Package Title Thresholding-based SVD for multivariate reduced rank regression Version 1.1 Date 2015-03-09 Author Xin Ma, Luo Xiao, and Wing H. Wong* Maintainer Xin Ma

More information

Generalized Elastic Net Regression

Generalized Elastic Net Regression Abstract Generalized Elastic Net Regression Geoffroy MOURET Jean-Jules BRAULT Vahid PARTOVINIA This work presents a variation of the elastic net penalization method. We propose applying a combined l 1

More information

DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania

DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania Submitted to the Annals of Statistics DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING By T. Tony Cai and Linjun Zhang University of Pennsylvania We would like to congratulate the

More information

A General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations

A General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations A General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations Joint work with Karim Oualkacha (UQÀM), Yi Yang (McGill), Celia Greenwood

More information

A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis

A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis A penalized matrix decomposition 1 A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis DANIELA M. WITTEN Department of Statistics, Stanford

More information

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss arxiv:1811.04545v1 [stat.co] 12 Nov 2018 Cheng Wang School of Mathematical Sciences, Shanghai Jiao

More information

Regularization Path Algorithms for Detecting Gene Interactions

Regularization Path Algorithms for Detecting Gene Interactions Regularization Path Algorithms for Detecting Gene Interactions Mee Young Park Trevor Hastie July 16, 2006 Abstract In this study, we consider several regularization path algorithms with grouped variable

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Generalizing Partial Least Squares and Correspondence Analysis to Predict Categorical (and Heterogeneous) Data

Generalizing Partial Least Squares and Correspondence Analysis to Predict Categorical (and Heterogeneous) Data Generalizing Partial Least Squares and Correspondence Analysis to Predict Categorical (and Heterogeneous) Data Hervé Abdi, Derek Beaton, & Gilbert Saporta CARME 2015, Napoli, September 21-23 1 Outline

More information

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1 Variable Selection in Restricted Linear Regression Models Y. Tuaç 1 and O. Arslan 1 Ankara University, Faculty of Science, Department of Statistics, 06100 Ankara/Turkey ytuac@ankara.edu.tr, oarslan@ankara.edu.tr

More information

Bayesian Grouped Horseshoe Regression with Application to Additive Models

Bayesian Grouped Horseshoe Regression with Application to Additive Models Bayesian Grouped Horseshoe Regression with Application to Additive Models Zemei Xu, Daniel F. Schmidt, Enes Makalic, Guoqi Qian, and John L. Hopper Centre for Epidemiology and Biostatistics, Melbourne

More information

How to analyze multiple distance matrices

How to analyze multiple distance matrices DISTATIS How to analyze multiple distance matrices Hervé Abdi & Dominique Valentin Overview. Origin and goal of the method DISTATIS is a generalization of classical multidimensional scaling (MDS see the

More information

PLS discriminant analysis for functional data

PLS discriminant analysis for functional data PLS discriminant analysis for functional data 1 Dept. de Statistique CERIM - Faculté de Médecine Université de Lille 2, 5945 Lille Cedex, France (e-mail: cpreda@univ-lille2.fr) 2 Chaire de Statistique

More information

SVD, PCA & Preprocessing

SVD, PCA & Preprocessing Chapter 1 SVD, PCA & Preprocessing Part 2: Pre-processing and selecting the rank Pre-processing Skillicorn chapter 3.1 2 Why pre-process? Consider matrix of weather data Monthly temperatures in degrees

More information

Linear Methods for Regression. Lijun Zhang

Linear Methods for Regression. Lijun Zhang Linear Methods for Regression Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Linear Regression Models and Least Squares Subset Selection Shrinkage Methods Methods Using Derived

More information

Sparse Principal Component Analysis Formulations And Algorithms

Sparse Principal Component Analysis Formulations And Algorithms Sparse Principal Component Analysis Formulations And Algorithms SLIDE 1 Outline 1 Background What Is Principal Component Analysis (PCA)? What Is Sparse Principal Component Analysis (spca)? 2 The Sparse

More information

Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors

Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors Patrick Breheny Department of Biostatistics University of Iowa Jian Huang Department of Statistics

More information

A Generalized Least Squares Matrix Decomposition

A Generalized Least Squares Matrix Decomposition This article was downloaded by: [171.65.65.46] On: 02 March 2014, At: 14:00 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer

More information

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17 Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 17 Outline Filters and Rotations Generating co-varying random fields Translating co-varying fields into

More information

Gaussian Graphical Models and Graphical Lasso

Gaussian Graphical Models and Graphical Lasso ELE 538B: Sparsity, Structure and Inference Gaussian Graphical Models and Graphical Lasso Yuxin Chen Princeton University, Spring 2017 Multivariate Gaussians Consider a random vector x N (0, Σ) with pdf

More information

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los

More information

Learning with Sparsity Constraints

Learning with Sparsity Constraints Stanford 2010 Trevor Hastie, Stanford Statistics 1 Learning with Sparsity Constraints Trevor Hastie Stanford University recent joint work with Rahul Mazumder, Jerome Friedman and Rob Tibshirani earlier

More information

Prediction of Stress Increase in Unbonded Tendons using Sparse Principal Component Analysis

Prediction of Stress Increase in Unbonded Tendons using Sparse Principal Component Analysis Utah State University DigitalCommons@USU All Graduate Plan B and other Reports Graduate Studies 8-2017 Prediction of Stress Increase in Unbonded Tendons using Sparse Principal Component Analysis Eric Mckinney

More information

Lecture 2 Part 1 Optimization

Lecture 2 Part 1 Optimization Lecture 2 Part 1 Optimization (January 16, 2015) Mu Zhu University of Waterloo Need for Optimization E(y x), P(y x) want to go after them first, model some examples last week then, estimate didn t discuss

More information

Summary and discussion of: Controlling the False Discovery Rate via Knockoffs

Summary and discussion of: Controlling the False Discovery Rate via Knockoffs Summary and discussion of: Controlling the False Discovery Rate via Knockoffs Statistics Journal Club, 36-825 Sangwon Justin Hyun and William Willie Neiswanger 1 Paper Summary 1.1 Quick intuitive summary

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems

More information

Regularized Tensor Factorizations and Higher-Order. Principal Components Analysis

Regularized Tensor Factorizations and Higher-Order. Principal Components Analysis Regularized Tensor Factorizations and Higher-Order Principal Components Analysis Genevera I. Allen Abstract High-dimensional tensors or multi-way data are becoming prevalent in areas such as biomedical

More information

CSC 576: Variants of Sparse Learning

CSC 576: Variants of Sparse Learning CSC 576: Variants of Sparse Learning Ji Liu Department of Computer Science, University of Rochester October 27, 205 Introduction Our previous note basically suggests using l norm to enforce sparsity in

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,

More information

High-dimensional Ordinary Least-squares Projection for Screening Variables

High-dimensional Ordinary Least-squares Projection for Screening Variables 1 / 38 High-dimensional Ordinary Least-squares Projection for Screening Variables Chenlei Leng Joint with Xiangyu Wang (Duke) Conference on Nonparametric Statistics for Big Data and Celebration to Honor

More information

Sparse inverse covariance estimation with the lasso

Sparse inverse covariance estimation with the lasso Sparse inverse covariance estimation with the lasso Jerome Friedman Trevor Hastie and Robert Tibshirani November 8, 2007 Abstract We consider the problem of estimating sparse graphs by a lasso penalty

More information

Regularization and Variable Selection via the Elastic Net

Regularization and Variable Selection via the Elastic Net p. 1/1 Regularization and Variable Selection via the Elastic Net Hui Zou and Trevor Hastie Journal of Royal Statistical Society, B, 2005 Presenter: Minhua Chen, Nov. 07, 2008 p. 2/1 Agenda Introduction

More information

A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression

A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression Noah Simon Jerome Friedman Trevor Hastie November 5, 013 Abstract In this paper we purpose a blockwise descent

More information

Penalized versus constrained generalized eigenvalue problems

Penalized versus constrained generalized eigenvalue problems Penalized versus constrained generalized eigenvalue problems Irina Gaynanova, James G. Booth and Martin T. Wells. arxiv:141.6131v3 [stat.co] 4 May 215 Abstract We investigate the difference between using

More information

Iterative Selection Using Orthogonal Regression Techniques

Iterative Selection Using Orthogonal Regression Techniques Iterative Selection Using Orthogonal Regression Techniques Bradley Turnbull 1, Subhashis Ghosal 1 and Hao Helen Zhang 2 1 Department of Statistics, North Carolina State University, Raleigh, NC, USA 2 Department

More information

Linear Regression Models. Based on Chapter 3 of Hastie, Tibshirani and Friedman

Linear Regression Models. Based on Chapter 3 of Hastie, Tibshirani and Friedman Linear Regression Models Based on Chapter 3 of Hastie, ibshirani and Friedman Linear Regression Models Here the X s might be: p f ( X = " + " 0 j= 1 X j Raw predictor variables (continuous or coded-categorical

More information

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

Principal Component Analysis. Applied Multivariate Statistics Spring 2012 Principal Component Analysis Applied Multivariate Statistics Spring 2012 Overview Intuition Four definitions Practical examples Mathematical example Case study 2 PCA: Goals Goal 1: Dimension reduction

More information

Sparse Partial Least Squares Regression with an Application to. Genome Scale Transcription Factor Analysis

Sparse Partial Least Squares Regression with an Application to. Genome Scale Transcription Factor Analysis Sparse Partial Least Squares Regression with an Application to Genome Scale Transcription Factor Analysis Hyonho Chun Department of Statistics University of Wisconsin Madison, WI 53706 email: chun@stat.wisc.edu

More information

EE 381V: Large Scale Optimization Fall Lecture 24 April 11

EE 381V: Large Scale Optimization Fall Lecture 24 April 11 EE 381V: Large Scale Optimization Fall 2012 Lecture 24 April 11 Lecturer: Caramanis & Sanghavi Scribe: Tao Huang 24.1 Review In past classes, we studied the problem of sparsity. Sparsity problem is that

More information

Principal Component Analysis with Sparse Fused Loadings

Principal Component Analysis with Sparse Fused Loadings Principal Component Analysis with Sparse Fused Loadings Jian Guo, Gareth James, Elizaveta Levina, George Michailidis and Ji Zhu March 21, 2010 Abstract In this paper, we propose a new method for principal

More information

25 : Graphical induced structured input/output models

25 : Graphical induced structured input/output models 10-708: Probabilistic Graphical Models 10-708, Spring 2016 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Raied Aljadaany, Shi Zong, Chenchen Zhu Disclaimer: A large

More information

Outlier detection and variable selection via difference based regression model and penalized regression

Outlier detection and variable selection via difference based regression model and penalized regression Journal of the Korean Data & Information Science Society 2018, 29(3), 815 825 http://dx.doi.org/10.7465/jkdi.2018.29.3.815 한국데이터정보과학회지 Outlier detection and variable selection via difference based regression

More information

Vector Space Models. wine_spectral.r

Vector Space Models. wine_spectral.r Vector Space Models 137 wine_spectral.r Latent Semantic Analysis Problem with words Even a small vocabulary as in wine example is challenging LSA Reduce number of columns of DTM by principal components

More information

A Modern Look at Classical Multivariate Techniques

A Modern Look at Classical Multivariate Techniques A Modern Look at Classical Multivariate Techniques Yoonkyung Lee Department of Statistics The Ohio State University March 16-20, 2015 The 13th School of Probability and Statistics CIMAT, Guanajuato, Mexico

More information

Generalized power method for sparse principal component analysis

Generalized power method for sparse principal component analysis CORE DISCUSSION PAPER 2008/70 Generalized power method for sparse principal component analysis M. JOURNÉE, Yu. NESTEROV, P. RICHTÁRIK and R. SEPULCHRE November 2008 Abstract In this paper we develop a

More information

Sparse representation classification and positive L1 minimization

Sparse representation classification and positive L1 minimization Sparse representation classification and positive L1 minimization Cencheng Shen Joint Work with Li Chen, Carey E. Priebe Applied Mathematics and Statistics Johns Hopkins University, August 5, 2014 Cencheng

More information

ISyE 691 Data mining and analytics

ISyE 691 Data mining and analytics ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)

More information

A Selective Review of Sufficient Dimension Reduction

A Selective Review of Sufficient Dimension Reduction A Selective Review of Sufficient Dimension Reduction Lexin Li Department of Statistics North Carolina State University Lexin Li (NCSU) Sufficient Dimension Reduction 1 / 19 Outline 1 General Framework

More information

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Lucas Janson, Stanford Department of Statistics WADAPT Workshop, NIPS, December 2016 Collaborators: Emmanuel

More information

Compressed Sensing in Cancer Biology? (A Work in Progress)

Compressed Sensing in Cancer Biology? (A Work in Progress) Compressed Sensing in Cancer Biology? (A Work in Progress) M. Vidyasagar FRS Cecil & Ida Green Chair The University of Texas at Dallas M.Vidyasagar@utdallas.edu www.utdallas.edu/ m.vidyasagar University

More information