Regularized PCA to denoise and visualise data

Similar documents
Regression, Ridge Regression, Lasso

Manifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA

Handling missing values in Multiple Factor Analysis

Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis

Linear Regression Linear Regression with Shrinkage

Signal Denoising with Wavelets

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION

Lecture 20 May 18, Empirical Bayes Interpretation [Efron & Morris 1973]

Linear Regression Linear Regression with Shrinkage

5.2 Expounding on the Admissibility of Shrinkage Estimators

Generalized SURE for optimal shrinkage of singular values in low-rank matrix denoising

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

PCA and admixture models

Problem Selected Scores

ISyE 691 Data mining and analytics

A Modern Look at Classical Multivariate Techniques

Dimension Reduction. David M. Blei. April 23, 2012

Bayesian causal forests: dealing with regularization induced confounding and shrinking towards homogeneous effects

Sparse Linear Models (10/7/13)

High-dimensional regression modeling

IEOR 165 Lecture 7 1 Bias-Variance Tradeoff

REGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

PCA, Kernel PCA, ICA

Approximate Principal Components Analysis of Large Data Sets

Weighted Least Squares

Linear Methods for Regression. Lijun Zhang

Mixtures of Robust Probabilistic Principal Component Analyzers

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

Linear regression methods

Factor Analysis (10/2/13)

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

Concentration Ellipsoids

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Approximate Message Passing Algorithms

Rigorous Dynamics and Consistent Estimation in Arbitrarily Conditioned Linear Systems

Lecture 8: Information Theory and Statistics

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

Spatial Modeling and Prediction of County-Level Employment Growth Data

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

High Dimensional Covariance and Precision Matrix Estimation

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

Ridge Regression Revisited

Covariance function estimation in Gaussian process regression

Factor Analysis and Kalman Filtering (11/2/04)

Robust Principal Component Analysis

Mixture of Gaussians Models

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 10

Machine Learning Techniques for Computer Vision

Bayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)

Probabilistic Models of Past Climate Change

Machine Learning for OR & FE

sparse and low-rank tensor recovery Cubic-Sketching

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Probabilistic Latent Semantic Analysis

(Part 1) High-dimensional statistics May / 41

Properties of the least squares estimates

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Association studies and regression

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

ABSTRACT. Topics on LASSO and Approximate Message Passing. Ali Mousavi

Lecture 14: Shrinkage

Density Estimation. Seungjin Choi

Lecture 6: Methods for high-dimensional problems

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Structured signal recovery from non-linear and heavy-tailed measurements

Statistical Inference On the High-dimensional Gaussian Covarianc

F & B Approaches to a simple model

Data Analysis and Machine Learning Lecture 12: Multicollinearity, Bias-Variance Trade-off, Cross-validation and Shrinkage Methods.

Linear Regression and Its Applications

Probabilistic Graphical Models

DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania

MLCC 2015 Dimensionality Reduction and PCA

Business Statistics. Tommaso Proietti. Model Evaluation and Selection. DEF - Università di Roma 'Tor Vergata'

Expression Data Exploration: Association, Patterns, Factors & Regression Modelling

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I

Prediction by supervised principal components

Chapter 3. Linear Models for Regression

Lecture 7: Con3nuous Latent Variable Models

Missing values imputation for mixed data based on principal component methods

Generative Models and Stochastic Algorithms for Population Average Estimation and Image Analysis

Multiple regression. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar

Estimating subgroup specific treatment effects via concave fusion

Chapter 14 Stein-Rule Estimation

Sparse Subspace Clustering

Outline Introduction OLS Design of experiments Regression. Metamodeling. ME598/494 Lecture. Max Yi Ren

Computer Vision Group Prof. Daniel Cremers. 3. Regression

Homework 1. Yuan Yao. September 18, 2011

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Estimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

High Dimensional Discriminant Analysis

13. Parameter Estimation. ECE 830, Spring 2014

y Xw 2 2 y Xw λ w 2 2

Estimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator

Least Absolute Shrinkage is Equivalent to Quadratic Penalization

Transcription:

Regularized PCA to denoise and visualise data Marie Verbanck Julie Josse François Husson Laboratoire de statistique, Agrocampus Ouest, Rennes, France CNAM, Paris, 16 janvier 2013 1 / 30

Outline 1 PCA 2 Regularized PCA MSE point of view Bayesian point of view 3 Results 2 / 30

Geometrical point of view Maximize the variance of the projected points Minimize the reconstruction error Approximation of X of low rank (S < p) X n p ˆXn p 2 SVD: ˆXPCA = U n S Λ 1 2 S S V p S F = UΛ 1 2 principal components V principal axes Alternating least squares algorithm: X FV 2 F = XV(V V) 1 V = X F(F F) 1 3 / 30

Model in PCA x = signal + noise Random eect model: Factor Analysis, Probabilistic PCA (Lawley, 1953, Roweis, 1998, Tipping & Bishop, 1999) individuals are exchangeable relationship between variables Fixed eect model (Lawley, 1941, Caussinus 1986) individuals are not i.i.d study the individuals and the variables sensory analysis (products - sensory descriptors) plant breeding (genotypes - environments) 4 / 30

Fixed eect model X n p = Xn p + ε n p x = S s=1 Randomness due to the noise d s q is r js + ε, ε N (0, σ 2 ) Individuals have dierent expectations: individual parameters Inferential framework: when n the number of parameters Asymptotic: σ 0 Maximum likelihood estimates (ˆθ) = least square estimates EM algorithm = ALS algorithm Not necessarily the best in MSE Shrinkage estimates: minimizes E(φˆθ θ) 2 5 / 30

Outline 1 PCA 2 Regularized PCA MSE point of view Bayesian point of view 3 Results 6 / 30

Minimizing the MSE x = x + ε = S s=1 x s + ε MSE = E i,j min(n 1,p) s=1 φ sˆx (s) x 2 with ˆx (s) = λ s u is v js MSE = E i,j ( S s=1 φ sˆx (s) ) 2 x (s) + min(n 1,p) s=s+1 ( φ sˆx (s) ) x (s) 2 For all s S + 1, x (s) = 0, therefore φ S+1 =... = φ min(n 1,p) = 0. 7 / 30

Minimizing the MSE MSE = E i,j ( S s=1 φ sˆx (s) x ) 2 with ˆx (s) = λ s u is v js φ s = i,j i,j (s) E(ˆx ) x E(ˆx (s)2 ) = i,j ( i,j V (ˆx (s) ) + (s) E(ˆx ) x ( E(ˆx (s) ) ) 2 ) When σ 0: E(ˆx (s) ) = x (s) (unbiased); V (ˆx (s) σ ) = 2 (Pazman & Denis, 1996; Denis & Gower, 1999) min(n 1;p) φ s = i,j i,j x (s)2 σ 2 min(n 1;p) + (s)2 i,j x 8 / 30

Minimizing the MSE x = x + ε = S s=1 d s q is r js + ε, ε N (0, σ 2 ) MSE = E ( i,j ( S s=1 φ sˆx (s) x ) 2 ) with ˆx (s) = λ s u is v js φ s = i,j i,j x (s)2 σ 2 min(n 1;p) + (s)2 i,j x = = d s npσ 2 min(n 1;p) + d s variance signal variance total ˆφ s = λ s ˆσ 2 λ s for s = 1,..., S 9 / 30

Regularized PCA ˆx rpca = ˆx PCA = S ( ) λs ˆσ 2 λs u is v js = s=1 λ s S s=1 S s=1 λs u is v js ( λs ˆσ2 λs ) u is v js Compromise between hard and soft thresholding σ 2 small regularized PCA PCA σ 2 large individuals coordinates toward the center of gravity ˆσ 2 = RSS ddl = n q s=s+1 λ s np p ns ps + S 2 + S (X n p ; U n S ; V p S ) (only asympt unbiased) 10 / 30

Illustration of the regularization X sim 4 10 = X4 10 + ε sim 4 10 sim = 1,..., 1000 PCA rpca rpca more biased less variable 11 / 30

Outline 1 PCA 2 Regularized PCA MSE point of view Bayesian point of view 3 Results 12 / 30

Outline 1 PCA 2 Regularized PCA MSE point of view Bayesian point of view Y = Xβ + ε, ε N (0, σ 2 ) β N (0, τ 2 ) ˆβ MAP = (X X + κ) 1 X Y, with κ = σ 2 /τ 2 3 Results 12 / 30

Probabilistic PCA (Roweis, 1998, Tipping & Bishop, 1999) A specic case of an exploratory factor analysis model x = (Z n S B p S ) + ε, z i N (0, I S ), ε i N (0, σ 2 I p ) Conditional independence : x i z i N (Bz i, I p ) Distribution for the observations: x i N (0, Σ) with Σ = BB + σ 2 I p Explicit solution: ˆσ 2 = 1 p S p s=s+1 λ s ˆB = V(Λ σ 2 I S ) 1 2 13 / 30

Probabilistic PCA x = (Z n S B p S ) + ε, z i N (0, I S ), ε i N (0, σ 2 I p ) Random eect model: no individual parameters BLUP E(z i x i ): Ẑ = XˆB(ˆB ˆB + σ2 I S ) 1 ˆX ppca = ẐˆB Using ˆB = V(Λ σ2 I S ) 1 2 and SVD X = UΛ 1/2 V XV = Λ 1/2 U ˆX ppca = U(Λ σ 2 I S )Λ 1 2 V ˆx ppca = S s=1 ( λs ˆσ2 λs ) u is v js Bayesian treatment of the xed eect model with an a priori on z i : Regularized PCA 14 / 30

Probabilistic PCA EM algorithm in ppca: two ridge regressions Step E: Step M: Ẑ = XˆB(ˆB ˆB + ˆσ 2 I S ) 1 ˆB = X Ẑ(Ẑ Ẑ + ˆσ 2 Λ 1 ) 1 EM algorithm in PCA: two regressions X FV 2 F = XV(V V) 1 V = X F(F F) 1 15 / 30

An empirical Bayesian approach Model: x = S s=1 Prior: x (s) N (0, τs 2 ) ( ) Posterior: E x (s) x (s) + ε, ε N (0, σ 2 ) x (s) ( ) Maximizing the likelihood of x (s) ( function of τs 2 to obtain: ˆτ s 2 = ˆΦ s = λ s = Φ s x (s) with Φ s = τ 2 s τ 2 s + 1 min(n 1;p) σ2 N (0, τs 2 1 + min(n 1;p) σ2 ) as a ) min(n 1;p) ˆσ2 1 np λ 1 s np min(n 1;p) ˆσ2 λ s 16 / 30

Outline 1 PCA 2 Regularized PCA MSE point of view Bayesian point of view 3 Results 17 / 30

Simulation design The simulated data: X = X + ε Vary several parameters to construct X: n/p: 100/20 = 5, 50/50 = 1 and 20/100 = 0.2 S underlying dimensions: 2, 4, 10 the ratio d 1 /d 2 : 4, 1 ε is drawn from a Gaussian distribution with a standard deviation chosen to obtain a signal-to-noise ratio (SNR): 4, 1, 0.8 18 / 30

Soft thresholding and SURE (Candès et al., 2012) ˆx Soft = min(n,p) s=1 ( λs λ) + u is v js X X 2 + λ X Choosing the tunning parameter λ? MSE(λ) = E X SVTλ (X) 2 Stein's Unbiased Risk Estimate (SURE) min(n,p) SURE(SVT λ )(X) = npσ 2 + min(λ 2, λ i ) + 2σ 2 div(svt λ )(X) i λ is obtained minimizing SURE. Remark: ˆσ 2 is required 19 / 30

Simulation study n/p S SNR d 1 /d 2 MSE(ˆXPCA, X) MSE(ˆXrPCA, X) MSE(ˆXSURE, X) 100/20 4 4 4 8.25E-04 8.24E-04 1.42E-03 100/20 4 4 1 8.26E-04 8.25E-04 1.43E-03 100/20 4 1 4 2.60E-01 1.96E-01 2.43E-01 100/20 4 1 1 2.47E-01 2.04E-01 2.62E-01 100/20 4 0.8 4 7.41E-01 4.27E-01 4.36E-01 100/20 4 0.8 1 6.68E-01 4.40E-01 4.83E-01 50/50 4 4 4 5.48E-04 5.48E-04 1.04E-03 50/50 4 4 1 5.46E-04 5.46E-04 1.04E-03 50/50 4 1 4 1.75E-01 1.53E-01 2.00E-01 50/50 4 1 1 1.68E-01 1.52E-01 2.09E-01 50/50 4 0.8 4 5.07E-01 3.87E-01 3.85E-01 50/50 4 0.8 1 4.67E-01 3.76E-01 4.13E-01 20/100 4 4 4 8.28E-04 8.27E-04 1.41E-03 20/100 4 4 1 8.29E-04 8.28E-04 1.42E-03 20/100 4 1 4 2.55E-01 1.97E-01 2.45E-01 20/100 4 1 1 2.48E-01 2.04E-01 2.60E-01 20/100 4 0.8 4 7.13E-01 4.15E-01 4.37E-01 20/100 4 0.8 1 6.66E-01 4.34E-01 4.78E-01 MSE in favor of rpca 20 / 30

Simulation study n/p S SNR d 1 /d 2 MSE(ˆXPCA, X) MSE(ˆXrPCA, X) MSE(ˆXSURE, X) 100/20 4 4 4 8.25E-04 8.24E-04 1.42E-03 100/20 4 4 1 8.26E-04 8.25E-04 1.43E-03 100/20 4 1 4 2.60E-01 1.96E-01 2.43E-01 100/20 4 1 1 2.47E-01 2.04E-01 2.62E-01 100/20 4 0.8 4 7.41E-01 4.27E-01 4.36E-01 100/20 4 0.8 1 6.68E-01 4.40E-01 4.83E-01 50/50 4 4 4 5.48E-04 5.48E-04 1.04E-03 50/50 4 4 1 5.46E-04 5.46E-04 1.04E-03 50/50 4 1 4 1.75E-01 1.53E-01 2.00E-01 50/50 4 1 1 1.68E-01 1.52E-01 2.09E-01 50/50 4 0.8 4 5.07E-01 3.87E-01 3.85E-01 50/50 4 0.8 1 4.67E-01 3.76E-01 4.13E-01 20/100 4 4 4 8.28E-04 8.27E-04 1.41E-03 20/100 4 4 1 8.29E-04 8.28E-04 1.42E-03 20/100 4 1 4 2.55E-01 1.97E-01 2.45E-01 20/100 4 1 1 2.48E-01 2.04E-01 2.60E-01 20/100 4 0.8 4 7.13E-01 4.15E-01 4.37E-01 20/100 4 0.8 1 6.66E-01 4.34E-01 4.78E-01 rpca PCA when SNR is small - rpca PCA when SNR is high 20 / 30

rpca rpca when d 1 d 2 20 / 30 PCA Regularized PCA Results Simulation study n/p S SNR d 1 /d 2 MSE(ˆXPCA, X) MSE(ˆXrPCA, X) MSE(ˆXSURE, X) 100/20 4 4 4 8.25E-04 8.24E-04 1.42E-03 100/20 4 4 1 8.26E-04 8.25E-04 1.43E-03 100/20 4 1 4 2.60E-01 1.96E-01 2.43E-01 100/20 4 1 1 2.47E-01 2.04E-01 2.62E-01 100/20 4 0.8 4 7.41E-01 4.27E-01 4.36E-01 100/20 4 0.8 1 6.68E-01 4.40E-01 4.83E-01 50/50 4 4 4 5.48E-04 5.48E-04 1.04E-03 50/50 4 4 1 5.46E-04 5.46E-04 1.04E-03 50/50 4 1 4 1.75E-01 1.53E-01 2.00E-01 50/50 4 1 1 1.68E-01 1.52E-01 2.09E-01 50/50 4 0.8 4 5.07E-01 3.87E-01 3.85E-01 50/50 4 0.8 1 4.67E-01 3.76E-01 4.13E-01 20/100 4 4 4 8.28E-04 8.27E-04 1.41E-03 20/100 4 4 1 8.29E-04 8.28E-04 1.42E-03 20/100 4 1 4 2.55E-01 1.97E-01 2.45E-01 20/100 4 1 1 2.48E-01 2.04E-01 2.60E-01 20/100 4 0.8 4 7.13E-01 4.15E-01 4.37E-01 20/100 4 0.8 1 6.66E-01 4.34E-01 4.78E-01

Simulation study n/p S SNR d 1 /d 2 MSE(ˆXPCA, X) MSE(ˆXrPCA, X) MSE(ˆXSURE, X) 100/20 4 4 4 8.25E-04 8.24E-04 1.42E-03 100/20 4 4 1 8.26E-04 8.25E-04 1.43E-03 100/20 4 1 4 2.60E-01 1.96E-01 2.43E-01 100/20 4 1 1 2.47E-01 2.04E-01 2.62E-01 100/20 4 0.8 4 7.41E-01 4.27E-01 4.36E-01 100/20 4 0.8 1 6.68E-01 4.40E-01 4.83E-01 50/50 4 4 4 5.48E-04 5.48E-04 1.04E-03 50/50 4 4 1 5.46E-04 5.46E-04 1.04E-03 50/50 4 1 4 1.75E-01 1.53E-01 2.00E-01 50/50 4 1 1 1.68E-01 1.52E-01 2.09E-01 50/50 4 0.8 4 5.07E-01 3.87E-01 3.85E-01 50/50 4 0.8 1 4.67E-01 3.76E-01 4.13E-01 20/100 4 4 4 8.28E-04 8.27E-04 1.41E-03 20/100 4 4 1 8.29E-04 8.28E-04 1.42E-03 20/100 4 1 4 2.55E-01 1.97E-01 2.45E-01 20/100 4 1 1 2.48E-01 2.04E-01 2.60E-01 20/100 4 0.8 4 7.13E-01 4.15E-01 4.37E-01 20/100 4 0.8 1 6.66E-01 4.34E-01 4.78E-01 Same order of errors for n/p = 0.2 or 5, smaller for n/p = 1! 20 / 30

Simulation study n/p S SNR d 1 /d 2 MSE(ˆXPCA, X) MSE(ˆXrPCA, X) MSE(ˆXSURE, X) 100/20 4 4 4 8.25E-04 8.24E-04 1.42E-03 100/20 4 4 1 8.26E-04 8.25E-04 1.43E-03 100/20 4 1 4 2.60E-01 1.96E-01 2.43E-01 100/20 4 1 1 2.47E-01 2.04E-01 2.62E-01 100/20 4 0.8 4 7.41E-01 4.27E-01 4.36E-01 100/20 4 0.8 1 6.68E-01 4.40E-01 4.83E-01 50/50 4 4 4 5.48E-04 5.48E-04 1.04E-03 50/50 4 4 1 5.46E-04 5.46E-04 1.04E-03 50/50 4 1 4 1.75E-01 1.53E-01 2.00E-01 50/50 4 1 1 1.68E-01 1.52E-01 2.09E-01 50/50 4 0.8 4 5.07E-01 3.87E-01 3.85E-01 50/50 4 0.8 1 4.67E-01 3.76E-01 4.13E-01 20/100 4 4 4 8.28E-04 8.27E-04 1.41E-03 20/100 4 4 1 8.29E-04 8.28E-04 1.42E-03 20/100 4 1 4 2.55E-01 1.97E-01 2.45E-01 20/100 4 1 1 2.48E-01 2.04E-01 2.60E-01 20/100 4 0.8 4 7.13E-01 4.15E-01 4.37E-01 20/100 4 0.8 1 6.66E-01 4.34E-01 4.78E-01 SURE good when data are noisy 20 / 30

Simulation from Candes et al. (2012) Same model of simulation n = 500 p = 200, SNR (4, 2, 1, 0.5), S (10, 100) S SNR MSE(ˆXPCA, X) MSE(ˆXrPCA, X) MSE(ˆXSURE, X) 10 4 4.31E-03 4.29E-03 8.74E-03 10 2 1.74E-02 1.71E-02 3.29E-02 10 1 7.16E-02 6.75E-02 1.16E-01 10 0.5 3.19E-01 2.57E-01 3.53E-01 100 4 3.79E-02 3.69E-02 4.50E-02 100 2 1.58E-01 1.41E-01 1.56E-01 100 1 7.29E-01 4.91E-01 4.48E-01 100 0.5 3.16E+00 1.48E+00 8.52E-01 21 / 30

Individuals representation 100 individuals, 20 variables, 2 dimensions, SNR=0.8, d 1 /d 2 = 4 True conguration PCA 22 / 30

Individuals representation 100 individuals, 20 variables, 2 dimensions, SNR=0.8, d 1 /d 2 = 4 PCA rpca 23 / 30

Individuals representation 100 individuals, 20 variables, 2 dimensions, SNR=0.8, d 1 /d 2 = 4 PCA SURE 23 / 30

Individuals representation 100 individuals, 20 variables, 2 dimensions, SNR=0.8, d 1 /d 2 = 4 True conguration rpca Improve the recovery of the inner-product matrix X X 24 / 30

Variables representation True conguration PCA cor(x 7, X 9 )= 1 0.68 25 / 30

Variables representation PCA rpca cor(x 7, X 9 )= 0.68 0.81 26 / 30

Variables representation PCA SURE cor(x 7, X 9 )= 0.68 0.82 26 / 30

Variables representation True conguration rpca cor(x 7, X 9 )= 1 0.81 Improve the recovery of the covariance matrix X X 27 / 30

Real data set 27 chickens submitted to 4 nutritional statuses 12664 gene expressions PCA rpca 28 / 30

Real data set 27 chickens submitted to 4 nutritional statuses 12664 gene expressions SURE rpca 28 / 30

Real data set 27 chickens submitted to 4 nutritional statuses 12664 gene expressions SPCA rpca 28 / 30

Real data set PCA rpca Bi-clustering from rpca better nd the 4 nutritional statuses! 29 / 30

Real data set SURE rpca Bi-clustering from rpca better nd the 4 nutritional statuses! 29 / 30

Real data set SPCA rpca Bi-clustering from rpca better nd the 4 nutritional statuses! 29 / 30

Conclusion Recover a low rank matrix from noisy observations singular values thresholding: a popular estimation strategy ( λs a regularized term a priori: ˆσ2 λs ): Minimize MSE - Bayesian xed eect model Good results in term of recovery of the structure, the distances between individuals and the covariance matrix and in term of visualization Asymptotic results Tuning parameter: the number of dimensions CV? Regularized MCA (Takane, 2006) 30 / 30