Estimation of large dimensional sparse covariance matrices
|
|
- Ralph Bailey
- 5 years ago
- Views:
Transcription
1 Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009
2 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed) observations of a random vector (X i ) n i= in Rp. Suppose X i -s have covariance matrix Σ p Often interested in estimating Σ p (e.g for PCA); eigenvalues: λ i Standard estimator: Σ p = (X X ) (X X )/(n ); eigenvalues l i
3 Role of eigenvalues: examples Principal Component Analysis (PCA); large eigenvalues: Optimization problems: Example in risk management: Invest a j in X t,j. Risk measured by var (X t a) = a Σ p a. (Constraints on a in more realistic versions: e.g a i = ) Role of small(/all) eigenvalues.
4 Classical results from multivariate Statistics Classical theory: p fixed, n Fact: Σ unbiased, n-consistent estimator of Σ: n Σ Σ = OP (). Also, fluctuation theory (CLT) for largest eigenvalues (Anderson, 63), under eg normality of the data Much more is known about estimation of covariance matrices: Efron, Haff, Morris, Stein etc... Will focus here on large n, large p situation (quite common now), so p/n has finite non-zero limit
5 Visual Example Histogram eigenvalues of X'X/n X i : N(0,Id) p=2000 n= "True" (i.e population) eigenvalues
6 Graphical illustration: n=500, p= Marchenko-Pastur Law and Histogram of empirical eigenvalues X: 500*200 matrix, entries i.i.d N(0,) Histogram of eigenvalues of X'X/n and corresponding Marchenko-Pastur density superimposed Note: Σ = Id, so all population (or true ) eigenvalues equal.
7 Surprising? Previous plots perhaps surprising: In particular, CLT implies that ˆσ ij σ ij = O(n /2 ): good entrywise estimation But poor spectral estimation
8 Importance of ρ n = p/n Case of Σ = Id Suppose entries of n p matrix X are iid mean 0, sd, 4th moment. Population covariance = Id Let ρ n = p/n. Theorem (Marčenko-Pastur, 67) l i : (decreasing) eigenvalues of n X X. Assume ρ n ρ (0, ]. If F p (x) = p {#l i x}, then F ρ has known density. F p = F ρ in probability. Support: [a, b], with a = ( ρ /2 ) 2, b = ( + ρ /2 ) 2 Bias issue/inconcistency problems for these asymptotics.
9 Marcenko-Pastur law illustration Population covariance: Σ = Id Here, density of limiting spectral distribution computable: Density of Marcenko Pastur law for varying r r =
10 Remark about extensions/limitations RMT also gives results about limiting spectral distribution of eigenvalues when Σ Id Models of form X i = Σ /2 Y i, entries of Y i i.i.d (say 4+ɛ moments) Problem: geometric implications for the data; near orthogonality, near constant norm of X / p, when λ (Σ) bounded Two models can have same Σ and different limiting spectral distributions
11 Estimation problems Previous results highlights fact that naive estimation of covariance matrix results in inconsistency in high-dimension. Question: can we find procedures that consistently estimate spectral properties of these matrices? (I.e also eigenspaces) And are more robust to distributional assumptions? Can we exploit good entrywise estimation to improved spectral estimation?
12 Previous work Related ideas: Banding Scheme proposed by P. Bickel and L. Levina ( 06): Consider population covariance matrices with small entries away from the diagonal Technically: Σ p satisfies max{ σ(i, j) } < Cm α, m j i j >m Recommend banding: ˆσ i,j = 0 if i j > k, otherwise keep estimate from Σ p Call this estimator B k ( Σ p ) Theorem (Bickel and Levina) If k (n /2 log(p)) /(α+), + tail decay conditions B k ( Σ p ) Σ p 2 0
13 Sparsity Introduction Oft-made assumption: many σ(i, j)=0 or small Appeal of thresholding (i.e keep large values, and put small ones to 0) Aim: find simple methods for improving estimation Practical requirements: speed, parallelizability, limited assumptions Theoretical requirements: good convergence properties, in spectral norm ( 2 ) Estimator not sensitive to ordering of variables
14 What is a good estimator? Requirement: where (for symmetric matrices) Σ p Σ p 2 0 M 2 = σ (M) = λ (M M) = max λ i (M). i Convergence of 2 implies Consistency of all eigenvalues (Weyl s inequality) Consistency of stable subspaces corresponding to separated eigenvalues (Davis-Kahan sin θ theorem) Our aim: permutation equivariant estimator, operator-norm consistent Our strategy: (hard) thresholding
15 Standard notion of sparsity Ill-suited for spectral problems Standard notion : count number of non-zero elements p p... p p E = E 2 = p p p p p p p p p Eigenvalues E : ± p p, (multiplicity: p-2) Eigenvalues( E 2 : ) + 2 p cos πk p+, k =,..., p
16 Adjacency matrices p p... p p E = p p A =
17 Adjacency matrices E 2 =. p p... 0 p p p p 0 p A 2 =
18 Adjacency matrices and graphs: comparison Closed walks of length k Closed walk: start and finishes at same vertex Length of walk: number of vertices it traverses
19 Adjacency matrices and graphs Connection with spectrum of covariance matrices Closed walk (length k) γ: i i 2... i k i k+ = i Weight of γ: w γ = σ(i, i 2 )... σ(i k, i ) ¾(,) ¾(,) ¾(,6) ¾(,2) ¾(,2) ¾(6,6) 6 2 ¾(2,2) ¾(6,6) 6 2 ¾(2,2) ¾(,5) ¾(,4) ¾(,3) ¾(5,6) ¾(2,3) ¾(5,5) 5 3 ¾(3,3) ¾(5,5) 5 3 ¾(3,3) 4 ¾(4,5) 4 ¾(3,4) ¾(4,4) trace (Σ k) = λ k i (Σ) = γ C p(k) ¾(4,4) w γ
20 Notion of sparsity compatible with spectral analysis A proposal Given Σ p, p p covariance matrix, compute adjacency matrix A p = σ(i,j) 0 Associate graph G p to it Consider C p (k) = {closed walks of length k on the graph with adjacency matrix A p } and φ p (k) = Card {C p (k)} = trace ( A k p). Call sequence of Σ p β-sparse if k 2N, φ p (k) f (k)p β(k )+ where f (k) independent of p and 0 β <
21 Examples Computation of sparsity coefficients Diagonal matrix : A p = Id p. φ(k) = p, for all k. Sparsity coefficient: 0. Matrices with at most M non-zero elements on each line φ(k) pm k. Sparsity coefficient: 0. Matrices with at most Mp α non-zero elements on each line φ(k) M (k ) pp α(k ). Sparsity coefficient: α
22 Assumptions underlying results In all that follows, Σ p (i, i) stay bounded 2 X i,j have infinitely many moments 3 Rows of (n p) data matrix X i.i.d 4 p/n l (0, )
23 Simple case: gap in entries of covariance matrix Gaussian MLE, centered case X i,j centered; cov(x i ) = Σ Suppose Σ p β-sparse, β = /2 η and η > 0 Theorem S p = n n X X, and T α (S p )(i, j) = S p (i, j) Sp(i,j) >Cn α. i= T α (S p )= thresholded version of S p at level Cn α Then, if α = β + ɛ, ɛ < η/2, T α (S p ) Σ p 2 0i.p.
24 Beyond truly sparse matrices Approximation by sparse matrices How does thresholding perform on matrices approximated by sparse matrices? Suppose T α (Σ p ) = Σ p, β-sparse. Suppose Σ p Σ p 2 0. Suppose α 0 < α < /2 δ 0 such that adjacency matrix of (i, j) s such that Cn α < σ(i, j) < Cn α 0 is γ-sparse, γ α 0 ζ 0, ζ 0 > 0. Proposition Then conclusions of theorem above apply: for α (α 0, α ), T α (S p ) Σ p 2 0 i.p.
25 Complements Theorems apply to sample covariance matrix, i.e (X X ) (X X )/(n ), not just the Gaussian MLE Also to correlation matrices Finite number of moments possible p = n r for certain r depending on β Proof provides finite n, p bounds on deviation probabilities of T α (S p ) Σ p 2 Example: Σ () p (i, j) = ρ i j, and Σ p = P Σ () p P, P random permutation matrix Can be approximated by Σ p = T n /2+ɛ(Σ p ) Σ p is asymptotically 0-sparse Proposition above applies when moment conditions satisfied
26 Sharpness of /2-sparse assumption Consider α 2 α 3... α p α Σ p = α p α p with, e.g, α i = p Σ p : /2-sparse (eigenvalues A p : ( ± p and s) Oracle estimator (O( Σ p )) of Σ p inconsistent in Gaussian case: O( Σ p ) Σ p 2 2 = p i=2 (ˆα i α i ) 2 /2 η sparsity assumption sharp.
27 Practical observations Asymptopia? Practical implementation Practicalities: implemented thresholding technique using FDR at level / p Theory: great results possible; Practice: good results (n,p a few 00 s to a few 000 s); maybe not as good as hoped for. Practice very good when few coefficients to estimate; far away from σ/ n (unsurprisingly) Making mistakes OK. Softer techniques (Huber-type) might be good alternatives, especially for non-sparse matrices Eigenvector practice somewhat disappointing. In hard situations, performs OK. In less hard situations, results comparable or a bit worse than sparse PCA. Example thresholding
28 Conclusions Highlighted some difficulties in covariance estimation in large n, large p setting Proposed notion of sparsity compatible with spectral analysis
29 Real world example Data Analysis: Portfolio of Industry Indexes Data: Daily Returns 48 Indexes, by Industry: Agriculture, Toys, Beer, Soda, etc... from Kenneth French s website at Dartmouth. p = 48 2 years of observations n = 504 Effect of shrinkage on smallest eigenvalue: l empirical l shrunk 0.27 Through Subsampling, get (0.205,0.303) 95% CI
30 Data Analysis: Portfolio of Industry Indexes Scree plot for Industry Index Portfolio data Smallest eigenvalue Figure: Scree plot for Industry index data
31 Data Analysis: Portfolio of Industry Indexes Subsampling distribution of adjusted estimator of smallest eigenvalue Industry indexes n=504 p=48 60 number Repetitions: number subsamples: empirical smallest eigenvalue Figure: Subsampling distribution of estimator Back
32 (Easy) Example : Toeplitz(,.3,.4) n = p = Population covariance: Toeplitz(,0.3,0.4) n=p= Thresholded matrix eigenvalues 0.5 Population eigenvalues
33 (Easy) Example : Toeplitz(,.3,.4) n = p = Population covariance: Toeplitz(,0.3,0.4) n=p=500 2 Oracle eigenvalues.5 Thresholded matrix eigenvalues 0.5 Population eigenvalues
34 (Easy) Example : Toeplitz(,.3,.4) n = p = Population covariance: Toeplitz(,0.3,0.4) n=p= Sample Covariance eigenvalues 2 Thresholded matrix eigenvalues 0 Population eigenvalues
35 Example 2: Toeplitz(2,.2,.3,0,-.4) n = p = Population covariance: Toeplitz(2,.2,.3,0,.4) n=p= Population Spectrum 2.5 Spectrum Thresholded Matrix
36 Example 2: Toeplitz(2,.2,.3,0,-.4) n = p = Population covariance: Toeplitz(2,.2,.3,0,.4) n=p= Population Spectrum Sample Covariance Matrix 2 0 Spectrum Thresholded Matrix
37 Example 2: Toeplitz(2,.2,.3,0,-.4) Independent simulation Population covariance: Toeplitz(2,.2,.3,0,.4) n=p= Black line: Population eigenvalues 2.5 Blue: Thresholded Matrix eigenvalues 0.5 Green + : oracle banded matrix Red.: oracle matrix Back
38 Estimating the theoretical frontier Practical implementation Data: from Fama-French website. Year n = 252. Those are p = 48 industry indexes. Orders of magnitude: α = 0.040, β = , γ = â = , b = , ĉ =.563 Also, δ = , d =.0208
39 Estimating the theoretical frontier Practical implementation: 252 days, 47 assets; raw data 0. Correction to empirical frontier: year of data Target Returns Naive Plug-in Estimator Corrected Estimator σ
40 Estimating the theoretical frontier Practical implementation: 252 days, 47 assets; simulations 0. Correction of efficient frontier: simulation results Efficient frontier 0.02 Mean of Corrected Frontier Bottom of 95% CI for corrected frontier Top of 95% CI for corrected frontier Median of Corrected Frontier 0.0 Mean uncorrected Frontier Bottom of 95% CI for uncorrected frontier Top of 95% CI for uncorrected frontier Back
41 Eigenfaces An interesting example of PCA Question: compression of digitized pictures of human faces. Idea: Gather a database of faces. Example: ORL face database: 0 pictures of 40 distinct subjects Pictures are say 92*2=0304 pixels, 256 levels of gray images. Code them as vectors. Get data matrix X. Run Principal Component Analysis on the corresponding matrix Eigenvectors corresponding to large eigenvalues are eigenfaces. Need only a few numbers to approximate a new face: coefficients of its projection on eigenfaces.
42 Some eigenfaces for ORL face database Data obtained from Wikipedia. Copyright by AT&T Laboratories Cambridge. Back
Regularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008
Regularized Estimation of High Dimensional Covariance Matrices Peter Bickel Cambridge January, 2008 With Thanks to E. Levina (Joint collaboration, slides) I. M. Johnstone (Slides) Choongsoon Bae (Slides)
More informationHigh Dimensional Covariance and Precision Matrix Estimation
High Dimensional Covariance and Precision Matrix Estimation Wei Wang Washington University in St. Louis Thursday 23 rd February, 2017 Wei Wang (Washington University in St. Louis) High Dimensional Covariance
More informationMethods for sparse analysis of high-dimensional data, II
Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional
More informationMethods for sparse analysis of high-dimensional data, II
Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional
More informationPermutation-invariant regularization of large covariance matrices. Liza Levina
Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work
More informationAssessing the dependence of high-dimensional time series via sample autocovariances and correlations
Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Johannes Heiny University of Aarhus Joint work with Thomas Mikosch (Copenhagen), Richard Davis (Columbia),
More informationConcentration Inequalities for Random Matrices
Concentration Inequalities for Random Matrices M. Ledoux Institut de Mathématiques de Toulouse, France exponential tail inequalities classical theme in probability and statistics quantify the asymptotic
More informationU.C. Berkeley Better-than-Worst-Case Analysis Handout 3 Luca Trevisan May 24, 2018
U.C. Berkeley Better-than-Worst-Case Analysis Handout 3 Luca Trevisan May 24, 2018 Lecture 3 In which we show how to find a planted clique in a random graph. 1 Finding a Planted Clique We will analyze
More informationComposite Loss Functions and Multivariate Regression; Sparse PCA
Composite Loss Functions and Multivariate Regression; Sparse PCA G. Obozinski, B. Taskar, and M. I. Jordan (2009). Joint covariate selection and joint subspace selection for multiple classification problems.
More informationRobot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction
Robot Image Credit: Viktoriya Sukhanova 13RF.com Dimensionality Reduction Feature Selection vs. Dimensionality Reduction Feature Selection (last time) Select a subset of features. When classifying novel
More informationInvertibility of random matrices
University of Michigan February 2011, Princeton University Origins of Random Matrix Theory Statistics (Wishart matrices) PCA of a multivariate Gaussian distribution. [Gaël Varoquaux s blog gael-varoquaux.info]
More informationEstimation of the Global Minimum Variance Portfolio in High Dimensions
Estimation of the Global Minimum Variance Portfolio in High Dimensions Taras Bodnar, Nestor Parolya and Wolfgang Schmid 07.FEBRUARY 2014 1 / 25 Outline Introduction Random Matrix Theory: Preliminary Results
More informationPreface to the Second Edition...vii Preface to the First Edition... ix
Contents Preface to the Second Edition...vii Preface to the First Edition........... ix 1 Introduction.............................................. 1 1.1 Large Dimensional Data Analysis.........................
More informationHigh-dimensional covariance estimation based on Gaussian graphical models
High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,
More informationEigenvector stability: Random Matrix Theory and Financial Applications
Eigenvector stability: Random Matrix Theory and Financial Applications J.P Bouchaud with: R. Allez, M. Potters http://www.cfm.fr Portfolio theory: Basics Portfolio weights w i, Asset returns X t i If expected/predicted
More informationPCA with random noise. Van Ha Vu. Department of Mathematics Yale University
PCA with random noise Van Ha Vu Department of Mathematics Yale University An important problem that appears in various areas of applied mathematics (in particular statistics, computer science and numerical
More information8.1 Concentration inequality for Gaussian random matrix (cont d)
MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration
More informationDISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania
Submitted to the Annals of Statistics DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING By T. Tony Cai and Linjun Zhang University of Pennsylvania We would like to congratulate the
More informationConfidence Intervals for Low-dimensional Parameters with High-dimensional Data
Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Cun-Hui Zhang and Stephanie S. Zhang Rutgers University and Columbia University September 14, 2012 Outline Introduction Methodology
More informationOn corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR
Introduction a two sample problem Marčenko-Pastur distributions and one-sample problems Random Fisher matrices and two-sample problems Testing cova On corrections of classical multivariate tests for high-dimensional
More informationRITZ VALUE BOUNDS THAT EXPLOIT QUASI-SPARSITY
RITZ VALUE BOUNDS THAT EXPLOIT QUASI-SPARSITY ILSE C.F. IPSEN Abstract. Absolute and relative perturbation bounds for Ritz values of complex square matrices are presented. The bounds exploit quasi-sparsity
More informationRobust Covariance Estimation for Approximate Factor Models
Robust Covariance Estimation for Approximate Factor Models Jianqing Fan, Weichen Wang and Yiqiao Zhong arxiv:1602.00719v1 [stat.me] 1 Feb 2016 Department of Operations Research and Financial Engineering,
More informationStatistical Data Analysis
DS-GA 0 Lecture notes 8 Fall 016 1 Descriptive statistics Statistical Data Analysis In this section we consider the problem of analyzing a set of data. We describe several techniques for visualizing the
More informationRandom Matrices: Invertibility, Structure, and Applications
Random Matrices: Invertibility, Structure, and Applications Roman Vershynin University of Michigan Colloquium, October 11, 2011 Roman Vershynin (University of Michigan) Random Matrices Colloquium 1 / 37
More informationA Multiple Testing Approach to the Regularisation of Large Sample Correlation Matrices
A Multiple Testing Approach to the Regularisation of Large Sample Correlation Matrices Natalia Bailey 1 M. Hashem Pesaran 2 L. Vanessa Smith 3 1 Department of Econometrics & Business Statistics, Monash
More informationCorrelation Matrices and the Perron-Frobenius Theorem
Wilfrid Laurier University July 14 2014 Acknowledgements Thanks to David Melkuev, Johnew Zhang, Bill Zhang, Ronnnie Feng Jeyan Thangaraj for research assistance. Outline The - theorem extensions to negative
More informationInference For High Dimensional M-estimates: Fixed Design Results
Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationA note on a Marčenko-Pastur type theorem for time series. Jianfeng. Yao
A note on a Marčenko-Pastur type theorem for time series Jianfeng Yao Workshop on High-dimensional statistics The University of Hong Kong, October 2011 Overview 1 High-dimensional data and the sample covariance
More informationThe largest eigenvalues of the sample covariance matrix. in the heavy-tail case
The largest eigenvalues of the sample covariance matrix 1 in the heavy-tail case Thomas Mikosch University of Copenhagen Joint work with Richard A. Davis (Columbia NY), Johannes Heiny (Aarhus University)
More informationA direct formulation for sparse PCA using semidefinite programming
A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon
More informationRandom Matrices and Multivariate Statistical Analysis
Random Matrices and Multivariate Statistical Analysis Iain Johnstone, Statistics, Stanford imj@stanford.edu SEA 06@MIT p.1 Agenda Classical multivariate techniques Principal Component Analysis Canonical
More informationRandom Matrices: Beyond Wigner and Marchenko-Pastur Laws
Random Matrices: Beyond Wigner and Marchenko-Pastur Laws Nathan Noiry Modal X, Université Paris Nanterre May 3, 2018 Wigner s Matrices Wishart s Matrices Generalizations Wigner s Matrices ij, (n, i, j
More information(Part 1) High-dimensional statistics May / 41
Theory for the Lasso Recall the linear model Y i = p j=1 β j X (j) i + ɛ i, i = 1,..., n, or, in matrix notation, Y = Xβ + ɛ, To simplify, we assume that the design X is fixed, and that ɛ is N (0, σ 2
More informationHomework 1. Yuan Yao. September 18, 2011
Homework 1 Yuan Yao September 18, 2011 1. Singular Value Decomposition: The goal of this exercise is to refresh your memory about the singular value decomposition and matrix norms. A good reference to
More informationInference For High Dimensional M-estimates. Fixed Design Results
: Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and
More informationLarge sample covariance matrices and the T 2 statistic
Large sample covariance matrices and the T 2 statistic EURANDOM, the Netherlands Joint work with W. Zhou Outline 1 2 Basic setting Let {X ij }, i, j =, be i.i.d. r.v. Write n s j = (X 1j,, X pj ) T and
More informationMatrix Factorizations
1 Stat 540, Matrix Factorizations Matrix Factorizations LU Factorization Definition... Given a square k k matrix S, the LU factorization (or decomposition) represents S as the product of two triangular
More informationLocal Kesten McKay law for random regular graphs
Local Kesten McKay law for random regular graphs Roland Bauerschmidt (with Jiaoyang Huang and Horng-Tzer Yau) University of Cambridge Weizmann Institute, January 2017 Random regular graph G N,d is the
More informationHigh-dimensional statistics: Some progress and challenges ahead
High-dimensional statistics: Some progress and challenges ahead Martin Wainwright UC Berkeley Departments of Statistics, and EECS University College, London Master Class: Lecture Joint work with: Alekh
More informationEstimating Structured High-Dimensional Covariance and Precision Matrices: Optimal Rates and Adaptive Estimation
Estimating Structured High-Dimensional Covariance and Precision Matrices: Optimal Rates and Adaptive Estimation T. Tony Cai 1, Zhao Ren 2 and Harrison H. Zhou 3 University of Pennsylvania, University of
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More informationLINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS
LINEAR ALGEBRA, -I PARTIAL EXAM SOLUTIONS TO PRACTICE PROBLEMS Problem (a) For each of the two matrices below, (i) determine whether it is diagonalizable, (ii) determine whether it is orthogonally diagonalizable,
More informationDissertation Defense
Clustering Algorithms for Random and Pseudo-random Structures Dissertation Defense Pradipta Mitra 1 1 Department of Computer Science Yale University April 23, 2008 Mitra (Yale University) Dissertation
More informationHigh Dimensional Inverse Covariate Matrix Estimation via Linear Programming
High Dimensional Inverse Covariate Matrix Estimation via Linear Programming Ming Yuan October 24, 2011 Gaussian Graphical Model X = (X 1,..., X p ) indep. N(µ, Σ) Inverse covariance matrix Σ 1 = Ω = (ω
More informationFactor-Adjusted Robust Multiple Test. Jianqing Fan (Princeton University)
Factor-Adjusted Robust Multiple Test Jianqing Fan Princeton University with Koushiki Bose, Qiang Sun, Wenxin Zhou August 11, 2017 Outline 1 Introduction 2 A principle of robustification 3 Adaptive Huber
More informationsparse and low-rank tensor recovery Cubic-Sketching
Sparse and Low-Ran Tensor Recovery via Cubic-Setching Guang Cheng Department of Statistics Purdue University www.science.purdue.edu/bigdata CCAM@Purdue Math Oct. 27, 2017 Joint wor with Botao Hao and Anru
More informationSpectral gap in bipartite biregular graphs and applications
Spectral gap in bipartite biregular graphs and applications Ioana Dumitriu Department of Mathematics University of Washington (Seattle) Joint work with Gerandy Brito and Kameron Harris ICERM workshop on
More informationSliced Inverse Regression
Sliced Inverse Regression Ge Zhao gzz13@psu.edu Department of Statistics The Pennsylvania State University Outline Background of Sliced Inverse Regression (SIR) Dimension Reduction Definition of SIR Inversed
More informationExtreme eigenvalues of Erdős-Rényi random graphs
Extreme eigenvalues of Erdős-Rényi random graphs Florent Benaych-Georges j.w.w. Charles Bordenave and Antti Knowles MAP5, Université Paris Descartes May 18, 2018 IPAM UCLA Inhomogeneous Erdős-Rényi random
More informationTuning Parameter Selection in Regularized Estimations of Large Covariance Matrices
Tuning Parameter Selection in Regularized Estimations of Large Covariance Matrices arxiv:1308.3416v1 [stat.me] 15 Aug 2013 Yixin Fang 1, Binhuan Wang 1, and Yang Feng 2 1 New York University and 2 Columbia
More informationEmpirical properties of large covariance matrices in finance
Empirical properties of large covariance matrices in finance Ex: RiskMetrics Group, Geneva Since 2010: Swissquote, Gland December 2009 Covariance and large random matrices Many problems in finance require
More informationSpectral Partitiong in a Stochastic Block Model
Spectral Graph Theory Lecture 21 Spectral Partitiong in a Stochastic Block Model Daniel A. Spielman November 16, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened
More informationMultivariate Analysis and Likelihood Inference
Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density
More informationOperator norm convergence for sequence of matrices and application to QIT
Operator norm convergence for sequence of matrices and application to QIT Benoît Collins University of Ottawa & AIMR, Tohoku University Cambridge, INI, October 15, 2013 Overview Overview Plan: 1. Norm
More informationRitz Value Bounds That Exploit Quasi-Sparsity
Banff p.1 Ritz Value Bounds That Exploit Quasi-Sparsity Ilse Ipsen North Carolina State University Banff p.2 Motivation Quantum Physics Small eigenvalues of large Hamiltonians Quasi-Sparse Eigenvector
More informationSparse PCA in High Dimensions
Sparse PCA in High Dimensions Jing Lei, Department of Statistics, Carnegie Mellon Workshop on Big Data and Differential Privacy Simons Institute, Dec, 2013 (Based on joint work with V. Q. Vu, J. Cho, and
More informationThe Sparsity and Bias of The LASSO Selection In High-Dimensional Linear Regression
The Sparsity and Bias of The LASSO Selection In High-Dimensional Linear Regression Cun-hui Zhang and Jian Huang Presenter: Quefeng Li Feb. 26, 2010 un-hui Zhang and Jian Huang Presenter: Quefeng The Sparsity
More informationHigh Dimensional Discriminant Analysis
High Dimensional Discriminant Analysis Charles Bouveyron 1,2, Stéphane Girard 1, and Cordelia Schmid 2 1 LMC IMAG, BP 53, Université Grenoble 1, 38041 Grenoble cedex 9 France (e-mail: charles.bouveyron@imag.fr,
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More informationMatching Polynomials of Graphs
Spectral Graph Theory Lecture 25 Matching Polynomials of Graphs Daniel A Spielman December 7, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened in class The notes
More informationParameter estimation in linear Gaussian covariance models
Parameter estimation in linear Gaussian covariance models Caroline Uhler (IST Austria) Joint work with Piotr Zwiernik (UC Berkeley) and Donald Richards (Penn State University) Big Data Reunion Workshop
More informationELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018
ELE 538B: Mathematics of High-Dimensional Data Spectral methods Yuxin Chen Princeton University, Fall 2018 Outline A motivating application: graph clustering Distance and angles between two subspaces Eigen-space
More informationLarge sample distribution for fully functional periodicity tests
Large sample distribution for fully functional periodicity tests Siegfried Hörmann Institute for Statistics Graz University of Technology Based on joint work with Piotr Kokoszka (Colorado State) and Gilles
More informationLectures 6 7 : Marchenko-Pastur Law
Fall 2009 MATH 833 Random Matrices B. Valkó Lectures 6 7 : Marchenko-Pastur Law Notes prepared by: A. Ganguly We will now turn our attention to rectangular matrices. Let X = (X 1, X 2,..., X n ) R p n
More informationInvariant Subspace Perturbations or: How I Learned to Stop Worrying and Love Eigenvectors
Invariant Subspace Perturbations or: How I Learned to Stop Worrying and Love Eigenvectors Alexander Cloninger Norbert Wiener Center Department of Mathematics University of Maryland, College Park http://www.norbertwiener.umd.edu
More informationOverlapping Variable Clustering with Statistical Guarantees and LOVE
with Statistical Guarantees and LOVE Department of Statistical Science Cornell University WHOA-PSI, St. Louis, August 2017 Joint work with Mike Bing, Yang Ning and Marten Wegkamp Cornell University, Department
More informationSparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results
Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results David Prince Biostat 572 dprince3@uw.edu April 19, 2012 David Prince (UW) SPICE April 19, 2012 1 / 11 Electronic
More informationU.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 8 Luca Trevisan September 19, 2017
U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 8 Luca Trevisan September 19, 2017 Scribed by Luowen Qian Lecture 8 In which we use spectral techniques to find certificates of unsatisfiability
More informationReconstruction from Anisotropic Random Measurements
Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013
More informationSparse Covariance Selection using Semidefinite Programming
Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support
More informationLecture Notes 1: Vector spaces
Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector
More informationFluctuations from the Semicircle Law Lecture 4
Fluctuations from the Semicircle Law Lecture 4 Ioana Dumitriu University of Washington Women and Math, IAS 2014 May 23, 2014 Ioana Dumitriu (UW) Fluctuations from the Semicircle Law Lecture 4 May 23, 2014
More informationRandom Correlation Matrices, Top Eigenvalue with Heavy Tails and Financial Applications
Random Correlation Matrices, Top Eigenvalue with Heavy Tails and Financial Applications J.P Bouchaud with: M. Potters, G. Biroli, L. Laloux, M. A. Miceli http://www.cfm.fr Portfolio theory: Basics Portfolio
More informationPortfolio Allocation using High Frequency Data. Jianqing Fan
Portfolio Allocation using High Frequency Data Princeton University With Yingying Li and Ke Yu http://www.princeton.edu/ jqfan September 10, 2010 About this talk How to select sparsely optimal portfolio?
More informationPrinciple Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA
Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In
More informationRandom Matrix Theory Lecture 1 Introduction, Ensembles and Basic Laws. Symeon Chatzinotas February 11, 2013 Luxembourg
Random Matrix Theory Lecture 1 Introduction, Ensembles and Basic Laws Symeon Chatzinotas February 11, 2013 Luxembourg Outline 1. Random Matrix Theory 1. Definition 2. Applications 3. Asymptotics 2. Ensembles
More informationLecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26
Principal Component Analysis Brett Bernstein CDS at NYU April 25, 2017 Brett Bernstein (CDS at NYU) Lecture 13 April 25, 2017 1 / 26 Initial Question Intro Question Question Let S R n n be symmetric. 1
More informationTheory and Applications of High Dimensional Covariance Matrix Estimation
1 / 44 Theory and Applications of High Dimensional Covariance Matrix Estimation Yuan Liao Princeton University Joint work with Jianqing Fan and Martina Mincheva December 14, 2011 2 / 44 Outline 1 Applications
More informationReview (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology
Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna
More informationRobust Testing and Variable Selection for High-Dimensional Time Series
Robust Testing and Variable Selection for High-Dimensional Time Series Ruey S. Tsay Booth School of Business, University of Chicago May, 2017 Ruey S. Tsay HTS 1 / 36 Outline 1 Focus on high-dimensional
More informationNorms of Random Matrices & Low-Rank via Sampling
CS369M: Algorithms for Modern Massive Data Set Analysis Lecture 4-10/05/2009 Norms of Random Matrices & Low-Rank via Sampling Lecturer: Michael Mahoney Scribes: Jacob Bien and Noah Youngs *Unedited Notes
More informationBootstrapping high dimensional vector: interplay between dependence and dimensionality
Bootstrapping high dimensional vector: interplay between dependence and dimensionality Xianyang Zhang Joint work with Guang Cheng University of Missouri-Columbia LDHD: Transition Workshop, 2014 Xianyang
More informationGRAPH-BASED REGULARIZATION OF LARGE COVARIANCE MATRICES.
GRAPH-BASED REGULARIZATION OF LARGE COVARIANCE MATRICES. A Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of The Ohio State University
More informationPrincipal Component Analysis for a Spiked Covariance Model with Largest Eigenvalues of the Same Asymptotic Order of Magnitude
Principal Component Analysis for a Spiked Covariance Model with Largest Eigenvalues of the Same Asymptotic Order of Magnitude Addy M. Boĺıvar Cimé Centro de Investigación en Matemáticas A.C. May 1, 2010
More informationThe Kadison-Singer Conjecture
The Kadison-Singer Conjecture John E. McCarthy April 8, 2006 Set-up We are given a large integer N, and a fixed basis {e i : i N} for the space H = C N. We are also given an N-by-N matrix H that has all
More informationAnnouncements (repeat) Principal Components Analysis
4/7/7 Announcements repeat Principal Components Analysis CS 5 Lecture #9 April 4 th, 7 PA4 is due Monday, April 7 th Test # will be Wednesday, April 9 th Test #3 is Monday, May 8 th at 8AM Just hour long
More informationOn corrections of classical multivariate tests for high-dimensional data
On corrections of classical multivariate tests for high-dimensional data Jian-feng Yao with Zhidong Bai, Dandan Jiang, Shurong Zheng Overview Introduction High-dimensional data and new challenge in statistics
More informationHigh-dimensional Covariance Estimation Based On Gaussian Graphical Models
High-dimensional Covariance Estimation Based On Gaussian Graphical Models Shuheng Zhou, Philipp Rutimann, Min Xu and Peter Buhlmann February 3, 2012 Problem definition Want to estimate the covariance matrix
More informationFactor Analysis (10/2/13)
STA561: Probabilistic machine learning Factor Analysis (10/2/13) Lecturer: Barbara Engelhardt Scribes: Li Zhu, Fan Li, Ni Guan Factor Analysis Factor analysis is related to the mixture models we have studied.
More informationLecture 12: Introduction to Spectral Graph Theory, Cheeger s inequality
CSE 521: Design and Analysis of Algorithms I Spring 2016 Lecture 12: Introduction to Spectral Graph Theory, Cheeger s inequality Lecturer: Shayan Oveis Gharan May 4th Scribe: Gabriel Cadamuro Disclaimer:
More informationLecture 13 October 6, Covering Numbers and Maurey s Empirical Method
CS 395T: Sublinear Algorithms Fall 2016 Prof. Eric Price Lecture 13 October 6, 2016 Scribe: Kiyeon Jeon and Loc Hoang 1 Overview In the last lecture we covered the lower bound for p th moment (p > 2) and
More informationSupremum of simple stochastic processes
Subspace embeddings Daniel Hsu COMS 4772 1 Supremum of simple stochastic processes 2 Recap: JL lemma JL lemma. For any ε (0, 1/2), point set S R d of cardinality 16 ln n S = n, and k N such that k, there
More informationRandom matrices: Distribution of the least singular value (via Property Testing)
Random matrices: Distribution of the least singular value (via Property Testing) Van H. Vu Department of Mathematics Rutgers vanvu@math.rutgers.edu (joint work with T. Tao, UCLA) 1 Let ξ be a real or complex-valued
More informationRobust estimation of scale and covariance with P n and its application to precision matrix estimation
Robust estimation of scale and covariance with P n and its application to precision matrix estimation Garth Tarr, Samuel Müller and Neville Weber USYD 2013 School of Mathematics and Statistics THE UNIVERSITY
More informationFinding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October
Finding normalized and modularity cuts by spectral clustering Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu Ljubjana 2010, October Outline Find
More informationarxiv: v5 [math.na] 16 Nov 2017
RANDOM PERTURBATION OF LOW RANK MATRICES: IMPROVING CLASSICAL BOUNDS arxiv:3.657v5 [math.na] 6 Nov 07 SEAN O ROURKE, VAN VU, AND KE WANG Abstract. Matrix perturbation inequalities, such as Weyl s theorem
More informationApproximate Principal Components Analysis of Large Data Sets
Approximate Principal Components Analysis of Large Data Sets Daniel J. McDonald Department of Statistics Indiana University mypage.iu.edu/ dajmcdon April 27, 2016 Approximation-Regularization for Analysis
More informationEstimating Covariance Structure in High Dimensions
Estimating Covariance Structure in High Dimensions Ashwini Maurya Michigan State University East Lansing, MI, USA Thesis Director: Dr. Hira L. Koul Committee Members: Dr. Yuehua Cui, Dr. Hyokyoung Hong,
More informationA New Space for Comparing Graphs
A New Space for Comparing Graphs Anshumali Shrivastava and Ping Li Cornell University and Rutgers University August 18th 2014 Anshumali Shrivastava and Ping Li ASONAM 2014 August 18th 2014 1 / 38 Main
More information