Covariance Matrix Estimation for the Cryo-EM Heterogeneity Problem
|
|
- Julius Lester
- 5 years ago
- Views:
Transcription
1 Covariance Matrix Estimation for the Cryo-EM Heterogeneity Problem Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics July 25, 2014 Joint work with Gene Katsevich (Princeton) and Alexander Katsevich (UCF) Amit Singer (Princeton University) July / 41
2 Single Particle Cryo-Electron Microscopy Drawing of the imaging process: Amit Singer (Princeton University) July / 41
3 The X-ray Transform (P s φ)(x,y) = R φ(r T s r)dz, where r = (x,y,z) T, R s SO(3), and φ L 2 (B) is the electric potential of the molecule. Projection I s Molecule φ R s = R 1 s R 2 s R 3 s SO(3) Electron source Amit Singer (Princeton University) July / 41
4 The cryo-em problem If S : L 2 (D) R q is discretization onto N N grid (q π 4 N2 ) then the microscope produces where ǫ s is noise. I s = SP s φ+ǫ s, s = 1,...,n, Nyquist can reconstruct only up to a bandlimit W. Natural search space for φ: space B of band limited functions. Suppose dim(b) = p. Cryo-EM problem: Given I s, estimate R s and find approximation of φ in B. Amit Singer (Princeton University) July / 41
5 The heterogeneity problem What if the molecule has more than one possible structure? (Image source: H. Liao and J. Frank, Classification by bootstrapping in single particle methods, Proceedings of the 2010 IEEE international conference on biomedical imaging, 2010.) Amit Singer (Princeton University) July / 41
6 The heterogeneity problem (discrete conformational space) Problem (1) φ : Ω R 3 R, with P[φ = φ c ] = p c, c = 1,...,C. φ 1,...,φ n are i.i.d. samples from φ Let I s = SP s φ s +ǫ s, where P 1,...,P n are obtained from i.i.d. rotations R s (independent of the φ s ) drawn from a distribution over SO(3), and ǫ s N(0,σ 2 I p ), independent of φ s,p s. Given R s,i s, find the number of classes C, approximations of the molecular structures φ c in B, and their probabilities p c. Amit Singer (Princeton University) July / 41
7 Previous work Common lines (e.g., Shatsky et al, 2010) Maximum likelihood (e.g., Wang et al, 2013) Bootstrapping and PCA (e.g., Penczek et al, 2011) Amit Singer (Princeton University) July / 41
8 The heterogeneity problem: finite-dimensional formulation For purposes of this talk, suppose that φ c B. Note that P s := SP s B is a linear operator P s : R p R q. If X s is basis representation of φ s, then I s = P s X s +ǫ s. X 1,...,X n are i.i.d samples of a random vector X in R p. When X has C classes, Σ = Var[X] has rank C 1. Penczek (2011) proposed a (suboptimal) method to use the top C 1 eigenvectors of Σ to solve the heterogeneity problem. How to estimate Σ? Amit Singer (Princeton University) July / 41
9 Covariance Estimation from Projected Data Consider the following general statistical estimation problem: Problem (2) X is a random vector on C p, with E[X] = µ and Var(X) = Σ. X 1,...,X n are i.i.d. samples from X Let I s = P s X s +ǫ s, s = 1,...,n, P 1,...,P n : C p C q are i.i.d. samples (independent of the X s ) from a distribution over C q p (q p) ǫ s are i.i.d. noises such that E[ǫ s ] = 0,Var[ǫ s ] = σ 2 I q, independent of X s,p s Given I s,p s, and σ 2, estimate µ and Σ. Amit Singer (Princeton University) July / 41
10 PCA from projected data: Well-known special cases I s = P s X s +ǫ s, s = 1,...,n. If p = q and P s = I p, we have the usual covariance estimation problem. When P s are coordinate-selection operators, we have a missing data statistics problem (Little and Rubin, 1987). A special case of the matrix sensing problem (Recht, Fazel, Parrilo, 2010) Here we do not attempt to impute the missing entries / estimate X s assuming Σ has low rank. Instead we shall try to estimate Σ and its rank directly. Amit Singer (Princeton University) July / 41
11 Obtaining an estimator I s = P s X s +ǫ s. Note that E[I s ] = P s µ; Var[I s ] = P s ΣP s +σ2 I. Can construct estimators µ n and Σ n as solutions of optimization problem: µ n = argmin µ Σ n = argmin Σ n I s P s µ 2 s=1 n (Is P s µ n )(I s P s µ n ) (P s ΣPs +σ 2 I) 2 F. s=1 Σ n is not constraint to be positive semidefinite: we are typically interested only in the few leading eigenvectors of Σ n. Amit Singer (Princeton University) July / 41
12 Resulting linear equations These lead to linear equations for µ n and Σ n : ( n ) A n µ n := 1 P n sp s µ n = 1 n s=1 n PsI s s=1 L n Σ n := 1 n = 1 n n PsP s Σ n PsP s s=1 n s=1 P s(i s P s µ n )(I s P s µ n ) P s σ2 n n PsP s. s=1 Amit Singer (Princeton University) July / 41
13 Connection to missing-data statistics We recover available-case estimators of mean and covariance Known to be consistent under the missing completely at random (MCAR) assumption Amit Singer (Princeton University) July / 41
14 PCA in high dimensions For the sample covariance matrix Σ n = 1 n x i xi T, X N(0,I p ) n i=1 the asymptotic regime p,n with p/n γ ( high dimensional statistics ) is by now well understood (Johnstone, ICM 2006) Marcenko-Pastur distribution, Tracy-Widom distribution, spiked model (Σ = I p +σ 2 vv T ), phase transition. Amit Singer (Princeton University) July / 41
15 Connection to high-dimensional PCA 1 n n PsP s Σ n PsP s = 1 n s=1 n s=1 P s(i s P s µ n )(I s P s µ n ) P s σ2 n n PsP s. s=1 When P s = I, we recover the sample covariance matrix How does the spectrum of Σ n behave as a function of n,p,q and the geometry of P s? Empirical spectral density? Distribution of largest eigenvalue? Spiked model? Amit Singer (Princeton University) July / 41
16 Theoretical results about Σ n : Preliminary Study Proposition [Asymptotic Consistency, fixed p] Suppose A n A and L n L, where A,L are invertible. Suppose P s is uniformly bounded and that X has bounded fourth moments. Then, Proposition E[Σ n ] Σ = O ( 1 n ) ; Var[Σ n ] = O ( ) 1, n [Finite Sample Bounds] Suppose X C 1, ǫ s C 2, P s C 3 a.s. Assume that X is centered. If λ = λ min (L n ) and B = C 2 3 C 1 +C 3 C 2, then ( 3nλ 2 t 2 ) /2 P{ Σ Σ n t} pexp B 4 +2B 2, t 0. λt Amit Singer (Princeton University) July / 41
17 Regularization Extreme example: q = 2, P s are random coordinate selection operators. Coupon collector: n p 2 logp for L n to be invertible. Regularization? min Σ n (Is P s µ n )(I s P s µ n ) (P s ΣPs +σ2 I) 2 F +λ Σ 2 F. s=1 This would set unattainable entries of Σ n to 0. While other regularization strategies are possible, e.g., WΣW 1 to promote sparsity in some basis W, they come at a price of increased computational complexity. Amit Singer (Princeton University) July / 41
18 Back to Heterogeneity: Fourier Slice Theorem If ˆP s is the Fourier-domain version of P s, then ˆP s simply restricts Fourier volumes to central planes perpendicular to the viewing direction. Amit Singer (Princeton University) July / 41
19 The slice theorem and covariance matrix estimation The Fourier slice theorem: Better work in Fourier domain ˆφ s = Fφ s, Î s = FI s Estimate ˆµ, ˆΣ and then inverse transform to get µ, Σ. ˆP s is a restriction to a central plane corresponding to R s. In 3D, for any pair of frequencies we can find a central plane that passes through them coordinate selection operators in the continuous limit. The covariance of 2D objects from their 1D line projections cannot be estimated. Amit Singer (Princeton University) July / 41
20 The ramp filter (  nˆµ n = 1 n ˆP n s=1 s ˆP ) s ˆµ n In the continuous limit: replace P s with P s and average with integral over SO(3) with respect to the Haar measure. Fourier slice theorem: (ˆP s ˆP sˆφ)(ρ) = ˆφ(ρ)δ(ρ R 3 s ). ˆµ(ρ) = ˆµ(ρ) 1 δ(ρ θ)dθ = ˆµ(ρ) 1 δ( ρ ˆµ(ρ) θ)dθ = 4π S 2 ρ 4π S 2 ρ 2 ρ. Amit Singer (Princeton University) July / 41
21 The triangular area filter ˆL nˆσn := 1 n ˆP n s=1 s ˆP sˆσnˆp s ˆPs Fourier slice theorem: (ˆP s ˆP sˆφ)(ρ) = ˆφ(ρ)δ(ρ R 3 s ). (ˆP s ˆP sˆσˆp s ˆP s )(ρ 1,ρ 2 ) = ˆΣ(ρ 1,ρ 2 )δ(ρ 1 R 3 s)δ(ρ 2 R 3 s) Integrate over R 3 s : (ˆLˆΣ)(ρ 1,ρ 2 ) = ˆΣ(ρ 1,ρ 2 )K(ρ 1,ρ 2 ). ˆL is a diagonal operator, with K(ρ 1,ρ 2 ) = 1 δ(ρ 1 θ)δ(ρ 2 θ)dθ = 1 2 4π S 2 4π ρ 1 ρ 2. K is the density of planes that go through frequencies ρ 1,ρ 2 The filter 1/K is proportional to the area of a triangle with nodes at ρ 1, ρ 2, and the origin. Amit Singer (Princeton University) July / 41
22 The triangular area filter (ˆLˆΣ)(ρ 1,ρ 2 ) = ˆΣ(ρ 1,ρ 2 )K(ρ 1,ρ 2 ). K(ρ 1,ρ 2 ) = 1 δ(ρ 1 θ)δ(ρ 2 θ)dθ = 1 2 4π S 2 4π ρ 1 ρ 2. Amit Singer (Princeton University) July / 41
23 Challenge: discretization Now that we are in the Fourier domain, how do we discretize? If we using a Cartesian grid and use nearest neighbors interpolation then we incur large discretization error. If the volumes are voxels, then Σ is of size Low pass filter approximation might be unavoidable. We need to respect rotation and restriction to planes (i.e., spherical coordinates), but then ˆL n is no longer diagonal and we can easily run into computational problems with the inversion of ˆL n. We need a representation of volumes in which ˆL n is sparse. Amit Singer (Princeton University) July / 41
24 Finding the right basis Idea: Find image and volume domains Î and ˆV (in Fourier space) so that ˆP s (ˆV) Î. Construction: ({ }) 1 Î = span f k (r)e imϕ ; ˆV = span({f k (r)yl m (θ,ϕ)}). 2π Note: resulting matrix ˆP s is block-diagonal. This is a bit unusual construction, since typically the radial functions depend on the angular frequency l (e.g., Spherical-Bessel, 3D Slepian) We have to be very careful with our construction so that our basis approximately spans B. Amit Singer (Princeton University) July / 41
25 Fourier linear equations The linear equations in the Fourier domain are essentially the same: ( n  nˆµ n := ˆP s ˆP s )ˆµ n = â; s=1 n ˆL nˆσn := ˆP s ˆP sˆσnˆp s ˆPs = ˆB, s=1 Note: ˆL n is block-diagonal (sparsity!), acting on the blocks ˆΣ k 1,k 2 separately. Denote each block of ˆL n by ˆL k 1,k 2 n. Amit Singer (Princeton University) July / 41
26 Challenge: size of ˆL n We still must deal with the enormous size of ˆL n, where n ˆL nˆσn = ˆP s ˆP sˆσnˆp s ˆPs. s=1 Amit Singer (Princeton University) July / 41
27 Sum to integral If rotations are uniformly distributed, we have ˆL nˆσn = 1 n 1 4π n ˆP s ˆP sˆσnˆp s ˆPs s=1 S 2 ˆP(θ) ˆP(θ)ˆΣnˆP(θ) ˆP(θ)dθ = ˆLˆΣn, a.s. Amit Singer (Princeton University) July / 41
28 What is ˆL? Let us view ˆΣ k 1,k 2 : S 2 S 2 C as bandlimited bilinear form. It turns out that where ˆL k 1,k 2ˆΣk 1,k 2 = LPF k1,k 2 (ˆΣ k 1,k2 K), K(α,β) = 1 2 4π α β. Thus, ˆL is no longer diagonal......but, ˆL turns out to be sparse! This can be shown using Clebsch-Gordan coefficients for products of spherical harmonics. Amit Singer (Princeton University) July / 41
29 Sparsity of ˆL Only about 6% of ˆL 15,15 are nonzero. Figure : Sparsity structure of (block of) ˆL Amit Singer (Princeton University) July / 41
30 Condition number and computational complexity 1 ˆL is self-adjoint and positive semidefinite 2 λ min (ˆL) 2 (implies consistency) 3 Conjecture: λ max (ˆL k 1,k 2 ) grows linearly with min(k 1,k 2 ) ) 4 Conjugate gradient requires O ( κ(ˆl k 1,k 2) iterations, where the cost of each iteration is nnz(ˆl k 1,k 2 ). 5 Computational complexity: (here N res = 2 π w max) O(nN 2 N 2 res +N 9.5 res) but all steps can be run in parallel (e.g., ˆL is block-diagonal). Amit Singer (Princeton University) July / 41
31 Numerical results: Two classes Synthetic dataset: simulate heterogeneity using two toy volumes. Figure : Slices of the toy volumes. Note: ideal covariance matrix should have rank 1. Amit Singer (Princeton University) July / 41
32 Numerical results: Two classes Noisy projection images: (a) Clean (b) SNR = 0.05 (c) SNR = 0.01 Amit Singer (Princeton University) July / 41
33 Numerical results: Two classes Reconstructions from noisy data: (a) Clean (b) SNR = 0.01 (c) SNR = (d) SNR = Figure : Reconstructions of mean and top eigenvector for three different SNR values. The top row contains clean and reconstructed slices of the mean; the bottom row contains slices of the top eigenvector. Amit Singer (Princeton University) July / 41
34 Numerical results: Two classes Figure : Correlations of true and computed top eigenvectors for various SNRs. Amit Singer (Princeton University) July / 41
35 Numerical results: Two classes Figure : Corresponding eigenvalue histograms. Amit Singer (Princeton University) July / 41
36 Numerical results: Three classes Ideal Σ has rank 2. Figure : Eigenvalue histograms of reconstructed covariance matrix in the three class case for three SNR values. Note that the noise distribution first engulfs the second eigenvalue, and eventually the top eigenvalue as well. Amit Singer (Princeton University) July / 41
37 Numerical results: Three classes Clustering Least squares and a Gaussian mixture model: ˆs µ X ˆn + C 1 X c =1 C 1 X ˆc. αs,c V c =1 ˆs V ˆ c P ˆs X ˆs P ˆs µ ˆs µ αs,c P ˆn ˆIs P ˆn. Figure : The coordinates αs for the three class case, colored according to true class. Amit Singer (Princeton University) July / 41
38 Numerical results: Continuous Variability Amit Singer (Princeton University) July / 41
39 Summary It is possible to do PCA from projected data and estimate rank without imputation. Open problems for the distribution of eigenvalues of Σ n in the large n, large p regime (also need consider q and distribution of P s ) Generalized the ramp filter of classical computerized tomography to the triangular area filter for tomographic inversion of structural variability. A careful selection of bases and the discovery of a sparse structure allowed us to tractably solve the heterogeneity problem. We began the study of the projection covariance transform L, an operator of interest in tomographic problems involving variability. Amit Singer (Princeton University) July / 41
40 Future research directions Random matrix theory investigation for high dimensional PCA of projected data Compare approach with matrix sensing/completion methods Study ˆL further Include contrast transfer function (CTF) in the forward model Add and examine regularization schemes Relax uniform rotations assumption Test on real datasets Other applications (e.g., 4D tomography) Amit Singer (Princeton University) July / 41
41 Thank You! Funding: NIH/NIGMS R01GM AFOSR FA Simons Foundation LTR DTD DARPA N G. Katsevich, A. Katsevich, A. Singer (submitted). Covariance Matrix Estimation for the Cryo-EM Heterogeneity Problem. Amit Singer (Princeton University) July / 41
Some Optimization and Statistical Learning Problems in Structural Biology
Some Optimization and Statistical Learning Problems in Structural Biology Amit Singer Princeton University, Department of Mathematics and PACM January 8, 2013 Amit Singer (Princeton University) January
More informationPCA from noisy linearly reduced measurements
PCA from noisy linearly reduced measurements Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics September 27, 2016 Amit Singer (Princeton University)
More informationIntroduction to Three Dimensional Structure Determination of Macromolecules by Cryo-Electron Microscopy
Introduction to Three Dimensional Structure Determination of Macromolecules by Cryo-Electron Microscopy Amit Singer Princeton University, Department of Mathematics and PACM July 23, 2014 Amit Singer (Princeton
More informationThree-dimensional structure determination of molecules without crystallization
Three-dimensional structure determination of molecules without crystallization Amit Singer Princeton University, Department of Mathematics and PACM June 4, 214 Amit Singer (Princeton University) June 214
More informationClass Averaging in Cryo-Electron Microscopy
Class Averaging in Cryo-Electron Microscopy Zhizhen Jane Zhao Courant Institute of Mathematical Sciences, NYU Mathematics in Data Sciences ICERM, Brown University July 30 2015 Single Particle Reconstruction
More information3D Structure Determination using Cryo-Electron Microscopy Computational Challenges
3D Structure Determination using Cryo-Electron Microscopy Computational Challenges Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics July 28,
More informationECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis
ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 7: Matrix completion Yuejie Chi The Ohio State University Page 1 Reference Guaranteed Minimum-Rank Solutions of Linear
More informationBeyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian
Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics
More informationApplied and Computational Harmonic Analysis
Appl. Comput. Harmon. Anal. 28 (2010) 296 312 Contents lists available at ScienceDirect Applied and Computational Harmonic Analysis www.elsevier.com/locate/acha Reference free structure determination through
More informationCompressive Sensing and Beyond
Compressive Sensing and Beyond Sohail Bahmani Gerorgia Tech. Signal Processing Compressed Sensing Signal Models Classics: bandlimited The Sampling Theorem Any signal with bandwidth B can be recovered
More informationRegularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008
Regularized Estimation of High Dimensional Covariance Matrices Peter Bickel Cambridge January, 2008 With Thanks to E. Levina (Joint collaboration, slides) I. M. Johnstone (Slides) Choongsoon Bae (Slides)
More informationCS281 Section 4: Factor Analysis and PCA
CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we
More informationDisentangling Orthogonal Matrices
Disentangling Orthogonal Matrices Teng Zhang Department of Mathematics, University of Central Florida Orlando, Florida 3286, USA arxiv:506.0227v2 [math.oc] 8 May 206 Amit Singer 2 Department of Mathematics
More informationOrientation Assignment in Cryo-EM
Orientation Assignment in Cryo-EM Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics July 24, 2014 Amit Singer (Princeton University) July 2014
More informationFast Angular Synchronization for Phase Retrieval via Incomplete Information
Fast Angular Synchronization for Phase Retrieval via Incomplete Information Aditya Viswanathan a and Mark Iwen b a Department of Mathematics, Michigan State University; b Department of Mathematics & Department
More informationSparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28
Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:
More informationGlobal Positioning from Local Distances
Global Positioning from Local Distances Amit Singer Yale University, Department of Mathematics, Program in Applied Mathematics SIAM Annual Meeting, July 2008 Amit Singer (Yale University) San Diego 1 /
More informationSparse Covariance Selection using Semidefinite Programming
Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support
More informationThe Projected Power Method: An Efficient Algorithm for Joint Alignment from Pairwise Differences
The Projected Power Method: An Efficient Algorithm for Joint Alignment from Pairwise Differences Yuxin Chen Emmanuel Candès Department of Statistics, Stanford University, Sep. 2016 Nonconvex optimization
More informationMatrix completion: Fundamental limits and efficient algorithms. Sewoong Oh Stanford University
Matrix completion: Fundamental limits and efficient algorithms Sewoong Oh Stanford University 1 / 35 Low-rank matrix completion Low-rank Data Matrix Sparse Sampled Matrix Complete the matrix from small
More informationRobust Principal Component Analysis
ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M
More informationFast and Robust Phase Retrieval
Fast and Robust Phase Retrieval Aditya Viswanathan aditya@math.msu.edu CCAM Lunch Seminar Purdue University April 18 2014 0 / 27 Joint work with Yang Wang Mark Iwen Research supported in part by National
More informationEstimation of large dimensional sparse covariance matrices
Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)
More informationMCMC Sampling for Bayesian Inference using L1-type Priors
MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling
More informationDimension Reduction. David M. Blei. April 23, 2012
Dimension Reduction David M. Blei April 23, 2012 1 Basic idea Goal: Compute a reduced representation of data from p -dimensional to q-dimensional, where q < p. x 1,...,x p z 1,...,z q (1) We want to do
More informationDISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania
Submitted to the Annals of Statistics DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING By T. Tony Cai and Linjun Zhang University of Pennsylvania We would like to congratulate the
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More informationInvertibility of random matrices
University of Michigan February 2011, Princeton University Origins of Random Matrix Theory Statistics (Wishart matrices) PCA of a multivariate Gaussian distribution. [Gaël Varoquaux s blog gael-varoquaux.info]
More informationComputing Steerable Principal Components of a Large Set of Images and Their Rotations Colin Ponce and Amit Singer
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL 20, NO 11, NOVEMBER 2011 3051 Computing Steerable Principal Components of a Large Set of Images and Their Rotations Colin Ponce and Amit Singer Abstract We present
More informationAnalysis of Spectral Kernel Design based Semi-supervised Learning
Analysis of Spectral Kernel Design based Semi-supervised Learning Tong Zhang IBM T. J. Watson Research Center Yorktown Heights, NY 10598 Rie Kubota Ando IBM T. J. Watson Research Center Yorktown Heights,
More informationCovariate-Assisted Variable Ranking
Covariate-Assisted Variable Ranking Tracy Ke Department of Statistics Harvard University WHOA-PSI@St. Louis, Sep. 8, 2018 1/18 Sparse linear regression Y = X β + z, X R n,p, z N(0, σ 2 I n ) Signals (nonzero
More informationNon white sample covariance matrices.
Non white sample covariance matrices. S. Péché, Université Grenoble 1, joint work with O. Ledoit, Uni. Zurich 17-21/05/2010, Université Marne la Vallée Workshop Probability and Geometry in High Dimensions
More informationSolving Corrupted Quadratic Equations, Provably
Solving Corrupted Quadratic Equations, Provably Yuejie Chi London Workshop on Sparse Signal Processing September 206 Acknowledgement Joint work with Yuanxin Li (OSU), Huishuai Zhuang (Syracuse) and Yingbin
More informationCS181 Midterm 2 Practice Solutions
CS181 Midterm 2 Practice Solutions 1. Convergence of -Means Consider Lloyd s algorithm for finding a -Means clustering of N data, i.e., minimizing the distortion measure objective function J({r n } N n=1,
More informationSparse PCA in High Dimensions
Sparse PCA in High Dimensions Jing Lei, Department of Statistics, Carnegie Mellon Workshop on Big Data and Differential Privacy Simons Institute, Dec, 2013 (Based on joint work with V. Q. Vu, J. Cho, and
More informationPermutation-invariant regularization of large covariance matrices. Liza Levina
Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work
More informationHigh-dimensional covariance estimation based on Gaussian graphical models
High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,
More informationNumerical Solutions to the General Marcenko Pastur Equation
Numerical Solutions to the General Marcenko Pastur Equation Miriam Huntley May 2, 23 Motivation: Data Analysis A frequent goal in real world data analysis is estimation of the covariance matrix of some
More informationRobust covariance matrices estimation and applications in signal processing
Robust covariance matrices estimation and applications in signal processing F. Pascal SONDRA/Supelec GDR ISIS Journée Estimation et traitement statistique en grande dimension May 16 th, 2013 FP (SONDRA/Supelec)
More informationSelf-Calibration and Biconvex Compressive Sensing
Self-Calibration and Biconvex Compressive Sensing Shuyang Ling Department of Mathematics, UC Davis July 12, 2017 Shuyang Ling (UC Davis) SIAM Annual Meeting, 2017, Pittsburgh July 12, 2017 1 / 22 Acknowledgements
More information2.3. Clustering or vector quantization 57
Multivariate Statistics non-negative matrix factorisation and sparse dictionary learning The PCA decomposition is by construction optimal solution to argmin A R n q,h R q p X AH 2 2 under constraint :
More information5742 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER /$ IEEE
5742 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER 2009 Uncertainty Relations for Shift-Invariant Analog Signals Yonina C. Eldar, Senior Member, IEEE Abstract The past several years
More informationVector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.
Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar
More informationROBUST BLIND SPIKES DECONVOLUTION. Yuejie Chi. Department of ECE and Department of BMI The Ohio State University, Columbus, Ohio 43210
ROBUST BLIND SPIKES DECONVOLUTION Yuejie Chi Department of ECE and Department of BMI The Ohio State University, Columbus, Ohio 4 ABSTRACT Blind spikes deconvolution, or blind super-resolution, deals with
More informationThe largest eigenvalues of the sample covariance matrix. in the heavy-tail case
The largest eigenvalues of the sample covariance matrix 1 in the heavy-tail case Thomas Mikosch University of Copenhagen Joint work with Richard A. Davis (Columbia NY), Johannes Heiny (Aarhus University)
More informationPrincipal component analysis and the asymptotic distribution of high-dimensional sample eigenvectors
Principal component analysis and the asymptotic distribution of high-dimensional sample eigenvectors Kristoffer Hellton Department of Mathematics, University of Oslo May 12, 2015 K. Hellton (UiO) Distribution
More informationAcoustic MIMO Signal Processing
Yiteng Huang Jacob Benesty Jingdong Chen Acoustic MIMO Signal Processing With 71 Figures Ö Springer Contents 1 Introduction 1 1.1 Acoustic MIMO Signal Processing 1 1.2 Organization of the Book 4 Part I
More informationParameter Estimation for Mixture Models via Convex Optimization
Parameter Estimation for Mixture Models via Convex Optimization Yuanxin Li Department of Electrical and Computer Engineering The Ohio State University Columbus Ohio 432 Email: li.3822@osu.edu Yuejie Chi
More informationThree-Dimensional Electron Microscopy of Macromolecular Assemblies
Three-Dimensional Electron Microscopy of Macromolecular Assemblies Joachim Frank Wadsworth Center for Laboratories and Research State of New York Department of Health The Governor Nelson A. Rockefeller
More informationThis property turns out to be a general property of eigenvectors of a symmetric A that correspond to distinct eigenvalues as we shall see later.
34 To obtain an eigenvector x 2 0 2 for l 2 = 0, define: B 2 A - l 2 I 2 = È 1, 1, 1 Î 1-0 È 1, 0, 0 Î 1 = È 1, 1, 1 Î 1. To transform B 2 into an upper triangular matrix, subtract the first row of B 2
More informationarxiv: v5 [math.na] 16 Nov 2017
RANDOM PERTURBATION OF LOW RANK MATRICES: IMPROVING CLASSICAL BOUNDS arxiv:3.657v5 [math.na] 6 Nov 07 SEAN O ROURKE, VAN VU, AND KE WANG Abstract. Matrix perturbation inequalities, such as Weyl s theorem
More informationA fast randomized algorithm for overdetermined linear least-squares regression
A fast randomized algorithm for overdetermined linear least-squares regression Vladimir Rokhlin and Mark Tygert Technical Report YALEU/DCS/TR-1403 April 28, 2008 Abstract We introduce a randomized algorithm
More informationReconstruction from Anisotropic Random Measurements
Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013
More informationIntroduction to Compressed Sensing
Introduction to Compressed Sensing Alejandro Parada, Gonzalo Arce University of Delaware August 25, 2016 Motivation: Classical Sampling 1 Motivation: Classical Sampling Issues Some applications Radar Spectral
More informationRandom projections. 1 Introduction. 2 Dimensionality reduction. Lecture notes 5 February 29, 2016
Lecture notes 5 February 9, 016 1 Introduction Random projections Random projections are a useful tool in the analysis and processing of high-dimensional data. We will analyze two applications that use
More informationHomework 1. Yuan Yao. September 18, 2011
Homework 1 Yuan Yao September 18, 2011 1. Singular Value Decomposition: The goal of this exercise is to refresh your memory about the singular value decomposition and matrix norms. A good reference to
More informationCross-Validation with Confidence
Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University UMN Statistics Seminar, Mar 30, 2017 Overview Parameter est. Model selection Point est. MLE, M-est.,... Cross-validation
More informationEfficient Spectral Methods for Learning Mixture Models!
Efficient Spectral Methods for Learning Mixture Models Qingqing Huang 2016 February Laboratory for Information & Decision Systems Based on joint works with Munther Dahleh, Rong Ge, Sham Kakade, Greg Valiant.
More informationThe Analysis Cosparse Model for Signals and Images
The Analysis Cosparse Model for Signals and Images Raja Giryes Computer Science Department, Technion. The research leading to these results has received funding from the European Research Council under
More informationSupplementary Information for cryosparc: Algorithms for rapid unsupervised cryo-em structure determination
Supplementary Information for cryosparc: Algorithms for rapid unsupervised cryo-em structure determination Supplementary Note : Stochastic Gradient Descent (SGD) SGD iteratively optimizes an objective
More informationRecovery of Low-Rank Plus Compressed Sparse Matrices with Application to Unveiling Traffic Anomalies
July 12, 212 Recovery of Low-Rank Plus Compressed Sparse Matrices with Application to Unveiling Traffic Anomalies Morteza Mardani Dept. of ECE, University of Minnesota, Minneapolis, MN 55455 Acknowledgments:
More informationGuaranteed Learning of Latent Variable Models through Spectral and Tensor Methods
Guaranteed Learning of Latent Variable Models through Spectral and Tensor Methods Anima Anandkumar U.C. Irvine Application 1: Clustering Basic operation of grouping data points. Hypothesis: each data point
More informationSuper-resolution via Convex Programming
Super-resolution via Convex Programming Carlos Fernandez-Granda (Joint work with Emmanuel Candès) Structure and Randomness in System Identication and Learning, IPAM 1/17/2013 1/17/2013 1 / 44 Index 1 Motivation
More informationBLIND SEPARATION OF TEMPORALLY CORRELATED SOURCES USING A QUASI MAXIMUM LIKELIHOOD APPROACH
BLID SEPARATIO OF TEMPORALLY CORRELATED SOURCES USIG A QUASI MAXIMUM LIKELIHOOD APPROACH Shahram HOSSEII, Christian JUTTE Laboratoire des Images et des Signaux (LIS, Avenue Félix Viallet Grenoble, France.
More informationStatistical Data Mining and Machine Learning Hilary Term 2016
Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes
More informationBi-cross-validation for factor analysis
Picking k in factor analysis 1 Bi-cross-validation for factor analysis Art B. Owen, Stanford University and Jingshu Wang, Stanford University Picking k in factor analysis 2 Summary Factor analysis is over
More informationSTAT 200C: High-dimensional Statistics
STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57
More informationCourse Notes: Week 1
Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues
More informationSparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda
Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic
More informationNetwork Topology Inference from Non-stationary Graph Signals
Network Topology Inference from Non-stationary Graph Signals Rasoul Shafipour Dept. of Electrical and Computer Engineering University of Rochester rshafipo@ece.rochester.edu http://www.ece.rochester.edu/~rshafipo/
More informationLearning Multiple Tasks with a Sparse Matrix-Normal Penalty
Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1
More informationTractable Upper Bounds on the Restricted Isometry Constant
Tractable Upper Bounds on the Restricted Isometry Constant Alex d Aspremont, Francis Bach, Laurent El Ghaoui Princeton University, École Normale Supérieure, U.C. Berkeley. Support from NSF, DHS and Google.
More informationData Preprocessing Tasks
Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can
More informationDecompositions of frames and a new frame identity
Decompositions of frames and a new frame identity Radu Balan a, Peter G. Casazza b, Dan Edidin c and Gitta Kutyniok d a Siemens Corporate Research, 755 College Road East, Princeton, NJ 08540, USA; b Department
More informationGeometric Modeling Summer Semester 2010 Mathematical Tools (1)
Geometric Modeling Summer Semester 2010 Mathematical Tools (1) Recap: Linear Algebra Today... Topics: Mathematical Background Linear algebra Analysis & differential geometry Numerical techniques Geometric
More informationMethods for sparse analysis of high-dimensional data, II
Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional
More informationDictionary Learning Using Tensor Methods
Dictionary Learning Using Tensor Methods Anima Anandkumar U.C. Irvine Joint work with Rong Ge, Majid Janzamin and Furong Huang. Feature learning as cornerstone of ML ML Practice Feature learning as cornerstone
More informationPerformance Limits on the Classification of Kronecker-structured Models
Performance Limits on the Classification of Kronecker-structured Models Ishan Jindal and Matthew Nokleby Electrical and Computer Engineering Wayne State University Motivation: Subspace models (Union of)
More informationCourse topics (tentative) The role of random effects
Course topics (tentative) random effects linear mixed models analysis of variance frequentist likelihood-based inference (MLE and REML) prediction Bayesian inference The role of random effects Rasmus Waagepetersen
More informationCross-Validation with Confidence
Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University WHOA-PSI Workshop, St Louis, 2017 Quotes from Day 1 and Day 2 Good model or pure model? Occam s razor We really
More informationSparse PCA with applications in finance
Sparse PCA with applications in finance A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon 1 Introduction
More informationPerhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.
Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage
More informationInformation-theoretically Optimal Sparse PCA
Information-theoretically Optimal Sparse PCA Yash Deshpande Department of Electrical Engineering Stanford, CA. Andrea Montanari Departments of Electrical Engineering and Statistics Stanford, CA. Abstract
More informationMethods for sparse analysis of high-dimensional data, II
Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional
More informationPrincipal Component Analysis
Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used
More informationSingular Value Decomposition and Principal Component Analysis (PCA) I
Singular Value Decomposition and Principal Component Analysis (PCA) I Prof Ned Wingreen MOL 40/50 Microarray review Data per array: 0000 genes, I (green) i,i (red) i 000 000+ data points! The expression
More informationHigh dimensional Ising model selection
High dimensional Ising model selection Pradeep Ravikumar UT Austin (based on work with John Lafferty, Martin Wainwright) Sparse Ising model US Senate 109th Congress Banerjee et al, 2008 Estimate a sparse
More informationLINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS
LINEAR ALGEBRA, -I PARTIAL EXAM SOLUTIONS TO PRACTICE PROBLEMS Problem (a) For each of the two matrices below, (i) determine whether it is diagonalizable, (ii) determine whether it is orthogonally diagonalizable,
More information8.1 Concentration inequality for Gaussian random matrix (cont d)
MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration
More informationComputational and Statistical Tradeoffs via Convex Relaxation
Computational and Statistical Tradeoffs via Convex Relaxation Venkat Chandrasekaran Caltech Joint work with Michael Jordan Time-constrained Inference o Require decision after a fixed (usually small) amount
More informationNIH Public Access Author Manuscript Ann Math. Author manuscript; available in PMC 2012 November 23.
NIH Public Access Author Manuscript Published in final edited form as: Ann Math. 2011 September 1; 174(2): 1219 1241. doi:10.4007/annals.2011.174.2.11. Representation theoretic patterns in three dimensional
More informationConfidence Intervals for Low-dimensional Parameters with High-dimensional Data
Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Cun-Hui Zhang and Stephanie S. Zhang Rutgers University and Columbia University September 14, 2012 Outline Introduction Methodology
More informationRapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization
Rapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization Shuyang Ling Department of Mathematics, UC Davis Oct.18th, 2016 Shuyang Ling (UC Davis) 16w5136, Oaxaca, Mexico Oct.18th, 2016
More informationPrincipal Component Analysis (PCA) for Sparse High-Dimensional Data
AB Principal Component Analysis (PCA) for Sparse High-Dimensional Data Tapani Raiko, Alexander Ilin, and Juha Karhunen Helsinki University of Technology, Finland Adaptive Informatics Research Center Principal
More informationRandom Matrix Theory Lecture 1 Introduction, Ensembles and Basic Laws. Symeon Chatzinotas February 11, 2013 Luxembourg
Random Matrix Theory Lecture 1 Introduction, Ensembles and Basic Laws Symeon Chatzinotas February 11, 2013 Luxembourg Outline 1. Random Matrix Theory 1. Definition 2. Applications 3. Asymptotics 2. Ensembles
More informationPrincipal Components Theory Notes
Principal Components Theory Notes Charles J. Geyer August 29, 2007 1 Introduction These are class notes for Stat 5601 (nonparametrics) taught at the University of Minnesota, Spring 2006. This not a theory
More informationMIT Final Exam Solutions, Spring 2017
MIT 8.6 Final Exam Solutions, Spring 7 Problem : For some real matrix A, the following vectors form a basis for its column space and null space: C(A) = span,, N(A) = span,,. (a) What is the size m n of
More informationSketching for Large-Scale Learning of Mixture Models
Sketching for Large-Scale Learning of Mixture Models Nicolas Keriven Université Rennes 1, Inria Rennes Bretagne-atlantique Adv. Rémi Gribonval Outline Introduction Practical Approach Results Theoretical
More informationLarge sample covariance matrices and the T 2 statistic
Large sample covariance matrices and the T 2 statistic EURANDOM, the Netherlands Joint work with W. Zhou Outline 1 2 Basic setting Let {X ij }, i, j =, be i.i.d. r.v. Write n s j = (X 1j,, X pj ) T and
More informationLecture Notes 5: Multiresolution Analysis
Optimization-based data analysis Fall 2017 Lecture Notes 5: Multiresolution Analysis 1 Frames A frame is a generalization of an orthonormal basis. The inner products between the vectors in a frame and
More information