PCA from noisy linearly reduced measurements

Size: px

Start display at page:

Download "PCA from noisy linearly reduced measurements"

Frederica Payne
5 years ago
Views:

1 PCA from noisy linearly reduced measurements Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics September 27, 2016 Amit Singer (Princeton University) September / 39

2 Single particle reconstruction using cryo-em Schematic drawing of the imaging process: The cryo-em problem: Amit Singer (Princeton University) September / 39

3 New detector technology: Exciting times for cryo-em BIOCHEMISTRY The Resolution Revolution Werner Kühlbrandt P recise knowledge of the structure of macromolecules in the cell is essential for understanding how they function. Structures of large macromolecules can now be obtained at near-atomic resolution by averaging thousands of electron microscope images recorded before radiation damage accumulates. This is what Amunts et al. have done in their research article on page 1485 of this issue ( 1), reporting the structure of the large subunit of the mitochondrial ribosome at 3.2 Å resolution by electron cryo-microscopy (cryo-em). Together with other recent high-resolution cryo-em structures ( 2 4) (see the figure), this achievement heralds the beginning of a new era in molecular biology, where structures at near-atomic resolution are no longer the prerogative of x-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy. Ribosomes are ancient, massive protein- RNA complexes that translate the linear genetic code into three-dimensional proteins. Mitochondria semi-autonomous organelles SCIENCE VOL MARCH A B C Advances in detector technology and image processing are yielding high-resolution electron cryo-microscopy structures of biomolecules. Near-atomic resolution with cryo-em. (A) The large subunit of the yeast mitochondrial ribosome at 3.2 Å reported by Amunts et al. In the detailed view below, the base pairs of an RNA double helix and a magnesium ion (blue) are clearly resolved. (B) TRPV1 ion channel at 3.4 Å ( 2), with a detailed view of residues lining the ion pore on the four-fold axis of the tetrameric channel. (C) F 420-reducing [NiFe] hydrogenase at 3.36 Å ( 3). The detail shows an α helix in the FrhA subunit with resolved side chains. The maps are not drawn to scale. Amit Singer (Princeton University) September / 39

4 Cryo-EM in the news... March 31, 2016 REPORTS The 3.8 Å resolution cryo-em structure of Zika virus Cite as: Sirohi et al., Science /science.aaf5316 (2016). Devika Sirohi, 1 * Zhenguo Chen, 1 * Lei Sun, 1 Thomas Klose, 1 Theodore C. Pierson, 2 Michael G. Rossmann, 1 Richard J. Kuhn 1 1 Markey Center for Structural Biology and Purdue Institute for Inflammation, Immunology and Infectious Disease, Purdue University, West Lafayette, IN 47907, USA. 2 Viral Pathogenesis Section, Laboratory of Viral Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA. Amit Singer (Princeton University) September / 39

Method of the Year 2015 January 2016 Volume 13 No 1 Single-particle cryo-electron microscopy (cryo-em) is our choice for Method of the Year 2015 for its newfound ability to solve protein structures

5 Method of the Year 2015 January 2016 Volume 13 No 1 Single-particle cryo-electron microscopy (cryo-em) is our choice for Method of the Year 2015 for its newfound ability to solve protein structures at near-atomic resolution. Featured is the 2.2-Å cryo-em structure of β-galactosidase as recently reported by Bartesaghi et al. (Science 348, , 2015). Cover design by Erin Dewalt. Amit Singer (Princeton University) September / 39

Big movie" data, publicly available http://www.ebi.ac.

6 Big movie" data, publicly available Amit Singer (Princeton University) September / 39

T k is a filtering" operator due to microscope and detector. Classical cryo-em problem: Estimate φ = φ 1 = φ 2 = = φ n (and R 1,.

7 Image formation model and inverse problem Projection y k Molecule φ k R k SO(3) Electron source Projection images y k = T k P k φ k +ǫ k, k = 1, 2,...,n. φ k : R 3 R is the scattering potential of the k th molecule. P k is 2D tomographic projection, after rotating φ k by R k. T k is a filtering" operator due to microscope and detector. Classical cryo-em problem: Estimate φ = φ 1 = φ 2 = = φ n (and R 1,...,R n) given y 1,...,y n. Heterogeneity problem: Estimate φ 1,φ 2,...,φ n (and R 1,...,R n) given y 1,...,y n. Amit Singer (Princeton University) September / 39

8 Toy example Amit Singer (Princeton University) September / 39

9 E. coli 50S ribosomal subunit 27,000 particle images provided by Dr. Fred Sigworth, Yale Medical School 3D reconstruction by S, Lanhui Wang, and Jane Zhao Amit Singer (Princeton University) September / 39

10 Algorithmic and computational challenges 1 Ab-initio modelling and orientation assignment 2 Heterogeneity resolving structural variability 3 Denoising and 2D class averaging 4 Particle picking 5 Symmetry detection and reconstruction 6 Motion correction 7 Estimation of contrast transfer function and noise statistics 8 Fast reconstruction and refinement 9 Validation and resolution determination Amit Singer (Princeton University) September / 39

11 Principal components analysis (PCA) Given i.i.d. samples x 1,..., x n of some x R p, estimate mean E[x] and covariance Cov[x]. µ n = 1 n n k=1 x k Σ n = 1 n (x k µ n n )(x k µ n ) T k=1 Eigenvectors and eigenvalues of Σ n are used for dimensionality reduction, compression, and denoising Useful when most variability is captured by a few leading principal components: low rank approximation. Amit Singer (Princeton University) September / 39

12 PCA from noisy linearly transformed measurements Measure with A k : R p R q, noise variance Cov[e k ] = σ 2 I q y k = A k x k + e k. A k A l Goal: Estimate E[x] and Cov[x] from y 1,...,y n. Amit Singer (Princeton University) September / 39

13 Motivation: cryo-electron microscopy Write image y k as y k = T k P k φ k + e k, k = 1,...,n where: φ k is a three-dimensional molecular volume, P k projects volume into an image and T k filters it, and e k is measurement noise. Amit Singer (Princeton University) September / 39

14 Application I: deconvolution in noise Goal: Estimate unfiltered projection x k = P k φ k from y k = T k x k + e k. Amit Singer (Princeton University) September / 39

15 Application I: deconvolution in noise Goal: Estimate unfiltered projection x k = P k φ k from y k = T k x k + e k. x k Amit Singer (Princeton University) September / 39

16 Application I: deconvolution in noise Goal: Estimate unfiltered projection x k = P k φ k from y k = T k x k + e k. x k T k x k Amit Singer (Princeton University) September / 39

17 Application I: deconvolution in noise Goal: Estimate unfiltered projection x k = P k φ k from y k = T k x k + e k. x k T k x k T k x k + e k Amit Singer (Princeton University) September / 39

18 Application I: deconvolution in noise Goal: Estimate unfiltered projection x k = P k φ k from y k = T k x k + e k. x k T k x k T k x k + e k Filtering T k : R p R p is singular or ill-conditioned. Amit Singer (Princeton University) September / 39

19 Application I: deconvolution in noise Goal: Estimate unfiltered projection x k = P k φ k from y k = T k x k + e k. x k T k x k T k x k + e k Filtering T k : R p R p is singular or ill-conditioned. Using low-rank Cov[x], we can invert T k and reduce effect of noise e k. Amit Singer (Princeton University) September / 39

20 Contrast transfer function: T k CTF suppresses information and inverts contrast CTF cannot be trivially inverted (zero crossings) Additional envelope Gaussian decay of high frequencies Information lost from one defocus group could be recovered from another group that has different zero crossings. Amit Singer (Princeton University) September / 39

21 Application II: heterogeneity How to classify images from different molecular structures? 1 Class A Class B 1 Images courtesy Joachim Frank (Columbia) Amit Singer (Princeton University) September / 39

22 Application: heterogeneity (cont.) Goal: Classify images according to molecular state φ k. y k = T k P k φ k + e k, Cannot invert since T k P k : R p R q and q < p. Volumes φ k live in low-dimensional space can be estimated from top eigenvectors of Cov[φ k ]. Restrict T k P k to this space, invert, cluster and reconstruct 2. Aggregating large-scale datasets give accurate estimates despite high noise. 2 Penczek et al. (2009) Amit Singer (Princeton University) September / 39

23 Least-squares estimator (mean) From y k = A k x k + e k, we have E[y k ] = A k E[x]. On the other hand so E[y k ] y k, y k A k E[x]. Form least-squares estimator for E[x] 1 µ n = arg min µ n n y k A k µ 2. k=1 Amit Singer (Princeton University) September / 39

24 Least-squares estimator (covariance) Again y k = A k x k + e k gives Cov[y k ] = A k Cov[x]A T k +σ2 I q. On the other hand Cov[y k ] (y k A k µ n )(y k A k µ n ) T, so (y k A k µ n )(y k A k µ n ) T A k Cov[x]A T k +σ2 I q. Form least-squares estimator for Cov[x] 1 Σ n = arg min Σ n n (y k A k µ n )(y k A k µ n ) T (A k ΣA H k +σ2 I q ) 2. F k=1 Amit Singer (Princeton University) September / 39

25 Normal equations Mean estimator µ n satisfies ( ) 1 n A T k n A k µ n = 1 n k=1 n A T k y k. k=1 For covariance, L n (Σ n ) = B n, with L n : R p p R p p given by n and L n (Σ) = 1 n k=1 A T k A k ΣAT k A k B n = 1 n A T k n (y k A k µ n )(y k A k µ n ) T A k σ 2 A T k A k. k=1 Resulting linear system in p 2 variables can be solved efficiently using the conjugate gradient method. Amit Singer (Princeton University) September / 39

26 Estimator properties Taking A k = I p and σ 2 = 0, recover sample mean and covariance. Does not rely on a particular distribution of x. With prior information on x, can regularize by adding terms to least-squares objective. Both µ n and Σ n are consistent estimators 3. For n µ n a.s. E[x] Σ n a.s. Cov[x] 3 Katsevich, Katsevich, S (2015) Amit Singer (Princeton University) September / 39

27 High-dimensional PCA: Marčenko-Pastur For pure Gaussian white noise of unit variance y k = e k, we have sample covariance 1 n n k=1 y k yt k. n p Central limit theorem n p Marčenko-Pastur law λ λ Amit Singer (Princeton University) September / 39

28 High-dimensional PCA: spiked covariance model Both signal and noise n p Central limit theorem y k = x k + e k. n p Spiked covariance model 4 λ λ Signal eigenvalues pop out of bulk when SNR > p/n. 4 Johnstone (2001) Amit Singer (Princeton University) September / 39

29 High-dimensional PCA: shrinkage To estimate Cov[x] from y k = x k + e k, when n p, we apply shrinkage to eigenvalues of sample covariance 5 Original Shrinkage λ λ 5 Donoho, Gavish, Johnstone (2013) Amit Singer (Princeton University) September / 39

30 Shrinkage of B n In clean case, y k = A k x k and B n = 1 n A T k n A k (x k µ n )(x k µ n ) T A T k A k, k=1 Amit Singer (Princeton University) September / 39

31 Shrinkage of B n In clean case, y k = A k x k and B n = 1 n A T k n A k (x k µ n )(x k µ n ) T A T k A k, k=1 so E[B n ] = 1 n n Cov[A T k A kx k ]. k=1 Amit Singer (Princeton University) September / 39

32 Shrinkage of B n In clean case, y k = A k x k and B n = 1 n so n A T k A k (x k µ n )(x k µ n ) T A T k A k, k=1 E[B n ] = Cov[A T k A kx k ] if the A k matrices are realizations of random matrices A k. Amit Singer (Princeton University) September / 39

33 Shrinkage of B n In clean case, y k = A k x k and B n = 1 n so n A T k A k (x k µ n )(x k µ n ) T A T k A k, k=1 E[B n ] = Cov[A T k A kx k ] if the A k matrices are realizations of random matrices A k. For large n, we then have B n Cov[A T k A kx k ]. Amit Singer (Princeton University) September / 39

34 Shrinkage of B n In clean case, y k = A k x k and B n = 1 n so n A T k A k (x k µ n )(x k µ n ) T A T k A k, k=1 E[B n ] = Cov[A T k A kx k ] if the A k matrices are realizations of random matrices A k. For large n, we then have B n Cov[A T k A kx k ]. How do we estimate Cov[A T k A kx k ] given noisy y k? Amit Singer (Princeton University) September / 39

35 Shrinkage of B n (cont.) Colored noise covariance E[A T k A k] {}}{ A T k y k = AT k A kx k + A T k e k. Amit Singer (Princeton University) September / 39

36 Shrinkage of B n (cont.) Colored noise covariance E[A T k A k] {}}{ A T k y k = AT k A kx k + A T k e k. Whiten by calculating S = 1 n n A T k A k E[A T k A k] k=1 and setting z k = S 1/2 A T k (y k A kµ n ). Amit Singer (Princeton University) September / 39

37 Shrinkage of B n (cont.) Colored noise covariance E[A T k A k] {}}{ A T k y k = AT k A kx k + A T k e k. Whiten by calculating S = 1 n n A T k A k E[A T k A k] k=1 and setting z k = S 1/2 A T k (y k A kµ n ). Calculate sample covariance Amit Singer (Princeton University) September / 39

38 Shrinkage of B n (cont.) Colored noise covariance E[A T k A k] {}}{ A T k y k = AT k A kx k + A T k e k. Whiten by calculating and setting S = 1 n A T k n A k E[A T k A k] k=1 z k = S 1/2 A T k (y k A kµ n ). Calculate sample covariance n z k z T k k=1 Amit Singer (Princeton University) September / 39

39 Shrinkage of B n (cont.) Colored noise covariance E[A T k A k] {}}{ A T k y k = AT k A kx k + A T k e k. Whiten by calculating and setting S = 1 n A T k n A k E[A T k A k] k=1 z k = S 1/2 A T k (y k A kµ n ). Calculate sample covariance, shrink ( n ) ρ z k z T k k=1 Amit Singer (Princeton University) September / 39

40 Shrinkage of B n (cont.) Colored noise covariance E[A T k A k] {}}{ A T k y k = AT k A kx k + A T k e k. Whiten by calculating and setting S = 1 n A T k n A k E[A T k A k] k=1 z k = S 1/2 A T k (y k A kµ n ). Calculate sample covariance, shrink, and unwhiten ( ) B n = S 1/2 1 n ρ z n k z T k S 1/2. k=1 Amit Singer (Princeton University) September / 39

41 Deconvolution in noise Recall noisy, filtered, image given clean projection x k. y k = T k x k + e k Using covariance estimation method, estimate Cov[x k ]. Construct classical Wiener filter and invert T k estimate x k from y k : Σ n T T k (T k Σ n T T k +σ2 I q ) 1 (y k T k µ n )+µ n. Amit Singer (Princeton University) September / 39

42 Effect of shrinkage on simulated data No shrinkage, SNR=1/20 Shrinkage, SNR=1/20 No shrinkage, SNR=1/40 Shrinkage, SNR=1/40 No shrinkage, SNR=1/60 Shrinkage, SNR=1/60 No shrinkage, SNR=1/100 Shrinkage, SNR=1/ log 10 (Relative MSE) Number of images 10 4 Applying shrinkage to right-hand side B n in L n (Σ n ) = B n results in significant increase in accuracy for the covariance estimation. Amit Singer (Princeton University) September / 39

43 Experimental results: TRPV1 (EMPIAR-10005) Dataset of projection images 256-by-256 pixels large. Liao et al., Nature 2013; Bhamre, Zhang, S, JSB 2016 Input Amit Singer (Princeton University) September / 39

44 Experimental results: TRPV1 (EMPIAR-10005) Dataset of projection images 256-by-256 pixels large. Liao et al., Nature 2013; Bhamre, Zhang, S, JSB 2016 Input Filtered Amit Singer (Princeton University) September / 39

45 Experimental results: TRPV1 (EMPIAR-10005) Dataset of projection images 256-by-256 pixels large. Liao et al., Nature 2013; Bhamre, Zhang, S, JSB 2016 Input Filtered Closest projection Amit Singer (Princeton University) September / 39

46 Experimental data - TRPV1 Raw Closest projection TWF CWF K2 direct electron detector motion corrected, picked particle images of pixels Amit Singer (Princeton University) September / 39

47 Experimental data - 80S ribosome Raw Closest projection TWF CWF FALCON II 4k 4k direct electron detector motion corrected, picked particle images of pixels Amit Singer (Princeton University) September / 39

48 Experimental data -IP3R1 Raw Closest projection TWF CWF Gatan 4k 4k CCD picked particle images of pixels Amit Singer (Princeton University) September / 39

49 2D covariance estimation further remarks Steerable PCA: covariance matrix commutes with in-plane rotation, block diagonalized in Fourier-Bessel basis, computational cost O(nN 3 ) (Zhao, Shkolnisky, S, IEEE TCI 2016) Wiener filtering: taking into account that estimated covariance differs from population covariance (S and Wu, SIAM Imaging 2013) Visualization of underlying particles without class averaging Outlier detection of picked particles Improved class averages Amit Singer (Princeton University) September / 39

50 Heterogeneity problem Given images y k = A k {}}{ T k P k x k + e k classify y k according to molecular state x k. We use proposed method to estimate covariance Cov[x k ]. Reduce computational and memory costs by writing L n as a convolution and circulant preconditioning. Dominant eigenvectors of Σ n define low-dimensional subspace where volumes x k live. For C classes, rank Cov[x k ] = C 1. We can invert T k P k to find coordinates in subspace and cluster. Amit Singer (Princeton University) September / 39

51 Experimental results: 70S ribosome Dataset of images (130-by-130), 2 classes, courtesy Joachim Frank (Columbia University). Downsampled to N = Largest 32 eigenvalues 400 Coordinate histogram Clustering accuracy 89% with respect to dataset labeling. Took 26 min. on a 2.7 GHz, 12-core CPU. Amit Singer (Princeton University) September / 39

52 Conclusions Least-squares approach to covariance estimation from noisy partial measurements is a simple but powerful method to characterize the distribution of the underlying data. In the high-dimensional regime, eigenvalue shrinkage can significantly improve the performance of the estimator. Covariance information can be leveraged to great effect in denoising and classification tasks such as those encountered in cryo-em. Amit Singer (Princeton University) September / 39

53 Current and future work Incorporating constraints on covariance Σ n, such as sparsity. Replacing the least squares cost with a robust cost function. Theoretical understanding of bulk noise distribution in Σ n, not just in B n. Application to other tasks, such as embryo development, magnetic resonance imaging, and matrix completion. Amit Singer (Princeton University) September / 39

54 ASPIRE: Algorithms for Single Particle Reconstruction Open source toolbox, publicly available: Amit Singer (Princeton University) September / 39

55 References A. Singer, H.-T. Wu, Two-Dimensional tomography from noisy projections taken at unknown random directions", SIAM Journal on Imaging Sciences, 6 (1), pp , (2013). E. Katsevich, A. Katsevich, A. Singer, Covariance Matrix Estimation for the Cryo-EM Heterogeneity Problem", SIAM Journal on Imaging Sciences, 8 (1), pp (2015). J. Andén, E. Katsevich, A. Singer, Covariance estimation using conjugate gradient for 3D classification in Cryo-EM", in IEEE 12th International Symposium on Biomedical Imaging (ISBI 2015), pp , April T. Bhamre, T. Zhang, A. Singer, Denoising and Covariance Estimation of Single Particle Cryo-EM Images", Journal of Structural Biology, 195 (1), pp (2016). Amit Singer (Princeton University) September / 39

Introduction to Three Dimensional Structure Determination of Macromolecules by Cryo-Electron Microscopy

Introduction to Three Dimensional Structure Determination of Macromolecules by Cryo-Electron Microscopy Amit Singer Princeton University, Department of Mathematics and PACM July 23, 2014 Amit Singer (Princeton