PCA from noisy linearly reduced measurements
|
|
- Frederica Payne
- 5 years ago
- Views:
Transcription
1 PCA from noisy linearly reduced measurements Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics September 27, 2016 Amit Singer (Princeton University) September / 39
2 Single particle reconstruction using cryo-em Schematic drawing of the imaging process: The cryo-em problem: Amit Singer (Princeton University) September / 39
3 New detector technology: Exciting times for cryo-em BIOCHEMISTRY The Resolution Revolution Werner Kühlbrandt P recise knowledge of the structure of macromolecules in the cell is essential for understanding how they function. Structures of large macromolecules can now be obtained at near-atomic resolution by averaging thousands of electron microscope images recorded before radiation damage accumulates. This is what Amunts et al. have done in their research article on page 1485 of this issue ( 1), reporting the structure of the large subunit of the mitochondrial ribosome at 3.2 Å resolution by electron cryo-microscopy (cryo-em). Together with other recent high-resolution cryo-em structures ( 2 4) (see the figure), this achievement heralds the beginning of a new era in molecular biology, where structures at near-atomic resolution are no longer the prerogative of x-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy. Ribosomes are ancient, massive protein- RNA complexes that translate the linear genetic code into three-dimensional proteins. Mitochondria semi-autonomous organelles SCIENCE VOL MARCH A B C Advances in detector technology and image processing are yielding high-resolution electron cryo-microscopy structures of biomolecules. Near-atomic resolution with cryo-em. (A) The large subunit of the yeast mitochondrial ribosome at 3.2 Å reported by Amunts et al. In the detailed view below, the base pairs of an RNA double helix and a magnesium ion (blue) are clearly resolved. (B) TRPV1 ion channel at 3.4 Å ( 2), with a detailed view of residues lining the ion pore on the four-fold axis of the tetrameric channel. (C) F 420-reducing [NiFe] hydrogenase at 3.36 Å ( 3). The detail shows an α helix in the FrhA subunit with resolved side chains. The maps are not drawn to scale. Amit Singer (Princeton University) September / 39
4 Cryo-EM in the news... March 31, 2016 REPORTS The 3.8 Å resolution cryo-em structure of Zika virus Cite as: Sirohi et al., Science /science.aaf5316 (2016). Devika Sirohi, 1 * Zhenguo Chen, 1 * Lei Sun, 1 Thomas Klose, 1 Theodore C. Pierson, 2 Michael G. Rossmann, 1 Richard J. Kuhn 1 1 Markey Center for Structural Biology and Purdue Institute for Inflammation, Immunology and Infectious Disease, Purdue University, West Lafayette, IN 47907, USA. 2 Viral Pathogenesis Section, Laboratory of Viral Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA. Amit Singer (Princeton University) September / 39
5 Method of the Year 2015 January 2016 Volume 13 No 1 Single-particle cryo-electron microscopy (cryo-em) is our choice for Method of the Year 2015 for its newfound ability to solve protein structures at near-atomic resolution. Featured is the 2.2-Å cryo-em structure of β-galactosidase as recently reported by Bartesaghi et al. (Science 348, , 2015). Cover design by Erin Dewalt. Amit Singer (Princeton University) September / 39
6 Big movie" data, publicly available Amit Singer (Princeton University) September / 39
7 Image formation model and inverse problem Projection y k Molecule φ k R k SO(3) Electron source Projection images y k = T k P k φ k +ǫ k, k = 1, 2,...,n. φ k : R 3 R is the scattering potential of the k th molecule. P k is 2D tomographic projection, after rotating φ k by R k. T k is a filtering" operator due to microscope and detector. Classical cryo-em problem: Estimate φ = φ 1 = φ 2 = = φ n (and R 1,...,R n) given y 1,...,y n. Heterogeneity problem: Estimate φ 1,φ 2,...,φ n (and R 1,...,R n) given y 1,...,y n. Amit Singer (Princeton University) September / 39
8 Toy example Amit Singer (Princeton University) September / 39
9 E. coli 50S ribosomal subunit 27,000 particle images provided by Dr. Fred Sigworth, Yale Medical School 3D reconstruction by S, Lanhui Wang, and Jane Zhao Amit Singer (Princeton University) September / 39
10 Algorithmic and computational challenges 1 Ab-initio modelling and orientation assignment 2 Heterogeneity resolving structural variability 3 Denoising and 2D class averaging 4 Particle picking 5 Symmetry detection and reconstruction 6 Motion correction 7 Estimation of contrast transfer function and noise statistics 8 Fast reconstruction and refinement 9 Validation and resolution determination Amit Singer (Princeton University) September / 39
11 Principal components analysis (PCA) Given i.i.d. samples x 1,..., x n of some x R p, estimate mean E[x] and covariance Cov[x]. µ n = 1 n n k=1 x k Σ n = 1 n (x k µ n n )(x k µ n ) T k=1 Eigenvectors and eigenvalues of Σ n are used for dimensionality reduction, compression, and denoising Useful when most variability is captured by a few leading principal components: low rank approximation. Amit Singer (Princeton University) September / 39
12 PCA from noisy linearly transformed measurements Measure with A k : R p R q, noise variance Cov[e k ] = σ 2 I q y k = A k x k + e k. A k A l Goal: Estimate E[x] and Cov[x] from y 1,...,y n. Amit Singer (Princeton University) September / 39
13 Motivation: cryo-electron microscopy Write image y k as y k = T k P k φ k + e k, k = 1,...,n where: φ k is a three-dimensional molecular volume, P k projects volume into an image and T k filters it, and e k is measurement noise. Amit Singer (Princeton University) September / 39
14 Application I: deconvolution in noise Goal: Estimate unfiltered projection x k = P k φ k from y k = T k x k + e k. Amit Singer (Princeton University) September / 39
15 Application I: deconvolution in noise Goal: Estimate unfiltered projection x k = P k φ k from y k = T k x k + e k. x k Amit Singer (Princeton University) September / 39
16 Application I: deconvolution in noise Goal: Estimate unfiltered projection x k = P k φ k from y k = T k x k + e k. x k T k x k Amit Singer (Princeton University) September / 39
17 Application I: deconvolution in noise Goal: Estimate unfiltered projection x k = P k φ k from y k = T k x k + e k. x k T k x k T k x k + e k Amit Singer (Princeton University) September / 39
18 Application I: deconvolution in noise Goal: Estimate unfiltered projection x k = P k φ k from y k = T k x k + e k. x k T k x k T k x k + e k Filtering T k : R p R p is singular or ill-conditioned. Amit Singer (Princeton University) September / 39
19 Application I: deconvolution in noise Goal: Estimate unfiltered projection x k = P k φ k from y k = T k x k + e k. x k T k x k T k x k + e k Filtering T k : R p R p is singular or ill-conditioned. Using low-rank Cov[x], we can invert T k and reduce effect of noise e k. Amit Singer (Princeton University) September / 39
20 Contrast transfer function: T k CTF suppresses information and inverts contrast CTF cannot be trivially inverted (zero crossings) Additional envelope Gaussian decay of high frequencies Information lost from one defocus group could be recovered from another group that has different zero crossings. Amit Singer (Princeton University) September / 39
21 Application II: heterogeneity How to classify images from different molecular structures? 1 Class A Class B 1 Images courtesy Joachim Frank (Columbia) Amit Singer (Princeton University) September / 39
22 Application: heterogeneity (cont.) Goal: Classify images according to molecular state φ k. y k = T k P k φ k + e k, Cannot invert since T k P k : R p R q and q < p. Volumes φ k live in low-dimensional space can be estimated from top eigenvectors of Cov[φ k ]. Restrict T k P k to this space, invert, cluster and reconstruct 2. Aggregating large-scale datasets give accurate estimates despite high noise. 2 Penczek et al. (2009) Amit Singer (Princeton University) September / 39
23 Least-squares estimator (mean) From y k = A k x k + e k, we have E[y k ] = A k E[x]. On the other hand so E[y k ] y k, y k A k E[x]. Form least-squares estimator for E[x] 1 µ n = arg min µ n n y k A k µ 2. k=1 Amit Singer (Princeton University) September / 39
24 Least-squares estimator (covariance) Again y k = A k x k + e k gives Cov[y k ] = A k Cov[x]A T k +σ2 I q. On the other hand Cov[y k ] (y k A k µ n )(y k A k µ n ) T, so (y k A k µ n )(y k A k µ n ) T A k Cov[x]A T k +σ2 I q. Form least-squares estimator for Cov[x] 1 Σ n = arg min Σ n n (y k A k µ n )(y k A k µ n ) T (A k ΣA H k +σ2 I q ) 2. F k=1 Amit Singer (Princeton University) September / 39
25 Normal equations Mean estimator µ n satisfies ( ) 1 n A T k n A k µ n = 1 n k=1 n A T k y k. k=1 For covariance, L n (Σ n ) = B n, with L n : R p p R p p given by n and L n (Σ) = 1 n k=1 A T k A k ΣAT k A k B n = 1 n A T k n (y k A k µ n )(y k A k µ n ) T A k σ 2 A T k A k. k=1 Resulting linear system in p 2 variables can be solved efficiently using the conjugate gradient method. Amit Singer (Princeton University) September / 39
26 Estimator properties Taking A k = I p and σ 2 = 0, recover sample mean and covariance. Does not rely on a particular distribution of x. With prior information on x, can regularize by adding terms to least-squares objective. Both µ n and Σ n are consistent estimators 3. For n µ n a.s. E[x] Σ n a.s. Cov[x] 3 Katsevich, Katsevich, S (2015) Amit Singer (Princeton University) September / 39
27 High-dimensional PCA: Marčenko-Pastur For pure Gaussian white noise of unit variance y k = e k, we have sample covariance 1 n n k=1 y k yt k. n p Central limit theorem n p Marčenko-Pastur law λ λ Amit Singer (Princeton University) September / 39
28 High-dimensional PCA: spiked covariance model Both signal and noise n p Central limit theorem y k = x k + e k. n p Spiked covariance model 4 λ λ Signal eigenvalues pop out of bulk when SNR > p/n. 4 Johnstone (2001) Amit Singer (Princeton University) September / 39
29 High-dimensional PCA: shrinkage To estimate Cov[x] from y k = x k + e k, when n p, we apply shrinkage to eigenvalues of sample covariance 5 Original Shrinkage λ λ 5 Donoho, Gavish, Johnstone (2013) Amit Singer (Princeton University) September / 39
30 Shrinkage of B n In clean case, y k = A k x k and B n = 1 n A T k n A k (x k µ n )(x k µ n ) T A T k A k, k=1 Amit Singer (Princeton University) September / 39
31 Shrinkage of B n In clean case, y k = A k x k and B n = 1 n A T k n A k (x k µ n )(x k µ n ) T A T k A k, k=1 so E[B n ] = 1 n n Cov[A T k A kx k ]. k=1 Amit Singer (Princeton University) September / 39
32 Shrinkage of B n In clean case, y k = A k x k and B n = 1 n so n A T k A k (x k µ n )(x k µ n ) T A T k A k, k=1 E[B n ] = Cov[A T k A kx k ] if the A k matrices are realizations of random matrices A k. Amit Singer (Princeton University) September / 39
33 Shrinkage of B n In clean case, y k = A k x k and B n = 1 n so n A T k A k (x k µ n )(x k µ n ) T A T k A k, k=1 E[B n ] = Cov[A T k A kx k ] if the A k matrices are realizations of random matrices A k. For large n, we then have B n Cov[A T k A kx k ]. Amit Singer (Princeton University) September / 39
34 Shrinkage of B n In clean case, y k = A k x k and B n = 1 n so n A T k A k (x k µ n )(x k µ n ) T A T k A k, k=1 E[B n ] = Cov[A T k A kx k ] if the A k matrices are realizations of random matrices A k. For large n, we then have B n Cov[A T k A kx k ]. How do we estimate Cov[A T k A kx k ] given noisy y k? Amit Singer (Princeton University) September / 39
35 Shrinkage of B n (cont.) Colored noise covariance E[A T k A k] {}}{ A T k y k = AT k A kx k + A T k e k. Amit Singer (Princeton University) September / 39
36 Shrinkage of B n (cont.) Colored noise covariance E[A T k A k] {}}{ A T k y k = AT k A kx k + A T k e k. Whiten by calculating S = 1 n n A T k A k E[A T k A k] k=1 and setting z k = S 1/2 A T k (y k A kµ n ). Amit Singer (Princeton University) September / 39
37 Shrinkage of B n (cont.) Colored noise covariance E[A T k A k] {}}{ A T k y k = AT k A kx k + A T k e k. Whiten by calculating S = 1 n n A T k A k E[A T k A k] k=1 and setting z k = S 1/2 A T k (y k A kµ n ). Calculate sample covariance Amit Singer (Princeton University) September / 39
38 Shrinkage of B n (cont.) Colored noise covariance E[A T k A k] {}}{ A T k y k = AT k A kx k + A T k e k. Whiten by calculating and setting S = 1 n A T k n A k E[A T k A k] k=1 z k = S 1/2 A T k (y k A kµ n ). Calculate sample covariance n z k z T k k=1 Amit Singer (Princeton University) September / 39
39 Shrinkage of B n (cont.) Colored noise covariance E[A T k A k] {}}{ A T k y k = AT k A kx k + A T k e k. Whiten by calculating and setting S = 1 n A T k n A k E[A T k A k] k=1 z k = S 1/2 A T k (y k A kµ n ). Calculate sample covariance, shrink ( n ) ρ z k z T k k=1 Amit Singer (Princeton University) September / 39
40 Shrinkage of B n (cont.) Colored noise covariance E[A T k A k] {}}{ A T k y k = AT k A kx k + A T k e k. Whiten by calculating and setting S = 1 n A T k n A k E[A T k A k] k=1 z k = S 1/2 A T k (y k A kµ n ). Calculate sample covariance, shrink, and unwhiten ( ) B n = S 1/2 1 n ρ z n k z T k S 1/2. k=1 Amit Singer (Princeton University) September / 39
41 Deconvolution in noise Recall noisy, filtered, image given clean projection x k. y k = T k x k + e k Using covariance estimation method, estimate Cov[x k ]. Construct classical Wiener filter and invert T k estimate x k from y k : Σ n T T k (T k Σ n T T k +σ2 I q ) 1 (y k T k µ n )+µ n. Amit Singer (Princeton University) September / 39
42 Effect of shrinkage on simulated data No shrinkage, SNR=1/20 Shrinkage, SNR=1/20 No shrinkage, SNR=1/40 Shrinkage, SNR=1/40 No shrinkage, SNR=1/60 Shrinkage, SNR=1/60 No shrinkage, SNR=1/100 Shrinkage, SNR=1/ log 10 (Relative MSE) Number of images 10 4 Applying shrinkage to right-hand side B n in L n (Σ n ) = B n results in significant increase in accuracy for the covariance estimation. Amit Singer (Princeton University) September / 39
43 Experimental results: TRPV1 (EMPIAR-10005) Dataset of projection images 256-by-256 pixels large. Liao et al., Nature 2013; Bhamre, Zhang, S, JSB 2016 Input Amit Singer (Princeton University) September / 39
44 Experimental results: TRPV1 (EMPIAR-10005) Dataset of projection images 256-by-256 pixels large. Liao et al., Nature 2013; Bhamre, Zhang, S, JSB 2016 Input Filtered Amit Singer (Princeton University) September / 39
45 Experimental results: TRPV1 (EMPIAR-10005) Dataset of projection images 256-by-256 pixels large. Liao et al., Nature 2013; Bhamre, Zhang, S, JSB 2016 Input Filtered Closest projection Amit Singer (Princeton University) September / 39
46 Experimental data - TRPV1 Raw Closest projection TWF CWF K2 direct electron detector motion corrected, picked particle images of pixels Amit Singer (Princeton University) September / 39
47 Experimental data - 80S ribosome Raw Closest projection TWF CWF FALCON II 4k 4k direct electron detector motion corrected, picked particle images of pixels Amit Singer (Princeton University) September / 39
48 Experimental data -IP3R1 Raw Closest projection TWF CWF Gatan 4k 4k CCD picked particle images of pixels Amit Singer (Princeton University) September / 39
49 2D covariance estimation further remarks Steerable PCA: covariance matrix commutes with in-plane rotation, block diagonalized in Fourier-Bessel basis, computational cost O(nN 3 ) (Zhao, Shkolnisky, S, IEEE TCI 2016) Wiener filtering: taking into account that estimated covariance differs from population covariance (S and Wu, SIAM Imaging 2013) Visualization of underlying particles without class averaging Outlier detection of picked particles Improved class averages Amit Singer (Princeton University) September / 39
50 Heterogeneity problem Given images y k = A k {}}{ T k P k x k + e k classify y k according to molecular state x k. We use proposed method to estimate covariance Cov[x k ]. Reduce computational and memory costs by writing L n as a convolution and circulant preconditioning. Dominant eigenvectors of Σ n define low-dimensional subspace where volumes x k live. For C classes, rank Cov[x k ] = C 1. We can invert T k P k to find coordinates in subspace and cluster. Amit Singer (Princeton University) September / 39
51 Experimental results: 70S ribosome Dataset of images (130-by-130), 2 classes, courtesy Joachim Frank (Columbia University). Downsampled to N = Largest 32 eigenvalues 400 Coordinate histogram Clustering accuracy 89% with respect to dataset labeling. Took 26 min. on a 2.7 GHz, 12-core CPU. Amit Singer (Princeton University) September / 39
52 Conclusions Least-squares approach to covariance estimation from noisy partial measurements is a simple but powerful method to characterize the distribution of the underlying data. In the high-dimensional regime, eigenvalue shrinkage can significantly improve the performance of the estimator. Covariance information can be leveraged to great effect in denoising and classification tasks such as those encountered in cryo-em. Amit Singer (Princeton University) September / 39
53 Current and future work Incorporating constraints on covariance Σ n, such as sparsity. Replacing the least squares cost with a robust cost function. Theoretical understanding of bulk noise distribution in Σ n, not just in B n. Application to other tasks, such as embryo development, magnetic resonance imaging, and matrix completion. Amit Singer (Princeton University) September / 39
54 ASPIRE: Algorithms for Single Particle Reconstruction Open source toolbox, publicly available: Amit Singer (Princeton University) September / 39
55 References A. Singer, H.-T. Wu, Two-Dimensional tomography from noisy projections taken at unknown random directions", SIAM Journal on Imaging Sciences, 6 (1), pp , (2013). E. Katsevich, A. Katsevich, A. Singer, Covariance Matrix Estimation for the Cryo-EM Heterogeneity Problem", SIAM Journal on Imaging Sciences, 8 (1), pp (2015). J. Andén, E. Katsevich, A. Singer, Covariance estimation using conjugate gradient for 3D classification in Cryo-EM", in IEEE 12th International Symposium on Biomedical Imaging (ISBI 2015), pp , April T. Bhamre, T. Zhang, A. Singer, Denoising and Covariance Estimation of Single Particle Cryo-EM Images", Journal of Structural Biology, 195 (1), pp (2016). Amit Singer (Princeton University) September / 39
Introduction to Three Dimensional Structure Determination of Macromolecules by Cryo-Electron Microscopy
Introduction to Three Dimensional Structure Determination of Macromolecules by Cryo-Electron Microscopy Amit Singer Princeton University, Department of Mathematics and PACM July 23, 2014 Amit Singer (Princeton
More informationNon-unique Games over Compact Groups and Orientation Estimation in Cryo-EM
Non-unique Games over Compact Groups and Orientation Estimation in Cryo-EM Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics July 28, 2016
More information3D Structure Determination using Cryo-Electron Microscopy Computational Challenges
3D Structure Determination using Cryo-Electron Microscopy Computational Challenges Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics July 28,
More informationSome Optimization and Statistical Learning Problems in Structural Biology
Some Optimization and Statistical Learning Problems in Structural Biology Amit Singer Princeton University, Department of Mathematics and PACM January 8, 2013 Amit Singer (Princeton University) January
More informationMultireference Alignment, Cryo-EM, and XFEL
Multireference Alignment, Cryo-EM, and XFEL Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics August 16, 2017 Amit Singer (Princeton University)
More informationClass Averaging in Cryo-Electron Microscopy
Class Averaging in Cryo-Electron Microscopy Zhizhen Jane Zhao Courant Institute of Mathematical Sciences, NYU Mathematics in Data Sciences ICERM, Brown University July 30 2015 Single Particle Reconstruction
More informationCovariance Matrix Estimation for the Cryo-EM Heterogeneity Problem
Covariance Matrix Estimation for the Cryo-EM Heterogeneity Problem Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics July 25, 2014 Joint work
More informationarxiv: v3 [cs.cv] 6 Apr 2016
Denoising and Covariance Estimation of Single Particle Cryo-EM Images Tejal Bhamre arxiv:1602.06632v3 [cs.cv] 6 Apr 2016 Department of Physics, Princeton University, Jadwin Hall, Washington Road, Princeton,
More informationBeyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian
Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics
More informationThree-Dimensional Electron Microscopy of Macromolecular Assemblies
Three-Dimensional Electron Microscopy of Macromolecular Assemblies Joachim Frank Wadsworth Center for Laboratories and Research State of New York Department of Health The Governor Nelson A. Rockefeller
More informationThree-dimensional structure determination of molecules without crystallization
Three-dimensional structure determination of molecules without crystallization Amit Singer Princeton University, Department of Mathematics and PACM June 4, 214 Amit Singer (Princeton University) June 214
More informationGlobal Positioning from Local Distances
Global Positioning from Local Distances Amit Singer Yale University, Department of Mathematics, Program in Applied Mathematics SIAM Annual Meeting, July 2008 Amit Singer (Yale University) San Diego 1 /
More informationPrincipal Component Analysis (PCA)
Principal Component Analysis (PCA) Salvador Dalí, Galatea of the Spheres CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy and Lisa Zhang Some slides from Derek Hoiem and Alysha
More informationComputing Steerable Principal Components of a Large Set of Images and Their Rotations Colin Ponce and Amit Singer
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL 20, NO 11, NOVEMBER 2011 3051 Computing Steerable Principal Components of a Large Set of Images and Their Rotations Colin Ponce and Amit Singer Abstract We present
More informationAlgorithms for Image Restoration and. 3D Reconstruction from Cryo-EM. Images
Algorithms for Image Restoration and 3D Reconstruction from Cryo-EM Images Tejal Bhamre A Dissertation Presented to the Faculty of Princeton University in Candidacy for the Degree of Doctor of Philosophy
More informationHomework 1. Yuan Yao. September 18, 2011
Homework 1 Yuan Yao September 18, 2011 1. Singular Value Decomposition: The goal of this exercise is to refresh your memory about the singular value decomposition and matrix norms. A good reference to
More informationImage Assessment San Diego, November 2005
Image Assessment San Diego, November 005 Pawel A. Penczek The University of Texas Houston Medical School Department of Biochemistry and Molecular Biology 6431 Fannin, MSB6.18, Houston, TX 77030, USA phone:
More informationLecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University
Lecture 4: Principal Component Analysis Aykut Erdem May 016 Hacettepe University This week Motivation PCA algorithms Applications PCA shortcomings Autoencoders Kernel PCA PCA Applications Data Visualization
More informationAnnouncements (repeat) Principal Components Analysis
4/7/7 Announcements repeat Principal Components Analysis CS 5 Lecture #9 April 4 th, 7 PA4 is due Monday, April 7 th Test # will be Wednesday, April 9 th Test #3 is Monday, May 8 th at 8AM Just hour long
More informationTools for Cryo-EM Map Fitting. Paul Emsley MRC Laboratory of Molecular Biology
Tools for Cryo-EM Map Fitting Paul Emsley MRC Laboratory of Molecular Biology April 2017 Cryo-EM model-building typically need to move more atoms that one does for crystallography the maps are lower resolution
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationMethods for sparse analysis of high-dimensional data, II
Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional
More informationRecent developments on sparse representation
Recent developments on sparse representation Zeng Tieyong Department of Mathematics, Hong Kong Baptist University Email: zeng@hkbu.edu.hk Hong Kong Baptist University Dec. 8, 2008 First Previous Next Last
More informationGauge optimization and duality
1 / 54 Gauge optimization and duality Junfeng Yang Department of Mathematics Nanjing University Joint with Shiqian Ma, CUHK September, 2015 2 / 54 Outline Introduction Duality Lagrange duality Fenchel
More informationSingular Value Decomposition and Principal Component Analysis (PCA) I
Singular Value Decomposition and Principal Component Analysis (PCA) I Prof Ned Wingreen MOL 40/50 Microarray review Data per array: 0000 genes, I (green) i,i (red) i 000 000+ data points! The expression
More informationBiological Small Angle X-ray Scattering (SAXS) Dec 2, 2013
Biological Small Angle X-ray Scattering (SAXS) Dec 2, 2013 Structural Biology Shape Dynamic Light Scattering Electron Microscopy Small Angle X-ray Scattering Cryo-Electron Microscopy Wide Angle X- ray
More informationMethods for sparse analysis of high-dimensional data, II
Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional
More informationPrincipal Component Analysis (PCA) CSC411/2515 Tutorial
Principal Component Analysis (PCA) CSC411/2515 Tutorial Harris Chan Based on previous tutorial slides by Wenjie Luo, Ladislav Rampasek University of Toronto hchan@cs.toronto.edu October 19th, 2017 (UofT)
More informationITERATIVE METHODS IN LARGE FIELD ELECTRON MICROSCOPE TOMOGRAPHY. Xiaohua Wan
ITERATIVE METHODS IN LARGE FIELD ELECTRON MICROSCOPE TOMOGRAPHY Xiaohua Wan wanxiaohua@ict.ac.cn Institute of Computing Technology, Chinese Academy of Sciences Outline Background Problem Our work Future
More informationAdvanced Introduction to Machine Learning CMU-10715
Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás Póczos Contents Motivation PCA algorithms Applications Some of these slides are taken from Karl Booksh Research
More informationVector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.
Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar
More informationLecture Notes 1: Vector spaces
Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector
More informationDeep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang
Deep Learning Basics Lecture 7: Factor Analysis Princeton University COS 495 Instructor: Yingyu Liang Supervised v.s. Unsupervised Math formulation for supervised learning Given training data x i, y i
More informationDescriptive Statistics
Descriptive Statistics DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Descriptive statistics Techniques to visualize
More informationStatistical Machine Learning
Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x
More informationFocus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.
Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,
More informationSPARSE SIGNAL RESTORATION. 1. Introduction
SPARSE SIGNAL RESTORATION IVAN W. SELESNICK 1. Introduction These notes describe an approach for the restoration of degraded signals using sparsity. This approach, which has become quite popular, is useful
More informationCHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION
59 CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION 4. INTRODUCTION Weighted average-based fusion algorithms are one of the widely used fusion methods for multi-sensor data integration. These methods
More informationUnsupervised Learning: Dimensionality Reduction
Unsupervised Learning: Dimensionality Reduction CMPSCI 689 Fall 2015 Sridhar Mahadevan Lecture 3 Outline In this lecture, we set about to solve the problem posed in the previous lecture Given a dataset,
More informationUnsupervised Learning: K- Means & PCA
Unsupervised Learning: K- Means & PCA Unsupervised Learning Supervised learning used labeled data pairs (x, y) to learn a func>on f : X Y But, what if we don t have labels? No labels = unsupervised learning
More informationLecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides
Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides Intelligent Data Analysis and Probabilistic Inference Lecture
More informationEstimation of large dimensional sparse covariance matrices
Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)
More informationFree Probability, Sample Covariance Matrices and Stochastic Eigen-Inference
Free Probability, Sample Covariance Matrices and Stochastic Eigen-Inference Alan Edelman Department of Mathematics, Computer Science and AI Laboratories. E-mail: edelman@math.mit.edu N. Raj Rao Deparment
More informationLecture Notes 2: Matrices
Optimization-based data analysis Fall 2017 Lecture Notes 2: Matrices Matrices are rectangular arrays of numbers, which are extremely useful for data analysis. They can be interpreted as vectors in a vector
More informationOpportunity and Challenge: the new era of cryo-em. Ning Gao School of Life Sciences Tsinghua University
Opportunity and Challenge: the new era of cryo-em Ning Gao School of Life Sciences Tsinghua University 2015-10-20 Education and Research Experience Principal Investigator 2008.11-present Postdoc, Department
More informationLinear Subspace Models
Linear Subspace Models Goal: Explore linear models of a data set. Motivation: A central question in vision concerns how we represent a collection of data vectors. The data vectors may be rasterized images,
More informationPrincipal Component Analysis
B: Chapter 1 HTF: Chapter 1.5 Principal Component Analysis Barnabás Póczos University of Alberta Nov, 009 Contents Motivation PCA algorithms Applications Face recognition Facial expression recognition
More informationPrincipal Component Analysis
Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand
More informationArtificial Intelligence Module 2. Feature Selection. Andrea Torsello
Artificial Intelligence Module 2 Feature Selection Andrea Torsello We have seen that high dimensional data is hard to classify (curse of dimensionality) Often however, the data does not fill all the space
More informationReconstruction from Anisotropic Random Measurements
Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013
More informationPrincipal Component Analysis -- PCA (also called Karhunen-Loeve transformation)
Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) PCA transforms the original input space into a lower dimensional space, by constructing dimensions that are linear combinations
More informationsin" =1.22 # D "l =1.22 f# D I: In introduction to molecular electron microscopy - Imaging macromolecular assemblies
I: In introduction to molecular electron microscopy - Imaging macromolecular assemblies Yifan Cheng Department of Biochemistry & Biophysics office: GH-S472D; email: ycheng@ucsf.edu 2/20/2015 - Introduction
More informationRobot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction
Robot Image Credit: Viktoriya Sukhanova 13RF.com Dimensionality Reduction Feature Selection vs. Dimensionality Reduction Feature Selection (last time) Select a subset of features. When classifying novel
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationNoisy Word Recognition Using Denoising and Moment Matrix Discriminants
Noisy Word Recognition Using Denoising and Moment Matrix Discriminants Mila Nikolova Département TSI ENST, rue Barrault, 753 Paris Cedex 13, France, nikolova@tsi.enst.fr Alfred Hero Dept. of EECS, Univ.
More informationSparsity and Morphological Diversity in Source Separation. Jérôme Bobin IRFU/SEDI-Service d Astrophysique CEA Saclay - France
Sparsity and Morphological Diversity in Source Separation Jérôme Bobin IRFU/SEDI-Service d Astrophysique CEA Saclay - France Collaborators - Yassir Moudden - CEA Saclay, France - Jean-Luc Starck - CEA
More informationSelf-Calibration and Biconvex Compressive Sensing
Self-Calibration and Biconvex Compressive Sensing Shuyang Ling Department of Mathematics, UC Davis July 12, 2017 Shuyang Ling (UC Davis) SIAM Annual Meeting, 2017, Pittsburgh July 12, 2017 1 / 22 Acknowledgements
More informationParM filament images were extracted and from the electron micrographs and
Supplemental methods Outline of the EM reconstruction: ParM filament images were extracted and from the electron micrographs and straightened. The digitized images were corrected for the phase of the Contrast
More informationPCA, Kernel PCA, ICA
PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per
More informationAdvanced Introduction to Machine Learning CMU-10715
Advanced Introduction to Machine Learning CMU-10715 Independent Component Analysis Barnabás Póczos Independent Component Analysis 2 Independent Component Analysis Model original signals Observations (Mixtures)
More informationWavelet Based Image Restoration Using Cross-Band Operators
1 Wavelet Based Image Restoration Using Cross-Band Operators Erez Cohen Electrical Engineering Department Technion - Israel Institute of Technology Supervised by Prof. Israel Cohen 2 Layout Introduction
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More informationIndependent Component Analysis. Contents
Contents Preface xvii 1 Introduction 1 1.1 Linear representation of multivariate data 1 1.1.1 The general statistical setting 1 1.1.2 Dimension reduction methods 2 1.1.3 Independence as a guiding principle
More informationApplication of Principal Component Analysis to TES data
Application of Principal Component Analysis to TES data Clive D Rodgers Clarendon Laboratory University of Oxford Madison, Wisconsin, 27th April 2006 1 My take on the PCA business 2/41 What is the best
More informationPCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given
More informationDimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas
Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx
More informationFace Detection and Recognition
Face Detection and Recognition Face Recognition Problem Reading: Chapter 18.10 and, optionally, Face Recognition using Eigenfaces by M. Turk and A. Pentland Queryimage face query database Face Verification
More informationSignal Denoising with Wavelets
Signal Denoising with Wavelets Selin Aviyente Department of Electrical and Computer Engineering Michigan State University March 30, 2010 Introduction Assume an additive noise model: x[n] = f [n] + w[n]
More informationPrincipal Component Analysis CS498
Principal Component Analysis CS498 Today s lecture Adaptive Feature Extraction Principal Component Analysis How, why, when, which A dual goal Find a good representation The features part Reduce redundancy
More informationImproving cryo-em maps: focused 2D & 3D classifications and focused refinements
Improving cryo-em maps: focused 2D & 3D classifications and focused refinements 3rd International Symposium on Cryo-3D Image Analysis 23.3.2018, Tahoe Bruno Klaholz Centre for Integrative Biology, IGBMC,
More informationTruncation Strategy of Tensor Compressive Sensing for Noisy Video Sequences
Journal of Information Hiding and Multimedia Signal Processing c 2016 ISSN 207-4212 Ubiquitous International Volume 7, Number 5, September 2016 Truncation Strategy of Tensor Compressive Sensing for Noisy
More informationUnsupervised Learning
2018 EE448, Big Data Mining, Lecture 7 Unsupervised Learning Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html ML Problem Setting First build and
More informationData Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395
Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection
More informationStructure, Dynamics and Function of Membrane Proteins using 4D Electron Microscopy
Structure, Dynamics and Function of Membrane Proteins using 4D Electron Microscopy Anthony W. P. Fitzpatrick Department of Chemistry, Lensfield Road, Cambridge, CB2 1EW My research aims to understand the
More information15 Singular Value Decomposition
15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing
More informationHST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007
MIT OpenCourseWare http://ocw.mit.edu HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing Spring 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationData Preprocessing Tasks
Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can
More informationIntroduction to Cryo Microscopy. Nikolaus Grigorieff
Introduction to Cryo Microscopy Nikolaus Grigorieff Early TEM and Biology 1906-1988 Bacterial culture in negative stain. Friedrich Krause, 1937 Nobel Prize in Physics 1986 Early design, 1931 Decades of
More informationDetermining Protein Structure BIBC 100
Determining Protein Structure BIBC 100 Determining Protein Structure X-Ray Diffraction Interactions of x-rays with electrons in molecules in a crystal NMR- Nuclear Magnetic Resonance Interactions of magnetic
More informationNoise, Image Reconstruction with Noise!
Noise, Image Reconstruction with Noise! EE367/CS448I: Computational Imaging and Display! stanford.edu/class/ee367! Lecture 10! Gordon Wetzstein! Stanford University! What s a Pixel?! photon to electron
More information2.3. Clustering or vector quantization 57
Multivariate Statistics non-negative matrix factorisation and sparse dictionary learning The PCA decomposition is by construction optimal solution to argmin A R n q,h R q p X AH 2 2 under constraint :
More information4 Bias-Variance for Ridge Regression (24 points)
Implement Ridge Regression with λ = 0.00001. Plot the Squared Euclidean test error for the following values of k (the dimensions you reduce to): k = {0, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500,
More informationData Mining Techniques
Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent (credit: Yijun Zhao, Percy Liang) DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Linear Dimensionality
More informationImage Noise: Detection, Measurement and Removal Techniques. Zhifei Zhang
Image Noise: Detection, Measurement and Removal Techniques Zhifei Zhang Outline Noise measurement Filter-based Block-based Wavelet-based Noise removal Spatial domain Transform domain Non-local methods
More informationStatistical Geometry Processing Winter Semester 2011/2012
Statistical Geometry Processing Winter Semester 2011/2012 Linear Algebra, Function Spaces & Inverse Problems Vector and Function Spaces 3 Vectors vectors are arrows in space classically: 2 or 3 dim. Euclidian
More informationMid-year Report Linear and Non-linear Dimentionality. Reduction. applied to gene expression data of cancer tissue samples
Mid-year Report Linear and Non-linear Dimentionality applied to gene expression data of cancer tissue samples Franck Olivier Ndjakou Njeunje Applied Mathematics, Statistics, and Scientific Computation
More informationImage Analysis. PCA and Eigenfaces
Image Analysis PCA and Eigenfaces Christophoros Nikou cnikou@cs.uoi.gr Images taken from: D. Forsyth and J. Ponce. Computer Vision: A Modern Approach, Prentice Hall, 2003. Computer Vision course by Svetlana
More informationDeriving Principal Component Analysis (PCA)
-0 Mathematical Foundations for Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Deriving Principal Component Analysis (PCA) Matt Gormley Lecture 11 Oct.
More informationElectron microscopy in molecular cell biology II
Electron microscopy in molecular cell biology II Cryo-EM and image processing Werner Kühlbrandt Max Planck Institute of Biophysics Sample preparation for cryo-em Preparation laboratory Specimen preparation
More informationMULTICHANNEL SIGNAL PROCESSING USING SPATIAL RANK COVARIANCE MATRICES
MULTICHANNEL SIGNAL PROCESSING USING SPATIAL RANK COVARIANCE MATRICES S. Visuri 1 H. Oja V. Koivunen 1 1 Signal Processing Lab. Dept. of Statistics Tampere Univ. of Technology University of Jyväskylä P.O.
More informationFast and Robust Phase Retrieval
Fast and Robust Phase Retrieval Aditya Viswanathan aditya@math.msu.edu CCAM Lunch Seminar Purdue University April 18 2014 0 / 27 Joint work with Yang Wang Mark Iwen Research supported in part by National
More informationCS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)
CS4495/6495 Introduction to Computer Vision 8B-L2 Principle Component Analysis (and its use in Computer Vision) Wavelength 2 Wavelength 2 Principal Components Principal components are all about the directions
More informationArtificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino
Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as
More informationPCA and admixture models
PCA and admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price PCA and admixture models 1 / 57 Announcements HW1
More informationSignal Modeling Techniques in Speech Recognition. Hassan A. Kingravi
Signal Modeling Techniques in Speech Recognition Hassan A. Kingravi Outline Introduction Spectral Shaping Spectral Analysis Parameter Transforms Statistical Modeling Discussion Conclusions 1: Introduction
More information12.2 Dimensionality Reduction
510 Chapter 12 of this dimensionality problem, regularization techniques such as SVD are almost always needed to perform the covariance matrix inversion. Because it appears to be a fundamental property
More information9. Nuclear Magnetic Resonance
9. Nuclear Magnetic Resonance Nuclear Magnetic Resonance (NMR) is a method that can be used to find structures of proteins. NMR spectroscopy is the observation of spins of atoms and electrons in a molecule
More informationAnalysis of Spectral Kernel Design based Semi-supervised Learning
Analysis of Spectral Kernel Design based Semi-supervised Learning Tong Zhang IBM T. J. Watson Research Center Yorktown Heights, NY 10598 Rie Kubota Ando IBM T. J. Watson Research Center Yorktown Heights,
More informationDecember 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis
.. December 20, 2013 Todays lecture. (PCA) (PLS-R) (LDA) . (PCA) is a method often used to reduce the dimension of a large dataset to one of a more manageble size. The new dataset can then be used to make
More informationECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction
ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering
More information