Class Averaging in Cryo-Electron Microscopy

Similar documents
Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian

Introduction to Three Dimensional Structure Determination of Macromolecules by Cryo-Electron Microscopy

Fast Steerable Principal Component Analysis

Some Optimization and Statistical Learning Problems in Structural Biology

3D Structure Determination using Cryo-Electron Microscopy Computational Challenges

PCA from noisy linearly reduced measurements

Three-Dimensional Electron Microscopy of Macromolecular Assemblies

Three-dimensional structure determination of molecules without crystallization

Non-unique Games over Compact Groups and Orientation Estimation in Cryo-EM

Computing Steerable Principal Components of a Large Set of Images and Their Rotations Colin Ponce and Amit Singer

Global Positioning from Local Distances

Covariance Matrix Estimation for the Cryo-EM Heterogeneity Problem

Filtering via a Reference Set. A.Haddad, D. Kushnir, R.R. Coifman Technical Report YALEU/DCS/TR-1441 February 21, 2011

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices

Artificial Intelligence Module 2. Feature Selection. Andrea Torsello

Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

Multireference Alignment, Cryo-EM, and XFEL

Probabilistic & Unsupervised Learning

Fast and Robust Phase Retrieval

Multiresolution Analysis

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization

Statistical Pattern Recognition

Statistical Machine Learning

Introduction to Machine Learning

Data dependent operators for the spatial-spectral fusion problem

EECS490: Digital Image Processing. Lecture #26

Learning representations

Orientation Assignment in Cryo-EM

Lecture Notes 2: Matrices

Phase Retrieval from Local Measurements in Two Dimensions

An Introduction to Wavelets and some Applications

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II

Data Mining Techniques

Anouncements. Assignment 3 has been posted!

arxiv: v3 [cs.cv] 6 Apr 2016

Unsupervised Learning: Dimensionality Reduction

Lecture Notes 5: Multiresolution Analysis

Applied and Computational Harmonic Analysis

Isotropic Multiresolution Analysis: Theory and Applications

Feature detectors and descriptors. Fei-Fei Li

Principal Component Analysis

LECTURE NOTE #10 PROF. ALAN YUILLE

Data Preprocessing Tasks

Advanced Introduction to Machine Learning CMU-10715

Approximate Principal Components Analysis of Large Data Sets

7. Variable extraction and dimensionality reduction

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Fast Angular Synchronization for Phase Retrieval via Incomplete Information

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.

Diffusion/Inference geometries of data features, situational awareness and visualization. Ronald R Coifman Mathematics Yale University

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Doug Cochran. 3 October 2011

Lecture Notes 1: Vector spaces

A Tour of the Lanczos Algorithm and its Convergence Guarantees through the Decades

A Tutorial on Wavelets and their Applications. Martin J. Mohlenkamp

Singular Value Decomposition and Principal Component Analysis (PCA) I

Lecture 6 Scattering theory Partial Wave Analysis. SS2011: Introduction to Nuclear and Particle Physics, Part 2 2

Normalized power iterations for the computation of SVD

Principal Component Analysis

Machine Learning - MT & 14. PCA and MDS

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Lecture: Face Recognition and Feature Reduction

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

PARAMETERIZATION OF NON-LINEAR MANIFOLDS

Covariance to PCA. CS 510 Lecture #14 February 23, 2018

Analysis of Spectral Kernel Design based Semi-supervised Learning

Perturbation of the Eigenvectors of the Graph Laplacian: Application to Image Denoising

Robot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction

A Second-order Finite Difference Scheme For The Wave Equation on a Reduced Polar Grid DRAFT

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Image Assessment San Diego, November 2005

Recurrence Relations and Fast Algorithms

Image Analysis. PCA and Eigenfaces

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer

Main matrix factorizations

Homework 1. Yuan Yao. September 18, 2011

Second-Order Inference for Gaussian Random Curves

Advanced Machine Learning & Perception

Estimation of the Optimum Rotational Parameter for the Fractional Fourier Transform Using Domain Decomposition

Sparse linear models

CS281 Section 4: Factor Analysis and PCA

Image Noise: Detection, Measurement and Removal Techniques. Zhifei Zhang

Non-positivity of the semigroup generated by the Dirichlet-to-Neumann operator

Practical Linear Algebra: A Geometry Toolbox

Probabilistic Graphical Models

L11: Pattern recognition principles

Recovery of Compactly Supported Functions from Spectrogram Measurements via Lifting

Lecture: Face Recognition and Feature Reduction

Designing Information Devices and Systems II Fall 2015 Note 5

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

Efficient Solvers for Stochastic Finite Element Saddle Point Problems

SPHERICAL NEAR-FIELD ANTENNA MEASUREMENTS

5. Discriminant analysis

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

Factor Analysis (10/2/13)

Corners, Blobs & Descriptors. With slides from S. Lazebnik & S. Seitz, D. Lowe, A. Efros

Transcription:

Class Averaging in Cryo-Electron Microscopy Zhizhen Jane Zhao Courant Institute of Mathematical Sciences, NYU Mathematics in Data Sciences ICERM, Brown University July 30 2015

Single Particle Reconstruction (SPR) using cryo-em

Challenges in Cryo-EM SPR Extremely low signal to noise ratio because of the low electron dose. Need to process a large number of images for desired resolution. With higher resolution camera, the number of pixels in each image becomes larger. Requires efficient and accurate algorithms.

Geometry of the Projection Image To each projection image I there corresponds a 3 3 unknown rotation matrix R describing its orientation. R = R 1 R 2 R 3 satisfying RR T = R T R = I, det R = 1. The projection image lies on a tangent plane to the two dimensional unit sphere S 2 at the viewing angle v = v(r) = R 3.

Algorithmic Procedure for SPR Particle selection: manual, experimental image segmentation. Class averaging: classify images with similar viewing directions, register and average to improve their signal-to-noise ratio (SNR). Clean I i I j Average Orientation estimation: use common lines to estimate the rotation matrix R associated with each projection image (or class average). 3D Reconstruction: a 3D volume is generated by a tomographic inversion algorithm. Iterative refinement

Clustering Method (Penczek, Zhu, Frank 1996) Suppose we have n images I 1,...,I n. Rotationally Invariant Distances (RID) for each pair of images is d RID (i, j) = min α [0,2π) I i R α (I j ). Cluster the images using K-means. Problems: 1. Computationally expensive to compute all pairs of d RID. Naïve implementation requires O(n 2 ) rotational alignment of images. Rotational invariant image representation, fast algorithms for dimensionality reduction and nearest neighbor search. 2. Noise and outliers: At low SNR images with completely different views have relatively small d RID (noise aligns well, instead of underlying signal). Fast steerable PCA, Wiener filtering, and vector diffusion maps.

Hairy Ball Theorem and Global Alignment There is no non-vanishing continuous tangent vector field on the sphere. Cannot findα i such that α ij = α i α j. No global in-plane rotational alignment of all images.

Our Class Averaging Procedure (Z., Singer 2014)

Comparison with Existing Methods 10 4 simulated projection images of 70S ribosome at different signal to noise ratio. 50 nearest neighbors. Proportion of the nearest neighbors within a small spherical cap of 18.2. RFA/K-means IMAGIC Relion EMAN2 Xmipp ASPIRE SNR= 1/50 0.45 0.97 0.79 0.74 0.83 1.00 SNR= 1/100 0.09 0.87 0.70 0.45 0.68 0.99 SNR= 1/150 0.07 0.67 0.52 0.13 0.48 0.90 Timing (hrs) 1.5 7.5 16 12 42 0.5 (a) IMAGIC (b) Relion (c) EMAN2 (d) Xmipp (e) ASPIRE (f) Reference

Ab initio Reconstruction of Experimental Data Sets 40, 778 images of 70S ribosome (Courtesy Dr Joachim Frank) Image size: 250 250 pixels. Pixel size: 1.5 Å ******************************************* Movie of 70S ribosome ******************************************* Ab initio model resolution: 11.5Å.

Rotational Invariant Representation: Bispectrum Rotational invariant representation of images: bispectrum. For a 1D periodic discrete signal f(x), bispectrum is defined as b(k 1, k 2 ) =ˆf(k 1 )ˆf(k 2 )ˆf( (k 1 + k 2 )). Bispectrum is shift invariant, complete and unbiased.

Rotational Invariant Representation: Bispectrum Steerable basis: u k,q (r,φ) = g q (r)e ıkφ. I(r,φ) = k,q a k,qu k,q (r,φ). Image rotation by α accounts for phase shift a k,q a k,q e ıkα. Rotational invariant image representation: b k1, k 2,q 1, q 2,q 3 = a k1,q 1 a k2,,q 2 a (k1 +k 2),q 3. The dimensionality of the feature vectors b is reduced by a randomized algorithm for low rank matrix approximation (Rokhlin, Liberty, Tygert, Martinsson, Halko, Tropp, Szlam,... ). Find k nearest neighbors in nearly linear time using a randomized algorithm for approximated nearest neighbors search (Jones, Osipov, Rokhlin 2011). Find the associated optimal alignment.

Small World Graph on S 2 (Singer, Z., Shkolnisky, Hadani 2011, Singer, Wu 2012) Define graph G = (V, E) by{i, j} E if i, j are nearest neighbors. Optimal rotation angles α ij gives d RID. Triplet consistency: α ij +α jk +α ki = 0 mod 2π. Vector diffusion maps explores the consistency in optimal rotations systematically. Non-local means with rotations.

Affinity Based on Consistency of Transformations In the VDM framework, we define the affinity between i and l by considering all paths of length t connecting them, but instead of just summing the weights of all paths, we sum the transformations. a) Higher affinity b) lower affinity Every path from i to l may result in a different transformation (like parallel transport due to curvature). When adding transformations of different paths, cancellations may happen.

Initial vs VDM Classification Noisy (SNR= 1/50) centered 10 4 simulated projection images of 70S ribosome. k = 200 N 6 x 105 5 4 3 2 1 0 0 60 120 180 acos v i,v j Initial Classification N 10 x 104 8 6 4 2 0 0 5 10 15 20 25 30 acos v i,v j VDM

Steerable PCA (Z., Singer 2013, Z., Shkolnisky, Singer 2015) Inclusion of the rotated and reflected images for PCA in some cases is advantageous. Many efficient algorithms transform images onto polar grid. However, non-unitary transformation from Cartesian to polar grid changes the noise statistics. A digital image I is obtained by sampling a squared-integrable bandlimited function f on a Cartesian grid of size L L with bandlimit c. The underlying clean image is essentially compactly supported in a disk of radius R. Suppose we have n images, we introduce an algorithm with computational complexity O(nL 3 + L 4 ) (O(nL 4 + L 6 ) for PCA).

Steerable Principal Components in Real Domain λ = 163.8 λ = 163.8 λ = 160.0 λ = 160.0 λ = 158.1 λ = 158.1 λ = 153.9 λ = 153.9 λ = 153.4 λ = 153.4 λ = 153.2 λ = 153.2 λ = 150.3 λ = 150.3 λ = 144.7 λ = 144.7

Running Time R PCA FFBsPCA 30 41.7 18.9 60 1460.6 23.8 90 1.1 10 4 34.5 120 5.0 10 4 48.2 150 1.2 10 5 96.7 180 111.0 210 139.2 240 158.6 Table: Running time for different PCA methods (in seconds), n = 10 3, c = 1/2, and R varies from 30 to 240. L = 2R.

Application: Denoising Images (a) clean (b) noisy (c) PCA (d) FFBsPCA Figure: Denoising 10 5 simulated projection images from human mitochondrial large ribosomal subunit. 240 240 pixels. The columns show (a) clean projection images, (b) noisy projection images with SNR= 1/30, (c) denoised projection images using traditional PCA, and (d) denoised images using FFBsPCA.

Outline of the Approach Truncated Fourier-Bessel expansion by a sampling criterion (asymptotics of Bessel functions). Efficient and accurate evaluation of the expansion coefficients (NUFFT and a Gaussian quadrature rule). Averaging over all rotations gives block diagonal structure in the sample covariance matrix and reduces computational complexity.

Truncated Fourier-Bessel Expansion The scaled Fourier-Bessel functions are the eigenfunctions of the Laplacian in a disk of radius c with Dirichlet boundary condition and they are given by ψ k,q c (ξ,θ) = { Nk,q J k ( R k,q ξ c ) e ıkθ, ξ c, 0, ξ > c, To avoid spurious information from noise outside of the compact support R in real space, we select k s and q s that satisfy R k,(q+1) 2πcR. F(f) is approximated by the finite expansion P c,r F(f)(ξ,θ) = 0 k max p k k= k max q=1 a k,q ψc k,q (ξ,θ). The Fourier-Bessel expansion coefficients are given by c ( ) ξ 2π a k,q = N k,q J k R k,q ξ dξ F(f)(ξ,θ)e ıkθ dθ. c 0

Evaluation of Fourier-Bessel Expansion Coefficients n ξ = 4cR and n θ = 16cR. Evaluate Fourier coefficients on a polar grid using NUFFT. O(L 2 log L+n ξ n θ ). The angular integration is sped up by 1D FFT on the concentric circles. O(n ξ n θ log n θ ). It is followed by a numerical evaluation of the radial integral with a Gaussian quadrature rule. O(n ξ n θ ). n ξ ( ) ξ j a k,q N k,q J k R k,q F(I)(ξ j, k)ξ j w(ξ j ). c j=1 Total computational cost is O(L 2 log L).

Spectrum of the T T The expansion coefficients vector a = T I. T T is almost unitary. 1.02 λi 1 0.98 0.96 0.94 0.92 λi 1 0.96 0.92 0.9 0 1000 2000 3000 4000 i 0.88 0 3000 6000 9000 i R = 60, c = 1/3, L = 121 R = 90, c = 1/3, L = 181 These are also the spectra of the population covariance matrices of transformed white noise images.

Sample Covariance Matrix In the Fourier-Bessel basis the covariance matrix C of the data with all rotations and reflection is given by C (k,q),(k,q ) = 1 4πn n i=1 1 = δ k,k 2n 2π 0 [a i k,q aik,q + a i k,q ai k,q ] e ı(k k )α dα n (a i k,q ai k,q + a i k,q ai k,q ). i=1 We project the images onto k max orthogonal subspaces with different angular Fourier modes, C = k max k=0 C(k). The block size p k decreases as the angular frequency k increases. Computational complexity: O(nL 3 + L 4 ).

Selecting Steerable PCA components Steerable PCA basis and the associated expansion coefficients c k,l are linear combinations of the Fourier-Bessel basis and the associated expansion coefficients using the eigenvectors of C (k) s. For images corrupted by additive white Gaussian noise, given a good estimation of the noise variance ˆσ 2, we select the components by the following criteria: for the(k)th block of the sample covariance matrix eigenvalues λ (k) l, where γ k = p k (2 δ k,0 )n. λ (k) l > ˆσ 2 (1+ γ k ) 2 The steerable PCA expansion coefficients are denoised by Wiener-type filtering (Singer, Wu 2013).

Summary Our class averaging procedure is an efficient and accurate computational framework for large cryo-em image data set. The whole pipeline is almost linear with the number of images. Fast steerable PCA is used to efficiently compress and denoise images. Rotationally invariant image representation and efficient dimensionality reduction method generate features for fast nearest neighbor search. VDM uses the consistency of linear transformations to get better estimation of nearest neighbors and alignment parameters. Methods described here can be applied to many other applications, such as X-ray free electron Laser (XFEL) single particle reconstruction.

References Z. Zhao, Y. Shkolnisky, A. Singer, Fast Steerable Principal Component Analysis, Submitted, Available at http://arxiv.org/abs/1412.0781. Z. Zhao, A. Singer, Rotationally Invariant Image Representation for Viewing Direction Classification in Cryo-EM, Journal of Structural Biology, 186 (1), pp. 153-166 (2014). Z. Zhao, A. Singer, Fourier-Bessel Rotational Invariant Eigenimages, The Journal of the Optical Society of America A, 30 (5), pp. 871-877 (2013). A. Singer, H.-T. Wu, Vector Diffusion Maps and the Connection Laplacian, Communications on Pure and Applied Mathematics, 65 (8), pp. 1067-1144 (2012). A. Singer, Z. Zhao, Y. Shkolnisky, R. Hadani, Viewing Angle Classification of Cryo-Electron Microscopy Images using Eigenvectors, SIAM Journal on Imaging Sciences, 4 (2), pp. 723-759 (2011).

ASPIRE: Algorithms for Single Particle Reconstruction Open source toolbox: spr.math.princeton.edu

Thank You!