Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian

Size: px

Start display at page:

Download "Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian"

Karin Edwards
5 years ago
Views:

1 Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics May 28, 2015 Amit Singer (Princeton University) May / 27

2 Single Particle Reconstruction using cryo-em Schematic drawing of the imaging process: The cryo-em problem: Amit Singer (Princeton University) May / 27

3 Main Algorithmic Challenges 1 Orientation assignment 2 Heterogeneity (resolving structural variability) 3 2D Class averaging (de-noising) 4 Symmetry detection 5 Motion correction 6 Particle picking Amit Singer (Princeton University) May / 27

4 Class Averaging in Cryo-EM: Improve SNR Amit Singer (Princeton University) May / 27

5 Image denoising by vector diffusion maps Generalization of Laplacian Eigenmaps (Belkin, Niyogi 2003) and Diffusion Maps (Coifman, Lafon 2006) Introduced the graph Connection Laplacian S, Zhao, Shkolnisky, Hadani (SIIMS 2011) Hadani, S (FoCM 2011) S, Wu (Comm. Pure Appl. Math 2012) Zhao, S (J Struct. Bio. 2014) Experimental images (70S) courtesy of Dr. Joachim Frank (Columbia) Class averages by vector diffusion maps (averaging with 20 nearest neighbors) Amit Singer (Princeton University) May / 27

6 Rotation Invariant Distances Projection images I 1,I 2,...,I n with unknown rotations R 1,R 2,...,R n SO(3) Rotationally Invariant Distances (RID) Cluster the images using K-means. d RID (i,j) = min O SO(2) I i O I j Images are not centered; also possible to include translations and to optimize over the special Euclidean group. Problem with this approach: outliers. At low SNR images with completely different viewing directions may have relatively small d RID (noise aligns well, instead of underlying signal). Amit Singer (Princeton University) May / 27

7 Outliers: Small World Graph on S 2 Define graph G = (V,E) by {i,j} E d RID (i,j) ε. 3 R i 2 R i 1 R i Optimal rotation angles O ij = argmin I i O I j, i,j = 1,...,n. O SO(2) Triplet consistency relation good triangles O ij O jk O ki I 2 2. How to use information of optimal rotations in a systematic way? Vector Diffusion Maps Non-local means with rotations Amit Singer (Princeton University) May / 27

8 Vector Diffusion Maps: Setup j w ij O ij i In VDM, the relationships between data points (e.g., cryo-em images) are represented as a weighted graph, where the weights w ij describing affinities between data points are accompanied by linear orthogonal transformations O ij. Amit Singer (Princeton University) May / 27

9 Manifold Learning: Point cloud in R p x 1,x 2,...,x n R p. Manifold assumption: x 1,...,x n M d, with d p. Local Principal Component Analysis (PCA) gives an approximate orthonormal basis O i for the tangent space T xi M. O i is a p d matrix with orthonormal columns: O T i O i = I d d. Alignment: O ij = argmin O O(d) O O T i O j HS (computed through the singular value decomposition of O T i O j ) Amit Singer (Princeton University) May / 27

10 Parallel Transport O ij approximates the parallel transport operator P xi,x j : T xj M T xi M Amit Singer (Princeton University) May / 27

11 Laplacian Eigenmap (Belkin and Niyogi 2003) and Diffusion Map (Coifman and Lafon 2006) Symmetric n n matrix W 0 : W 0 (i,j) = { wij (i,j) E, 0 (i,j) / E. Diagonal matrix D 0 of the same size: D 0 (i,i) = deg(i) = j:(i,j) E w ij. Graph Laplacian, Normalized graph Laplacian and the random walk matrix: L 0 = D 0 W 0, L 0 = I D 1/2 0 W 0 D 1/2 0, A 0 = D 1 0 W 0 The diffusion map Φ t is defined in terms of the eigenvectors of A 0 : A 0 φ l = λ l φ l, l = 1,...,n Φ t : i (λ t l φ l(i)) n l=1. Amit Singer (Princeton University) May / 27

12 Vector diffusion mapping: W 1 and D 1 Symmetric nd nd matrix W 1 : { wij O W 1 (i,j) = ij (i,j) E, 0 d d (i,j) / E. n n blocks, each of which is of size d d. Diagonal matrix D 1 of the same size, where the diagonal d d blocks are scalar matrices with the weighted degrees: D 1 (i,i) = deg(i)i d d, and deg(i) = j:(i,j) E w ij Amit Singer (Princeton University) May / 27

13 A 1 = D 1 1 W 1 is an averaging operator for vector fields The matrix A 1 can be applied to vectors v of length nd, which we regard as n vectors of length d, such that v(i) is a vector in R d viewed as a vector in T xi M. The matrix A 1 = D1 1 W 1 is an averaging operator for vector fields, since (A 1 v)(i) = 1 deg(i) j:(i,j) E w ij O ij v(j). This implies that the operator A 1 transport vectors from the tangent spaces T xj M (that are nearby to T xi M) to T xi M and then averages the transported vectors in T xi M. Amit Singer (Princeton University) May / 27

14 Affinity between nodes based on consistency of transformations In the VDM framework, we define the affinity between i and j by considering all paths of length t connecting them, but instead of just summing the weights of all paths, we sum the transformations. Every path from j to i may result in a different transformation (like parallel transport due to curvature). When adding transformations of different paths, cancelations may happen. We define the affinity between i and j as the consistency between these transformations. A 1 = D 1 1 W 1 is similar to the symmetric matrix W 1 W 1 = D 1/2 1 W 1 D 1/2 1 We define the affinity between i and j as W 2t 1 (i,j) 2 HS = deg(i) deg(j) (D 1 1 W 1) 2t (i,j) 2 HS. Amit Singer (Princeton University) May / 27

15 Embedding into a Hilbert Space Since W 1 is symmetric, it has a complete set of eigenvectors {v l } nd l=1 and eigenvalues {λ l } nd l=1 (ordered as λ 1 λ 2... λ nd ). Spectral decompositions of W 1 and W 2t W 1 (i,j) = nd l=1 1 : λ l v l (i)v l (j) T, and W2t 1 (i,j) = nd l=1 where v l (i) R d for i = 1,...,n and l = 1,...,nd. The HS norm of W2t 1 (i,j) is calculated using the trace: W 2t λ 2t l v l (i)v l (j) T, nd 1 (i,j) 2 HS = (λ l λ r ) 2t v l (i),v r (i) v l (j),v r (j). l,r=1 The affinity W 2t 1 (i,j) 2 HS = V t(i),v t (j) is an inner product for the finite dimensional Hilbert space R (nd)2 via the mapping V t : V t : i ( (λ l λ r ) t v l (i),v r (i) ) nd l,r=1. Amit Singer (Princeton University) May / 27

16 Vector Diffusion Distance The vector diffusion mapping is defined as V t : i ( (λ l λ r ) t v l (i),v r (i) ) nd l,r=1. The vector diffusion distance between nodes i and j is denoted d VDM,t (i,j) and is defined as d 2 VDM,t (i,j) = V t(i),v t (i) + V t (j),v t (j) 2 V t (i),v t (j). Other normalizations of the matrix W 1 are possible and lead to slightly different embeddings and distances (similar to diffusion maps). The matrices I W 1 and I + W 1 are positive semidefinite, because v T (I ±D 1/2 1 W 1 D 1/2 1 )v = (i,j) E v(i) ± w ijo ij v(j) 2 0, deg(i) deg(j) for any v R nd. Therefore, λ l [ 1,1]. As a result, the vector diffusion mapping and distances can be well approximated by using only the few largest eigenvalues and their corresponding eigenvectors. Amit Singer (Princeton University) May / 27

17 Application to the class averaging problem in Cryo-EM (S, Zhao, Shkolnisky, Hadani 2011) 14 x x degrees degrees (a) Neighbors are identified using d RID (b) Neighbors are identified using d VDM,t=2 Figure : SNR=1/64: Histogram of the angles (x-axis, in degrees) between the viewing directions of each image (out of 40000) and it 40 neighboring images. Left: neighbors are identified using the original rotationally invariant distances d RID. Right: neighbors are post identified using vector diffusion distances. Amit Singer (Princeton University) May / 27

18 Computational Aspects Zhao, S J. Struct. Biol Naïve implementation requires O(n 2 ) rotational alignments of images Rotational invariant representation of images: bispectrum Dimensionality reduction using a randomized algorithm for PCA (Rokhlin, Liberty, Tygert, Martinsson, Halko, Tropp, Szlam,...) Randomized approximated nearest neighbors search in nearly linear time (Jones, Osipov, Rokhlin 2011) Amit Singer (Princeton University) May / 27

19 The Hairy Ball Theorem There is no non-vanishing continuous tangent vector field on the sphere. Cannot find O i such that O ij = O i O 1 j. No global rotational alignment of all images. Amit Singer (Princeton University) May / 27

20 Convergence Theorem to the Connection-Laplacian (S, Wu) Let ι : M R p be a smooth d-dim closed Riemannian manifold embedded in R p, with metric g induced from the canonical metric on R p, and the data set {x i } i=1,...,n is independently uniformly distributed over M. Let K C 2 (R + ) be a positive kernel function decaying exponentially, that is, there exist T > 0 and C > 0 such that K(t) Ce t when t > T. ( ι(xi ) ι(x j ) R p For ǫ > 0, let K ǫ (x i,x j ) = K ǫ ). Then, for X C 3 (TM) and for all x i almost surely we have [ n lim 1 j=1 K ] ǫ(x i,x j )O ij X j ǫ 0n ǫ n j=1 K X i = m 2 ( ι 2 X(x i ),e l ) d ǫ(x i,x j ) 2dm l=1, 0 where 2 is the connection Laplacian, X i ( ι X(x i ),e l ) d l=1 Rd for all i, {e l (x i )} l=1,...,d is an orthonormal basis of ι T xi M, m l = R d x l K( x )dx, and O ij is the optimal orthogonal transformation determined by the algorithm in the alignment step. Amit Singer (Princeton University) May / 27

21 Example: Connection-Laplacian for S d embedded in R d+1 The connection-laplacian commutes with rotations and the eigenvalues and eigen-vector-fields are calculated using representation theory: S 2 : 6,10,14,... S 3 : 4,6,9,16,16,... S 4 : 5,10,14,... S 5 : 6,15,20, (a) S 2 (b) S 3 (c) S 4 (d) S 5 Figure : Bar plots of the largest 30 eigenvalues of A 1 for n = 8000 points uniformly distributed over spheres of different dimensions. Amit Singer (Princeton University) May / 27

22 Spectral graph theory with vector fields The Graph Connection Laplacian L 1 = D 1 W 1 The Normalized Graph Connection Laplacian L 1 = I D 1/2 1 W 1 D 1/2 = I W 1 Averaging operator / random walk matrix for vector diffusion: A 1 = D 1 1 W 1 A Cheeger inequality for the graph connection Laplacian (Bandeira, S, Spielman SIAM Matrix Analysis 2013) Amit Singer (Princeton University) May / 27

23 More applications of VDM: Orientability from a point cloud Encode the information about reflections in a symmetric n n matrix Z with entries { detoij (i,j) E, Z ij = 0 (i,j) / E. That is, Z ij = 1 if no reflection is needed, Z ij = 1 if a reflection is needed, and Z ij = 0 if the points are not nearby. Normalize Z by the node degrees (a) S 2 (b) Klein bottle (c) RP 2 Figure : Histogram of the values of the top eigenvector of D 1 0 Z. Amit Singer (Princeton University) May / 27

24 Orientable Double Covering Embedding obtained using the eigenvectors of the (normalized) matrix [ ] ( ) Z Z 1 1 = Z, Z Z 1 1 Figure : Left: the orientable double covering of RP(2), which is S 2 ; Middle: the orientable double covering of the Klein bottle, which is T 2 ; Right: the orientable double covering of the Möbius strip, which is a cylinder. Amit Singer (Princeton University) May / 27

25 Registration and Ordering of Drosophila Embryogenesis Images Dsilva, Lim, Lu, S, Kevrekidis, Shvartsman, Development 2015 Amit Singer (Princeton University) May / 27

26 Registered and Ordered using VDM Amit Singer (Princeton University) May / 27

27 References A. Singer, Z. Zhao, Y. Shkolnisky, R. Hadani, Viewing Angle Classification of Cryo-Electron Microscopy Images using Eigenvectors, SIAM Journal on Imaging Sciences, 4 (2), pp (2011). A. Singer, H.-T. Wu, Orientability and Diffusion Maps, Applied and Computational Harmonic Analysis, 31 (1), pp (2011). R. Hadani, A. Singer, Representation theoretic patterns in three dimensional Cryo-Electron Microscopy II The class averaging problem, Foundations of Computational Mathematics, 11 (5), pp (2011). A. Singer, H.-T. Wu, Vector Diffusion Maps and the Connection Laplacian, Communications on Pure and Applied Mathematics, 65 (8), pp (2012). A. S. Bandeira, A. Singer, D. A. Spielman, A Cheeger inequality for the graph connection Laplacian, SIAM Journal on Matrix Analysis and Applications, 34 (4), pp (2013). Z. Zhao, A. Singer, Rotationally Invariant Image Representation for Viewing Direction Classification in Cryo-EM, Journal of Structural Biology, 186 (1), pp (2014). C. J. Dsilva, B. Lim, H. Lu, A. Singer, I. G. Kevrekidis, S. Y. Shvartsman, Temporal ordering and registration of images in studies of developmental dynamics, Development, 142, (2015). Amit Singer (Princeton University) May / 27

Class Averaging in Cryo-Electron Microscopy

Class Averaging in Cryo-Electron Microscopy Zhizhen Jane Zhao Courant Institute of Mathematical Sciences, NYU Mathematics in Data Sciences ICERM, Brown University July 30 2015 Single Particle Reconstruction