Global Positioning from Local Distances Amit Singer Yale University, Department of Mathematics, Program in Applied Mathematics SIAM Annual Meeting, July 2008 Amit Singer (Yale University) San Diego 1 / 34
Joint work with... Ronald Coifman (Yale University, Applied Mathematics) Yoel Shkolnisky (Yale University, Applied Mathematics) Fred Sigworth (Yale School of Medicine, Cellular & Molecular Physiology) Yuval Kluger (NYU, Department of Cell Biology) David Cowburn (New York Structural Biology Center) Yosi Keller (Bar Ilan University, Electrical Engineering) Amit Singer (Yale University) San Diego 2 / 34
Three Dimensional Puzzle Amit Singer (Yale University) San Diego 3 / 34
Cryo Electron Microscopy: Projection Images Projection Image Line Integral y z x g SO(3) Molecule The projection image is P g (x,y) = φ g(x,y,z)dz. φ(r) is the electric potential of the molecule, φ g (r) = φ(g 1 r). Amit Singer (Yale University) San Diego 4 / 34
Projection Images: Toy Example Amit Singer (Yale University) San Diego 5 / 34
Fourier projection-slice theorem (k 1,l 1 ) Projection P 1 Fourier transform ˆP1 3D Fourier space Λ k2,l 2 =Λ k1,l 1 Λ k2,l 2 =Λ k1,l 1 Projection P 2 Fourier transform ˆP 2 3D Fourier space Amit Singer (Yale University) San Diego 6 / 34
The Geometry of the slice theorem Every image is a great circle over S 2. Any pair of images have a common line, or Any pair of great circles meet at two antipodal points. Amit Singer (Yale University) San Diego 7 / 34
Three Dimensional Puzzle The radial lines are the puzzle pieces. Every image is a circular chain of pieces. Common line: meeting point Amit Singer (Yale University) San Diego 8 / 34
Spiders: It s the Network K projection images L radial lines We build a weighted directed graph G = (V,E,W). The vertices are the radial lines ( V = KL) V = {(k,l) : 1 k K, 0 l L 1} E = {((k 1,l 1 ),(k 2,l 2 )) : (k 1,l 1 ) points to (k 2,l 2 )} W (k1,l 1 ),(k 2,l 2 ) = { 1 if ((k1,l 1 ),(k 2,l 2 )) E 0 if ((k 1,l 1 ),(k 2,l 2 )) E W is a sparse weight matrix of size KL KL Amit Singer (Yale University) San Diego 9 / 34
Spider first pair of legs Blue vertex (k 1,l 1 ) is the head of the spider Link (k 1,l 1 ) with (k 1,l 1 + l), d l d (same image radial lines) Weights: W (k1,l 1 ),(k 1,l 1 +l) = 1. Amit Singer (Yale University) San Diego 10 / 34
Spider: remaining legs (k 1,l 1 ) and (k 2,l 2 ) are common radial lines of different images. Links: ((k 1,l 1 ),(k 2,l 2 + l)) E for d l d. Weights: W (k1,l 1 ),(k 2,l 2 +l) = 1. Amit Singer (Yale University) San Diego 11 / 34
Averaging operator Row stochastic normalization of W A = D 1 W. D is a diagonal matrix, D ii = N j=1 W ij. The matrix A is an averaging operator: (Af)(k 1,l 1 ) = 1 {((k 1,l 1 ),(k 2,l 2 )) E} ((k 1,l 1 ),(k 2,l 2 )) E f (k 2,l 2 ). A assigns the head of each spider the average of f over its legs. A1 = 1: trivial eigenvector ψ 0 = 1, with λ 0 = 1. Amit Singer (Yale University) San Diego 12 / 34
Coordinate Eigenvectors Coordinate vectors x, y and z are eigenvectors: Ax = λx Ay = λy Az = λz The center of mass of every spider is beneath the spider s head: any pair of opposite legs balance each other symmetric weights. Amit Singer (Yale University) San Diego 13 / 34
Embedding and algorithm Find the common lines for all pairs of images. Construct the averaging operator A. Compute eigenvectors Aψ i = λ i ψ i. Embed the data into the eigenspace (ψ 1,ψ 2,ψ 3 ) (k,l) (ψ 1 (k,l),ψ 2 (k,l),ψ 3 (k,l)). Reveals molecule orientations up to rotation and reflection. Amit Singer (Yale University) San Diego 14 / 34
Numerical Spectrum 1 0.9 0.8 K=200 L=100 d=10 0.7 0.6 λ i 0.5 0.4 0.3 0.2 0.1 0 0 5 10 15 20 25 30 35 40 45 50 i Amit Singer (Yale University) San Diego 15 / 34
Toy Example Amit Singer (Yale University) San Diego 16 / 34
E. coli ribosome Amit Singer (Yale University) San Diego 17 / 34
E. coli ribosome Amit Singer (Yale University) San Diego 18 / 34
Beyond CryoEM: Center of mass averaging operator Coordinates x,y,z are eigenfunctions of the averaging operator: Ax = λx Ay = λy Az = λz. Sphere has a constant curvature. Is there a generalization to Euclidean spaces? Amit Singer (Yale University) San Diego 19 / 34
Global Positioning from Local Distances Problem setup: N points r i R p (p = 2,3). Find coordinates r i = (x 1 i,...,xp i ) Given noisy neighboring distances δ ij = r i r j 2 + noise. Solution: Build an operator whose eigenfunctions are the global coordinates. Efficient eigenvector computation of a sparse matrix. Applications: Sensor networks Protein structuring from NMR spectroscopy (1/r 6 decay of the spin-spin interaction between hydrogen atoms) Surface reconstruction and PDE solvers. More? Amit Singer (Yale University) San Diego 20 / 34
Global Positioning from Local Distances: History Multidimensional Scaling (MDS) if all d ij = r i r j are given: law of cosines + SVD of the inner product matrix. Optimization: minimizing a variety of loss functions, e.g. [ min ri r r 1,...,r j 2 δ 2 2 ij] N i j many variables, not convex, local minima. Semidefinite programming (SDP), slow. Graph Laplacian regularization (Weinberger, Sha, Zhu & Saul, NIPS 2006). Amit Singer (Yale University) San Diego 21 / 34
Rigidity Amit Singer (Yale University) San Diego 22 / 34
Locally Rigid Embedding Assume local rigidity. For each point, embed its k-nn locally (using MDS or otherwise). N local coordinate systems mutually rotated and possibly reflected. How to glue the different coordinate systems together? Amit Singer (Yale University) San Diego 23 / 34
Center of Mass Consider the point r i and its k-nn r i1,r i2,...,r ik. Find weights such that r i is the center of mass of its neighbors and k W i,ij r ij = r i j=1 k W i,ij = 1 j=1 System of p + 1 linear equations in k variables; underdetermined for k > p + 1. Weights are invariant to rigid transformations: rotation, translation, reflection. Amit Singer (Yale University) San Diego 24 / 34
Eigenvectors of W The N N weight matrix W is sparse: every row has at most k non-zero elements. W is not symmetric; must have negative weights for points on the boundary. By construction because W1 = 1, Wx 1 = x 1,..., Wx p = x p, N W ij = 1, j=1 N W ij r j = r i. We are practically done: eigenvectors of W with λ = 1 are the desired coordinates. A little linear algebra is needed to deal with multiplicity and noise. j=1 Amit Singer (Yale University) San Diego 25 / 34
1097 US Cities, k = 18 NN y 0.2 0.15 0.1 0.05 0 0.05 0.1 0.15 0.2 US Cities 0.25 0.4 0.3 0.2 0.1 0 0.1 0.2 0.3 x φ 2 0.2 0.15 0.1 0.05 0 0.05 0.1 0.15 0.2 Locally Rigid Embedding: no noise 0.25 0.4 0.3 0.2 0.1 0 0.1 0.2 0.3 φ 1 φ 2 0.2 0.15 0.1 0.05 0 0.05 0.1 0.15 0.2 Locally Rigid Embedding: 1% noise 0.25 0.4 0.3 0.2 0.1 0 0.1 0.2 0.3 φ 1 φ 2 0.2 0.15 0.1 0.05 0 0.05 0.1 0.15 0.2 Locally Rigid Embedding: 10% noise 0.25 0.4 0.3 0.2 0.1 0 0.1 0.2 0.3 φ 1 Amit Singer (Yale University) San Diego 26 / 34
US Cities: Numerical Spectrum 0.015 0.015 0.03 0.01 0.01 0.02 0.005 0.005 0.01 0 0 1 2 3 4 5 6 7 8 9 0 0 1 2 3 4 5 6 7 8 9 0 0 1 2 3 4 5 6 7 8 9 Numerical spectrum of W for different levels of noise: clean distances (left), 1% noise (center), and 10% noise (right). Amit Singer (Yale University) San Diego 27 / 34
519 hydrogen atoms of 2GGR: 1% noise 20 10 z 0 10 20 20 10 0 y 10 20 20 10 x 0 10 20 Amit Singer (Yale University) San Diego 28 / 34
2GGR: 5% noise 20 10 z 0 10 20 20 10 0 y 10 20 20 0 x 20 40 Amit Singer (Yale University) San Diego 29 / 34
2GGR: Numerical Spectrum 0.05 0.05 0.05 0.04 0.04 0.04 0.03 0.03 0.03 0.02 0.02 0.02 0.01 0.01 0.01 0 0 1 2 3 4 5 6 7 8 9 0 0 1 2 3 4 5 6 7 8 9 0 0 1 2 3 4 5 6 7 8 9 Amit Singer (Yale University) San Diego 30 / 34
Dealing with the multiplicity The eigenvalue λ = 1 is degenerated with multiplicity p + 1 Wφ i = φ i, i = 0,1,...,p. The computed eigenvectors φ 0,φ 1,...,φ p are linear combinations of 1,x 1,...,x p. We may assume φ 0 = 1 and φ j,1 = 0 for j = 1,...,p. We look for a p p matrix A that maps the eigenmap Φ i = (φ 1 i,...,φp i ) to the original coordinate set r i = (x 1 i,...,xp i ) r i = AΦ i, for i = 1,...,N. The squared distance between r i and r j is d 2 ij = r i r j 2 = (Φ i Φ j ) T A T A(Φ i Φ j ). Overdetermined system of linear equations for the elements of A T A. Least squares gives A T A, whose Cholesky decomposition yields A. Amit Singer (Yale University) San Diego 31 / 34
Dealing with noisy distances δ ij Noise breaks the degeneracy and may lead to crossings of eigenvalues. Coordinate vectors are approximated as linear combinations of m non-trivial eigenvectors Φ i = (φ 1,...,φ m ), with m > p (still m N). r i = AΦ i, with A being p m instead of p p. Replace d 2 ij = r i r j 2 = (Φ i Φ j ) T A T A(Φ i Φ j ) with the constrained minimization problem for the m m semidefinite positive matrix P = A T A min i j [ (Φ i Φ j ) T P(Φ i Φ j ) δ 2 ij] 2, such that P 0. A small SDP (formulation uses the Schur complement lemma). Amit Singer (Yale University) San Diego 32 / 34
Comparison to LLE and Graph Laplacian regularization Locally Linear Embedding (LLE) is a non-linear dimensionality reduction method. LLE: Construct weights by solving an overdetermined system min W ij X j X i, where X i R n. 2 j We have a preprocessing step to reveal vectors (X i are not given). Graph Laplacian regularization: { approximate } coordinate by the eigenvectors of W ij = exp dij 2/ε for i j, W ij = 0 elsewhere. LRE requires locally rigid subgraphs, graph neighbors can be physically distant. Amit Singer (Yale University) San Diego 33 / 34
Thank You! Amit Singer (Yale University) San Diego 34 / 34