Space-Variant Computer Vision: A Graph Theoretic Approach

Similar documents
Random walks and anisotropic interpolation on graphs. Filip Malmberg

Spectral Clustering on Handwritten Digits Database

Markov Chains and Spectral Clustering

ECS231: Spectral Partitioning. Based on Berkeley s CS267 lecture on graph partition

Lecture 12: Introduction to Spectral Graph Theory, Cheeger s inequality

Spectral Clustering. Spectral Clustering? Two Moons Data. Spectral Clustering Algorithm: Bipartioning. Spectral methods

Iterative methods for symmetric eigenvalue problems

CS 556: Computer Vision. Lecture 13

CSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13

Graphs, Vectors, and Matrices Daniel A. Spielman Yale University. AMS Josiah Willard Gibbs Lecture January 6, 2016

Image enhancement. Why image enhancement? Why image enhancement? Why image enhancement? Example of artifacts caused by image encoding

Spectral and Electrical Graph Theory. Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity

1 Matrix notation and preliminaries from spectral graph theory

Graphs in Machine Learning

Introduction to Spectral Graph Theory and Graph Clustering

Conjugate gradient acceleration of non-linear smoothing filters

Spectral Clustering. Zitao Liu

ON THE QUALITY OF SPECTRAL SEPARATORS

Laplacian Matrices of Graphs: Spectral and Electrical Theory

Numerical Methods - Numerical Linear Algebra

Spectral Graph Theory and its Applications. Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity

EIGIFP: A MATLAB Program for Solving Large Symmetric Generalized Eigenvalue Problems

Is there life after the Lanczos method? What is LOBPCG?

Solving Constrained Rayleigh Quotient Optimization Problem by Projected QEP Method

Iterative Methods for Solving A x = b

Math 307 Learning Goals. March 23, 2010

Graph Partitioning Algorithms and Laplacian Eigenvalues

Linear algebra and applications to graphs Part 1

Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings

CS 231A Section 1: Linear Algebra & Probability Review

Spectra of Adjacency and Laplacian Matrices

arxiv: v1 [math.co] 1 Mar 2017

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering

Locally-biased analytics

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering

Part I: Preliminary Results. Pak K. Chan, Martine Schlag and Jason Zien. Computer Engineering Board of Studies. University of California, Santa Cruz

Ma/CS 6b Class 23: Eigenvalues in Regular Graphs

CS 664 Segmentation (2) Daniel Huttenlocher

Conjugate gradient acceleration of non-linear smoothing filters Iterated edge-preserving smoothing

Summary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University)

An indicator for the number of clusters using a linear map to simplex structure

STA141C: Big Data & High Performance Statistical Computing

Graph Clustering Algorithms

Graph Partitioning Using Random Walks

Spectral Theory of Unsigned and Signed Graphs Applications to Graph Clustering. Some Slides

Partitioning Networks by Eigenvectors

1 Matrix notation and preliminaries from spectral graph theory

Convex Optimization of Graph Laplacian Eigenvalues

Unsupervised Clustering of Human Pose Using Spectral Embedding

Lecture 9 Approximations of Laplace s Equation, Finite Element Method. Mathématiques appliquées (MATH0504-1) B. Dewals, C.

Final Examination. CS 205A: Mathematical Methods for Robotics, Vision, and Graphics (Fall 2013), Stanford University

Lecture 13: Spectral Graph Theory

Random Walks and Electric Resistance on Distance-Regular Graphs

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Research on Consistency Problem of Network Multi-agent Car System with State Predictor

Eigenvalues of tensors

Networks and Their Spectra

Spectral Clustering. Guokun Lai 2016/10

Lecture Introduction. 2 Brief Recap of Lecture 10. CS-621 Theory Gems October 24, 2012

Lecture: Modeling graphs with electrical networks

On the Maximal Error of Spectral Approximation of Graph Bisection

Computational Methods. Eigenvalues and Singular Values

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Spectral clustering. Two ideal clusters, with two points each. Spectral clustering algorithms

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang

1 Review: symmetric matrices, their eigenvalues and eigenvectors

Machine Learning for Data Science (CS4786) Lecture 11

Asymmetric Cheeger cut and application to multi-class unsupervised clustering

Communities, Spectral Clustering, and Random Walks

CS168: The Modern Algorithmic Toolbox Lectures #11 and #12: Spectral Graph Theory

Spectral Generative Models for Graphs

Diffuse interface methods on graphs: Data clustering and Gamma-limits

Spectral Processing. Misha Kazhdan

Fast Krylov Methods for N-Body Learning

Graph Embedding Techniques for Bounding Condition Numbers of Incomplete Factor Preconditioners

CAAM 454/554: Stationary Iterative Methods

Math 307 Learning Goals

ORIE 6334 Spectral Graph Theory October 13, Lecture 15

Lecture: Local Spectral Methods (1 of 4)

Scale Space Analysis by Stabilized Inverse Diffusion Equations

Two-View Segmentation of Dynamic Scenes from the Multibody Fundamental Matrix

Inverse Power Method for Non-linear Eigenproblems

Spectral Graph Theory

TOPOLOGY FOR GLOBAL AVERAGE CONSENSUS. Soummya Kar and José M. F. Moura

Diffusion and random walks on graphs

Learning Spectral Graph Segmentation

Self-Tuning Spectral Clustering

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.

ACM/CMS 107 Linear Analysis & Applications Fall 2017 Assignment 2: PDEs and Finite Element Methods Due: 7th November 2017

Rank-one LMIs and Lyapunov's Inequality. Gjerrit Meinsma 4. Abstract. We describe a new proof of the well-known Lyapunov's matrix inequality about

Chapter 11. Matrix Algorithms and Graph Partitioning. M. E. J. Newman. June 10, M. E. J. Newman Chapter 11 June 10, / 43

Random walks for deformable image registration Dana Cobzas

OUTLINE 1. Introduction 1.1 Notation 1.2 Special matrices 2. Gaussian Elimination 2.1 Vector and matrix norms 2.2 Finite precision arithmetic 2.3 Fact

17.1 Directed Graphs, Undirected Graphs, Incidence Matrices, Adjacency Matrices, Weighted Graphs

Markov Chains, Random Walks on Graphs, and the Laplacian

Embedding and Spectrum of Graphs

Stabilization and Acceleration of Algebraic Multigrid Method

Motivation: Sparse matrices and numerical PDE's

Linear Algebra Massoud Malek

Low Rank Approximation Lecture 7. Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL

Transcription:

p.1/65 Space-Variant Computer Vision: A Graph Theoretic Approach Leo Grady Cognitive and Neural Systems Boston University

p.2/65 Outline of talk Space-variant vision - Why and how of graph theory Anisotropic interpolation Isoperimetric segmentation Topology and numerical efficiency Multiresolution segmentation Conclusion - Future work

p.3/65 Biological visual sampling Why has evolution driven this process? (Hughes, 1977)

Biological visual sampling Visual archetecture satisfies the needs of the system p.4/65

p.5/65 Biological visual sampling The Terrain Theory of Hughes (1977) (a) Tree Kangaroo (b) Ground Kangaroo

p.6/65 Biological visual sampling Problem: How to apply computer vision techniques to space-variant images? (a) 4-Connected (b) 8-Connected (c) Kangaroo

p.7/65 Biological visual sampling Approach: Graph theory Nodes correspond to pixels and node density corresponds to resolution Naturally extends to 3D or data clustering Applies to any field defined on nodes (e.g., image intensity, coordinates)

p.8/65 Advantages of graph theory Why should we be interested in graph-theoretic approaches to vision? Recent success using graph theory in computer vision Global-local interactions Dimension independent Analogies between graphs, matrices, vector calculus and circuit theory allows transfer of ideas and intuition

p.9/65 Graph theory How to use concepts from calculus on a graph? A graph is a pair G = (V, E) with vertices (nodes) v V and edges e E V V. An edge, e, spanning two vertices, v i and v j, is denoted by e ij. A weighted graph has a value (typically nonnegative and real) assigned to each edge, e ij, denoted by w(e ij ) or w ij. A eij v k = C eij e ks = +1 if i = k, 1 if j = k, 0 otherwise. w(e ij ) if i = k, j = s, 0 otherwise. d i L vi v j = w(e ij ) if i=j, if e ij E, 0 otherwise. d i = e w(e ij ) e ij E ij L=A T CA

p.10/65 Graph theory - Circuit analogy A T y = f Cp = y p = Ax Kirchhoff s Current Law Ohm s Law Kirchhoff s Voltage Law

p.11/65 Graph theory Allows straightforward translation of computer vision concepts (a) Structure (b) Image (c) Edge-detection

p.12/65 Outline of talk Space-variant vision - Why and how of graph theory Anisotropic interpolation Isoperimetric segmentation Topology and numerical efficiency Multiresolution segmentation Conclusion - Future work

Graph interpolation Problem: How to interpolate unknown nodal values between known nodal values? Answer: Solve the combinatorial Dirichlet Problem using Dirichlet boundary conditions at the known values (a) Graph (b) Circuit p.13/65

The Dirichlet Problem Continuous Dirichlet integral D[u] = 1 2 Ω u 2 dv Laplace equation 2 u = 0 Combinatorial Dirichlet integral D[x] = 1 2 xt A T CAx Laplace equation A T CAx = Lx = 0 p.14/65

p.15/65 Dirichlet Problem How are the boundary conditions actually incorporated? D[x] = 1 2 xt Lx D[x i ] = 1 2 [ x T b x T i ] [ ] [ ] L b R xb R T L i x i = x T b L b x b + 2x T i R T x b + x T i L i x i L i x i = R T x b

Does it work? p.16/65

p.17/65 Dirichlet Problem - So what? Connection with anisotropic diffusion (Perona and Malik, 1990) suggests use for early vision processing Diffusion equation dx dt = Lx Laplace equation 0 = Lx

p.18/65 Dirichlet Problem Anisotropic Laplacian smoothing (a) Original (b) Samples (c) Interpolation

Space-variant sampling p.19/65

p.20/65 Laplace vs. Diffusion Conclusion Steady state vs. transient Diffusion requires stopping parameter (i.e., time) Sampling of Laplace allows more or less smoothing in different regions

p.21/65 Outline of talk Space-variant vision - Why and how of graph theory Anisotropic interpolation Isoperimetric segmentation Topology and numerical efficiency Multiresolution segmentation Conclusion - Future work

Isoperimetric Problem Problem: For a given volume, what shape has the smallest perimeter? Enclosing a volume with a boundary may be considered as a separation of the space. Segmentation: View image as a discrete geometry where the image values specify the metric and find the partitions that satisfy the isoperimetric problem. p.22/65

p.23/65 Isoperimetric Problem For an arbitrary space, the problem is much more difficult. Formally, the isoperimetric constant for a space is given as S h = inf, S Vol S for any subset of points, S.

p.24/65 Isoperimetric Problem How to define problem for a discrete geometry (graph)? Instead of points, S is a set of nodes. S = {4, 5} S = {1, 2, 3} S = {d, e}

p.25/65 Graph formulation Define an indicator vector x i = { 0 if vi / S, 1 if v i S. Note that specification of x defines a partition. How to define perimeter? S = x T Lx

p.26/65 Graph formulation How to define volume? Nodal Volume Normalized Volume Vol S = x T r Vol S = x T d Where r is the vector of all ones and d is the vector of node degree.

p.27/65 Graph formulation As a measure of partition quality, define the isoperimetric ratio as h(x) = xt Lx x T d. Strategy: Relax x to take real values and use Lagrange multiplier to perform a constrained minimization of the perimeter with respect to a constant volume. Namely, minimize x T Lx subject to the constraint x T d = k.

p.28/65 Graph formulation Define the function Q(x) as Q(x) = x T Lx Λ(x T d k). Since L is positive semi-definite (Biggs, 1974), Q(x) takes minima at critical points satisfying 2Lx = Λd. Problem: Equation is singular - For a connected graph, L has a nullspace spanned by r.

p.29/65 Circuit analogy Same equation occurs in circuit theory (Branin Jr., 1966) for a resistive network powered by current sources. Must ground the circuit in order to find potentials. Lx = d L 0 x 0 = d 0

Potentials When L 0 x 0 = d 0 is solved, a set of potentials is assigned to each node that must be thresholded in order to produce a partition. Lemma: For any threshold, the set of nodes with potentials lower than the threshold must be connected. Proof: Follows from the mean value theorem with positive sources. p.30/65

p.31/65 Partitioning algorithm recap Steps of the partitioning algorithm 1 Generate edge weights based on image/coordinate/sensor data. 2 Choose a ground node, v g, and form L 0, d 0. 3 Solve L 0 x 0 = d 0 for x 0. 4 Choose threshold and divide nodes into the sets with potentials above and below the threshold.

p.32/65 Spectral partitioning Spectral vs. Isoperimetric Eigenvector problem vs. system of linear equations If eigenvalue algebraic multiplicity is greater than unity: Any vector in the subspace spanned by the eigenvectors is a valid solution.

p.33/65 Data clustering The Gestalt clustering challenges of Zahn

p.34/65 Image processing Why should this work well? Image Potentials Segmentation

p.35/65 Image processing What is the effect of choosing ground?

Image processing How to choose weights? w ij = exp ( β I i I j ) We term β the scale (a) β = 30 (b) β = 50 p.36/65

p.37/65 Recursive bipartitioning How to apply isoperimetric partitioning to (unsupervised) image segmentation? Recursively apply bipartitioning until the isoperimetric ratio of the cut (i.e., the partition quality) fails to satisfy a pre-specified isoperimetric ratio.

p.38/65 Recursive bipartitioning No user interaction - How to automatically choose ground? Anderson and Morley (1971) proved that the spectral radius of L, ρ(l), satisfies ρ(l) 2d max. Therefore, grounding the node of highest degree may have the most beneficial affect on the numerical solution to L 0 x 0 = d 0.

Recursive bipartitioning p.39/65

Noise analysis Segmentation of white circle on black background (a) Additive (b) Multiplicative (c) Shot p.40/65

Noise analysis Noise Image Iso Ncuts Image Iso Ncuts Image Iso Ncuts (a) Additive (b) Multiplicative (c) Shot p.41/65

p.42/65 Other graphs Because of the flexibility of graph theory, the isoperimetric segmentation algorithm applies to general graphs. (a) 3D graph (b) Space-variant

p.43/65 Outline of talk Space-variant vision - Why and how of graph theory Anisotropic interpolation Isoperimetric segmentation Topology and numerical efficiency Multiresolution segmentation Conclusion - Future work

p.44/65 Krylov subspaces Krylov subspace methods are the methods of choice for solving both the system of linear equations (conjugate gradients) and the eigenvector problem (Lanczos method). K(A; x 0 ; k) = span(x 0, Ax 0, A 2 x 0,..., A (k 1) x 0 ) The solution found at each iteration, i, of conjugate gradients is the solution to Ax = b projected onto the Krylov subspace K(A, Ax 0 b, i) (Dongarra et al., 1991).

p.45/65 Krylov subspaces K(A; x 0 ; k) = span(x 0, Ax 0, A 2 x 0,..., A (k 1) x 0 ) Connection with diffusion x i+1 = x i + tlx i Convergence is directly related to graph diameter.

Krylov subspaces Idea: Use small world graphs to dramatically lower graph diameter with the addition of a minimal number of edges. (a) Lattice (b) Small world p.46/65

p.47/65 Krylov subspaces Convergence with new edges added

p.48/65 Krylov subspaces Results are largely unaffected. (a) Original (b) Lattice (c) Small World

p.49/65 Outline of talk Space-variant vision - Why and how of graph theory Anisotropic interpolation Isoperimetric segmentation Topology and numerical efficiency Multiresolution segmentation Conclusion - Future work

p.50/65 Multiresolution analysis Common approach: Image Filter Process Backproject solution Used for speed and resilience to noise

p.51/65 Multiresolution analysis Idea: Apply a graph-based analysis algorithm to the whole pyramid as a separate graph.

p.52/65 Multiresolution analysis Long range connections improve performance on blurry images Blur Image Lattice Pyramid

Multiresolution segmentation p.53/65

p.54/65 Multiresolution analysis Doesn t the speed suffer with the additional nodes? Computations nearly the same since the graph diameter is small.

p.55/65 Outline of talk Space-variant vision - Why and how of graph theory Anisotropic interpolation Isoperimetric segmentation Topology and numerical efficiency Multiresolution segmentation Conclusion - Future work

Conclusion Created Graph Analysis Toolbox for MATLAB to be publicly released: Implementation of tools to allow processing of data associated with graphs (e.g., filtering, edge detection, Ncuts) Implementation of new algorithms developed in my doctoral work Provides tools for transferring image data from Cartesian images to graphs of varying resolution Provides tools for visualizing data on graphs Includes space-variant graph data for over 20 species Has scripts to generate all the figures in my dissertation and papers p.56/65

p.57/65 Conclusion Final message: The analogy between graph theory, circuit theory, linear algebra and vector calculus provides Established principles that drive new algorithm design - with an intuition about their functionality An ability to transfer standard computer vision techniques to more general domains e.g., space-variant sensor data, higher dimensions, abstract feature spaces The potential for alternate methods of computation via circuit construction

p.58/65 Future directions General directions Use of nodes to represent image regions instead of pixels Incorporation of training/learning into image analysis algorithms Pursue interactive foreground/background segmentation Object recognition Explore relationship with statistical image analysis methods

p.59/65 Graph theory Taubin et al. (1996) frames standard filtering in the same context (a) Image (b) Low-pass filter (c) High-pass filter

p.60/65 Graph theory Coordinate filtering also possible (e.g., a ring graph) (a) Original (b) Low-pass filter (c) High-pass filter

p.61/65 Dirichlet Problem Same technique works for simple graph drawing (a) Original (b) Interpolated

p.62/65 Isoperimetric Problem The isoperimetric constant quantifies the separability of a space

Spectral partitioning Spectral partitioning (Donath and Hoffman, 1972; Pothen et al., 1990) adopts a different approach to minimizing the isoperimetric ratio. Isoperimetric ratio h(x) = xt Lx x T d Rayleigh quotient R(x) = xt Lx x T x Results by Fiedler (1975a,b, 1973), Alon (1986) and Cheeger (1970) characterize the relationship between isoperimetric constant of a graph and the second smallest eigenvector of L (the Fiedler value). The Normalized Cuts image segmentation algorithm of Shi and Malik (2000) adopts this approach with a different representation of the Laplacian matrix. p.63/65

p.64/65 Spectral partitioning Guattery and Miller (1998) demonstrated that spectral partitioning fails to perform well on certain families of graphs (a) Spectral (b) Isoperimetric

p.65/65 Image processing No edge information present - The Kaniza illusion (a) Potentials (b) Partition

References Alon, N. (1986). Eigenvalues and expanders. Combinatorica, 6:83 96. 600. Anderson, W. N. J. and Morley, T. D. (1971). Eigenvalues of the laplacian of a graph. Technical Report TR 71-45, University of Maryland. Biggs, N. (1974). Algebraic Graph Theory. Number 67 in Cambridge Tracts in Mathematics. Cambridge University Press. Branin Jr., F. H. (1966). The algebraic-topological basis for network analogies and the vector calculus. In Generalized Networks, Proceedings, pages 453 491, Brooklyn, N.Y. Cheeger, J. (1970). A lower bound for the smallest eigenvalue of the laplacian. In Gunning, R., editor, Problems in Analysis, pages 195 199. Princeton University Press, Princeton, NJ. Donath, W. and Hoffman, A. (1972). Algorithms for partitioning of graphs and computer logic based on eigenvectors of connection matrices. IBM Technical Disclosure Bulletin, 15:938 944. 65-1

Dongarra, J. J., Duff, I. S., Sorenson, D. C., and van der Vorst, H. A. (1991). Solving Linear Systems on Vector and Shared Memory Computers. Society for Industrial and Applied Mathematics, Philadelphia. Fiedler, M. (1973). Algebraic connectivity of graphs. Czechoslovak Mathematical Journal, 23(98):298 305. Fiedler, M. (1975a). Eigenvalues of acyclic matrices. Czechoslovak Mathematical Journal, 25(100):607 618. Fiedler, M. (1975b). A property of eigenvalues of nonnegative symmetric matrices and its applications to graph theory. Czechoslovak Mathematical Journal, 25(100):619 633. Guattery, S. and Miller, G. (1998). On the quality of spectral separators. SIAM Journal on Matrix Analysis and Applications, 19(3):701 719. Hughes, A. (1977). The topography of vision in mammals of contrasting life style: Comparative optics and retinal organization. In The Handbook of Sensory Physiology, volume 7. New York: Academic Press. Perona, P. and Malik, J. (1990). Scale-space and edge detection using anisotropic diffusion. IEEE Transac- 65-2

tions on Pattern Analysis and Machine Intelligence, 12(7):629 639. Pothen, A., Simon, H., and Liou, K.-P. (1990). Partitioning sparse matrices with eigenvectors of graphs. SIAM Journal of Matrix Analysis Applications, 11(3):430 452. Shi, J. and Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888 905. Taubin, G., Zhang, T., and Golub, G. (1996). Optimal surface smoothing as filter design. Technical Report RC-20404, IBM. 65-3