Learning Spectral Graph Segmentation AISTATS 2005 Timothée Cour Jianbo Shi Nicolas Gogin Computer and Information Science Department University of Pennsylvania Computer Science Ecole Polytechnique
Graph-based Image Segmentation Weighted graph G =(V,W) W ij V = vertices (pixels i) Similarity between pixels i and j : W ij = W ji 0 Segmentation = graph partition of pixels 2
Spectral Graph Segmentation Image I Intensity Color Edges Texture Graph Affinities W=W(I,Θ) 3
Spectral Graph Segmentation Image I Intensity Color Edges Texture Graph Affinities W=W(I,Θ) NCut( A, B) = cut( A, B) Vol A Vol B 4
Spectral Graph Segmentation Image I Graph Affinities W=W(I,Θ) Eigenvector X(W) NCut( A, B) = cut( A, B) Vol A Vol B WX X A () i = λdx 1 if i = 0 if i A A 5
Spectral Graph Segmentation Image I Graph Affinities W=W(I,Θ) Eigenvector X(W) Discretisation NCut( A, B) = cut( A, B) Vol A Vol B WX = λdx X A ( i) 1 if i A = 0 if i A 6
Spectral Graph Segmentation Image I Intensity + Color + Edges + Texture + Graph Affinities W=W(I,Θ) Eigenvector X(W) Learn graph parameters Θ Target segmentation Reverse pipeline 7
intensity cues edges cues color cues [Shi & Malik, 97] Multiscale cues [Yu, 04; Fowlkes 04] Texture cues 8
Where is Waldo? Do you use Edges cues? Color cues? Texture cues? -That s not enough, you need Shape cues High-level object priors 9
Spectral Graph Segmentation Image I Intensity + Color + Edges + Texture + Graph Affinities W=W(I,Θ) Eigenvector X(W) Learn graph parameters Θ Target segmentation 10
Θ Image I Affinities W=W(I,Θ) Eigenvector X(W) - Target X * Target P * ~W * Error criterion on affinity matrix W [1] Meila and Shi (2001) [2] Fowlkes, Martin, Malik (2004) 11
Θ Image I Affinities W=W(I,Θ) Eigenvector X(W) - Target X * Target P * ~W * Error criterion on affinity matrix W Constraining Wij is overkill: Many W have segmentation W: O(n 2 ) parameters Segments : O(n) parameters [1] Meila and Shi (2001) [2] Fowlkes, Martin, Malik (2004) 12
Θ Image I Affinities W=W(I,Θ) Eigenvector X(W) - Error criterion on partition vector X only! Target X * 13
Energy function for segmenting 1 image I Eigenvector X(W) - * 2 E ( W) = X( W( I, Θ)) X ( I) I Target X * 14
Energy function for segmenting 1 image I Eigenvector X(W) - * 2 E ( W) = X( W( I, Θ)) X ( I) I Min. E( Θ) = XW ( ( I, Θ)) X( I) images I * 2 Target X * 15
Eigenvector X(W) - E ( W) = X( W( I, Θ)) X ( I) I Energy function for image I * 2 Min. E( Θ) = XW ( ( I, Θ)) X( I) images I * 2 Target X * Can use gradient descent...but XW ( ) is only implicitly defined, by W( I, Θ ) - λ D X = 0, ( ) X f( W, Θ) Can Not backtrack easily 16
Min. E( Θ) = X( W( I, Θ)) X ( I) images I * 2 Gradient descent: Θ = η E E X W = η Θ X W Θ 17
Min. E( Θ) = X( W( I, Θ)) X ( I) images I * 2 Gradient descent: X(W) is implicit Θ = η E E X W = η Θ X W Θ E(X) is quadratic depends on W(Θ) 18
E X W Θ = η X W Θ Theorem : Derivative of NCut eigenvectors The map W (X, λ) is C over Ω and we can 1 express the derivatives over any C path ( ) as : Wt dλ( W( t)) dt dx ( W ( t)) dt ( ' λ ') T X W D X = T X DX = ( W λd) W' λd' dλ dt D X 19
E X W Θ = η X W Θ Theorem : Derivative of NCut eigenvectors The map W (X, λ) is C over Ω and we can 1 express the derivatives over any C path ( ) as : Wt dλ( W( t)) dt dx ( W ( t)) dt ( ' λ ') T X W D X = T X DX = ( W λd) W' λd' dλ dt D X Feasible set in the space of graph weight matrices : W Ωiff 1) W is n n symmetric matrix 2) W1 > 0 3) λ is single with WX = λ DX 2 2 2 2 20
E X W Θ = η X W Θ Theorem : Derivative of NCut eigenvectors The map W (X, λ) is C over Ω and we can 1 express the derivatives over any C path ( ) as : Wt dλ( W( t)) dt dx ( W ( t)) dt ( ' λ ') T X W D X = T X DX = ( W λd) W' λd' dλ dt D X [Meila, Shortreed, Xu, 05] 21
Task 1: Learning 2D point segmentation 100 points at ( x, y ) i i W ij = 2 2 e σx xi xj σ y yi yj ( σ σ ) parameters : Θ =, x y 22
( 2 2) Learn W = exp σ ( x x ) σ ( y y ) ij x i j y i j ( ) * 2 σ σ E( σ, σ ) = X W(, ) X x y x y Ground truth segmentation X * and X(W) learned 23
Learning 2D point segmentation ( 2 2) Learn W = exp σ ( x x ) σ ( y y ) ij x i j y i j ( ) * 2 σ σ E( σ, σ ) = X W(, ) X x y x y Ground truth segmentation X * and X(W) learned 24
Proposition : Exponential convergence E The PDE W = either W converges to a global minimum W : E W(t) ( ) 0, exponentially or escapes any compact K Ω this happens when 1) λ ( t) 1 2 W() t or λ () t λ () t 0 2 3 2) D ( i, i) 0 for some i 25
Is there a ground truth segmentation? { e x y α } Graph nodes = edge at (, ) with angle i i i i 26
Task 2: Learning rectangular shape W( e, e ) = f( x, y, α; x, y, α ) i j i i i j j j Edge location x, y Edge orientation i i α i Convex Concave Impossible 27
Task 2: Learning rectangular shape W( e, e ) = f( x, y, α; x, y, α ) i j i i i j j j Edge location Edge orientaion x, y i α i i Convex Concave Impossible 10 10 edge locations 4 edge angles 2 W = n n n = 160,000 parameters x y α 28
Task 2: Learning rectangular shape W( e, e ) = f( x, y, α; x, y, α ) i j i i i j j j Edge location x, y Edge orientaion i α i i Convex Concave Impossible Invariance through parameterization : W( e, e ) = f( x x, y y, α, α ) i j j i j i i j 20 20 4 4 = 6400 parameters 29
Synthetic case training Input edges Ncut eigenvector after training target testing Input edges Ncut eigenvector 30
Testing on real images Learning with precise edges 31
Comparison of the spectral graph learning schemes Meila and Shi, Fowlkes, Martin and Malik Bach-Jordan Our method 1 1 * arg min D W D W( X ) W arg min W J ( W, X * ) * argmin XW ( ) X W Bach and Jordan (2003) Learning spectral clustering Use a differentiable approximation of eigenvectors Our work Exact analytical form computationally efficient solution 32
Conclusion Supervise the un-supervised spectral clustering Exact and efficient learning of graph weights using derivatives of eigenvectors 33