CS 664 Segmentation (2) Daniel Huttenlocher

Similar documents
CS 556: Computer Vision. Lecture 13

K-Means, Expectation Maximization and Segmentation. D.A. Forsyth, CS543

Spectral Clustering. Spectral Clustering? Two Moons Data. Spectral Clustering Algorithm: Bipartioning. Spectral methods

Spectral Clustering. Guokun Lai 2016/10

Introduction to Spectral Graph Theory and Graph Clustering

Grouping with Bias. Stella X. Yu 1,2 Jianbo Shi 1. Robotics Institute 1 Carnegie Mellon University Center for the Neural Basis of Cognition 2

Spectral Clustering. Zitao Liu

Machine Learning for Data Science (CS4786) Lecture 11

Spectral Clustering on Handwritten Digits Database

Self-Tuning Spectral Clustering

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

Spectral clustering. Two ideal clusters, with two points each. Spectral clustering algorithms

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

Learning Spectral Graph Segmentation

CS 534: Computer Vision Segmentation III Statistical Nonparametric Methods for Segmentation

Supervised object recognition, unsupervised object recognition then Perceptual organization

Australian National University WORKSHOP ON SYSTEMS AND CONTROL

Spectral Clustering on Handwritten Digits Database Mid-Year Pr

Summary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University)

Iterative Methods for Solving A x = b

Section 4.4 Reduction to Symmetric Tridiagonal Form

STA141C: Big Data & High Performance Statistical Computing

Energy minimization via graph-cuts

Unsupervised Image Segmentation Using Comparative Reasoning and Random Walks

SOS Boosting of Image Denoising Algorithms

LINEAR SYSTEMS (11) Intensive Computation

Joint Optimization of Segmentation and Appearance Models

MLCC Clustering. Lorenzo Rosasco UNIGE-MIT-IIT

ECS231: Spectral Partitioning. Based on Berkeley s CS267 lecture on graph partition

Spectral Algorithms I. Slides based on Spectral Mesh Processing Siggraph 2010 course

Lecture 10: Dimension Reduction Techniques

Clustering compiled by Alvin Wan from Professor Benjamin Recht s lecture, Samaneh s discussion

Lecture 1: Systems of linear equations and their solutions

Scientific Computing WS 2018/2019. Lecture 9. Jürgen Fuhrmann Lecture 9 Slide 1

Announce Statistical Motivation Properties Spectral Theorem Other ODE Theory Spectral Embedding. Eigenproblems I

MATH 567: Mathematical Techniques in Data Science Clustering II

Energy Minimization via Graph Cuts

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS

CSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13

j=1 [We will show that the triangle inequality holds for each p-norm in Chapter 3 Section 6.] The 1-norm is A F = tr(a H A).

Optical flow. Subhransu Maji. CMPSCI 670: Computer Vision. October 20, 2016

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering

Lecture 7. Econ August 18

1 Inner Product and Orthogonality

Lecture 8 : Eigenvalues and Eigenvectors

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS

ORIE 6334 Spectral Graph Theory October 13, Lecture 15

Shared Segmentation of Natural Scenes. Dependent Pitman-Yor Processes

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang

Quadratic forms. Here. Thus symmetric matrices are diagonalizable, and the diagonalization can be performed by means of an orthogonal matrix.

CS 231A Section 1: Linear Algebra & Probability Review

The Singular Value Decomposition

Diffuse interface methods on graphs: Data clustering and Gamma-limits

Statistical Machine Learning

Digital Image Processing Chapter 11 Representation and Description

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Markov Chains and Spectral Clustering

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

Majorizations for the Eigenvectors of Graph-Adjacency Matrices: A Tool for Complex Network Design

Section 3.9. Matrix Norm

The Eigenvalue Problem: Perturbation Theory

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

Grouping with Directed Relationships

Diffusion/Inference geometries of data features, situational awareness and visualization. Ronald R Coifman Mathematics Yale University

Optical Flow, Motion Segmentation, Feature Tracking

COMPSCI 514: Algorithms for Data Science

Robust Motion Segmentation by Spectral Clustering

Linear Algebra Review

Announce Statistical Motivation ODE Theory Spectral Embedding Properties Spectral Theorem Other. Eigenproblems I

On Distributed Coordination of Mobile Agents with Changing Nearest Neighbors

An indicator for the number of clusters using a linear map to simplex structure

MAT 242 CHAPTER 4: SUBSPACES OF R n

Spectral Theory of Unsigned and Signed Graphs Applications to Graph Clustering. Some Slides

CS Finding Corner Points

CS 246 Review of Linear Algebra 01/17/19

CS6220: DATA MINING TECHNIQUES

REVIEW OF DIFFERENTIAL CALCULUS

Matrices and Vectors

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

Spectral Graph Theory and You: Matrix Tree Theorem and Centrality Metrics

Additional Homework Problems

Distances and similarities Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Computational Methods. Eigenvalues and Singular Values

Spectral Clustering using Multilinear SVD

Systems of Algebraic Equations and Systems of Differential Equations

Notes for CS542G (Iterative Solvers for Linear Systems)

Lecture 13: Spectral Graph Theory

LECTURE NOTE #11 PROF. ALAN YUILLE

Chapter 6: Orthogonality

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

17.1 Directed Graphs, Undirected Graphs, Incidence Matrices, Adjacency Matrices, Weighted Graphs

Nonlinear Dimensionality Reduction

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Conceptual Questions for Review

Econ Slides from Lecture 7

Transcription:

CS 664 Segmentation (2) Daniel Huttenlocher

Recap Last time covered perceptual organization more broadly, focused in on pixel-wise segmentation Covered local graph-based methods such as MST and Felzenszwalb-Huttenlocher method Today Cut-based methods such as grab cut, normalized cuts Iterative local update methods such as mean shift 2

Cut Based Techniques For costs, natural to consider minimum cost cuts Removing edges with smallest total cost, that cut graph in two parts Graph only has finite-weight edges Manually assisted techniques, foreground vs. background General segmentation, recursively cut resulting components Question of when to stop 3

Image Segmentation & Minimum Cut Pixel Neighborhood Image Pixels w Similarity Measure Minimum Cut 4

Segmentation by Min (s-t) Cut min cut t s [Boykov 01] Manually select a few fg and bg pixels Infinite cost link from each bg pixel to the t node, and each fg pixel to s node Compute min cut that separates s from t 5

Grabcut [Rother et al., SIGGRAPH 2004] 6

Automatic Cut-Based Segmentation q p C pq c Fully-connected graph Node for every pixel Link between every pair of pixels, p,q Cost for each link measures similarity 7

Drawbacks of Minimum Cut Weight of cut proportional to number of edges preference for small regions Motivation for Shi-Malik normalized cuts Ideal Cut Cuts with lesser weight than the ideal cut * Slide from Khurram Hassan-Shafique CAP5415 Computer Vision 2003 8

Normalized Cuts A number of normalization criteria have been proposed One that is commonly used [Shi&Malik ] Ncut(A,B) = cut(a,b) cut(a,b) assoc(a,v) + assoc(b,v) Where cut(a,b) is standard definition i A,j B w ij And assoc(a,v) = j i A w ij 9

Computing Normalized Cuts Has been shown this is equivalent to an integer programming problem, minimize y T (D-W)y y T D y Subject to the constraint that y i {1,b} and y T D1=0 Where 1 vector of all 1 s W is the affinity matrix D is the degree matrix (diagonal) D(i,i) = j w ij 10

Approximating Normalized Cuts Integer programming problem NP hard Instead simply solve continuous (real-valued) version This corresponds to finding second smallest eigenvector of (D-W)y i = λ i Dy i Widely used method Works well in practice Large eigenvector problem, but sparse matrices Often resolution reduce images, e.g, 100x100 But no longer clearly related to cut problem 11

Normalized Cut Examples 12

Another Look [Weiss 99] Consider eigen analysis of affinity matrix W = [ w ij ] Note W is symmetric; for images w ij =w ji W also essentially block diagonal With suitable rearrangement of rows/cols so that vertices with higher affinity have nearer indices Entries far from diagonal are small (though not quite zero) Eigenvectors of W Recall for real, symmetric matrix forms an orthogonal basis Axes of decreasing importance 13

Structure of W Eigenvectors of block diagonal matrix consist of eigenvectors of the blocks Padded with zeroes Note rearrangement so that clusters lie near diagonal only conceptual Eigenvectors of permuted matrix are permutation of original eigenvectors Can think of eigenvectors as being associated with high affinity clusters Eigenvectors with large eigenvalues Approximately the case 14

Structure of W Consider case of point set where affinities w ij =exp(-(y i -y j ) 2 /σ 2 ) With two clusters Points indexed to respect clusters for clarity Block diagonal form of W Within cluster affinities A, B for clusters Between cluster affinity C W= A C T C B 15

First Eigenvector of W Recall, vectors x i satisfying Wx i =λ i x i Consider ordered by eigenvalues λ i First eigenvector x 1 has largest eigenvalue λ 1 Elements of first eigenvector serve as index vector [Perona,Freeman] Selecting elements of highest affinity cluster Magnitude of elements Points in plane W Elements of x 1 16

Clustering First eigenvector of W has been suggested as clustering or segmentation criterion For selecting most significant segment Then recursively segment remainder Problematic when nonzero non-diagonal blocks (similar affinity clusters) Points in plane W Elements of x 1 17

Understanding Normalized Cuts Intractable discrete graph problem used to motivate continuous (real valued) problem Find second smallest generalized eigenvector (D-W)x i = λ i Dx i Where D is (diagonal) degree matrix d ii = j w ij Can be viewed in terms of first two eigenvectors of normalized affinity matrix Let N=D -1/2 WD -1/2 Note n ij =w ij /( d ii d jj ) Affinity normalized by degree of the two nodes 18

Normalized Affinities Can be shown that If x is an eigenvector of N with eigenvalue λ then D -1/2 x is a generalized eigenvector of W with eigenvalue 1-λ The vector D -1/2 1 is an eigenvector of N with eigenvalue 1 It follows that Second smallest generalized eigenvector of W is ratio of first two eigenvectors of N So ncut uses normalized affinity matrix N and first two eigenvectors rather than affinity matrix W and first eigenvector 19

Contrasting W and N Three simple point clustering examples W, first eigenvector of W, ratio of first two eigenvectors of N (generalized eigenvector of W) 20

Image Segmentation Considering W and N for segmentation Affinity a negative exponential based on distance in x,y,b space Eigenvectors of N more correlated with regions First 4 eigenvectors of W First 4 eigenvectors of N 21

Using More Eigenvectors Based on k largest eigenvectors Construct matrix Q such that (ideally) q ij =1 if i and j in same cluster, 0 otherwise Let V be matrix whose columns are first k eigenvectors of W Normalize rows of V to have unit Euclidean norm Ideally each node (row) in one cluster (col) Let Q=VV T Each entry product of two unit vectors 22

Normalization and k Eigenvectors Normalized affinities help correct for variations in overall degree of affinity So compute Q for N instead of W Contrasting Q with ratio of first two eigenvectors of N (ncut criterion) More clearly selects most significant region Using k=6 eigenvectors Row of Q matrix vs. ratio of eigenvectors of N Q N Q N 23

Spectral Methods Eigenvectors of affinity and normalized affinity matrices Widely used outside computer vision for graph-based clustering Link structure of web pages, citation structure of scientific papers Often directed rather than undirected graphs 24

Iterative Clustering Methods Techniques such as k-means, but for image segmentation generally have no idea about number of regions Mean-shift a nonparametric method 25

Finding Modes in a Histogram How Many Modes Are There? Easy to see, less easy to compute 26

Mean Shift [Comaniciu & Meer] Iterative Mode Search 1. Initialize random seed, and window W 2. Calculate center of gravity ( mean ) of W and shift 27

Mean Shift Used both for segmentation and for edge preserving filtering Operates on collection of points X={x 1,, x n } in R d Replace each point with value derived from mean shift procedure Searches for a local density maximum by repeatedly shifting a d-dimensional hypersphere of fixed radius h Differs from most clustering, such as k-means in that no fixed number of clusters 28

Mean Shift Procedure For given point x X let y 1,, y T denote successive locations of that point y 1 =x y k+1 =1/ S(y k ) x S(y k) x Where S(y k ) is the subset of X contained in a hyper-sphere of radius h centered at y k The radius h is a fixed parameter of the method For a point set X, the mean shift procedure is applied separately to all the points 29

Mean-Shift Initialize window around each point Where it shifts determines which region it s in Multiple points will shift to the same region 30

Mean Shift Image Filtering Map each image pixel to point in u,v,b space x i =(u i,v i,b i /σ) Analogous for color images, with three intensity values instead of one Scale factor σ normalizes intensity vs. spatial dimensions Perform mean shift for each point Let Y i =(U i,vi,b i ) denote mean shifted value Assign result z i =(u i,v i,b i ) Original spatial coords, mean shifted intensity 31

Mean Shift Example 32

Mean Shift Example 33

Edge Preserving Filtering Mean shift tends to preserve edges Edges are where intensity is changing rapidly Rapid changes in intensity will result in lower density regions in joint spatialintensity space Mean shift finds local density maxima 34

Mean Shift Clustering Run mean shift procedure for each point Cluster resulting convergence points that closer than some small constant Assign each point label of its cluster Analogous to filtering, but with added step of merging cluster that are nearby in the joint spatial-intensity domain 35

About Mean Shift Convergence to local density maximum Where local determined by sphere radius Consider simple point set Over wide range of sphere radii end up with two clusters Relationship to MST 36