Joint Optimization of Segmentation and Appearance Models

Size: px

Start display at page:

Download "Joint Optimization of Segmentation and Appearance Models"

Cynthia Andrews
5 years ago
Views:

1 Joint Optimization of Segmentation and Appearance Models David Mandle, Sameep Tandon April 29, 2013 David Mandle, Sameep Tandon (Stanford) April 29, / 19

2 Overview 1 Recap: Image Segmentation 2 Optimization Strategy 3 Experimental David Mandle, Sameep Tandon (Stanford) April 29, / 19

3 Recap: Image Segmentation Problem Segment an image into foreground and background Figure: Left: Input image. Middle: Segmentation by EM (GrabCut). Right: Segmentation by the method covered today David Mandle, Sameep Tandon (Stanford) April 29, / 19

4 Recap: Image Segmentation as Energy Optimization Recall Grid Structured Markov Random Field: Latent variables x i {0, 1} corresponding to foreground/background Observations z i. Take to be RGB pixel values Edge potentials Φ(x i, z i ), Ψ(x i, x j ) David Mandle, Sameep Tandon (Stanford) April 29, / 19

5 Recap: Image Segmentation as Energy Optimization The Graphical Model encodes the following (unnormalized) probability distribution: David Mandle, Sameep Tandon (Stanford) April 29, / 19

6 Recap: Image Segmentation as Energy Optimization Goal: find x to maximize P(x, z) (z is observed) David Mandle, Sameep Tandon (Stanford) April 29, / 19

7 Recap: Image Segmentation as Energy Optimization Goal: find x to maximize P(x, z) (z is observed) Taking logs: E(x, z) = i φ(x i, z i ) + i,j ψ(x i, x j ) David Mandle, Sameep Tandon (Stanford) April 29, / 19

8 Recap: Image Segmentation as Energy Optimization Goal: find x to maximize P(x, z) (z is observed) Taking logs: E(x, z) = i φ(x i, z i ) + i,j ψ(x i, x j ) Unary potential φ(x i, z i ) encodes how likely it is for a pixel or patch y i to belong to segmentation x i. David Mandle, Sameep Tandon (Stanford) April 29, / 19

9 Recap: Image Segmentation as Energy Optimization Goal: find x to maximize P(x, z) (z is observed) Taking logs: E(x, z) = i φ(x i, z i ) + i,j ψ(x i, x j ) Unary potential φ(x i, z i ) encodes how likely it is for a pixel or patch y i to belong to segmentation x i. Pairwise potential ψ(x i, x j ) encodes neighborhood info about pixel/patch segmentation labels David Mandle, Sameep Tandon (Stanford) April 29, / 19

10 Recap: GrabCut Model Unary Potentials: log of Gaussian Mixture Model But to deal with tractability, we assign each x i to component k i φ(x i, k i, θ z i ) = log π(x i, k i ) + log N(z i ; µ(k i ), Σ(k i )) David Mandle, Sameep Tandon (Stanford) April 29, / 19

11 Recap: GrabCut Model Unary Potentials: log of Gaussian Mixture Model But to deal with tractability, we assign each x i to component k i φ(x i, k i, θ z i ) = log π(x i, k i ) + log N(z i ; µ(k i ), Σ(k i )) Pairwise Potentials: ψ(x i, x j z i, z j ) = [x i x j ] exp( β 1 z i z j 2 ) where β = 2 avg( z i z j 2 ) David Mandle, Sameep Tandon (Stanford) April 29, / 19

12 Recap: GrabCut Optimization Strategy GrabCut EM Algorithm 1 Initialize Mixture Models David Mandle, Sameep Tandon (Stanford) April 29, / 19

13 Recap: GrabCut Optimization Strategy GrabCut EM Algorithm 1 Initialize Mixture Models 2 Assign GMM components: k i = arg min k φ(x i, k i, θ z i ) David Mandle, Sameep Tandon (Stanford) April 29, / 19

14 Recap: GrabCut Optimization Strategy GrabCut EM Algorithm 1 Initialize Mixture Models 2 Assign GMM components: 3 Get GMM parameters: k i = arg min k φ(x i, k i, θ z i ) θ = arg min θ φ(x i, k i, θ z i ) i David Mandle, Sameep Tandon (Stanford) April 29, / 19

15 Recap: GrabCut Optimization Strategy GrabCut EM Algorithm 1 Initialize Mixture Models 2 Assign GMM components: 3 Get GMM parameters: k i = arg min k φ(x i, k i, θ z i ) θ = arg min θ φ(x i, k i, θ z i ) 4 Perform segmentation using reduction to min-cut: i x = arg min x E(x, z; k, θ) David Mandle, Sameep Tandon (Stanford) April 29, / 19

16 Recap: GrabCut Optimization Strategy GrabCut EM Algorithm 1 Initialize Mixture Models 2 Assign GMM components: 3 Get GMM parameters: k i = arg min k φ(x i, k i, θ z i ) θ = arg min θ φ(x i, k i, θ z i ) 4 Perform segmentation using reduction to min-cut: 5 Iterate from step 2 until converged i x = arg min x E(x, z; k, θ) David Mandle, Sameep Tandon (Stanford) April 29, / 19

17 New Model Let s consider a simpler model. This will be useful soon Unary terms: Histograms David Mandle, Sameep Tandon (Stanford) April 29, / 19

18 New Model Let s consider a simpler model. This will be useful soon Unary terms: Histograms K bins, b i is bin of pixel z i David Mandle, Sameep Tandon (Stanford) April 29, / 19

19 New Model Let s consider a simpler model. This will be useful soon Unary terms: Histograms K bins, b i is bin of pixel z i θ 0, θ 1 [0, 1] K represent color models (distributions) over foreground/background David Mandle, Sameep Tandon (Stanford) April 29, / 19

20 New Model Let s consider a simpler model. This will be useful soon Unary terms: Histograms K bins, b i is bin of pixel z i θ 0, θ 1 [0, 1] K represent color models (distributions) over foreground/background φ(x i, b i, θ) = log θ x i b i David Mandle, Sameep Tandon (Stanford) April 29, / 19

21 New Model Let s consider a simpler model. This will be useful soon Unary terms: Histograms K bins, b i is bin of pixel z i θ 0, θ 1 [0, 1] K represent color models (distributions) over foreground/background Pairwise Potentials φ(x i, b i, θ) = log θ x i b i ψ(x i, x j ) = w ij x i x j We will define w ij later; for now, consider pairwise equal to Grabcut David Mandle, Sameep Tandon (Stanford) April 29, / 19

22 New Model Let s consider a simpler model. This will be useful soon Unary terms: Histograms K bins, b i is bin of pixel z i θ 0, θ 1 [0, 1] K represent color models (distributions) over foreground/background Pairwise Potentials φ(x i, b i, θ) = log θ x i b i ψ(x i, x j ) = w ij x i x j We will define w ij later; for now, consider pairwise equal to Grabcut Total Energy: E(x, θ 0, θ 1 ) = log P(z p θ xp ) + w pq x p x q p V P(z p θ xp ) = θ xp b p (p,q) N David Mandle, Sameep Tandon (Stanford) April 29, / 19

23 EM under new model 1 Initialize histograms θ 0, θ 1. David Mandle, Sameep Tandon (Stanford) April 29, / 19

24 EM under new model 1 Initialize histograms θ 0, θ 1. 2 Fix θ. Perform segmentation using reduction to min-cut: x = arg min x E(x, θ 0, θ 1 ) David Mandle, Sameep Tandon (Stanford) April 29, / 19

25 EM under new model 1 Initialize histograms θ 0, θ 1. 2 Fix θ. Perform segmentation using reduction to min-cut: x = arg min x E(x, θ 0, θ 1 ) 3 Fix x. Compute θ 0, θ 1 (via standard parameter fitting). David Mandle, Sameep Tandon (Stanford) April 29, / 19

26 EM under new model 1 Initialize histograms θ 0, θ 1. 2 Fix θ. Perform segmentation using reduction to min-cut: x = arg min x E(x, θ 0, θ 1 ) 3 Fix x. Compute θ 0, θ 1 (via standard parameter fitting). 4 Iterate from step 2 until converged David Mandle, Sameep Tandon (Stanford) April 29, / 19

27 Optimization Goal: Minimize Energy min x E(x) E(x) = k h k (n 1 k ) + (p,q) N w pq x p x q } {{ } E 1 (x) + h(n 1 ) }{{} E 2 (x) where n 1 k = p V k x p and n 1 = p V x p David Mandle, Sameep Tandon (Stanford) April 29, / 19

28 Optimization Goal: Minimize Energy min x E(x) E(x) = k h k (n 1 k ) + (p,q) N w pq x p x q } {{ } E 1 (x) + h(n 1 ) }{{} E 2 (x) where n 1 k = p V k x p and n 1 = p V x p This is hard! David Mandle, Sameep Tandon (Stanford) April 29, / 19

29 Optimization Goal: Minimize Energy min x E(x) E(x) = k h k (n 1 k ) + (p,q) N w pq x p x q } {{ } E 1 (x) + h(n 1 ) }{{} E 2 (x) where n 1 k = p V k x p and n 1 = p V x p This is hard! But efficient strategies for optimizing E 1 (x) and E 2 (x) separately David Mandle, Sameep Tandon (Stanford) April 29, / 19

30 Optimization via Dual Decomposition Consider an optimization of the form min x f 1 (x) + f 2 (x) David Mandle, Sameep Tandon (Stanford) April 29, / 19

31 Optimization via Dual Decomposition Consider an optimization of the form min x f 1 (x) + f 2 (x) where optimizing f (x) = f 1 (x) + f 2 (x) is hard David Mandle, Sameep Tandon (Stanford) April 29, / 19

32 Optimization via Dual Decomposition Consider an optimization of the form min x f 1 (x) + f 2 (x) where optimizing f (x) = f 1 (x) + f 2 (x) is hard But min x f 1 (x) and min x f 2 (x) are easy problems David Mandle, Sameep Tandon (Stanford) April 29, / 19

33 Optimization via Dual Decomposition Consider an optimization of the form min x f 1 (x) + f 2 (x) where optimizing f (x) = f 1 (x) + f 2 (x) is hard But min x f 1 (x) and min x f 2 (x) are easy problems Dual Decomposition idea: Optimize f 1 (x) and f 2 (x) separately and combine in a principled way. David Mandle, Sameep Tandon (Stanford) April 29, / 19

34 Optimization via Dual Decomposition Original Problem: min x f 1 (x) + f 2 (x) David Mandle, Sameep Tandon (Stanford) April 29, / 19

35 Optimization via Dual Decomposition Original Problem: min x f 1 (x) + f 2 (x) Introduce local variables: min x1,x 2 f 1 (x 1 ) + f 2 (x 2 ) s.t x 1 = x 2 David Mandle, Sameep Tandon (Stanford) April 29, / 19

36 Optimization via Dual Decomposition Original Problem: min x f 1 (x) + f 2 (x) Introduce local variables: min x1,x 2 f 1 (x 1 ) + f 2 (x 2 ) s.t x 1 = x 2 Equivalent Problem: min x1,x 2 f 1 (x 1 ) + f 2 (x 2 ) s.t x 2 x 1 = 0 David Mandle, Sameep Tandon (Stanford) April 29, / 19

37 Optimization via Dual Decomposition Primal Problem: min x1,x 2 f 1 (x 1 ) + f 2 (x 2 ) s.t x 2 x 1 = 0 David Mandle, Sameep Tandon (Stanford) April 29, / 19

38 Optimization via Dual Decomposition Primal Problem: Lagrangian Dual: min x1,x 2 f 1 (x 1 ) + f 2 (x 2 ) s.t x 2 x 1 = 0 g(y) = min x 1,x 2 f 1 (x 1 ) + f 2 (x 2 ) + y T (x 2 x 1 ) David Mandle, Sameep Tandon (Stanford) April 29, / 19

39 Optimization via Dual Decomposition Primal Problem: Lagrangian Dual: min x1,x 2 f 1 (x 1 ) + f 2 (x 2 ) s.t x 2 x 1 = 0 g(y) = min x 1,x 2 f 1 (x 1 ) + f 2 (x 2 ) + y T (x 2 x 1 ) Decompose Lagrangian Dual: g(y) = (min f 1 (x 1 ) y T x 1 ) + (min f 2 (x 2 ) + y T x 2 ) x 1 x 2 David Mandle, Sameep Tandon (Stanford) April 29, / 19

40 Optimization via Dual Decomposition g(y) = (min f 1 (x 1 ) y T x 1 ) + (min f 2 (x 2 ) + y T x 2 ) x } 1 x {{}} 2 {{} g 1 (y) g 2 (y) For all y, g(y) is a lower bound on optimal of the primal problem. David Mandle, Sameep Tandon (Stanford) April 29, / 19

41 Optimization via Dual Decomposition g(y) = (min f 1 (x 1 ) y T x 1 ) + (min f 2 (x 2 ) + y T x 2 ) x } 1 x {{}} 2 {{} g 1 (y) g 2 (y) For all y, g(y) is a lower bound on optimal of the primal problem. Maximize g(y) w.r.t. y to get the tightest bound David Mandle, Sameep Tandon (Stanford) April 29, / 19

42 Optimization via Dual Decomposition g(y) = (min f 1 (x 1 ) y T x 1 ) + (min f 2 (x 2 ) + y T x 2 ) x } 1 x {{}} 2 {{} g 1 (y) g 2 (y) For all y, g(y) is a lower bound on optimal of the primal problem. Maximize g(y) w.r.t. y to get the tightest bound Further, g(y) is concave in y. Many techniques for concave maximization (subgradient ascent, etc). David Mandle, Sameep Tandon (Stanford) April 29, / 19

43 Optimization via Dual Decomposition g(y) = (min f 1 (x 1 ) y T x 1 ) + (min f 2 (x 2 ) + y T x 2 ) x } 1 x {{}} 2 {{} g 1 (y) g 2 (y) For all y, g(y) is a lower bound on optimal of the primal problem. Maximize g(y) w.r.t. y to get the tightest bound Further, g(y) is concave in y. Many techniques for concave maximization (subgradient ascent, etc). Ideally have fast optimization strategies for g 1 (y) and g 2 (y) David Mandle, Sameep Tandon (Stanford) April 29, / 19

44 Optimization via Dual Decomposition Back to our Segmentation problem Energy E(x) = h k (nk 1 ) + k w pq x p x q + h(n 1 ) (p,q) N where n 1 k = p V k x p and n 1 = p V x p David Mandle, Sameep Tandon (Stanford) April 29, / 19

45 Optimization via Dual Decomposition Back to our Segmentation problem Energy E(x) = k h k (n 1 k ) + (p,q) N where n 1 k = p V k x p and n 1 = p V x p Group terms E(x) = k h k (n 1 k ) + (p,q) N w pq x p x q + h(n 1 ) w pq x p x q } {{ } E 1 (x) + h(n 1 ) }{{} E 2 (x) David Mandle, Sameep Tandon (Stanford) April 29, / 19

46 Optimization via Dual Decomposition Back to our Segmentation problem Energy E(x) = k h k (n 1 k ) + (p,q) N where n 1 k = p V k x p and n 1 = p V x p Group terms E(x) = k E(x) = E 1 (x) + E 2 (x) h k (n 1 k ) + (p,q) N w pq x p x q + h(n 1 ) w pq x p x q } {{ } E 1 (x) + h(n 1 ) }{{} E 2 (x) David Mandle, Sameep Tandon (Stanford) April 29, / 19

47 Optimization via Dual Decomposition Dual Decomposition on E(x) = E 1 (x) + E 2 (x) Φ(y) = min(e 1 (x 1 ) y T x 1 ) + min(e 2 (x 2 ) + y T x 2 ) x } 1 x {{}} 2 {{} Φ 1 (y) Φ 2 (y) David Mandle, Sameep Tandon (Stanford) April 29, / 19

48 Optimization via Dual Decomposition Dual Decomposition on E(x) = E 1 (x) + E 2 (x) Φ(y) = min(e 1 (x 1 ) y T x 1 ) + min(e 2 (x 2 ) + y T x 2 ) x } 1 x {{}} 2 {{} Φ 1 (y) Φ 2 (y) Maximize Φ(y) w.r.t. y to get the tightest lower bound. David Mandle, Sameep Tandon (Stanford) April 29, / 19

49 Optimization via Dual Decomposition Dual Decomposition on E(x) = E 1 (x) + E 2 (x) Φ(y) = min(e 1 (x 1 ) y T x 1 ) + min(e 2 (x 2 ) + y T x 2 ) x } 1 x {{}} 2 {{} Φ 1 (y) Φ 2 (y) Maximize Φ(y) w.r.t. y to get the tightest lower bound. Φ 1 (y) can be computed efficiently via reduction to min s-t cut. David Mandle, Sameep Tandon (Stanford) April 29, / 19

50 Optimization via Dual Decomposition Dual Decomposition on E(x) = E 1 (x) + E 2 (x) Φ(y) = min(e 1 (x 1 ) y T x 1 ) + min(e 2 (x 2 ) + y T x 2 ) x } 1 x {{}} 2 {{} Φ 1 (y) Φ 2 (y) Maximize Φ(y) w.r.t. y to get the tightest lower bound. Φ 1 (y) can be computed efficiently via reduction to min s-t cut. Φ 2 (y) via convex minimization (several strategies) David Mandle, Sameep Tandon (Stanford) April 29, / 19

51 Optimization via Dual Decomposition Dual Decomposition on E(x) = E 1 (x) + E 2 (x) Φ(y) = min(e 1 (x 1 ) y T x 1 ) + min(e 2 (x 2 ) + y T x 2 ) x } 1 x {{}} 2 {{} Φ 1 (y) Φ 2 (y) Maximize Φ(y) w.r.t. y to get the tightest lower bound. Φ 1 (y) can be computed efficiently via reduction to min s-t cut. Φ 2 (y) via convex minimization (several strategies) You could now use your favorite concave maximization strategy for Φ(y) David Mandle, Sameep Tandon (Stanford) April 29, / 19

52 Optimization via Dual Decomposition Turns out can maximize Φ(y) over y in polynomial time using a parametric max-flow technique. David Mandle, Sameep Tandon (Stanford) April 29, / 19

53 Optimization via Dual Decomposition Turns out can maximize Φ(y) over y in polynomial time using a parametric max-flow technique. Sketch: David Mandle, Sameep Tandon (Stanford) April 29, / 19

54 Optimization via Dual Decomposition Turns out can maximize Φ(y) over y in polynomial time using a parametric max-flow technique. Sketch: Theorem: Given Φ 1 (y) and Φ 2 (y) as described, optimal y has form y = s1 David Mandle, Sameep Tandon (Stanford) April 29, / 19

55 Optimization via Dual Decomposition Turns out can maximize Φ(y) over y in polynomial time using a parametric max-flow technique. Sketch: Theorem: Given Φ 1 (y) and Φ 2 (y) as described, optimal y has form y = s1 Implication: maximize Φ(s1) over all possible s David Mandle, Sameep Tandon (Stanford) April 29, / 19

56 Optimization via Dual Decomposition Turns out can maximize Φ(y) over y in polynomial time using a parametric max-flow technique. Sketch: Theorem: Given Φ 1 (y) and Φ 2 (y) as described, optimal y has form y = s1 Implication: maximize Φ(s1) over all possible s Φ1 (s1) is piecewise-linear concave V breakpoints computed by parametric max-flow Parametric max-flow also returns between 2 and V + 1 solutions segmentations x per breakpoint David Mandle, Sameep Tandon (Stanford) April 29, / 19

57 Optimization via Dual Decomposition Turns out can maximize Φ(y) over y in polynomial time using a parametric max-flow technique. Sketch: Theorem: Given Φ 1 (y) and Φ 2 (y) as described, optimal y has form y = s1 Implication: maximize Φ(s1) over all possible s Φ1 (s1) is piecewise-linear concave V breakpoints computed by parametric max-flow Parametric max-flow also returns between 2 and V + 1 solutions segmentations x per breakpoint Φ 2 (s1) is piecewise-linear concave V breakpoints can be enumerated from h( ) David Mandle, Sameep Tandon (Stanford) April 29, / 19

58 Optimization via Dual Decomposition Turns out can maximize Φ(y) over y in polynomial time using a parametric max-flow technique. Sketch: Theorem: Given Φ 1 (y) and Φ 2 (y) as described, optimal y has form y = s1 Implication: maximize Φ(s1) over all possible s Φ1 (s1) is piecewise-linear concave V breakpoints computed by parametric max-flow Parametric max-flow also returns between 2 and V + 1 solutions segmentations x per breakpoint Φ 2 (s1) is piecewise-linear concave V breakpoints can be enumerated from h( ) Φ(s1) is piecewise-linear concave with 2 V breakpoints, so finding max over Φ(1s) is easy it s one of the breakpoints. David Mandle, Sameep Tandon (Stanford) April 29, / 19

59 Optimization via Dual Decomposition Turns out can maximize Φ(y) over y in polynomial time using a parametric max-flow technique. Sketch: Theorem: Given Φ 1 (y) and Φ 2 (y) as described, optimal y has form y = s1 Implication: maximize Φ(s1) over all possible s Φ1 (s1) is piecewise-linear concave V breakpoints computed by parametric max-flow Parametric max-flow also returns between 2 and V + 1 solutions segmentations x per breakpoint Φ 2 (s1) is piecewise-linear concave V breakpoints can be enumerated from h( ) Φ(s1) is piecewise-linear concave with 2 V breakpoints, so finding max over Φ(1s) is easy it s one of the breakpoints. Given the breakpoint, return the segmentation x with minimum energy from set given by PMF. David Mandle, Sameep Tandon (Stanford) April 29, / 19

60 References MRF Review Slides: spring1213/slides/segmentation.p Dual Decomposition Slides: Petter Strandmark and Fredrik Kahl. Parallel and Distributed Graph Cuts by Dual Decomposition. David Mandle, Sameep Tandon (Stanford) April 29, / 19

Energy minimization via graph-cuts

Energy minimization via graph-cuts Nikos Komodakis Ecole des Ponts ParisTech, LIGM Traitement de l information et vision artificielle Binary energy minimization We will first consider binary MRFs: Graph