Pushmeet Kohli. Microsoft Research Cambridge. IbPRIA 2011
|
|
- Octavia Ferguson
- 5 years ago
- Views:
Transcription
1 Pushmeet Kohli Microsoft Research Cambridge IbPRIA 2011
2 2:30 4:30 Labelling Problems Graphical Models Message Passing 4:30 5:00 - Coffee break 5:00 7:00 - Graph Cuts Move Making Algorithms Speed and Efficiency
3 Infer the values of hidden random variables Object Segmentation Pose Estimation Geometry Estimation Sky Tree Building Grass Orientation Joint angles
4
5 ( n = number of pixels ) z є (R,G,B) n x є {0,1} n n random variables: x 1,,x n Joint Probability: P(x 1,,x n ) Conditional Probability: P(x 1 z 1,z n )
6 ( n = number of pixels ) z є (R,G,B) n x є {0,1} n
7 ( n = number of pixels ) z є (R,G,B) n x є {0,1} n P(x z) = P(z x) P(x) / P(z) ~ P(z x) P(x) Posterior Probability Likelihood (data-dependent) Prior (data- independent) (MAP Solution) x* = arg max P(x z) = arg min E(x) x x
8 Posterior Likelihood Prior P(x z) = P(z x) P(x) P(z i x i ) x i
9 Green Red Green Red P(x z) ~ P(z x) P(x) P(z x) = F GMM (z,x)
10 P(x z) ~ P(z x) P(x) Log P(z i x i =0) P(z i x i =1) MAP Solution x* = argmax P(z x) x = argmax P(z i x i ) x x i
11 Posterior Likelihood Prior P(x z) = P(z x) P(x) Encourages consistency between labelling of adjacent pixels f(x i,x j ) x i,x j
12 P(x z) ~ P(z x) P(x) x i x j P(x) = f ij (x i,x j ) i,j Є N = exp{- x i -x j } MRF Ising prior i,j Є N
13 P(x z) = P(z i x i ) Posterior Probability x i P(x i,x j ) x i,x j -ve log E(x,z,w) = θ i (x i,z i ) + w θ ij (x i,x j,z i,z j ) Energy i i,j
14 E(x,z,w) = θ i (x i,z i ) + w θ ij (x i,x j ) i i,j Image Likelihood Solution MRF Solution (Ising prior)
15 P(x z) = P(z i x i ) x i P(x i,x j ) x i,x j
16 P(x z) = P(z i x i ) x i P(x i,x j,z i,z j ) x i,x j -ve log E(x,z,w) = θ i (x i,z i ) + w θ ij (x i,x j,z i,z j ) i i,j [Boykov and Jolly 01] [Blake et al. 04] [Rother, Kolmogorov and Blake `04]
17 E(x,z,w) = θ i (x i,z i ) + w θ ij (x i,x j,z i,z j ) i i,j Cost (θ ij ) Pairwise Cost z i -z j 2 [Boykov and Jolly 01] [Blake et al. 04] [Rother, Kolmogorov and Blake `04]
18 E(x,z,w) = θ i (x i,z i ) + w θ ij (x i,x j,z i,z j ) i i,j Pairwise Cost Global Minimum (x*) [Boykov and Jolly 01] [Blake et al. 04] [Rother, Kolmogorov and Blake `04]
19 Demo
20 Factors graphs & Order of a posterior distribution (or energy function)
21 Write probability distributions as Graphical model: P(x) ~ exp{-e(x)} E(x) = θ(x 1,x 2,x 3 ) + θ(x 2,x 4 ) + θ(x 3,x 4 ) + θ(x 3,x 5 ) Gibbs distribution 4 factors x 1 x 2 unobserved variables are in same factor. x 3 x 4 References: x 5 Factor graph - Pattern Recognition and Machine Learning [Bishop 08, book, chapter 8] - several lectures at the Machine Learning Summer School 2009 (see video lectures)
22 x 1 x 2 Definition Order : The arity (number of variables) of the largest factor x 3 x 4 E(X) = θ(x 1,x 2,x 3 ) θ(x 2,x 4 ) θ(x 3,x 4 ) θ(x 3,x 5 ) arity 3 arity 2 x 5 Factor graph with order 3
23 4-connected; pairwise MRF E(x) = θ ij (x i,x j ) i,j Є N 4 higher(8)-connected; pairwise MRF E(x) = θ ij (x i,x j ) i,j Є N 8 Higher-order RF E(x) = θ ij (x i,x j ) i,j Є N 4 +θ(x 1,,x n ) Order 2 Order 2 Order n Pairwise energy higher-order energy
24 1. Image segmentation 2. Stereo 3. Semantic Segmentation
25 1. Image segmentation 2. Stereo 3. Semantic Segmentation
26 P(x z) ~ exp{-e(x)} E(x) = θ i (x i, z i ) + θ ij (x i,x j ) i i,j Є N 4 Observed variable Unobserved (latent) variable z i x j x i Factor graph
27 Goal: Detect and segment test image: 1 Large set of example segmentation: (Template, Position, Scale, Orientation) Up to exemplars w T(1) T(2) T(3) E(x,w): {0,1} n x {Exemplar} R E(x,w) = T(w) i -x i + θ ij (x i,x j ) i i,j Є N 4 Hamming distance [Kohli et al. IJCV 08, Lempisky et al. ECCV 08]
28 UIUC dataset; 98.8% accuracy [Kohli et al. IJCV 08, Lempisky et al. ECCV 08]
29
30 Training Image Pairwise Energy P(x) Test Image Test Image (60% Noise) Minimized using st-mincut or max-product message passing Result
31 Higher Order Structure not Preserved Training Image Pairwise Energy P(x) Test Image Test Image (60% Noise) Minimized using st-mincut or max-product message passing Result
32 Minimize: Where: E(X) = P(X) + h c (X c ) c h c : {0,1} c R Higher Order Function ( c = 10x10 = 100) Assigns cost to possible labellings! Exploit function structure to transform it to a Pairwise function p 1 p 2 p 3 Rother and Kohli, MSR Tech Report [2010]
33 Training Image Learned Patterns Test Image Test Image (60% Noise) Pairwise Result Higher-Order Result Rother and Kohli, MSR Tech Report [2010]
34 1. Image segmentation 2. Stereo 3. Semantic Segmentation
35 d=0 d=4 Image left(a) Image right(b) Ground truth depth Images rectified Ignore occlusion for now Energy: d i E(d): {0,,D-1} n R Labels: d (depth/shift)
36 Right Image Energy: Unary: E(d): {0,,D-1} n R E(d) = θ i (d i ) + θ ij (d i,d j ) i i,j Є N 4 θ i (d i ) = (l j -r i-di ) SAD; Sum of absolute differences (many others possible, NCC, ) Pairwise: i i-2 (d i =2) left right θ ij (d i,d j ) = g( d i -d j ) Left Image
37 cost [Olga Veksler PhD thesis, Daniel Cremers et al.] θ ij (d i,d j ) = g( d i -d j ) d i -d j No truncation (global min.)
38 cost [Olga Veksler PhD thesis, Daniel Cremers et al.] θ ij (d i,d j ) = g( d i -d j ) d i -d j discontinuity preserving potentials [Blake&Zisserman 83, 87] No truncation (global min.) with truncation (NP hard optimization)
39 cost θ ij (d i,d j ) = g( d i -d j ) Left image d i -d j Potts model (Potts model) Smooth disparities [Olga Veksler PhD thesis]
40 see No MRF Pixel independent (WTA) No horizontal links Efficient since independent chains Pairwise MRF [Boykov et al. 01] Ground truth
41 1. Image segmentation 2. Stereo 3. Semantic/Object Segmentation
42 Assign a label to each pixel (or Voxel in 3D) Object Segmentation Organ Detection water boat tree chair road
43 (1) No relationships between Pixel s (2) Pixel Groups, no relationships between groups (3) Pairwise Relationships between Pixel groups (4) Pairwise Relationships between Pixel s (5) Higher order Relationships between Pixels
44 Image Window (W) Pixel to be classified (P) Image P,W Pixel Classifier Boosting [Shotton et al, 2006] Random Decision Forests Cost for assigning label Cow
45 Image (MSRC-21) Pairwise CRF Higher order CRF Ground Truth Grass Sheep
46 [Unwrap Mosaic, Rav-Acha, Kohli, Fitzgibbon, Rother, Siggraph 08]
47
48 E(x) = f i (x i ) + g ij (x i,x j ) + h c (x c ) i ij c Unary Pairwise Higher Order x takes discrete values How to minimize E(x)?
49 NP-Hard MAXCUT CSP Space of Problems
50 Which functions are exactly solvable? Approximate solutions of NP-hard problems Scalability and Efficiency
51 Which functions are exactly solvable? Boros Hammer [1965], Kolmogorov Zabih [ECCV 2002, PAMI 2004], Ishikawa [PAMI 2003], Schlesinger [EMMCVPR 2007], Kohli Kumar Torr [CVPR2007, PAMI 2008], Ramalingam Kohli Alahari Torr [CVPR 2008], Kohli Ladicky Torr [CVPR 2008, IJCV 2009], Zivny Jeavons [CP 2008] Approximate solutions of NP-hard problems Schlesinger [1976 ], Kleinberg and Tardos [FOCS 99], Chekuri et al. [2001], Boykov et al. [PAMI 2001], Wainwright et al. [NIPS 2001], Werner [PAMI 2007], Komodakis [PAMI 2005], Lempitsky et al. [ICCV 2007], Kumar et al. [NIPS 2007], Kumar et al. [ICML 2008], Sontag and Jakkola [NIPS 2007], Kohli et al. [ICML 2008], Kohli et al. [CVPR 2008, IJCV 2009], Rother et al. [2009] Scalability and Efficiency Kohli Torr [ICCV 2005, PAMI 2007], Juan and Boykov [CVPR 2006], Alahari Kohli Torr [CVPR 2008], Delong and Boykov [CVPR 2008]
52 Dynamic Programming (DP) Belief Propagtion (BP) Tree-Reweighted (TRW), Max-sum Diffusion Message passing Graph Cut (GC) Branch & Bound Combinatorial Algorithms Relaxation methods Convex Optimization (Linear Programming)
53 NP-Hard MAXCUT CSP Tree Structured Space of Problems
54
55 E(x) = f i (x i ) + g ij (x i,x j ) i ij p q Iteratively pass messages between nodes... Message update rule? Belief propagation (BP) Tree-reweighted belief propagation (TRW) Max-product (minimizing an energy function, or MAP estimation) x* = arg max P(x) = arg min E(x) E(x) = -ln P(x) + K Sum-product (computing marginal probabilities) P(x i ) = P(x) x V-{i}
56 f (x p ) + g pq (x p,x q ) p q r L 1 j M p->q (L 1 ) = min f (x p ) + g pq (x p, L 1 ) x p = min (5+0, 1+2, 2+2) Potts Model g pq =2 ( x p x q ) M p->q (L 1,L 2,L 3 ) = (3,1,2)
57 f (x p ) + g pq (x p,x q ) p q r L 1 Potts Model g pq =2 ( x p x q )
58 M p->q + f (x q ) + g qr (x q,x r ) M q->r + f (x r ) p q r min { E(x) x r = j} x (Min-marginal) Select minimum cost label Trace back path to get minimum cost labeling Global minimum in linear time
59 leaf p leaf q r root Dynamic programming: global minimum in linear time BP: Inward pass (dynamic programming) Outward pass Gives min-marginals
60 p( xp) pq( xp, xq) p q r j
61 p( xp) pq( xp, xq) p q r j
62 p( xp) pq( xp, xq) p q r j
63 p( xp) pq( xp, xq) p q r j
64 p q r k
65 p q r
66 p q r
67 p q r
68 p q r Min-marginal for node q and label j: j
69 Pass messages using same rules Empirically often works quite well May not converge Pseudo min-marginals Gives local minimum in the tree neighborhood [Weiss&Freeman 01],[Wainwright et al. 04] Assumptions: BP has converged no ties in pseudo min-marginals
70 Naïve implementation: O(K 2 ) [K = number of labels] Often can be improved to O(K) Potts interactions, truncated linear, truncated quadratic,... j
71 + Provides a lower bound Lower Bound < E(x*) < E(x ) Slightly different messages Try to solve a LP relaxation of the MAP problem [Kolmogorov, Wainwright et al.]
72 Related to Dynamic Programming Exact on Trees (Hyper-trees) Connections to Linear programming Popular methods: BP, TRW-S, Diffusion Schedules have a significant result on the solution For more details: [Kolmogorov 06, Wainwright et al. 05]
73 n = Number of Variables Segmentation Energy NP-Hard MAXCUT CSP Pair-wise O(n3 ) Tree Structured Submodular Functions O(n 6 ) Space of Problems
74
75 Pseudo-boolean function f {0,1} n R is submodular if f(a) + f(b) f(a B) + f(a B) (OR) (AND) for all A,B ϵ {0,1} n Example: n = 2, A = [1,0], B = [0,1] f([1,0]) + f([0,1]) f([1,1]) + f([0,0]) Property : Sum of submodular functions is submodular Binary Image Segmentation Energy is submodular E(x) = c i x i + d ij x i -x j i i,j
76 Discrete Analogues of Concave Functions [Lovasz, 83] Widely applied in Operations Research Applications in Machine Learning MAP Inference in Markov Random Fields Clustering [Narasimhan, Jojic, & Bilmes, NIPS 2005] Structure Learning [Narasimhan & Bilmes, NIPS 2006] Maximizing the spread of influence through a social network [Kempe, Kleinberg & Tardos, KDD 2003]
77 Polynomial time algorithms Ellipsoid Algorithm: [Grotschel, Lovasz & Schrijver 81] First strongly polynomial algorithm: [Iwata et al. 00] [A. Schrijver 00] Current Best: O(n 5 Q + n 6 ) [Q is function evaluation time] [Orlin 07] Symmetric functions: E(x) = E(1-x) Can be minimized in O(n 3 ) Minimizing Pairwise submodular functions Can be transformed to st-mincut/max-flow [Hammer, 1965] Very low empirical running time ~ O(n) E(X) = f i (x i ) + g ij (x i,x j ) i ij
78 Source v 1 v Graph (V, E, C) Vertices V = {v 1, v 2... v n } Edges E = {(v 1, v 2 )...} Costs C = {c (1, 2)...} Sink
79 What is a st-cut? Source v 1 v Sink
80 What is a st-cut? Source An st-cut (S,T) divides the nodes between source and sink v 1 v What is the cost of a st-cut? Sum of cost of all edges going from S to T Sink = 15
81 What is a st-cut? Source v 1 v Sink = 8 An st-cut (S,T) divides the nodes between source and sink. What is the cost of a st-cut? Sum of cost of all edges going from S to T What is the st-mincut? st-cut with the minimum cost
82 Construct a graph such that: 1. Any st-cut corresponds to an assignment of x 2. The cost of the cut is equal to the energy of x : E(x) S st-mincut E(x) T Solution [Hammer, 1965] [Kolmogorov and Zabih, 2002
83 E(x) = θ i (x i ) + θ ij (x i,x j ) i i,j For all ij θ ij (0,1) + θ ij (1,0) θ ij (0,0) + θ ij (1,1) Equivalent (transformable) E(x) = c i x i + c ij x i (1-x j ) i i,j c ij 0
84 E(a 1,a 2 ) Source (0) a 1 a 2 Sink (1)
85 E(a 1,a 2 ) = 2a 1 Source (0) 2 a 1 a 2 Sink (1)
86 E(a 1,a 2 ) = 2a 1 + 5ā 1 Source (0) 2 a 1 a 2 5 Sink (1)
87 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 Source (0) 2 9 a 1 a Sink (1)
88 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 Source (0) 2 9 a 1 a Sink (1)
89 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a Source (0) a 1 a Sink (1)
90 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a Source (0) a 1 a Sink (1)
91 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a 2 Source (0) 2 9 Cost of cut = 11 1 a 1 a 2 a 1 = 1 a 2 = E (1,1) = 11 Sink (1)
92 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a 2 Source (0) a 1 a 2 st-mincut cost = 8 a 1 = 1 a 2 = E (1,0) = 8 Sink (1)
93 Solve the dual maximum flow problem Source v 1 v Sink 4 Compute the maximum flow between Source and Sink s.t. Edges: Flow < Capacity Nodes: Flow in = Flow out Min-cut\Max-flow Theorem In every network, the maximum flow equals the cost of the st-mincut Assuming non-negative capacity
94 Flow = 0 Source Augmenting Path Based Algorithms v 1 v Sink
95 Flow = 0 Source Augmenting Path Based Algorithms Find path from source to sink with positive capacity v 1 v Sink
96 Source v 1 v Flow = Augmenting Path Based Algorithms 1. Find path from source to sink with positive capacity 2. Push maximum possible flow through this path Sink
97 Source v 1 v 2 3 Flow = Augmenting Path Based Algorithms 1. Find path from source to sink with positive capacity 2. Push maximum possible flow through this path Sink
98 Source v 1 v 2 3 Flow = 2 2 Sink 4 Augmenting Path Based Algorithms 1. Find path from source to sink with positive capacity 2. Push maximum possible flow through this path 3. Repeat until no path can be found
99 Source v 1 v 2 3 Flow = 2 2 Sink 4 Augmenting Path Based Algorithms 1. Find path from source to sink with positive capacity 2. Push maximum possible flow through this path 3. Repeat until no path can be found
100 Source v 1 v 2 3 Flow = Sink 0 Augmenting Path Based Algorithms 1. Find path from source to sink with positive capacity 2. Push maximum possible flow through this path 3. Repeat until no path can be found
101 Source v 1 v 2 3 Flow = 6 2 Sink 0 Augmenting Path Based Algorithms 1. Find path from source to sink with positive capacity 2. Push maximum possible flow through this path 3. Repeat until no path can be found
102 Source v 1 v 2 3 Flow = 6 2 Sink 0 Augmenting Path Based Algorithms 1. Find path from source to sink with positive capacity 2. Push maximum possible flow through this path 3. Repeat until no path can be found
103 Source v 1 v 2 1 Flow = Sink 0 Augmenting Path Based Algorithms 1. Find path from source to sink with positive capacity 2. Push maximum possible flow through this path 3. Repeat until no path can be found
104 Flow = 8 Source Augmenting Path Based Algorithms Find path from source to sink with positive capacity v 1 v Sink 0 2. Push maximum possible flow through this path 3. Repeat until no path can be found
105 Source v 1 v 2 2 Flow = 8 3 Sink 0 Augmenting Path Based Algorithms 1. Find path from source to sink with positive capacity 2. Push maximum possible flow through this path 3. Repeat until no path can be found
106 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a Source (0) a 1 a Sink (1)
107 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a 2 Source (0) a 1 a a 1 + 5ā 1 = 2(a 1 +ā 1 ) + 3ā 1 = 2 + 3ā 1 Sink (1)
108 E(a 1,a 2 ) = 2 + 3ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a 2 Source (0) a 1 a a 1 + 5ā 1 = 2(a 1 +ā 1 ) + 3ā 1 = 2 + 3ā 1 Sink (1)
109 E(a 1,a 2 ) = 2 + 3ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a 2 Source (0) a 1 a a 2 + 4ā 2 = 4(a 2 +ā 2 ) + 5ā 2 = 4 + 5ā 2 Sink (1)
110 E(a 1,a 2 ) = 2 + 3ā 1 + 5a a 1 ā 2 + ā 1 a 2 Source (0) a 1 a a 2 + 4ā 2 = 4(a 2 +ā 2 ) + 5ā 2 = 4 + 5ā 2 Sink (1)
111 E(a 1,a 2 ) = 6 + 3ā 1 + 5a 2 + 2a 1 ā 2 + ā 1 a Source (0) a 1 a Sink (1)
112 E(a 1,a 2 ) = 6 + 3ā 1 + 5a 2 + 2a 1 ā 2 + ā 1 a Source (0) a 1 a Sink (1)
113 E(a 1,a 2 ) = 6 + 3ā 1 + 5a 2 + 2a 1 ā 2 + ā 1 a 2 3ā 1 + 5a 2 + 2a 1 ā Source (0) a 1 a Sink (1) = 2(ā 1 +a 2 +a 1 ā 2 ) +ā 1 +3a 2 = 2(1+ā 1 a 2 ) +ā 1 +3a 2 F1 = ā 1 +a 2 +a 1 ā 2 F2 = 1+ā 1 a 2 a 1 a 2 F1 F
114 E(a 1,a 2 ) = 8 + ā 1 + 3a 2 + 3ā 1 a 2 3ā 1 + 5a 2 + 2a 1 ā Source (0) a 1 a Sink (1) = 2(ā 1 +a 2 +a 1 ā 2 ) +ā 1 +3a 2 = 2(1+ā 1 a 2 ) +ā 1 +3a 2 F1 = ā 1 +a 2 +a 1 ā 2 F2 = 1+ā 1 a 2 a 1 a 2 F1 F
115 E(a 1,a 2 ) = 8 + ā 1 + 3a 2 + 3ā 1 a 2 Source (0) a 1 a No more augmenting paths possible Sink (1)
116 E(a 1,a 2 ) = 8 + ā 1 + 3a 2 + 3ā 1 a 2 Residual Graph (positive coefficients) Total Flow bound on the optimal solution Source (0) a 1 a Sink (1) Tight Bound --> Inference of the optimal solution becomes trivial
117 E(a 1,a 2 ) = 8 + ā 1 + 3a 2 + 3ā 1 a 2 Residual Graph (positive coefficients) Total Flow bound on the energy of the optimal solution Source (0) a 1 a 2 st-mincut cost = 8 a 1 = 1 a 2 = E (1,0) = 8 Sink (1) Tight Bound --> Inference of the optimal solution becomes trivial
118 Augmenting Path and Push-Relabel n: #nodes m: #edges U: maximum edge weight Algorithms assume nonnegative edge weights [Slide credit: Andrew Goldberg]
119 Augmenting Path and Push-Relabel n: #nodes m: #edges U: maximum edge weight Algorithms assume nonnegative edge weights [Slide credit: Andrew Goldberg]
120 Ford Fulkerson: Choose any augmenting path Source a 1 a Sink
121 Ford Fulkerson: Choose any augmenting path Source a 1 a 2 0 Good Augmenting Paths Sink
122 Ford Fulkerson: Choose any augmenting path Source a 1 a 2 0 Bad Augmenting Path Sink
123 Ford Fulkerson: Choose any augmenting path Source a 1 a Sink
124 Ford Fulkerson: Choose any augmenting path n: #nodes m: #edges Source a 1 a Sink We will have to perform 2000 augmentations! Worst case complexity: O (m x Total_Flow) (Pseudo-polynomial bound: depends on flow)
125 Dinic: Choose shortest augmenting path n: #nodes m: #edges Source a 1 a Sink Worst case Complexity: O (m n 2 )
126 Specialized algorithms for vision problems Grid graphs Low connectivity (m ~ O(n)) Dual search tree augmenting path algorithm [Boykov and Kolmogorov PAMI 2004] Finds approximate shortest augmenting paths efficiently High worst-case time complexity Empirically outperforms other algorithms on vision problems Efficient code available on the web
127 E(x) = c i x i + d ij x i -x j i i,j E: {0,1} n R 0 fg 1 bg n = number of pixels x* = arg min E(x) x How to minimize E(x)? Global Minimum (x*)
128 Graph *g; For all pixels p /* Add a node to the graph */ nodeid(p) = g->add_node(); Source (0) /* Set cost of terminal edges */ set_weights(nodeid(p), fgcost(p), bgcost(p)); end for all adjacent pixels p,q add_weights(nodeid(p), nodeid(q), cost(p,q)); end g->compute_maxflow(); Sink (1) label_p = g->is_connected_to_source(nodeid(p)); // is the label of pixel p (0 or 1)
129 Graph *g; For all pixels p end /* Add a node to the graph */ nodeid(p) = g->add_node(); /* Set cost of terminal edges */ set_weights(nodeid(p), fgcost(p), bgcost(p)); Source (0) bgcost(a 1 ) bgcost(a 2 ) a 1 a 2 for all adjacent pixels p,q add_weights(nodeid(p), nodeid(q), cost(p,q)); end g->compute_maxflow(); label_p = g->is_connected_to_source(nodeid(p)); // is the label of pixel p (0 or 1) fgcost(a 1 ) fgcost(a 2 ) Sink (1)
130 Graph *g; For all pixels p end /* Add a node to the graph */ nodeid(p) = g->add_node(); /* Set cost of terminal edges */ set_weights(nodeid(p), fgcost(p), bgcost(p)); Source (0) bgcost(a 1 ) bgcost(a 2 ) cost(p,q) a 1 a 2 for all adjacent pixels p,q add_weights(nodeid(p), nodeid(q), cost(p,q)); end g->compute_maxflow(); label_p = g->is_connected_to_source(nodeid(p)); // is the label of pixel p (0 or 1) fgcost(a 1 ) fgcost(a 2 ) Sink (1)
131 Graph *g; For all pixels p end /* Add a node to the graph */ nodeid(p) = g->add_node(); /* Set cost of terminal edges */ set_weights(nodeid(p), fgcost(p), bgcost(p)); Source (0) bgcost(a 1 ) bgcost(a 2 ) cost(p,q) a 1 a 2 for all adjacent pixels p,q add_weights(nodeid(p), nodeid(q), cost(p,q)); end g->compute_maxflow(); fgcost(a 1 ) fgcost(a 2 ) Sink (1) label_p = g->is_connected_to_source(nodeid(p)); // is the label of pixel p (0 or 1) a 1 = bg a 2 = fg
132
133 Mixed (Real-Integer) Problems Higher Order Energy Functions Multi-label Problems Ordered Labels Stereo (depth labels) Unordered Labels Object segmentation ( car, `road, `person )
134 Mixed (Real-Integer) Problems Higher Order Energy Functions Multi-label Problems Ordered Labels Stereo (depth labels) Unordered Labels Object segmentation ( car, `road, `person )
135 We have seen several of them in the intro... x binary image segmentation (x i {0,1}) ω non-local parameter (lives in some large set Ω) E(x,ω) = C(ω) + θ i (ω, x i ) + θ ij (ω,x i,x j ) i i,j constant unary potentials pairwise potentials 0 ω Template Position Scale Orientation
136 x binary image segmentation (x i {0,1}) ω non-local parameter (lives in some large set Ω) E(x,ω) = C(ω) + θ i (ω, x i ) + θ ij (ω,x i,x j ) i i,j constant unary potentials pairwise potentials 0 {x*,ω*} = arg min E(x,ω) x,ω Standard graph cut energy if ω is fixed [Kohli et al, 06,08] [Lempitsky et al, 08]
137 Local Method: Gradient Descent over ω ω * = arg min min E (x,ω) x ω Submodular ω* [Kohli et al, 06,08]
138 Local Method: Gradient Descent over ω ω * = arg min min E (x,ω) x ω Submodular E (x,ω 1 ) E (x,ω 2 ) Similar Energy Functions Dynamic Graph Cuts time speedup! [Kohli et al, 06,08]
139 Global Method: Branch and Mincut Produces the global optimal solution Exhaustively explores all ω in Ω in the worst case 30,000,000 shapes [Lempitsky et al, 08]
140 Mixed (Real-Integer) Problems Higher Order Energy Functions Multi-label Problems Ordered Labels Stereo (depth labels) Unordered Labels Object segmentation ( car, `road, `person )
141 n = number of pixels E: {0,1} n R 0 fg, 1 bg E(X) = c i x i + d ij x i -x j i i,j Image Unary Cost Segmentation [Boykov and Jolly 01] [Blake et al. 04] [Rother, Kolmogorov and Blake `04]
142 [Kohli et al. 07] Patch Dictionary (Tree) h(x p ) = { C 1 if x i = 0, i ϵ p C max otherwise C max C 1 p
143 n = number of pixels E: {0,1} n R 0 fg, 1 bg E(X) = c i x i + d ij x i -x j + h p (X p ) i i,j p h(x p ) = { C 1 if x i = 0, i ϵ p C max otherwise p [Kohli et al. 07]
144 n = number of pixels E: {0,1} n R 0 fg, 1 bg E(X) = c i x i + d ij x i -x j + h p (X p ) i i,j p Image Pairwise Segmentation Final Segmentation [Kohli et al. 07]
145 Higher Order Submodular Functions Exact Transformation? Pairwise Submodular Function Billionnet and M. Minoux [DAM 1985] Kolmogorov & Zabih [PAMI 2004] Freedman & Drineas [CVPR2005] Kohli Kumar Torr [CVPR2007, PAMI 2008] Kohli Ladicky Torr [CVPR 2008, IJCV 2009] Ramalingam Kohli Alahari Torr [CVPR 2008] Zivny et al. [CP 2008] S T st-mincut
146 Higher Order Function Pairwise Submodular Function Identified transformable families of higher order function s.t. 1. Constant or polynomial number of auxiliary variables (a) added 2. All pairwise functions (g) are submodular
147 Example: H (X) = F ( x i ) concave H (X) 0 x i [Kohli et al. 08]
148 Simple Example using Auxiliary variables f(x) = { 0 if all x i = 0 C 1 otherwise x ϵ L = {0,1} n min f(x) x = x,a ϵ {0,1} Higher Order Submodular Function min C 1 a + C 1 ā x i Quadratic Submodular Function x i < 1 a=0 (ā=1) f(x) = 0 x i 1 a=1 (ā=0) f(x) = C 1
149 min f(x) x = Higher Order Submodular Function min C 1 a + C 1 ā x i x,a ϵ {0,1} Quadratic Submodular Function C 1 x i C x i [Kohli et al. 08]
150 min f(x) x = Higher Order Submodular Function min C 1 a + C 1 ā x i x,a ϵ {0,1} Quadratic Submodular Function C 1 x i C 1 a=0 a=1 Lower envelop of concave functions is concave x i [Kohli et al. 08]
151 min f(x) x = Higher Order Submodular Function min f 1 (x)a + f 2 (x)ā x,a ϵ {0,1} Quadratic Submodular Function f 2 (x) f 1 (x) Lower envelop of concave functions is concave x i [Kohli et al. 08]
152 min f(x) x = Higher Order Submodular Function min f 1 (x)a + f 2 (x)ā x,a ϵ {0,1} Quadratic Submodular Function f 2 (x) a=0 f 1 (x) a=1 Lower envelop of concave functions is concave x i [Kohli et al. 08]
153 min f(x) x = x ϵ {0,1} Higher Order Submodular Function min max f 1 (x)a + f 2 (x)ā a ϵ {0,1} Quadratic Submodular Function f 2 (x) a=0 f 1 (x) a=1 Upper envelope of linear functions is convex Very Hard x i Problem!!!! [Kohli and Kumar, CVPR 2010]
154 SILHOUETTE CONSTRAINTS SIZE/VOLUME PRIORS [Sinha et al. 05, Cremers et al. 08] [Woodford et al. 2009] 3D RECONSTRUCTION BINARY SEGMENTATION Rays must touch silhouette at least once Object Background Prior on size of object (say n/2) [Kohli and Kumar, CVPR 2010]
155 Mixed (Real-Integer) Problems Higher Order Energy Functions Multi-label Problems Ordered Labels Stereo (depth labels) Unordered Labels Object segmentation ( car, `road, `person )
156 Min y E(y) = f i (y i ) + g ij (y i,y j ) i i,j y ϵ Labels L = {l 1, l 2,, l k } Exact Transformation to QPBF [Roy and Cox 98] [Ishikawa 03] [Schlesinger & Flach 06] [Ramalingam, Alahari, Kohli, and Torr 08] Move making algorithms
157 So what is the problem? E m (y 1,y 2,..., y n ) E b (x 1,x 2,..., x m ) y i ϵ L = {l 1, l 2,, l k } x i ϵ L = {0,1} Multi-label Problem Binary label Problem such that: Let Y and X be the set of feasible solutions, then 1. One-One encoding function T:X->Y 2. arg min E m (y) = T(arg min E b (x))
158 Popular encoding scheme [Roy and Cox 98, Ishikawa 03, Schlesinger & Flach 06] # Nodes = n * k # Edges = m * k 2
159 Popular encoding scheme [Roy and Cox 98, Ishikawa 03, Schlesinger & Flach 06] Ishikawa s result: E(y) = θ i (y i ) + θ ij (y i,y j ) i i,j y ϵ Labels L = {l 1, l 2,, l k } # Nodes = n * k # Edges = m * k 2 θ ij (y i,y j ) = g( y i -y j ) g( y i -y j ) y i -y j Convex Function
160 Popular encoding scheme [Roy and Cox 98, Ishikawa 03, Schlesinger & Flach 06] Schlesinger & Flach 06: E(y) = θ i (y i ) + θ ij (y i,y j ) i i,j y ϵ Labels L = {l 1, l 2,, l k } θ ij (l i+1,l j ) + θ ij (l i,l j+1 ) θ ij (l i,l j ) + θ ij (l i+1,l j+1 ) # Nodes = n * k # Edges = m * k 2 l i +1 l i l j +1 l j
161 Image MAP Solution Scanline algorithm [Roy and Cox, 98]
162 Applicability Cannot handle truncated costs (non-robust) θ ij (y i,y j ) = g( y i -y j ) y i -y j Computational Cost Very high computational cost Problem size = Variables x Labels Gray level image denoising (1 Mpixel image) (~2.5 x 10 8 graph nodes) discontinuity preserving potentials Blake&Zisserman 83,87
163 Other less known algorithms Ishikawa Transformation [03] Schlesinger Transformation [06] Hochbaum [01] Unary Potentials Arbitrary Pair-wise Potentials Convex and Symmetric Complexity T(nk, mk 2 ) Arbitrary Submodular T(nk, mk 2 ) Linear Convex and Symmetric T(n, m) + n log k Hochbaum [01] Convex Convex and Symmetric O(mn log n log nk) T(a,b) = complexity of maxflow with a nodes and b edges
164 Min y E(y) = f i (y i ) + g ij (y i,y j ) i i,j y ϵ Labels L = {l 1, l 2,, l k } Exact Transformation to QPBF Move making algorithms [Boykov, Veksler and Zabih 2001] [Woodford, Fitzgibbon, Reid, Torr, 2008] [Lempitsky, Rother, Blake, 2008] [Veksler, 2008] [Kohli, Ladicky, Torr 2008]
165 Energy Solution Space
166 Energy Current Solution Search Neighbourhood Optimal Move Solution Space
167 Energy Current Solution Search Neighbourhood (t) x c Optimal Move Key Property Move Space Solution Space Bigger move space Better solutions Finding the optimal move hard
168 Minimizing Pairwise Functions [Boykov Veksler and Zabih, PAMI 2001] Series of locally optimal moves Each move reduces energy Optimal move by minimizing submodular function Current Solution Move Space (t) : 2 n Space of Solutions (x) : L n n L Search Neighbourhood Number of Variables Number of Labels Extend to minimize Higher order Functions Kohli et al. 07, 08, 09
169 New solution x = t x 1 + (1- t) x 2 Current Solution Second solution E m (t) = E(t x 1 + (1- t) x 2 ) Minimize over move variables t For certain x 1 and x 2, the move energy is sub-modular QPBF [Boykov, Veksler and Zabih 2001]
170 Variables labeled α, β can swap their labels [Boykov, Veksler and Zabih 2001]
171 Variables labeled α, β can swap their labels Swap Sky, House Tree Ground House Sky [Boykov, Veksler and Zabih 2001]
172 Variables labeled α, β can swap their labels Move energy is submodular if: Unary Potentials: Arbitrary Pairwise potentials: Semi-metric θ ij (l a,l b ) 0 θ ij (l a,l b ) = 0 a = b Examples: Potts model, Truncated Convex [Boykov, Veksler and Zabih 2001]
173 Variables take label a or retain current label [Boykov, Veksler and Zabih 2001] [Boykov, Veksler, Zabih]
174 Variables take label a or retain current label Status: Initialize Expand Ground Sky House with Tree Tree Ground House Sky [Boykov,, Veksler Veksler, and Zabih] 2001]
175 Variables take label a or retain current label Move energy is submodular if: Unary Potentials: Arbitrary Pairwise potentials: Metric Semi metric + Triangle Inequality θ ij (l a,l b ) + θ ij (l b,l c ) θ ij (l a,l c ) Examples: Potts model, Truncated linear Cannot solve truncated quadratic [Boykov, Veksler and Zabih 2001] [Boykov, Veksler, Zabih]
176 Expansion and Swap can be derived as a primal dual scheme Get solution of the dual problem which is a lower bound on the energy of solution Weak guarantee on the solution E(x) < 2(d max /d min ) E(x*) θ ij (l i,l j ) = g( l i -l j ) d max d min y i -y j [Komodakis et al 05, 07]
177 x = t x 1 + (1-t) x 2 New solution First solution Second solution Minimize over move variables t Move Type First Solution Second Solution Guarantee Expansion Old solution All alpha Metric Fusion Any solution Any solution Move functions can be non-submodular!!
178 x = t x 1 + (1-t) x 2 x 1, x 2 can be continuous Optical Flow Example Solution from Method 2 x 2 Solution from Method 1 x 1 F x Final Solution [Woodford, Fitzgibbon, Reid, Torr, 2008] [Lempitsky, Rother, Blake, 2008]
179 Move variables can be multi-label x = (t ==1) x 1 + (t==2) x 2 + +(t==k) x k Optimal move found out by using the Ishikawa Transform Useful for minimizing energies with truncated convex pairwise potentials T θ ij (y i,y j ) = min( y i -y j 2,T) θ ij (y i,y j ) y i -y j [Kumar and Torr, 2008] [Veksler, 2008]
180 Image Noisy Image Expansion Move Range Moves [Veksler, 2008]
181
182 3,600,000,000 Pixels Created from about MegaPixel Images [Kopf et al. (MSR Redmond) SIGGRAPH 2007 ]
183 [Kopf et al. (MSR Redmond) SIGGRAPH 2007 ]
184 Processing Videos 1 minute video of 1M pixel resolution 3.6 B pixels 3D reconstruction [500 x 500 x 500 =.125B voxels]
185 Segment First Frame Can we do better? Segment Second Frame Kohli [Kohli & Torr & Torr, (ICCV05, PAMI07) PAMI07]
186 Image Flow Segmentation Kohli [Kohli & Torr & Torr, (ICCV05, PAMI07) PAMI07]
187 Frame 1 E A minimize S A Reuse Computation differences between A and B E B* Simpler fast Frame 2 E B minimize S B time speedup! Reparametrization [Kohli & Torr, ICCV05 PAMI07] [Komodakis & Paragios, CVPR07]
188 Original Energy E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a 2 Reparametrized Energy E(a 1,a 2 ) = 8 + ā 1 + 3a 2 + 3ā 1 a 2 New Energy E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 7a 1 ā 2 + ā 1 a 2 New Reparametrized Energy E(a 1,a 2 ) = 8 + ā 1 + 3a 2 + 3ā 1 a 2 + 5a 1 ā 2 [Kohli & Torr, ICCV05 PAMI07] Kohli & [Komodakis Torr (ICCV05, & Paragios, PAMI07) CVPR07]
189 Original Problem (Large) Approximation algorithm (Slow) Approximate Solution Fast partially optimal algorithm [ Alahari Kohli & Torr CVPR 08]
190 Original Problem (Large) Approximation algorithm (Slow) Approximate Solution Fast partially optimal algorithm [Kovtun 03] Fast partially optimal algorithm [ Kohli et al. 09] Reduced Problem Solved Problem (Global Optima) Approximation algorithm (Fast) Approximate Solution [ Alahari Kohli & Torr CVPR 08]
191 3-100 Times Speed up Fast partially optimal algorithm [Kovtun 03] Fast partially optimal algorithm [ Kohli et al. 09] Original Problem (Large) sky Reduced Problem Building Solved Airplane Problem (Global Optima) Grass Tree Reweighted Message Passing (9.89 sec) Tree Reweighted Message Passing Total Time (0.30 sec) sky Building Approximate Airplane Solution Grass sky Building Approximate Airplane Solution Grass [ Alahari Kohli & Torr CVPR 08]
192 Message Passing algorithms Belief propagation, TRW Exact on tree structured graphs Connections to Linear programming relaxations Graph Cuts and submodular functions Move making algorithms expansion, swap, fusion Minimizing Higher order functions
193 Going beyond small connectivity Representation, Inference and learning of Higher order models Topology constraints: connectivity of objects Parallelization GPUs Linear programming formulations
194
195 Past presenters Pawan, Carsten and Andrew Blake Special thanks to the IbPRIA team In particular: Joost Van de Weijer Jordi Gonzàlez Mario Hernández Tejera
196
197
198 Set of Label assignments Associated Costs p 1 p 2 p 3 Introduce pattern variable z with (t+1)-states to get a pairwise function: Unary term assigns the cost for the selected label assignment z Pairwise terms ensure x is consistent with pattern z
199 E(x) = θ i (x i, z i ) + θ ij (x i,x j,z i,z j ) i i,j Є N 4 θ ij (x i,x j,z i,z j ) = x i -x j (-exp{-ß z i -z j }) θ ij ß=2(Mean( z i -z j 2 ) ) -1 Conditional Random Field (CRF): no pure prior z i -z j z i z j Observed variable Unobserved (latent) variable x j Factor graph x j i MRF CRF
200 unary Pair-wise E(x) = θ i (x i,z i ) + w θ ij (x i,x j ) x i R Convex unary and pairwise terms: θ ij (x i,x j ) = g( x i -x j ) θ i (x i,z i ) = x i -z i Can be solved globally optimal x i -x j x i original input TRW-S (discrete labels) HBF [Szelsiki 06] (continuous labels) ~ 15times faster
201 Non-submodular Energy Functions Mixed (Real-Integer) Problems Higher Order Energy Functions Multi-label Problems Ordered Labels Stereo (depth labels) Unordered Labels Object segmentation ( car, `road, `person )
202 E(x) = θ i (x i ) + θ ij (x i,x j ) i i,j θ ij (0,1) + θ ij (1,0) θ ij (0,0) + θ ij (1,1) for some ij Minimizing general non-submodular functions is NPhard. Commonly used method is to solve a relaxation of the problem [Boros and Hammer, 02]
203
204 Integer Program,
205 , Linear Program Feasibility Region (LOCAL Polytope)
206 Formulate the program Solve LP and obtain Integer solution by rounding Round Integral solution is not guaranteed to be optimal However, we can obtain multiplicative bounds on the error Feasibility Region (LOCAL Polytope)
207 Traditional LP Solvers are unable to handle large instances ( >10 7 variables) Simplex Interior Point Methods [Bertsimas and Tsitsikilis, 1997] [Boyd and Vandenberghe, 2004] Exploit the structure of the problem Special Message Passing Solvers Message Passing Algorithms Weak convergence properties [Kolmogorov and Wainwright] [Ravikumar et al., ICML 2008] Feasibility Region (LOCAL Polytope)
208 A new weaker relaxation...roof-dual [Boros and Hammer, 2002] [Kohli et al., ICML 2008] [Kolmogorov and Rother, PAMI 2006] x... which can be exactly solved by minimizing a pairwise submodular function Obtain partially optimal solutions x = [00*10**1111*] x OPT = [ ] Feasibility Region (LOCAL Polytope)
209 unary ~ pq ( 0,0) pq(1,1) pq(0,1) pq(1,0) pairwise submodular pq ~ (0,0) pq (1,1) ~ pq ~ (0,1) pairwise nonsubmodular pq (1,0) [Boros and Hammer, 02]
210 Double number of variables: [Boros and Hammer, 02]
211 Double number of variables: x x, x x 1 x ) p p p ( p p [Boros and Hammer, 02] Ignore constraint and solve
212 Double number of variables: x x, x x 1 x ) p p p ( p p Local Optimality [Boros and Hammer, 02]
Discrete Inference and Learning Lecture 3
Discrete Inference and Learning Lecture 3 MVA 2017 2018 h
More informationPushmeet Kohli Microsoft Research
Pushmeet Kohli Microsoft Research E(x) x in {0,1} n Image (D) [Boykov and Jolly 01] [Blake et al. 04] E(x) = c i x i Pixel Colour x in {0,1} n Unary Cost (c i ) Dark (Bg) Bright (Fg) x* = arg min E(x)
More informationMAP Estimation Algorithms in Computer Vision - Part II
MAP Estimation Algorithms in Comuter Vision - Part II M. Pawan Kumar, University of Oford Pushmeet Kohli, Microsoft Research Eamle: Image Segmentation E() = c i i + c ij i (1- j ) i i,j E: {0,1} n R 0
More informationA note on the primal-dual method for the semi-metric labeling problem
A note on the primal-dual method for the semi-metric labeling problem Vladimir Kolmogorov vnk@adastral.ucl.ac.uk Technical report June 4, 2007 Abstract Recently, Komodakis et al. [6] developed the FastPD
More informationPart 7: Structured Prediction and Energy Minimization (2/2)
Part 7: Structured Prediction and Energy Minimization (2/2) Colorado Springs, 25th June 2011 G: Worst-case Complexity Hard problem Generality Optimality Worst-case complexity Integrality Determinism G:
More informationGraph Cut based Inference with Co-occurrence Statistics. Ľubor Ladický, Chris Russell, Pushmeet Kohli, Philip Torr
Graph Cut based Inference with Co-occurrence Statistics Ľubor Ladický, Chris Russell, Pushmeet Kohli, Philip Torr Image labelling Problems Assign a label to each image pixel Geometry Estimation Image Denoising
More informationPart 6: Structured Prediction and Energy Minimization (1/2)
Part 6: Structured Prediction and Energy Minimization (1/2) Providence, 21st June 2012 Prediction Problem Prediction Problem y = f (x) = argmax y Y g(x, y) g(x, y) = p(y x), factor graphs/mrf/crf, g(x,
More informationMarkov Random Fields for Computer Vision (Part 1)
Markov Random Fields for Computer Vision (Part 1) Machine Learning Summer School (MLSS 2011) Stephen Gould stephen.gould@anu.edu.au Australian National University 13 17 June, 2011 Stephen Gould 1/23 Pixel
More informationMaking the Right Moves: Guiding Alpha-Expansion using Local Primal-Dual Gaps
Making the Right Moves: Guiding Alpha-Expansion using Local Primal-Dual Gaps Dhruv Batra TTI Chicago dbatra@ttic.edu Pushmeet Kohli Microsoft Research Cambridge pkohli@microsoft.com Abstract 5 This paper
More informationDiscrete Optimization Lecture 5. M. Pawan Kumar
Discrete Optimization Lecture 5 M. Pawan Kumar pawan.kumar@ecp.fr Exam Question Type 1 v 1 s v 0 4 2 1 v 4 Q. Find the distance of the shortest path from s=v 0 to all vertices in the graph using Dijkstra
More informationOn Partial Optimality in Multi-label MRFs
Pushmeet Kohli 1 Alexander Shekhovtsov 2 Carsten Rother 1 Vladimir Kolmogorov 3 Philip Torr 4 pkohli@microsoft.com shekhovt@cmp.felk.cvut.cz carrot@microsoft.com vnk@adastral.ucl.ac.uk philiptorr@brookes.ac.uk
More informationDiscrete Optimization in Machine Learning. Colorado Reed
Discrete Optimization in Machine Learning Colorado Reed [ML-RCC] 31 Jan 2013 1 Acknowledgements Some slides/animations based on: Krause et al. tutorials: http://www.submodularity.org Pushmeet Kohli tutorial:
More informationHigher-Order Clique Reduction Without Auxiliary Variables
Higher-Order Clique Reduction Without Auxiliary Variables Hiroshi Ishikawa Department of Computer Science and Engineering Waseda University Okubo 3-4-1, Shinjuku, Tokyo, Japan hfs@waseda.jp Abstract We
More informationEnergy Minimization via Graph Cuts
Energy Minimization via Graph Cuts Xiaowei Zhou, June 11, 2010, Journal Club Presentation 1 outline Introduction MAP formulation for vision problems Min-cut and Max-flow Problem Energy Minimization via
More informationGeneralized Roof Duality for Pseudo-Boolean Optimization
Generalized Roof Duality for Pseudo-Boolean Optimization Fredrik Kahl Petter Strandmark Centre for Mathematical Sciences, Lund University, Sweden {fredrik,petter}@maths.lth.se Abstract The number of applications
More informationOn Partial Optimality in Multi-label MRFs
On Partial Optimality in Multi-label MRFs P. Kohli 1 A. Shekhovtsov 2 C. Rother 1 V. Kolmogorov 3 P. Torr 4 1 Microsoft Research Cambridge 2 Czech Technical University in Prague 3 University College London
More informationRounding-based Moves for Semi-Metric Labeling
Rounding-based Moves for Semi-Metric Labeling M. Pawan Kumar, Puneet K. Dokania To cite this version: M. Pawan Kumar, Puneet K. Dokania. Rounding-based Moves for Semi-Metric Labeling. Journal of Machine
More informationA Graph Cut Algorithm for Higher-order Markov Random Fields
A Graph Cut Algorithm for Higher-order Markov Random Fields Alexander Fix Cornell University Aritanan Gruber Rutgers University Endre Boros Rutgers University Ramin Zabih Cornell University Abstract Higher-order
More informationHigher-Order Clique Reduction in Binary Graph Cut
CVPR009: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Miami Beach, Florida. June 0-5, 009. Hiroshi Ishikawa Nagoya City University Department of Information and Biological
More informationIntelligent Systems:
Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition
More informationMinimizing Count-based High Order Terms in Markov Random Fields
EMMCVPR 2011, St. Petersburg Minimizing Count-based High Order Terms in Markov Random Fields Thomas Schoenemann Center for Mathematical Sciences Lund University, Sweden Abstract. We present a technique
More informationTruncated Max-of-Convex Models Technical Report
Truncated Max-of-Convex Models Technical Report Pankaj Pansari University of Oxford The Alan Turing Institute pankaj@robots.ox.ac.uk M. Pawan Kumar University of Oxford The Alan Turing Institute pawan@robots.ox.ac.uk
More informationEnergy minimization via graph-cuts
Energy minimization via graph-cuts Nikos Komodakis Ecole des Ponts ParisTech, LIGM Traitement de l information et vision artificielle Binary energy minimization We will first consider binary MRFs: Graph
More informationVariational Inference (11/04/13)
STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further
More informationA Generative Perspective on MRFs in Low-Level Vision Supplemental Material
A Generative Perspective on MRFs in Low-Level Vision Supplemental Material Uwe Schmidt Qi Gao Stefan Roth Department of Computer Science, TU Darmstadt 1. Derivations 1.1. Sampling the Prior We first rewrite
More informationProbabilistic Graphical Models
Probabilistic Graphical Models David Sontag New York University Lecture 6, March 7, 2013 David Sontag (NYU) Graphical Models Lecture 6, March 7, 2013 1 / 25 Today s lecture 1 Dual decomposition 2 MAP inference
More informationPotts model, parametric maxflow and k-submodular functions
2013 IEEE International Conference on Computer Vision Potts model, parametric maxflow and k-submodular functions Igor Gridchyn IST Austria igor.gridchyn@ist.ac.at Vladimir Kolmogorov IST Austria vnk@ist.ac.at
More informationConvergent Tree-reweighted Message Passing for Energy Minimization
Accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2006. Convergent Tree-reweighted Message Passing for Energy Minimization Vladimir Kolmogorov University College London,
More informationMAP Examples. Sargur Srihari
MAP Examples Sargur srihari@cedar.buffalo.edu 1 Potts Model CRF for OCR Topics Image segmentation based on energy minimization 2 Examples of MAP Many interesting examples of MAP inference are instances
More informationTransformation of Markov Random Fields for Marginal Distribution Estimation
Transformation of Markov Random Fields for Marginal Distribution Estimation Masaki Saito Takayuki Okatani Tohoku University, Japan {msaito, okatani}@vision.is.tohoku.ac.jp Abstract This paper presents
More informationDecision Tree Fields
Sebastian Nowozin, arsten Rother, Shai agon, Toby Sharp, angpeng Yao, Pushmeet Kohli arcelona, 8th November 2011 Introduction Random Fields in omputer Vision Markov Random Fields (MRF) (Kindermann and
More informationSubproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing Huayan Wang huayanw@cs.stanford.edu Computer Science Department, Stanford University, Palo Alto, CA 90 USA Daphne Koller koller@cs.stanford.edu
More informationQuadratic Programming Relaxations for Metric Labeling and Markov Random Field MAP Estimation
Quadratic Programming Relaations for Metric Labeling and Markov Random Field MAP Estimation Pradeep Ravikumar John Lafferty School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213,
More informationTighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm
Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm Dhruv Batra Sebastian Nowozin Pushmeet Kohli TTI-Chicago Microsoft Research Cambridge Microsoft Research Cambridge
More informationTighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm
Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm Dhruv Batra Sebastian Nowozin Pushmeet Kohli TTI-Chicago Microsoft Research Cambridge Microsoft Research Cambridge
More informationProbabilistic Graphical Models
Probabilistic Graphical Models David Sontag New York University Lecture 4, February 16, 2012 David Sontag (NYU) Graphical Models Lecture 4, February 16, 2012 1 / 27 Undirected graphical models Reminder
More informationECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning
ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Summary of Class Advanced Topics Dhruv Batra Virginia Tech HW1 Grades Mean: 28.5/38 ~= 74.9%
More informationPseudo-Bound Optimization for Binary Energies
In European Conference on Computer Vision (ECCV), Zurich, Switzerland, 2014 p. 1 Pseudo-Bound Optimization for Binary Energies Meng Tang 1 Ismail Ben Ayed 1,2 Yuri Boykov 1 1 University of Western Ontario,
More informationGraph Cut based Inference with Co-occurrence Statistics
Graph Cut based Inference with Co-occurrence Statistics Lubor Ladicky 1,3, Chris Russell 1,3, Pushmeet Kohli 2, and Philip H.S. Torr 1 1 Oxford Brookes 2 Microsoft Research Abstract. Markov and Conditional
More informationHigher-Order Energies for Image Segmentation
IEEE TRANSACTIONS ON IMAGE PROCESSING 1 Higher-Order Energies for Image Segmentation Jianbing Shen, Senior Member, IEEE, Jianteng Peng, Xingping Dong, Ling Shao, Senior Member, IEEE, and Fatih Porikli,
More informationA Combined LP and QP Relaxation for MAP
A Combined LP and QP Relaxation for MAP Patrick Pletscher ETH Zurich, Switzerland pletscher@inf.ethz.ch Sharon Wulff ETH Zurich, Switzerland sharon.wulff@inf.ethz.ch Abstract MAP inference for general
More informationMaximum Persistency via Iterative Relaxed Inference with Graphical Models
Maximum Persistency via Iterative Relaxed Inference with Graphical Models Alexander Shekhovtsov TU Graz, Austria shekhovtsov@icg.tugraz.at Paul Swoboda Heidelberg University, Germany swoboda@math.uni-heidelberg.de
More informationSemi-Markov/Graph Cuts
Semi-Markov/Graph Cuts Alireza Shafaei University of British Columbia August, 2015 1 / 30 A Quick Review For a general chain-structured UGM we have: n n p(x 1, x 2,..., x n ) φ i (x i ) φ i,i 1 (x i, x
More informationQuadratization of symmetric pseudo-boolean functions
of symmetric pseudo-boolean functions Yves Crama with Martin Anthony, Endre Boros and Aritanan Gruber HEC Management School, University of Liège Liblice, April 2013 of symmetric pseudo-boolean functions
More informationTightness of LP Relaxations for Almost Balanced Models
Tightness of LP Relaxations for Almost Balanced Models Adrian Weller University of Cambridge AISTATS May 10, 2016 Joint work with Mark Rowland and David Sontag For more information, see http://mlg.eng.cam.ac.uk/adrian/
More informationSubmodularity in Machine Learning
Saifuddin Syed MLRG Summer 2016 1 / 39 What are submodular functions Outline 1 What are submodular functions Motivation Submodularity and Concavity Examples 2 Properties of submodular functions Submodularity
More informationEfficient Inference in Fully Connected CRFs with Gaussian Edge Potentials
Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials by Phillip Krahenbuhl and Vladlen Koltun Presented by Adam Stambler Multi-class image segmentation Assign a class label to each
More informationIntroduction To Graphical Models
Peter Gehler Introduction to Graphical Models Introduction To Graphical Models Peter V. Gehler Max Planck Institute for Intelligent Systems, Tübingen, Germany ENS/INRIA Summer School, Paris, July 2013
More informationCourse 16:198:520: Introduction To Artificial Intelligence Lecture 9. Markov Networks. Abdeslam Boularias. Monday, October 14, 2015
Course 16:198:520: Introduction To Artificial Intelligence Lecture 9 Markov Networks Abdeslam Boularias Monday, October 14, 2015 1 / 58 Overview Bayesian networks, presented in the previous lecture, are
More informationMANY problems in computer vision, such as segmentation,
134 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 33, NO. 6, JUNE 011 Transformation of General Binary MRF Minimization to the First-Order Case Hiroshi Ishikawa, Member, IEEE Abstract
More informationJoint Optimization of Segmentation and Appearance Models
Joint Optimization of Segmentation and Appearance Models David Mandle, Sameep Tandon April 29, 2013 David Mandle, Sameep Tandon (Stanford) April 29, 2013 1 / 19 Overview 1 Recap: Image Segmentation 2 Optimization
More informationMultiresolution Graph Cut Methods in Image Processing and Gibbs Estimation. B. A. Zalesky
Multiresolution Graph Cut Methods in Image Processing and Gibbs Estimation B. A. Zalesky 1 2 1. Plan of Talk 1. Introduction 2. Multiresolution Network Flow Minimum Cut Algorithm 3. Integer Minimization
More informationUndirected Graphical Models: Markov Random Fields
Undirected Graphical Models: Markov Random Fields 40-956 Advanced Topics in AI: Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2015 Markov Random Field Structure: undirected
More informationOn the optimality of tree-reweighted max-product message-passing
Appeared in Uncertainty on Artificial Intelligence, July 2005, Edinburgh, Scotland. On the optimality of tree-reweighted max-product message-passing Vladimir Kolmogorov Microsoft Research Cambridge, UK
More informationJunction Tree, BP and Variational Methods
Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,
More informationProbabilistic Graphical Models & Applications
Probabilistic Graphical Models & Applications Learning of Graphical Models Bjoern Andres and Bernt Schiele Max Planck Institute for Informatics The slides of today s lecture are authored by and shown with
More informationConditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013
Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013 Outline Modeling Inference Training Applications Outline Modeling Problem definition Discriminative vs. Generative Chain CRF General
More informationMarkov Random Fields
Markov Random Fields Umamahesh Srinivas ipal Group Meeting February 25, 2011 Outline 1 Basic graph-theoretic concepts 2 Markov chain 3 Markov random field (MRF) 4 Gauss-Markov random field (GMRF), and
More informationEfficient Inference with Cardinality-based Clique Potentials
Rahul Gupta IBM Research Lab, New Delhi, India Ajit A. Diwan Sunita Sarawagi IIT Bombay, India Abstract Many collective labeling tasks require inference on graphical models where the clique potentials
More informationTHE topic of this paper is the following problem:
1 Revisiting the Linear Programming Relaxation Approach to Gibbs Energy Minimization and Weighted Constraint Satisfaction Tomáš Werner Abstract We present a number of contributions to the LP relaxation
More informationIntroduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah
Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,
More informationReformulations of nonlinear binary optimization problems
1 / 36 Reformulations of nonlinear binary optimization problems Yves Crama HEC Management School, University of Liège, Belgium Koper, May 2018 2 / 36 Nonlinear 0-1 optimization Outline 1 Nonlinear 0-1
More informationCSCI-567: Machine Learning (Spring 2019)
CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March
More informationPart III. for energy minimization
ICCV 2007 tutorial Part III Message-assing algorithms for energy minimization Vladimir Kolmogorov University College London Message assing ( E ( (,, Iteratively ass messages between nodes... Message udate
More informationSuperdifferential Cuts for Binary Energies
Superdifferential Cuts for Binary Energies Tatsunori Taniai University of Tokyo, Japan Yasuyuki Matsushita Osaka University, Japan Takeshi Naemura University of Tokyo, Japan taniai@iis.u-tokyo.ac.jp yasumat@ist.osaka-u.ac.jp
More informationLecture 15. Probabilistic Models on Graph
Lecture 15. Probabilistic Models on Graph Prof. Alan Yuille Spring 2014 1 Introduction We discuss how to define probabilistic models that use richly structured probability distributions and describe how
More informationIntroduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah
Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,
More informationCSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches
CSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches A presentation by Evan Ettinger November 11, 2005 Outline Introduction Motivation and Background
More informationComputer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo
Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain
More informationFast Memory-Efficient Generalized Belief Propagation
Fast Memory-Efficient Generalized Belief Propagation M. Pawan Kumar P.H.S. Torr Department of Computing Oxford Brookes University Oxford, UK, OX33 1HX {pkmudigonda,philiptorr}@brookes.ac.uk http://cms.brookes.ac.uk/computervision
More informationSubmodularization for Binary Pairwise Energies
IEEE conference on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, 214 p. 1 Submodularization for Binary Pairwise Energies Lena Gorelick Yuri Boykov Olga Veksler Computer Science Department
More informationMarkov Random Fields and Bayesian Image Analysis. Wei Liu Advisor: Tom Fletcher
Markov Random Fields and Bayesian Image Analysis Wei Liu Advisor: Tom Fletcher 1 Markov Random Field: Application Overview Awate and Whitaker 2006 2 Markov Random Field: Application Overview 3 Markov Random
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning More Approximate Inference Mark Schmidt University of British Columbia Winter 2018 Last Time: Approximate Inference We ve been discussing graphical models for density estimation,
More informationInference for Order Reduction in Markov Random Fields
Inference for Order Reduction in Markov Random Fields Andre C. Gallagher Eastman Kodak Company andre.gallagher@kodak.com Dhruv Batra Toyota Technological Institute at Chicago ttic.uchicago.edu/ dbatra
More information3 : Representation of Undirected GM
10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:
More informationHigher-order Graph Cuts
ACCV2014 Area Chairs Workshop Sep. 3, 2014 Nanyang Technological University, Singapore 1 Higher-order Graph Cuts Hiroshi Ishikawa Department of Computer Science & Engineering Waseda University Labeling
More informationA Compact Linear Programming Relaxation for Binary Sub-modular MRF
A Compact Linear Programming Relaxation for Binary Sub-modular MRF Junyan Wang and Sai-Kit Yeung Singapore University of Technology and Design {junyan_wang,saikit}@sutd.edu.sg Abstract. Direct linear programming
More informationProbabilistic Graphical Models Lecture Notes Fall 2009
Probabilistic Graphical Models Lecture Notes Fall 2009 October 28, 2009 Byoung-Tak Zhang School of omputer Science and Engineering & ognitive Science, Brain Science, and Bioinformatics Seoul National University
More informationECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning
ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Markov Random Fields: Representation Conditional Random Fields Log-Linear Models Readings: KF
More informationClustering with k-means and Gaussian mixture distributions
Clustering with k-means and Gaussian mixture distributions Machine Learning and Category Representation 2012-2013 Jakob Verbeek, ovember 23, 2012 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.12.13
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Variational Inference II: Mean Field Method and Variational Principle Junming Yin Lecture 15, March 7, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3
More informationCS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling
CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy
More informationLecture 9: PGM Learning
13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and
More informationEfficient Inference in Fully Connected CRFs with Gaussian Edge Potentials
Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials Philipp Krähenbühl and Vladlen Koltun Stanford University Presenter: Yuan-Ting Hu 1 Conditional Random Field (CRF) E x I = φ u
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft
More informationClustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.
Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)
More informationParameter Learning for Log-supermodular Distributions
Parameter Learning for Log-supermodular Distributions Tatiana Shpakova IRIA - École ormale Supérieure Paris tatiana.shpakova@inria.fr Francis Bach IRIA - École ormale Supérieure Paris francis.bach@inria.fr
More informationUndirected Graphical Models
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional
More informationAccelerated MRI Image Reconstruction
IMAGING DATA EVALUATION AND ANALYTICS LAB (IDEAL) CS5540: Computational Techniques for Analyzing Clinical Data Lecture 15: Accelerated MRI Image Reconstruction Ashish Raj, PhD Image Data Evaluation and
More informationSubmodular Functions Properties Algorithms Machine Learning
Submodular Functions Properties Algorithms Machine Learning Rémi Gilleron Inria Lille - Nord Europe & LIFL & Univ Lille Jan. 12 revised Aug. 14 Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised
More informationSubmodular Maximization and Diversity in Structured Output Spaces
Submodular Maximization and Diversity in Structured Output Spaces Adarsh Prasad Virginia Tech, UT Austin adarshprasad27@gmail.com Stefanie Jegelka UC Berkeley stefje@eecs.berkeley.edu Dhruv Batra Virginia
More informationGraphical Models and Kernel Methods
Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.
More informationInference Methods for CRFs with Co-occurrence Statistics
Noname manuscript No. (will be inserted by the editor) Inference Methods for CRFs with Co-occurrence Statistics L ubor Ladický 1 Chris Russell 1 Pushmeet Kohli Philip H. S. Torr the date of receipt and
More informationBayesian Machine Learning - Lecture 7
Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1
More informationMachine Learning for Data Science (CS4786) Lecture 24
Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each
More informationLearning with Structured Inputs and Outputs
Learning with Structured Inputs and Outputs Christoph H. Lampert IST Austria (Institute of Science and Technology Austria), Vienna Microsoft Machine Learning and Intelligence School 2015 July 29-August
More informationRevisiting Uncertainty in Graph Cut Solutions
Revisiting Uncertainty in Graph Cut Solutions Daniel Tarlow Dept. of Computer Science University of Toronto dtarlow@cs.toronto.edu Ryan P. Adams School of Engineering and Applied Sciences Harvard University
More informationInference Methods for CRFs with Co-occurrence Statistics
Int J Comput Vis (2013) 103:213 225 DOI 10.1007/s11263-012-0583-y Inference Methods for CRFs with Co-occurrence Statistics L ubor Ladický Chris Russell Pushmeet Kohli Philip H. S. Torr Received: 21 January
More informationPart 4: Conditional Random Fields
Part 4: Conditional Random Fields Sebastian Nowozin and Christoph H. Lampert Colorado Springs, 25th June 2011 1 / 39 Problem (Probabilistic Learning) Let d(y x) be the (unknown) true conditional distribution.
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More information