Pushmeet Kohli. Microsoft Research Cambridge. IbPRIA 2011

Size: px
Start display at page:

Download "Pushmeet Kohli. Microsoft Research Cambridge. IbPRIA 2011"

Transcription

1 Pushmeet Kohli Microsoft Research Cambridge IbPRIA 2011

2 2:30 4:30 Labelling Problems Graphical Models Message Passing 4:30 5:00 - Coffee break 5:00 7:00 - Graph Cuts Move Making Algorithms Speed and Efficiency

3 Infer the values of hidden random variables Object Segmentation Pose Estimation Geometry Estimation Sky Tree Building Grass Orientation Joint angles

4

5 ( n = number of pixels ) z є (R,G,B) n x є {0,1} n n random variables: x 1,,x n Joint Probability: P(x 1,,x n ) Conditional Probability: P(x 1 z 1,z n )

6 ( n = number of pixels ) z є (R,G,B) n x є {0,1} n

7 ( n = number of pixels ) z є (R,G,B) n x є {0,1} n P(x z) = P(z x) P(x) / P(z) ~ P(z x) P(x) Posterior Probability Likelihood (data-dependent) Prior (data- independent) (MAP Solution) x* = arg max P(x z) = arg min E(x) x x

8 Posterior Likelihood Prior P(x z) = P(z x) P(x) P(z i x i ) x i

9 Green Red Green Red P(x z) ~ P(z x) P(x) P(z x) = F GMM (z,x)

10 P(x z) ~ P(z x) P(x) Log P(z i x i =0) P(z i x i =1) MAP Solution x* = argmax P(z x) x = argmax P(z i x i ) x x i

11 Posterior Likelihood Prior P(x z) = P(z x) P(x) Encourages consistency between labelling of adjacent pixels f(x i,x j ) x i,x j

12 P(x z) ~ P(z x) P(x) x i x j P(x) = f ij (x i,x j ) i,j Є N = exp{- x i -x j } MRF Ising prior i,j Є N

13 P(x z) = P(z i x i ) Posterior Probability x i P(x i,x j ) x i,x j -ve log E(x,z,w) = θ i (x i,z i ) + w θ ij (x i,x j,z i,z j ) Energy i i,j

14 E(x,z,w) = θ i (x i,z i ) + w θ ij (x i,x j ) i i,j Image Likelihood Solution MRF Solution (Ising prior)

15 P(x z) = P(z i x i ) x i P(x i,x j ) x i,x j

16 P(x z) = P(z i x i ) x i P(x i,x j,z i,z j ) x i,x j -ve log E(x,z,w) = θ i (x i,z i ) + w θ ij (x i,x j,z i,z j ) i i,j [Boykov and Jolly 01] [Blake et al. 04] [Rother, Kolmogorov and Blake `04]

17 E(x,z,w) = θ i (x i,z i ) + w θ ij (x i,x j,z i,z j ) i i,j Cost (θ ij ) Pairwise Cost z i -z j 2 [Boykov and Jolly 01] [Blake et al. 04] [Rother, Kolmogorov and Blake `04]

18 E(x,z,w) = θ i (x i,z i ) + w θ ij (x i,x j,z i,z j ) i i,j Pairwise Cost Global Minimum (x*) [Boykov and Jolly 01] [Blake et al. 04] [Rother, Kolmogorov and Blake `04]

19 Demo

20 Factors graphs & Order of a posterior distribution (or energy function)

21 Write probability distributions as Graphical model: P(x) ~ exp{-e(x)} E(x) = θ(x 1,x 2,x 3 ) + θ(x 2,x 4 ) + θ(x 3,x 4 ) + θ(x 3,x 5 ) Gibbs distribution 4 factors x 1 x 2 unobserved variables are in same factor. x 3 x 4 References: x 5 Factor graph - Pattern Recognition and Machine Learning [Bishop 08, book, chapter 8] - several lectures at the Machine Learning Summer School 2009 (see video lectures)

22 x 1 x 2 Definition Order : The arity (number of variables) of the largest factor x 3 x 4 E(X) = θ(x 1,x 2,x 3 ) θ(x 2,x 4 ) θ(x 3,x 4 ) θ(x 3,x 5 ) arity 3 arity 2 x 5 Factor graph with order 3

23 4-connected; pairwise MRF E(x) = θ ij (x i,x j ) i,j Є N 4 higher(8)-connected; pairwise MRF E(x) = θ ij (x i,x j ) i,j Є N 8 Higher-order RF E(x) = θ ij (x i,x j ) i,j Є N 4 +θ(x 1,,x n ) Order 2 Order 2 Order n Pairwise energy higher-order energy

24 1. Image segmentation 2. Stereo 3. Semantic Segmentation

25 1. Image segmentation 2. Stereo 3. Semantic Segmentation

26 P(x z) ~ exp{-e(x)} E(x) = θ i (x i, z i ) + θ ij (x i,x j ) i i,j Є N 4 Observed variable Unobserved (latent) variable z i x j x i Factor graph

27 Goal: Detect and segment test image: 1 Large set of example segmentation: (Template, Position, Scale, Orientation) Up to exemplars w T(1) T(2) T(3) E(x,w): {0,1} n x {Exemplar} R E(x,w) = T(w) i -x i + θ ij (x i,x j ) i i,j Є N 4 Hamming distance [Kohli et al. IJCV 08, Lempisky et al. ECCV 08]

28 UIUC dataset; 98.8% accuracy [Kohli et al. IJCV 08, Lempisky et al. ECCV 08]

29

30 Training Image Pairwise Energy P(x) Test Image Test Image (60% Noise) Minimized using st-mincut or max-product message passing Result

31 Higher Order Structure not Preserved Training Image Pairwise Energy P(x) Test Image Test Image (60% Noise) Minimized using st-mincut or max-product message passing Result

32 Minimize: Where: E(X) = P(X) + h c (X c ) c h c : {0,1} c R Higher Order Function ( c = 10x10 = 100) Assigns cost to possible labellings! Exploit function structure to transform it to a Pairwise function p 1 p 2 p 3 Rother and Kohli, MSR Tech Report [2010]

33 Training Image Learned Patterns Test Image Test Image (60% Noise) Pairwise Result Higher-Order Result Rother and Kohli, MSR Tech Report [2010]

34 1. Image segmentation 2. Stereo 3. Semantic Segmentation

35 d=0 d=4 Image left(a) Image right(b) Ground truth depth Images rectified Ignore occlusion for now Energy: d i E(d): {0,,D-1} n R Labels: d (depth/shift)

36 Right Image Energy: Unary: E(d): {0,,D-1} n R E(d) = θ i (d i ) + θ ij (d i,d j ) i i,j Є N 4 θ i (d i ) = (l j -r i-di ) SAD; Sum of absolute differences (many others possible, NCC, ) Pairwise: i i-2 (d i =2) left right θ ij (d i,d j ) = g( d i -d j ) Left Image

37 cost [Olga Veksler PhD thesis, Daniel Cremers et al.] θ ij (d i,d j ) = g( d i -d j ) d i -d j No truncation (global min.)

38 cost [Olga Veksler PhD thesis, Daniel Cremers et al.] θ ij (d i,d j ) = g( d i -d j ) d i -d j discontinuity preserving potentials [Blake&Zisserman 83, 87] No truncation (global min.) with truncation (NP hard optimization)

39 cost θ ij (d i,d j ) = g( d i -d j ) Left image d i -d j Potts model (Potts model) Smooth disparities [Olga Veksler PhD thesis]

40 see No MRF Pixel independent (WTA) No horizontal links Efficient since independent chains Pairwise MRF [Boykov et al. 01] Ground truth

41 1. Image segmentation 2. Stereo 3. Semantic/Object Segmentation

42 Assign a label to each pixel (or Voxel in 3D) Object Segmentation Organ Detection water boat tree chair road

43 (1) No relationships between Pixel s (2) Pixel Groups, no relationships between groups (3) Pairwise Relationships between Pixel groups (4) Pairwise Relationships between Pixel s (5) Higher order Relationships between Pixels

44 Image Window (W) Pixel to be classified (P) Image P,W Pixel Classifier Boosting [Shotton et al, 2006] Random Decision Forests Cost for assigning label Cow

45 Image (MSRC-21) Pairwise CRF Higher order CRF Ground Truth Grass Sheep

46 [Unwrap Mosaic, Rav-Acha, Kohli, Fitzgibbon, Rother, Siggraph 08]

47

48 E(x) = f i (x i ) + g ij (x i,x j ) + h c (x c ) i ij c Unary Pairwise Higher Order x takes discrete values How to minimize E(x)?

49 NP-Hard MAXCUT CSP Space of Problems

50 Which functions are exactly solvable? Approximate solutions of NP-hard problems Scalability and Efficiency

51 Which functions are exactly solvable? Boros Hammer [1965], Kolmogorov Zabih [ECCV 2002, PAMI 2004], Ishikawa [PAMI 2003], Schlesinger [EMMCVPR 2007], Kohli Kumar Torr [CVPR2007, PAMI 2008], Ramalingam Kohli Alahari Torr [CVPR 2008], Kohli Ladicky Torr [CVPR 2008, IJCV 2009], Zivny Jeavons [CP 2008] Approximate solutions of NP-hard problems Schlesinger [1976 ], Kleinberg and Tardos [FOCS 99], Chekuri et al. [2001], Boykov et al. [PAMI 2001], Wainwright et al. [NIPS 2001], Werner [PAMI 2007], Komodakis [PAMI 2005], Lempitsky et al. [ICCV 2007], Kumar et al. [NIPS 2007], Kumar et al. [ICML 2008], Sontag and Jakkola [NIPS 2007], Kohli et al. [ICML 2008], Kohli et al. [CVPR 2008, IJCV 2009], Rother et al. [2009] Scalability and Efficiency Kohli Torr [ICCV 2005, PAMI 2007], Juan and Boykov [CVPR 2006], Alahari Kohli Torr [CVPR 2008], Delong and Boykov [CVPR 2008]

52 Dynamic Programming (DP) Belief Propagtion (BP) Tree-Reweighted (TRW), Max-sum Diffusion Message passing Graph Cut (GC) Branch & Bound Combinatorial Algorithms Relaxation methods Convex Optimization (Linear Programming)

53 NP-Hard MAXCUT CSP Tree Structured Space of Problems

54

55 E(x) = f i (x i ) + g ij (x i,x j ) i ij p q Iteratively pass messages between nodes... Message update rule? Belief propagation (BP) Tree-reweighted belief propagation (TRW) Max-product (minimizing an energy function, or MAP estimation) x* = arg max P(x) = arg min E(x) E(x) = -ln P(x) + K Sum-product (computing marginal probabilities) P(x i ) = P(x) x V-{i}

56 f (x p ) + g pq (x p,x q ) p q r L 1 j M p->q (L 1 ) = min f (x p ) + g pq (x p, L 1 ) x p = min (5+0, 1+2, 2+2) Potts Model g pq =2 ( x p x q ) M p->q (L 1,L 2,L 3 ) = (3,1,2)

57 f (x p ) + g pq (x p,x q ) p q r L 1 Potts Model g pq =2 ( x p x q )

58 M p->q + f (x q ) + g qr (x q,x r ) M q->r + f (x r ) p q r min { E(x) x r = j} x (Min-marginal) Select minimum cost label Trace back path to get minimum cost labeling Global minimum in linear time

59 leaf p leaf q r root Dynamic programming: global minimum in linear time BP: Inward pass (dynamic programming) Outward pass Gives min-marginals

60 p( xp) pq( xp, xq) p q r j

61 p( xp) pq( xp, xq) p q r j

62 p( xp) pq( xp, xq) p q r j

63 p( xp) pq( xp, xq) p q r j

64 p q r k

65 p q r

66 p q r

67 p q r

68 p q r Min-marginal for node q and label j: j

69 Pass messages using same rules Empirically often works quite well May not converge Pseudo min-marginals Gives local minimum in the tree neighborhood [Weiss&Freeman 01],[Wainwright et al. 04] Assumptions: BP has converged no ties in pseudo min-marginals

70 Naïve implementation: O(K 2 ) [K = number of labels] Often can be improved to O(K) Potts interactions, truncated linear, truncated quadratic,... j

71 + Provides a lower bound Lower Bound < E(x*) < E(x ) Slightly different messages Try to solve a LP relaxation of the MAP problem [Kolmogorov, Wainwright et al.]

72 Related to Dynamic Programming Exact on Trees (Hyper-trees) Connections to Linear programming Popular methods: BP, TRW-S, Diffusion Schedules have a significant result on the solution For more details: [Kolmogorov 06, Wainwright et al. 05]

73 n = Number of Variables Segmentation Energy NP-Hard MAXCUT CSP Pair-wise O(n3 ) Tree Structured Submodular Functions O(n 6 ) Space of Problems

74

75 Pseudo-boolean function f {0,1} n R is submodular if f(a) + f(b) f(a B) + f(a B) (OR) (AND) for all A,B ϵ {0,1} n Example: n = 2, A = [1,0], B = [0,1] f([1,0]) + f([0,1]) f([1,1]) + f([0,0]) Property : Sum of submodular functions is submodular Binary Image Segmentation Energy is submodular E(x) = c i x i + d ij x i -x j i i,j

76 Discrete Analogues of Concave Functions [Lovasz, 83] Widely applied in Operations Research Applications in Machine Learning MAP Inference in Markov Random Fields Clustering [Narasimhan, Jojic, & Bilmes, NIPS 2005] Structure Learning [Narasimhan & Bilmes, NIPS 2006] Maximizing the spread of influence through a social network [Kempe, Kleinberg & Tardos, KDD 2003]

77 Polynomial time algorithms Ellipsoid Algorithm: [Grotschel, Lovasz & Schrijver 81] First strongly polynomial algorithm: [Iwata et al. 00] [A. Schrijver 00] Current Best: O(n 5 Q + n 6 ) [Q is function evaluation time] [Orlin 07] Symmetric functions: E(x) = E(1-x) Can be minimized in O(n 3 ) Minimizing Pairwise submodular functions Can be transformed to st-mincut/max-flow [Hammer, 1965] Very low empirical running time ~ O(n) E(X) = f i (x i ) + g ij (x i,x j ) i ij

78 Source v 1 v Graph (V, E, C) Vertices V = {v 1, v 2... v n } Edges E = {(v 1, v 2 )...} Costs C = {c (1, 2)...} Sink

79 What is a st-cut? Source v 1 v Sink

80 What is a st-cut? Source An st-cut (S,T) divides the nodes between source and sink v 1 v What is the cost of a st-cut? Sum of cost of all edges going from S to T Sink = 15

81 What is a st-cut? Source v 1 v Sink = 8 An st-cut (S,T) divides the nodes between source and sink. What is the cost of a st-cut? Sum of cost of all edges going from S to T What is the st-mincut? st-cut with the minimum cost

82 Construct a graph such that: 1. Any st-cut corresponds to an assignment of x 2. The cost of the cut is equal to the energy of x : E(x) S st-mincut E(x) T Solution [Hammer, 1965] [Kolmogorov and Zabih, 2002

83 E(x) = θ i (x i ) + θ ij (x i,x j ) i i,j For all ij θ ij (0,1) + θ ij (1,0) θ ij (0,0) + θ ij (1,1) Equivalent (transformable) E(x) = c i x i + c ij x i (1-x j ) i i,j c ij 0

84 E(a 1,a 2 ) Source (0) a 1 a 2 Sink (1)

85 E(a 1,a 2 ) = 2a 1 Source (0) 2 a 1 a 2 Sink (1)

86 E(a 1,a 2 ) = 2a 1 + 5ā 1 Source (0) 2 a 1 a 2 5 Sink (1)

87 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 Source (0) 2 9 a 1 a Sink (1)

88 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 Source (0) 2 9 a 1 a Sink (1)

89 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a Source (0) a 1 a Sink (1)

90 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a Source (0) a 1 a Sink (1)

91 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a 2 Source (0) 2 9 Cost of cut = 11 1 a 1 a 2 a 1 = 1 a 2 = E (1,1) = 11 Sink (1)

92 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a 2 Source (0) a 1 a 2 st-mincut cost = 8 a 1 = 1 a 2 = E (1,0) = 8 Sink (1)

93 Solve the dual maximum flow problem Source v 1 v Sink 4 Compute the maximum flow between Source and Sink s.t. Edges: Flow < Capacity Nodes: Flow in = Flow out Min-cut\Max-flow Theorem In every network, the maximum flow equals the cost of the st-mincut Assuming non-negative capacity

94 Flow = 0 Source Augmenting Path Based Algorithms v 1 v Sink

95 Flow = 0 Source Augmenting Path Based Algorithms Find path from source to sink with positive capacity v 1 v Sink

96 Source v 1 v Flow = Augmenting Path Based Algorithms 1. Find path from source to sink with positive capacity 2. Push maximum possible flow through this path Sink

97 Source v 1 v 2 3 Flow = Augmenting Path Based Algorithms 1. Find path from source to sink with positive capacity 2. Push maximum possible flow through this path Sink

98 Source v 1 v 2 3 Flow = 2 2 Sink 4 Augmenting Path Based Algorithms 1. Find path from source to sink with positive capacity 2. Push maximum possible flow through this path 3. Repeat until no path can be found

99 Source v 1 v 2 3 Flow = 2 2 Sink 4 Augmenting Path Based Algorithms 1. Find path from source to sink with positive capacity 2. Push maximum possible flow through this path 3. Repeat until no path can be found

100 Source v 1 v 2 3 Flow = Sink 0 Augmenting Path Based Algorithms 1. Find path from source to sink with positive capacity 2. Push maximum possible flow through this path 3. Repeat until no path can be found

101 Source v 1 v 2 3 Flow = 6 2 Sink 0 Augmenting Path Based Algorithms 1. Find path from source to sink with positive capacity 2. Push maximum possible flow through this path 3. Repeat until no path can be found

102 Source v 1 v 2 3 Flow = 6 2 Sink 0 Augmenting Path Based Algorithms 1. Find path from source to sink with positive capacity 2. Push maximum possible flow through this path 3. Repeat until no path can be found

103 Source v 1 v 2 1 Flow = Sink 0 Augmenting Path Based Algorithms 1. Find path from source to sink with positive capacity 2. Push maximum possible flow through this path 3. Repeat until no path can be found

104 Flow = 8 Source Augmenting Path Based Algorithms Find path from source to sink with positive capacity v 1 v Sink 0 2. Push maximum possible flow through this path 3. Repeat until no path can be found

105 Source v 1 v 2 2 Flow = 8 3 Sink 0 Augmenting Path Based Algorithms 1. Find path from source to sink with positive capacity 2. Push maximum possible flow through this path 3. Repeat until no path can be found

106 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a Source (0) a 1 a Sink (1)

107 E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a 2 Source (0) a 1 a a 1 + 5ā 1 = 2(a 1 +ā 1 ) + 3ā 1 = 2 + 3ā 1 Sink (1)

108 E(a 1,a 2 ) = 2 + 3ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a 2 Source (0) a 1 a a 1 + 5ā 1 = 2(a 1 +ā 1 ) + 3ā 1 = 2 + 3ā 1 Sink (1)

109 E(a 1,a 2 ) = 2 + 3ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a 2 Source (0) a 1 a a 2 + 4ā 2 = 4(a 2 +ā 2 ) + 5ā 2 = 4 + 5ā 2 Sink (1)

110 E(a 1,a 2 ) = 2 + 3ā 1 + 5a a 1 ā 2 + ā 1 a 2 Source (0) a 1 a a 2 + 4ā 2 = 4(a 2 +ā 2 ) + 5ā 2 = 4 + 5ā 2 Sink (1)

111 E(a 1,a 2 ) = 6 + 3ā 1 + 5a 2 + 2a 1 ā 2 + ā 1 a Source (0) a 1 a Sink (1)

112 E(a 1,a 2 ) = 6 + 3ā 1 + 5a 2 + 2a 1 ā 2 + ā 1 a Source (0) a 1 a Sink (1)

113 E(a 1,a 2 ) = 6 + 3ā 1 + 5a 2 + 2a 1 ā 2 + ā 1 a 2 3ā 1 + 5a 2 + 2a 1 ā Source (0) a 1 a Sink (1) = 2(ā 1 +a 2 +a 1 ā 2 ) +ā 1 +3a 2 = 2(1+ā 1 a 2 ) +ā 1 +3a 2 F1 = ā 1 +a 2 +a 1 ā 2 F2 = 1+ā 1 a 2 a 1 a 2 F1 F

114 E(a 1,a 2 ) = 8 + ā 1 + 3a 2 + 3ā 1 a 2 3ā 1 + 5a 2 + 2a 1 ā Source (0) a 1 a Sink (1) = 2(ā 1 +a 2 +a 1 ā 2 ) +ā 1 +3a 2 = 2(1+ā 1 a 2 ) +ā 1 +3a 2 F1 = ā 1 +a 2 +a 1 ā 2 F2 = 1+ā 1 a 2 a 1 a 2 F1 F

115 E(a 1,a 2 ) = 8 + ā 1 + 3a 2 + 3ā 1 a 2 Source (0) a 1 a No more augmenting paths possible Sink (1)

116 E(a 1,a 2 ) = 8 + ā 1 + 3a 2 + 3ā 1 a 2 Residual Graph (positive coefficients) Total Flow bound on the optimal solution Source (0) a 1 a Sink (1) Tight Bound --> Inference of the optimal solution becomes trivial

117 E(a 1,a 2 ) = 8 + ā 1 + 3a 2 + 3ā 1 a 2 Residual Graph (positive coefficients) Total Flow bound on the energy of the optimal solution Source (0) a 1 a 2 st-mincut cost = 8 a 1 = 1 a 2 = E (1,0) = 8 Sink (1) Tight Bound --> Inference of the optimal solution becomes trivial

118 Augmenting Path and Push-Relabel n: #nodes m: #edges U: maximum edge weight Algorithms assume nonnegative edge weights [Slide credit: Andrew Goldberg]

119 Augmenting Path and Push-Relabel n: #nodes m: #edges U: maximum edge weight Algorithms assume nonnegative edge weights [Slide credit: Andrew Goldberg]

120 Ford Fulkerson: Choose any augmenting path Source a 1 a Sink

121 Ford Fulkerson: Choose any augmenting path Source a 1 a 2 0 Good Augmenting Paths Sink

122 Ford Fulkerson: Choose any augmenting path Source a 1 a 2 0 Bad Augmenting Path Sink

123 Ford Fulkerson: Choose any augmenting path Source a 1 a Sink

124 Ford Fulkerson: Choose any augmenting path n: #nodes m: #edges Source a 1 a Sink We will have to perform 2000 augmentations! Worst case complexity: O (m x Total_Flow) (Pseudo-polynomial bound: depends on flow)

125 Dinic: Choose shortest augmenting path n: #nodes m: #edges Source a 1 a Sink Worst case Complexity: O (m n 2 )

126 Specialized algorithms for vision problems Grid graphs Low connectivity (m ~ O(n)) Dual search tree augmenting path algorithm [Boykov and Kolmogorov PAMI 2004] Finds approximate shortest augmenting paths efficiently High worst-case time complexity Empirically outperforms other algorithms on vision problems Efficient code available on the web

127 E(x) = c i x i + d ij x i -x j i i,j E: {0,1} n R 0 fg 1 bg n = number of pixels x* = arg min E(x) x How to minimize E(x)? Global Minimum (x*)

128 Graph *g; For all pixels p /* Add a node to the graph */ nodeid(p) = g->add_node(); Source (0) /* Set cost of terminal edges */ set_weights(nodeid(p), fgcost(p), bgcost(p)); end for all adjacent pixels p,q add_weights(nodeid(p), nodeid(q), cost(p,q)); end g->compute_maxflow(); Sink (1) label_p = g->is_connected_to_source(nodeid(p)); // is the label of pixel p (0 or 1)

129 Graph *g; For all pixels p end /* Add a node to the graph */ nodeid(p) = g->add_node(); /* Set cost of terminal edges */ set_weights(nodeid(p), fgcost(p), bgcost(p)); Source (0) bgcost(a 1 ) bgcost(a 2 ) a 1 a 2 for all adjacent pixels p,q add_weights(nodeid(p), nodeid(q), cost(p,q)); end g->compute_maxflow(); label_p = g->is_connected_to_source(nodeid(p)); // is the label of pixel p (0 or 1) fgcost(a 1 ) fgcost(a 2 ) Sink (1)

130 Graph *g; For all pixels p end /* Add a node to the graph */ nodeid(p) = g->add_node(); /* Set cost of terminal edges */ set_weights(nodeid(p), fgcost(p), bgcost(p)); Source (0) bgcost(a 1 ) bgcost(a 2 ) cost(p,q) a 1 a 2 for all adjacent pixels p,q add_weights(nodeid(p), nodeid(q), cost(p,q)); end g->compute_maxflow(); label_p = g->is_connected_to_source(nodeid(p)); // is the label of pixel p (0 or 1) fgcost(a 1 ) fgcost(a 2 ) Sink (1)

131 Graph *g; For all pixels p end /* Add a node to the graph */ nodeid(p) = g->add_node(); /* Set cost of terminal edges */ set_weights(nodeid(p), fgcost(p), bgcost(p)); Source (0) bgcost(a 1 ) bgcost(a 2 ) cost(p,q) a 1 a 2 for all adjacent pixels p,q add_weights(nodeid(p), nodeid(q), cost(p,q)); end g->compute_maxflow(); fgcost(a 1 ) fgcost(a 2 ) Sink (1) label_p = g->is_connected_to_source(nodeid(p)); // is the label of pixel p (0 or 1) a 1 = bg a 2 = fg

132

133 Mixed (Real-Integer) Problems Higher Order Energy Functions Multi-label Problems Ordered Labels Stereo (depth labels) Unordered Labels Object segmentation ( car, `road, `person )

134 Mixed (Real-Integer) Problems Higher Order Energy Functions Multi-label Problems Ordered Labels Stereo (depth labels) Unordered Labels Object segmentation ( car, `road, `person )

135 We have seen several of them in the intro... x binary image segmentation (x i {0,1}) ω non-local parameter (lives in some large set Ω) E(x,ω) = C(ω) + θ i (ω, x i ) + θ ij (ω,x i,x j ) i i,j constant unary potentials pairwise potentials 0 ω Template Position Scale Orientation

136 x binary image segmentation (x i {0,1}) ω non-local parameter (lives in some large set Ω) E(x,ω) = C(ω) + θ i (ω, x i ) + θ ij (ω,x i,x j ) i i,j constant unary potentials pairwise potentials 0 {x*,ω*} = arg min E(x,ω) x,ω Standard graph cut energy if ω is fixed [Kohli et al, 06,08] [Lempitsky et al, 08]

137 Local Method: Gradient Descent over ω ω * = arg min min E (x,ω) x ω Submodular ω* [Kohli et al, 06,08]

138 Local Method: Gradient Descent over ω ω * = arg min min E (x,ω) x ω Submodular E (x,ω 1 ) E (x,ω 2 ) Similar Energy Functions Dynamic Graph Cuts time speedup! [Kohli et al, 06,08]

139 Global Method: Branch and Mincut Produces the global optimal solution Exhaustively explores all ω in Ω in the worst case 30,000,000 shapes [Lempitsky et al, 08]

140 Mixed (Real-Integer) Problems Higher Order Energy Functions Multi-label Problems Ordered Labels Stereo (depth labels) Unordered Labels Object segmentation ( car, `road, `person )

141 n = number of pixels E: {0,1} n R 0 fg, 1 bg E(X) = c i x i + d ij x i -x j i i,j Image Unary Cost Segmentation [Boykov and Jolly 01] [Blake et al. 04] [Rother, Kolmogorov and Blake `04]

142 [Kohli et al. 07] Patch Dictionary (Tree) h(x p ) = { C 1 if x i = 0, i ϵ p C max otherwise C max C 1 p

143 n = number of pixels E: {0,1} n R 0 fg, 1 bg E(X) = c i x i + d ij x i -x j + h p (X p ) i i,j p h(x p ) = { C 1 if x i = 0, i ϵ p C max otherwise p [Kohli et al. 07]

144 n = number of pixels E: {0,1} n R 0 fg, 1 bg E(X) = c i x i + d ij x i -x j + h p (X p ) i i,j p Image Pairwise Segmentation Final Segmentation [Kohli et al. 07]

145 Higher Order Submodular Functions Exact Transformation? Pairwise Submodular Function Billionnet and M. Minoux [DAM 1985] Kolmogorov & Zabih [PAMI 2004] Freedman & Drineas [CVPR2005] Kohli Kumar Torr [CVPR2007, PAMI 2008] Kohli Ladicky Torr [CVPR 2008, IJCV 2009] Ramalingam Kohli Alahari Torr [CVPR 2008] Zivny et al. [CP 2008] S T st-mincut

146 Higher Order Function Pairwise Submodular Function Identified transformable families of higher order function s.t. 1. Constant or polynomial number of auxiliary variables (a) added 2. All pairwise functions (g) are submodular

147 Example: H (X) = F ( x i ) concave H (X) 0 x i [Kohli et al. 08]

148 Simple Example using Auxiliary variables f(x) = { 0 if all x i = 0 C 1 otherwise x ϵ L = {0,1} n min f(x) x = x,a ϵ {0,1} Higher Order Submodular Function min C 1 a + C 1 ā x i Quadratic Submodular Function x i < 1 a=0 (ā=1) f(x) = 0 x i 1 a=1 (ā=0) f(x) = C 1

149 min f(x) x = Higher Order Submodular Function min C 1 a + C 1 ā x i x,a ϵ {0,1} Quadratic Submodular Function C 1 x i C x i [Kohli et al. 08]

150 min f(x) x = Higher Order Submodular Function min C 1 a + C 1 ā x i x,a ϵ {0,1} Quadratic Submodular Function C 1 x i C 1 a=0 a=1 Lower envelop of concave functions is concave x i [Kohli et al. 08]

151 min f(x) x = Higher Order Submodular Function min f 1 (x)a + f 2 (x)ā x,a ϵ {0,1} Quadratic Submodular Function f 2 (x) f 1 (x) Lower envelop of concave functions is concave x i [Kohli et al. 08]

152 min f(x) x = Higher Order Submodular Function min f 1 (x)a + f 2 (x)ā x,a ϵ {0,1} Quadratic Submodular Function f 2 (x) a=0 f 1 (x) a=1 Lower envelop of concave functions is concave x i [Kohli et al. 08]

153 min f(x) x = x ϵ {0,1} Higher Order Submodular Function min max f 1 (x)a + f 2 (x)ā a ϵ {0,1} Quadratic Submodular Function f 2 (x) a=0 f 1 (x) a=1 Upper envelope of linear functions is convex Very Hard x i Problem!!!! [Kohli and Kumar, CVPR 2010]

154 SILHOUETTE CONSTRAINTS SIZE/VOLUME PRIORS [Sinha et al. 05, Cremers et al. 08] [Woodford et al. 2009] 3D RECONSTRUCTION BINARY SEGMENTATION Rays must touch silhouette at least once Object Background Prior on size of object (say n/2) [Kohli and Kumar, CVPR 2010]

155 Mixed (Real-Integer) Problems Higher Order Energy Functions Multi-label Problems Ordered Labels Stereo (depth labels) Unordered Labels Object segmentation ( car, `road, `person )

156 Min y E(y) = f i (y i ) + g ij (y i,y j ) i i,j y ϵ Labels L = {l 1, l 2,, l k } Exact Transformation to QPBF [Roy and Cox 98] [Ishikawa 03] [Schlesinger & Flach 06] [Ramalingam, Alahari, Kohli, and Torr 08] Move making algorithms

157 So what is the problem? E m (y 1,y 2,..., y n ) E b (x 1,x 2,..., x m ) y i ϵ L = {l 1, l 2,, l k } x i ϵ L = {0,1} Multi-label Problem Binary label Problem such that: Let Y and X be the set of feasible solutions, then 1. One-One encoding function T:X->Y 2. arg min E m (y) = T(arg min E b (x))

158 Popular encoding scheme [Roy and Cox 98, Ishikawa 03, Schlesinger & Flach 06] # Nodes = n * k # Edges = m * k 2

159 Popular encoding scheme [Roy and Cox 98, Ishikawa 03, Schlesinger & Flach 06] Ishikawa s result: E(y) = θ i (y i ) + θ ij (y i,y j ) i i,j y ϵ Labels L = {l 1, l 2,, l k } # Nodes = n * k # Edges = m * k 2 θ ij (y i,y j ) = g( y i -y j ) g( y i -y j ) y i -y j Convex Function

160 Popular encoding scheme [Roy and Cox 98, Ishikawa 03, Schlesinger & Flach 06] Schlesinger & Flach 06: E(y) = θ i (y i ) + θ ij (y i,y j ) i i,j y ϵ Labels L = {l 1, l 2,, l k } θ ij (l i+1,l j ) + θ ij (l i,l j+1 ) θ ij (l i,l j ) + θ ij (l i+1,l j+1 ) # Nodes = n * k # Edges = m * k 2 l i +1 l i l j +1 l j

161 Image MAP Solution Scanline algorithm [Roy and Cox, 98]

162 Applicability Cannot handle truncated costs (non-robust) θ ij (y i,y j ) = g( y i -y j ) y i -y j Computational Cost Very high computational cost Problem size = Variables x Labels Gray level image denoising (1 Mpixel image) (~2.5 x 10 8 graph nodes) discontinuity preserving potentials Blake&Zisserman 83,87

163 Other less known algorithms Ishikawa Transformation [03] Schlesinger Transformation [06] Hochbaum [01] Unary Potentials Arbitrary Pair-wise Potentials Convex and Symmetric Complexity T(nk, mk 2 ) Arbitrary Submodular T(nk, mk 2 ) Linear Convex and Symmetric T(n, m) + n log k Hochbaum [01] Convex Convex and Symmetric O(mn log n log nk) T(a,b) = complexity of maxflow with a nodes and b edges

164 Min y E(y) = f i (y i ) + g ij (y i,y j ) i i,j y ϵ Labels L = {l 1, l 2,, l k } Exact Transformation to QPBF Move making algorithms [Boykov, Veksler and Zabih 2001] [Woodford, Fitzgibbon, Reid, Torr, 2008] [Lempitsky, Rother, Blake, 2008] [Veksler, 2008] [Kohli, Ladicky, Torr 2008]

165 Energy Solution Space

166 Energy Current Solution Search Neighbourhood Optimal Move Solution Space

167 Energy Current Solution Search Neighbourhood (t) x c Optimal Move Key Property Move Space Solution Space Bigger move space Better solutions Finding the optimal move hard

168 Minimizing Pairwise Functions [Boykov Veksler and Zabih, PAMI 2001] Series of locally optimal moves Each move reduces energy Optimal move by minimizing submodular function Current Solution Move Space (t) : 2 n Space of Solutions (x) : L n n L Search Neighbourhood Number of Variables Number of Labels Extend to minimize Higher order Functions Kohli et al. 07, 08, 09

169 New solution x = t x 1 + (1- t) x 2 Current Solution Second solution E m (t) = E(t x 1 + (1- t) x 2 ) Minimize over move variables t For certain x 1 and x 2, the move energy is sub-modular QPBF [Boykov, Veksler and Zabih 2001]

170 Variables labeled α, β can swap their labels [Boykov, Veksler and Zabih 2001]

171 Variables labeled α, β can swap their labels Swap Sky, House Tree Ground House Sky [Boykov, Veksler and Zabih 2001]

172 Variables labeled α, β can swap their labels Move energy is submodular if: Unary Potentials: Arbitrary Pairwise potentials: Semi-metric θ ij (l a,l b ) 0 θ ij (l a,l b ) = 0 a = b Examples: Potts model, Truncated Convex [Boykov, Veksler and Zabih 2001]

173 Variables take label a or retain current label [Boykov, Veksler and Zabih 2001] [Boykov, Veksler, Zabih]

174 Variables take label a or retain current label Status: Initialize Expand Ground Sky House with Tree Tree Ground House Sky [Boykov,, Veksler Veksler, and Zabih] 2001]

175 Variables take label a or retain current label Move energy is submodular if: Unary Potentials: Arbitrary Pairwise potentials: Metric Semi metric + Triangle Inequality θ ij (l a,l b ) + θ ij (l b,l c ) θ ij (l a,l c ) Examples: Potts model, Truncated linear Cannot solve truncated quadratic [Boykov, Veksler and Zabih 2001] [Boykov, Veksler, Zabih]

176 Expansion and Swap can be derived as a primal dual scheme Get solution of the dual problem which is a lower bound on the energy of solution Weak guarantee on the solution E(x) < 2(d max /d min ) E(x*) θ ij (l i,l j ) = g( l i -l j ) d max d min y i -y j [Komodakis et al 05, 07]

177 x = t x 1 + (1-t) x 2 New solution First solution Second solution Minimize over move variables t Move Type First Solution Second Solution Guarantee Expansion Old solution All alpha Metric Fusion Any solution Any solution Move functions can be non-submodular!!

178 x = t x 1 + (1-t) x 2 x 1, x 2 can be continuous Optical Flow Example Solution from Method 2 x 2 Solution from Method 1 x 1 F x Final Solution [Woodford, Fitzgibbon, Reid, Torr, 2008] [Lempitsky, Rother, Blake, 2008]

179 Move variables can be multi-label x = (t ==1) x 1 + (t==2) x 2 + +(t==k) x k Optimal move found out by using the Ishikawa Transform Useful for minimizing energies with truncated convex pairwise potentials T θ ij (y i,y j ) = min( y i -y j 2,T) θ ij (y i,y j ) y i -y j [Kumar and Torr, 2008] [Veksler, 2008]

180 Image Noisy Image Expansion Move Range Moves [Veksler, 2008]

181

182 3,600,000,000 Pixels Created from about MegaPixel Images [Kopf et al. (MSR Redmond) SIGGRAPH 2007 ]

183 [Kopf et al. (MSR Redmond) SIGGRAPH 2007 ]

184 Processing Videos 1 minute video of 1M pixel resolution 3.6 B pixels 3D reconstruction [500 x 500 x 500 =.125B voxels]

185 Segment First Frame Can we do better? Segment Second Frame Kohli [Kohli & Torr & Torr, (ICCV05, PAMI07) PAMI07]

186 Image Flow Segmentation Kohli [Kohli & Torr & Torr, (ICCV05, PAMI07) PAMI07]

187 Frame 1 E A minimize S A Reuse Computation differences between A and B E B* Simpler fast Frame 2 E B minimize S B time speedup! Reparametrization [Kohli & Torr, ICCV05 PAMI07] [Komodakis & Paragios, CVPR07]

188 Original Energy E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a 2 Reparametrized Energy E(a 1,a 2 ) = 8 + ā 1 + 3a 2 + 3ā 1 a 2 New Energy E(a 1,a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 7a 1 ā 2 + ā 1 a 2 New Reparametrized Energy E(a 1,a 2 ) = 8 + ā 1 + 3a 2 + 3ā 1 a 2 + 5a 1 ā 2 [Kohli & Torr, ICCV05 PAMI07] Kohli & [Komodakis Torr (ICCV05, & Paragios, PAMI07) CVPR07]

189 Original Problem (Large) Approximation algorithm (Slow) Approximate Solution Fast partially optimal algorithm [ Alahari Kohli & Torr CVPR 08]

190 Original Problem (Large) Approximation algorithm (Slow) Approximate Solution Fast partially optimal algorithm [Kovtun 03] Fast partially optimal algorithm [ Kohli et al. 09] Reduced Problem Solved Problem (Global Optima) Approximation algorithm (Fast) Approximate Solution [ Alahari Kohli & Torr CVPR 08]

191 3-100 Times Speed up Fast partially optimal algorithm [Kovtun 03] Fast partially optimal algorithm [ Kohli et al. 09] Original Problem (Large) sky Reduced Problem Building Solved Airplane Problem (Global Optima) Grass Tree Reweighted Message Passing (9.89 sec) Tree Reweighted Message Passing Total Time (0.30 sec) sky Building Approximate Airplane Solution Grass sky Building Approximate Airplane Solution Grass [ Alahari Kohli & Torr CVPR 08]

192 Message Passing algorithms Belief propagation, TRW Exact on tree structured graphs Connections to Linear programming relaxations Graph Cuts and submodular functions Move making algorithms expansion, swap, fusion Minimizing Higher order functions

193 Going beyond small connectivity Representation, Inference and learning of Higher order models Topology constraints: connectivity of objects Parallelization GPUs Linear programming formulations

194

195 Past presenters Pawan, Carsten and Andrew Blake Special thanks to the IbPRIA team In particular: Joost Van de Weijer Jordi Gonzàlez Mario Hernández Tejera

196

197

198 Set of Label assignments Associated Costs p 1 p 2 p 3 Introduce pattern variable z with (t+1)-states to get a pairwise function: Unary term assigns the cost for the selected label assignment z Pairwise terms ensure x is consistent with pattern z

199 E(x) = θ i (x i, z i ) + θ ij (x i,x j,z i,z j ) i i,j Є N 4 θ ij (x i,x j,z i,z j ) = x i -x j (-exp{-ß z i -z j }) θ ij ß=2(Mean( z i -z j 2 ) ) -1 Conditional Random Field (CRF): no pure prior z i -z j z i z j Observed variable Unobserved (latent) variable x j Factor graph x j i MRF CRF

200 unary Pair-wise E(x) = θ i (x i,z i ) + w θ ij (x i,x j ) x i R Convex unary and pairwise terms: θ ij (x i,x j ) = g( x i -x j ) θ i (x i,z i ) = x i -z i Can be solved globally optimal x i -x j x i original input TRW-S (discrete labels) HBF [Szelsiki 06] (continuous labels) ~ 15times faster

201 Non-submodular Energy Functions Mixed (Real-Integer) Problems Higher Order Energy Functions Multi-label Problems Ordered Labels Stereo (depth labels) Unordered Labels Object segmentation ( car, `road, `person )

202 E(x) = θ i (x i ) + θ ij (x i,x j ) i i,j θ ij (0,1) + θ ij (1,0) θ ij (0,0) + θ ij (1,1) for some ij Minimizing general non-submodular functions is NPhard. Commonly used method is to solve a relaxation of the problem [Boros and Hammer, 02]

203

204 Integer Program,

205 , Linear Program Feasibility Region (LOCAL Polytope)

206 Formulate the program Solve LP and obtain Integer solution by rounding Round Integral solution is not guaranteed to be optimal However, we can obtain multiplicative bounds on the error Feasibility Region (LOCAL Polytope)

207 Traditional LP Solvers are unable to handle large instances ( >10 7 variables) Simplex Interior Point Methods [Bertsimas and Tsitsikilis, 1997] [Boyd and Vandenberghe, 2004] Exploit the structure of the problem Special Message Passing Solvers Message Passing Algorithms Weak convergence properties [Kolmogorov and Wainwright] [Ravikumar et al., ICML 2008] Feasibility Region (LOCAL Polytope)

208 A new weaker relaxation...roof-dual [Boros and Hammer, 2002] [Kohli et al., ICML 2008] [Kolmogorov and Rother, PAMI 2006] x... which can be exactly solved by minimizing a pairwise submodular function Obtain partially optimal solutions x = [00*10**1111*] x OPT = [ ] Feasibility Region (LOCAL Polytope)

209 unary ~ pq ( 0,0) pq(1,1) pq(0,1) pq(1,0) pairwise submodular pq ~ (0,0) pq (1,1) ~ pq ~ (0,1) pairwise nonsubmodular pq (1,0) [Boros and Hammer, 02]

210 Double number of variables: [Boros and Hammer, 02]

211 Double number of variables: x x, x x 1 x ) p p p ( p p [Boros and Hammer, 02] Ignore constraint and solve

212 Double number of variables: x x, x x 1 x ) p p p ( p p Local Optimality [Boros and Hammer, 02]

Discrete Inference and Learning Lecture 3

Discrete Inference and Learning Lecture 3 Discrete Inference and Learning Lecture 3 MVA 2017 2018 h

More information

Pushmeet Kohli Microsoft Research

Pushmeet Kohli Microsoft Research Pushmeet Kohli Microsoft Research E(x) x in {0,1} n Image (D) [Boykov and Jolly 01] [Blake et al. 04] E(x) = c i x i Pixel Colour x in {0,1} n Unary Cost (c i ) Dark (Bg) Bright (Fg) x* = arg min E(x)

More information

MAP Estimation Algorithms in Computer Vision - Part II

MAP Estimation Algorithms in Computer Vision - Part II MAP Estimation Algorithms in Comuter Vision - Part II M. Pawan Kumar, University of Oford Pushmeet Kohli, Microsoft Research Eamle: Image Segmentation E() = c i i + c ij i (1- j ) i i,j E: {0,1} n R 0

More information

A note on the primal-dual method for the semi-metric labeling problem

A note on the primal-dual method for the semi-metric labeling problem A note on the primal-dual method for the semi-metric labeling problem Vladimir Kolmogorov vnk@adastral.ucl.ac.uk Technical report June 4, 2007 Abstract Recently, Komodakis et al. [6] developed the FastPD

More information

Part 7: Structured Prediction and Energy Minimization (2/2)

Part 7: Structured Prediction and Energy Minimization (2/2) Part 7: Structured Prediction and Energy Minimization (2/2) Colorado Springs, 25th June 2011 G: Worst-case Complexity Hard problem Generality Optimality Worst-case complexity Integrality Determinism G:

More information

Graph Cut based Inference with Co-occurrence Statistics. Ľubor Ladický, Chris Russell, Pushmeet Kohli, Philip Torr

Graph Cut based Inference with Co-occurrence Statistics. Ľubor Ladický, Chris Russell, Pushmeet Kohli, Philip Torr Graph Cut based Inference with Co-occurrence Statistics Ľubor Ladický, Chris Russell, Pushmeet Kohli, Philip Torr Image labelling Problems Assign a label to each image pixel Geometry Estimation Image Denoising

More information

Part 6: Structured Prediction and Energy Minimization (1/2)

Part 6: Structured Prediction and Energy Minimization (1/2) Part 6: Structured Prediction and Energy Minimization (1/2) Providence, 21st June 2012 Prediction Problem Prediction Problem y = f (x) = argmax y Y g(x, y) g(x, y) = p(y x), factor graphs/mrf/crf, g(x,

More information

Markov Random Fields for Computer Vision (Part 1)

Markov Random Fields for Computer Vision (Part 1) Markov Random Fields for Computer Vision (Part 1) Machine Learning Summer School (MLSS 2011) Stephen Gould stephen.gould@anu.edu.au Australian National University 13 17 June, 2011 Stephen Gould 1/23 Pixel

More information

Making the Right Moves: Guiding Alpha-Expansion using Local Primal-Dual Gaps

Making the Right Moves: Guiding Alpha-Expansion using Local Primal-Dual Gaps Making the Right Moves: Guiding Alpha-Expansion using Local Primal-Dual Gaps Dhruv Batra TTI Chicago dbatra@ttic.edu Pushmeet Kohli Microsoft Research Cambridge pkohli@microsoft.com Abstract 5 This paper

More information

Discrete Optimization Lecture 5. M. Pawan Kumar

Discrete Optimization Lecture 5. M. Pawan Kumar Discrete Optimization Lecture 5 M. Pawan Kumar pawan.kumar@ecp.fr Exam Question Type 1 v 1 s v 0 4 2 1 v 4 Q. Find the distance of the shortest path from s=v 0 to all vertices in the graph using Dijkstra

More information

On Partial Optimality in Multi-label MRFs

On Partial Optimality in Multi-label MRFs Pushmeet Kohli 1 Alexander Shekhovtsov 2 Carsten Rother 1 Vladimir Kolmogorov 3 Philip Torr 4 pkohli@microsoft.com shekhovt@cmp.felk.cvut.cz carrot@microsoft.com vnk@adastral.ucl.ac.uk philiptorr@brookes.ac.uk

More information

Discrete Optimization in Machine Learning. Colorado Reed

Discrete Optimization in Machine Learning. Colorado Reed Discrete Optimization in Machine Learning Colorado Reed [ML-RCC] 31 Jan 2013 1 Acknowledgements Some slides/animations based on: Krause et al. tutorials: http://www.submodularity.org Pushmeet Kohli tutorial:

More information

Higher-Order Clique Reduction Without Auxiliary Variables

Higher-Order Clique Reduction Without Auxiliary Variables Higher-Order Clique Reduction Without Auxiliary Variables Hiroshi Ishikawa Department of Computer Science and Engineering Waseda University Okubo 3-4-1, Shinjuku, Tokyo, Japan hfs@waseda.jp Abstract We

More information

Energy Minimization via Graph Cuts

Energy Minimization via Graph Cuts Energy Minimization via Graph Cuts Xiaowei Zhou, June 11, 2010, Journal Club Presentation 1 outline Introduction MAP formulation for vision problems Min-cut and Max-flow Problem Energy Minimization via

More information

Generalized Roof Duality for Pseudo-Boolean Optimization

Generalized Roof Duality for Pseudo-Boolean Optimization Generalized Roof Duality for Pseudo-Boolean Optimization Fredrik Kahl Petter Strandmark Centre for Mathematical Sciences, Lund University, Sweden {fredrik,petter}@maths.lth.se Abstract The number of applications

More information

On Partial Optimality in Multi-label MRFs

On Partial Optimality in Multi-label MRFs On Partial Optimality in Multi-label MRFs P. Kohli 1 A. Shekhovtsov 2 C. Rother 1 V. Kolmogorov 3 P. Torr 4 1 Microsoft Research Cambridge 2 Czech Technical University in Prague 3 University College London

More information

Rounding-based Moves for Semi-Metric Labeling

Rounding-based Moves for Semi-Metric Labeling Rounding-based Moves for Semi-Metric Labeling M. Pawan Kumar, Puneet K. Dokania To cite this version: M. Pawan Kumar, Puneet K. Dokania. Rounding-based Moves for Semi-Metric Labeling. Journal of Machine

More information

A Graph Cut Algorithm for Higher-order Markov Random Fields

A Graph Cut Algorithm for Higher-order Markov Random Fields A Graph Cut Algorithm for Higher-order Markov Random Fields Alexander Fix Cornell University Aritanan Gruber Rutgers University Endre Boros Rutgers University Ramin Zabih Cornell University Abstract Higher-order

More information

Higher-Order Clique Reduction in Binary Graph Cut

Higher-Order Clique Reduction in Binary Graph Cut CVPR009: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Miami Beach, Florida. June 0-5, 009. Hiroshi Ishikawa Nagoya City University Department of Information and Biological

More information

Intelligent Systems:

Intelligent Systems: Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition

More information

Minimizing Count-based High Order Terms in Markov Random Fields

Minimizing Count-based High Order Terms in Markov Random Fields EMMCVPR 2011, St. Petersburg Minimizing Count-based High Order Terms in Markov Random Fields Thomas Schoenemann Center for Mathematical Sciences Lund University, Sweden Abstract. We present a technique

More information

Truncated Max-of-Convex Models Technical Report

Truncated Max-of-Convex Models Technical Report Truncated Max-of-Convex Models Technical Report Pankaj Pansari University of Oxford The Alan Turing Institute pankaj@robots.ox.ac.uk M. Pawan Kumar University of Oxford The Alan Turing Institute pawan@robots.ox.ac.uk

More information

Energy minimization via graph-cuts

Energy minimization via graph-cuts Energy minimization via graph-cuts Nikos Komodakis Ecole des Ponts ParisTech, LIGM Traitement de l information et vision artificielle Binary energy minimization We will first consider binary MRFs: Graph

More information

Variational Inference (11/04/13)

Variational Inference (11/04/13) STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further

More information

A Generative Perspective on MRFs in Low-Level Vision Supplemental Material

A Generative Perspective on MRFs in Low-Level Vision Supplemental Material A Generative Perspective on MRFs in Low-Level Vision Supplemental Material Uwe Schmidt Qi Gao Stefan Roth Department of Computer Science, TU Darmstadt 1. Derivations 1.1. Sampling the Prior We first rewrite

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models David Sontag New York University Lecture 6, March 7, 2013 David Sontag (NYU) Graphical Models Lecture 6, March 7, 2013 1 / 25 Today s lecture 1 Dual decomposition 2 MAP inference

More information

Potts model, parametric maxflow and k-submodular functions

Potts model, parametric maxflow and k-submodular functions 2013 IEEE International Conference on Computer Vision Potts model, parametric maxflow and k-submodular functions Igor Gridchyn IST Austria igor.gridchyn@ist.ac.at Vladimir Kolmogorov IST Austria vnk@ist.ac.at

More information

Convergent Tree-reweighted Message Passing for Energy Minimization

Convergent Tree-reweighted Message Passing for Energy Minimization Accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2006. Convergent Tree-reweighted Message Passing for Energy Minimization Vladimir Kolmogorov University College London,

More information

MAP Examples. Sargur Srihari

MAP Examples. Sargur Srihari MAP Examples Sargur srihari@cedar.buffalo.edu 1 Potts Model CRF for OCR Topics Image segmentation based on energy minimization 2 Examples of MAP Many interesting examples of MAP inference are instances

More information

Transformation of Markov Random Fields for Marginal Distribution Estimation

Transformation of Markov Random Fields for Marginal Distribution Estimation Transformation of Markov Random Fields for Marginal Distribution Estimation Masaki Saito Takayuki Okatani Tohoku University, Japan {msaito, okatani}@vision.is.tohoku.ac.jp Abstract This paper presents

More information

Decision Tree Fields

Decision Tree Fields Sebastian Nowozin, arsten Rother, Shai agon, Toby Sharp, angpeng Yao, Pushmeet Kohli arcelona, 8th November 2011 Introduction Random Fields in omputer Vision Markov Random Fields (MRF) (Kindermann and

More information

Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing

Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing Huayan Wang huayanw@cs.stanford.edu Computer Science Department, Stanford University, Palo Alto, CA 90 USA Daphne Koller koller@cs.stanford.edu

More information

Quadratic Programming Relaxations for Metric Labeling and Markov Random Field MAP Estimation

Quadratic Programming Relaxations for Metric Labeling and Markov Random Field MAP Estimation Quadratic Programming Relaations for Metric Labeling and Markov Random Field MAP Estimation Pradeep Ravikumar John Lafferty School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213,

More information

Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm

Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm Dhruv Batra Sebastian Nowozin Pushmeet Kohli TTI-Chicago Microsoft Research Cambridge Microsoft Research Cambridge

More information

Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm

Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm Dhruv Batra Sebastian Nowozin Pushmeet Kohli TTI-Chicago Microsoft Research Cambridge Microsoft Research Cambridge

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models David Sontag New York University Lecture 4, February 16, 2012 David Sontag (NYU) Graphical Models Lecture 4, February 16, 2012 1 / 27 Undirected graphical models Reminder

More information

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Summary of Class Advanced Topics Dhruv Batra Virginia Tech HW1 Grades Mean: 28.5/38 ~= 74.9%

More information

Pseudo-Bound Optimization for Binary Energies

Pseudo-Bound Optimization for Binary Energies In European Conference on Computer Vision (ECCV), Zurich, Switzerland, 2014 p. 1 Pseudo-Bound Optimization for Binary Energies Meng Tang 1 Ismail Ben Ayed 1,2 Yuri Boykov 1 1 University of Western Ontario,

More information

Graph Cut based Inference with Co-occurrence Statistics

Graph Cut based Inference with Co-occurrence Statistics Graph Cut based Inference with Co-occurrence Statistics Lubor Ladicky 1,3, Chris Russell 1,3, Pushmeet Kohli 2, and Philip H.S. Torr 1 1 Oxford Brookes 2 Microsoft Research Abstract. Markov and Conditional

More information

Higher-Order Energies for Image Segmentation

Higher-Order Energies for Image Segmentation IEEE TRANSACTIONS ON IMAGE PROCESSING 1 Higher-Order Energies for Image Segmentation Jianbing Shen, Senior Member, IEEE, Jianteng Peng, Xingping Dong, Ling Shao, Senior Member, IEEE, and Fatih Porikli,

More information

A Combined LP and QP Relaxation for MAP

A Combined LP and QP Relaxation for MAP A Combined LP and QP Relaxation for MAP Patrick Pletscher ETH Zurich, Switzerland pletscher@inf.ethz.ch Sharon Wulff ETH Zurich, Switzerland sharon.wulff@inf.ethz.ch Abstract MAP inference for general

More information

Maximum Persistency via Iterative Relaxed Inference with Graphical Models

Maximum Persistency via Iterative Relaxed Inference with Graphical Models Maximum Persistency via Iterative Relaxed Inference with Graphical Models Alexander Shekhovtsov TU Graz, Austria shekhovtsov@icg.tugraz.at Paul Swoboda Heidelberg University, Germany swoboda@math.uni-heidelberg.de

More information

Semi-Markov/Graph Cuts

Semi-Markov/Graph Cuts Semi-Markov/Graph Cuts Alireza Shafaei University of British Columbia August, 2015 1 / 30 A Quick Review For a general chain-structured UGM we have: n n p(x 1, x 2,..., x n ) φ i (x i ) φ i,i 1 (x i, x

More information

Quadratization of symmetric pseudo-boolean functions

Quadratization of symmetric pseudo-boolean functions of symmetric pseudo-boolean functions Yves Crama with Martin Anthony, Endre Boros and Aritanan Gruber HEC Management School, University of Liège Liblice, April 2013 of symmetric pseudo-boolean functions

More information

Tightness of LP Relaxations for Almost Balanced Models

Tightness of LP Relaxations for Almost Balanced Models Tightness of LP Relaxations for Almost Balanced Models Adrian Weller University of Cambridge AISTATS May 10, 2016 Joint work with Mark Rowland and David Sontag For more information, see http://mlg.eng.cam.ac.uk/adrian/

More information

Submodularity in Machine Learning

Submodularity in Machine Learning Saifuddin Syed MLRG Summer 2016 1 / 39 What are submodular functions Outline 1 What are submodular functions Motivation Submodularity and Concavity Examples 2 Properties of submodular functions Submodularity

More information

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials by Phillip Krahenbuhl and Vladlen Koltun Presented by Adam Stambler Multi-class image segmentation Assign a class label to each

More information

Introduction To Graphical Models

Introduction To Graphical Models Peter Gehler Introduction to Graphical Models Introduction To Graphical Models Peter V. Gehler Max Planck Institute for Intelligent Systems, Tübingen, Germany ENS/INRIA Summer School, Paris, July 2013

More information

Course 16:198:520: Introduction To Artificial Intelligence Lecture 9. Markov Networks. Abdeslam Boularias. Monday, October 14, 2015

Course 16:198:520: Introduction To Artificial Intelligence Lecture 9. Markov Networks. Abdeslam Boularias. Monday, October 14, 2015 Course 16:198:520: Introduction To Artificial Intelligence Lecture 9 Markov Networks Abdeslam Boularias Monday, October 14, 2015 1 / 58 Overview Bayesian networks, presented in the previous lecture, are

More information

MANY problems in computer vision, such as segmentation,

MANY problems in computer vision, such as segmentation, 134 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 33, NO. 6, JUNE 011 Transformation of General Binary MRF Minimization to the First-Order Case Hiroshi Ishikawa, Member, IEEE Abstract

More information

Joint Optimization of Segmentation and Appearance Models

Joint Optimization of Segmentation and Appearance Models Joint Optimization of Segmentation and Appearance Models David Mandle, Sameep Tandon April 29, 2013 David Mandle, Sameep Tandon (Stanford) April 29, 2013 1 / 19 Overview 1 Recap: Image Segmentation 2 Optimization

More information

Multiresolution Graph Cut Methods in Image Processing and Gibbs Estimation. B. A. Zalesky

Multiresolution Graph Cut Methods in Image Processing and Gibbs Estimation. B. A. Zalesky Multiresolution Graph Cut Methods in Image Processing and Gibbs Estimation B. A. Zalesky 1 2 1. Plan of Talk 1. Introduction 2. Multiresolution Network Flow Minimum Cut Algorithm 3. Integer Minimization

More information

Undirected Graphical Models: Markov Random Fields

Undirected Graphical Models: Markov Random Fields Undirected Graphical Models: Markov Random Fields 40-956 Advanced Topics in AI: Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2015 Markov Random Field Structure: undirected

More information

On the optimality of tree-reweighted max-product message-passing

On the optimality of tree-reweighted max-product message-passing Appeared in Uncertainty on Artificial Intelligence, July 2005, Edinburgh, Scotland. On the optimality of tree-reweighted max-product message-passing Vladimir Kolmogorov Microsoft Research Cambridge, UK

More information

Junction Tree, BP and Variational Methods

Junction Tree, BP and Variational Methods Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,

More information

Probabilistic Graphical Models & Applications

Probabilistic Graphical Models & Applications Probabilistic Graphical Models & Applications Learning of Graphical Models Bjoern Andres and Bernt Schiele Max Planck Institute for Informatics The slides of today s lecture are authored by and shown with

More information

Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013

Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013 Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013 Outline Modeling Inference Training Applications Outline Modeling Problem definition Discriminative vs. Generative Chain CRF General

More information

Markov Random Fields

Markov Random Fields Markov Random Fields Umamahesh Srinivas ipal Group Meeting February 25, 2011 Outline 1 Basic graph-theoretic concepts 2 Markov chain 3 Markov random field (MRF) 4 Gauss-Markov random field (GMRF), and

More information

Efficient Inference with Cardinality-based Clique Potentials

Efficient Inference with Cardinality-based Clique Potentials Rahul Gupta IBM Research Lab, New Delhi, India Ajit A. Diwan Sunita Sarawagi IIT Bombay, India Abstract Many collective labeling tasks require inference on graphical models where the clique potentials

More information

THE topic of this paper is the following problem:

THE topic of this paper is the following problem: 1 Revisiting the Linear Programming Relaxation Approach to Gibbs Energy Minimization and Weighted Constraint Satisfaction Tomáš Werner Abstract We present a number of contributions to the LP relaxation

More information

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,

More information

Reformulations of nonlinear binary optimization problems

Reformulations of nonlinear binary optimization problems 1 / 36 Reformulations of nonlinear binary optimization problems Yves Crama HEC Management School, University of Liège, Belgium Koper, May 2018 2 / 36 Nonlinear 0-1 optimization Outline 1 Nonlinear 0-1

More information

CSCI-567: Machine Learning (Spring 2019)

CSCI-567: Machine Learning (Spring 2019) CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March

More information

Part III. for energy minimization

Part III. for energy minimization ICCV 2007 tutorial Part III Message-assing algorithms for energy minimization Vladimir Kolmogorov University College London Message assing ( E ( (,, Iteratively ass messages between nodes... Message udate

More information

Superdifferential Cuts for Binary Energies

Superdifferential Cuts for Binary Energies Superdifferential Cuts for Binary Energies Tatsunori Taniai University of Tokyo, Japan Yasuyuki Matsushita Osaka University, Japan Takeshi Naemura University of Tokyo, Japan taniai@iis.u-tokyo.ac.jp yasumat@ist.osaka-u.ac.jp

More information

Lecture 15. Probabilistic Models on Graph

Lecture 15. Probabilistic Models on Graph Lecture 15. Probabilistic Models on Graph Prof. Alan Yuille Spring 2014 1 Introduction We discuss how to define probabilistic models that use richly structured probability distributions and describe how

More information

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,

More information

CSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches

CSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches CSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches A presentation by Evan Ettinger November 11, 2005 Outline Introduction Motivation and Background

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Fast Memory-Efficient Generalized Belief Propagation

Fast Memory-Efficient Generalized Belief Propagation Fast Memory-Efficient Generalized Belief Propagation M. Pawan Kumar P.H.S. Torr Department of Computing Oxford Brookes University Oxford, UK, OX33 1HX {pkmudigonda,philiptorr}@brookes.ac.uk http://cms.brookes.ac.uk/computervision

More information

Submodularization for Binary Pairwise Energies

Submodularization for Binary Pairwise Energies IEEE conference on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, 214 p. 1 Submodularization for Binary Pairwise Energies Lena Gorelick Yuri Boykov Olga Veksler Computer Science Department

More information

Markov Random Fields and Bayesian Image Analysis. Wei Liu Advisor: Tom Fletcher

Markov Random Fields and Bayesian Image Analysis. Wei Liu Advisor: Tom Fletcher Markov Random Fields and Bayesian Image Analysis Wei Liu Advisor: Tom Fletcher 1 Markov Random Field: Application Overview Awate and Whitaker 2006 2 Markov Random Field: Application Overview 3 Markov Random

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning More Approximate Inference Mark Schmidt University of British Columbia Winter 2018 Last Time: Approximate Inference We ve been discussing graphical models for density estimation,

More information

Inference for Order Reduction in Markov Random Fields

Inference for Order Reduction in Markov Random Fields Inference for Order Reduction in Markov Random Fields Andre C. Gallagher Eastman Kodak Company andre.gallagher@kodak.com Dhruv Batra Toyota Technological Institute at Chicago ttic.uchicago.edu/ dbatra

More information

3 : Representation of Undirected GM

3 : Representation of Undirected GM 10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:

More information

Higher-order Graph Cuts

Higher-order Graph Cuts ACCV2014 Area Chairs Workshop Sep. 3, 2014 Nanyang Technological University, Singapore 1 Higher-order Graph Cuts Hiroshi Ishikawa Department of Computer Science & Engineering Waseda University Labeling

More information

A Compact Linear Programming Relaxation for Binary Sub-modular MRF

A Compact Linear Programming Relaxation for Binary Sub-modular MRF A Compact Linear Programming Relaxation for Binary Sub-modular MRF Junyan Wang and Sai-Kit Yeung Singapore University of Technology and Design {junyan_wang,saikit}@sutd.edu.sg Abstract. Direct linear programming

More information

Probabilistic Graphical Models Lecture Notes Fall 2009

Probabilistic Graphical Models Lecture Notes Fall 2009 Probabilistic Graphical Models Lecture Notes Fall 2009 October 28, 2009 Byoung-Tak Zhang School of omputer Science and Engineering & ognitive Science, Brain Science, and Bioinformatics Seoul National University

More information

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Markov Random Fields: Representation Conditional Random Fields Log-Linear Models Readings: KF

More information

Clustering with k-means and Gaussian mixture distributions

Clustering with k-means and Gaussian mixture distributions Clustering with k-means and Gaussian mixture distributions Machine Learning and Category Representation 2012-2013 Jakob Verbeek, ovember 23, 2012 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.12.13

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference II: Mean Field Method and Variational Principle Junming Yin Lecture 15, March 7, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3

More information

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy

More information

Lecture 9: PGM Learning

Lecture 9: PGM Learning 13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and

More information

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials Philipp Krähenbühl and Vladlen Koltun Stanford University Presenter: Yuan-Ting Hu 1 Conditional Random Field (CRF) E x I = φ u

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft

More information

Clustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.

Clustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014. Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)

More information

Parameter Learning for Log-supermodular Distributions

Parameter Learning for Log-supermodular Distributions Parameter Learning for Log-supermodular Distributions Tatiana Shpakova IRIA - École ormale Supérieure Paris tatiana.shpakova@inria.fr Francis Bach IRIA - École ormale Supérieure Paris francis.bach@inria.fr

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

Accelerated MRI Image Reconstruction

Accelerated MRI Image Reconstruction IMAGING DATA EVALUATION AND ANALYTICS LAB (IDEAL) CS5540: Computational Techniques for Analyzing Clinical Data Lecture 15: Accelerated MRI Image Reconstruction Ashish Raj, PhD Image Data Evaluation and

More information

Submodular Functions Properties Algorithms Machine Learning

Submodular Functions Properties Algorithms Machine Learning Submodular Functions Properties Algorithms Machine Learning Rémi Gilleron Inria Lille - Nord Europe & LIFL & Univ Lille Jan. 12 revised Aug. 14 Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised

More information

Submodular Maximization and Diversity in Structured Output Spaces

Submodular Maximization and Diversity in Structured Output Spaces Submodular Maximization and Diversity in Structured Output Spaces Adarsh Prasad Virginia Tech, UT Austin adarshprasad27@gmail.com Stefanie Jegelka UC Berkeley stefje@eecs.berkeley.edu Dhruv Batra Virginia

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

Inference Methods for CRFs with Co-occurrence Statistics

Inference Methods for CRFs with Co-occurrence Statistics Noname manuscript No. (will be inserted by the editor) Inference Methods for CRFs with Co-occurrence Statistics L ubor Ladický 1 Chris Russell 1 Pushmeet Kohli Philip H. S. Torr the date of receipt and

More information

Bayesian Machine Learning - Lecture 7

Bayesian Machine Learning - Lecture 7 Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1

More information

Machine Learning for Data Science (CS4786) Lecture 24

Machine Learning for Data Science (CS4786) Lecture 24 Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each

More information

Learning with Structured Inputs and Outputs

Learning with Structured Inputs and Outputs Learning with Structured Inputs and Outputs Christoph H. Lampert IST Austria (Institute of Science and Technology Austria), Vienna Microsoft Machine Learning and Intelligence School 2015 July 29-August

More information

Revisiting Uncertainty in Graph Cut Solutions

Revisiting Uncertainty in Graph Cut Solutions Revisiting Uncertainty in Graph Cut Solutions Daniel Tarlow Dept. of Computer Science University of Toronto dtarlow@cs.toronto.edu Ryan P. Adams School of Engineering and Applied Sciences Harvard University

More information

Inference Methods for CRFs with Co-occurrence Statistics

Inference Methods for CRFs with Co-occurrence Statistics Int J Comput Vis (2013) 103:213 225 DOI 10.1007/s11263-012-0583-y Inference Methods for CRFs with Co-occurrence Statistics L ubor Ladický Chris Russell Pushmeet Kohli Philip H. S. Torr Received: 21 January

More information

Part 4: Conditional Random Fields

Part 4: Conditional Random Fields Part 4: Conditional Random Fields Sebastian Nowozin and Christoph H. Lampert Colorado Springs, 25th June 2011 1 / 39 Problem (Probabilistic Learning) Let d(y x) be the (unknown) true conditional distribution.

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information