Part 6: Structured Prediction and Energy Minimization (1/2)
|
|
- Ezra Bell
- 5 years ago
- Views:
Transcription
1 Part 6: Structured Prediction and Energy Minimization (1/2) Providence, 21st June 2012
2 Prediction Problem Prediction Problem y = f (x) = argmax y Y g(x, y) g(x, y) = p(y x), factor graphs/mrf/crf, g(x, y) = E(y; x, w), factor graphs/mrf/crf, g(x, y) = w, ψ(x, y), linear model (e.g. multiclass SVM), difficulty: Y finite but very large
3 Prediction Problem Prediction Problem y = f (x) = argmax y Y g(x, y) g(x, y) = p(y x), factor graphs/mrf/crf, g(x, y) = E(y; x, w), factor graphs/mrf/crf, g(x, y) = w, ψ(x, y), linear model (e.g. multiclass SVM), difficulty: Y finite but very large
4 Prediction Problem Prediction Prblem (cont) Definition (Optimization Problem) Given (g, Y, G, x), with feasible set Y G over decision domain G, and given an input instance x X and an objective function g : X G R, find the optimal value α = sup g(x, y), y Y and, if the supremum exists, find an optimal solution y Y such that g(x, y ) = α.
5 Prediction Problem The feasible set Ingredients Decision domain G, typically simple (G = R d, G = 2 V, etc.) Feasible set Y G, defining the problem-specific structure Objective function g : X G R. Terminology Y = G: unconstrained optimization problem, G finite: discrete optimization problem, G = 2 Σ for ground set Σ: combinatorial optimization problem, Y = : infeasible problem.
6 Prediction Problem Example: Feasible Sets (cont) J ij y i y j J jk y j y k Y i Y j Y k (+1) ( 1) ( 1) h i y i h j y j h k y k Ising model with external field Graph G = (V, E) External field : h R V Interaction matrix: J R V V Objective, defined on y i { 1, 1} g(y) = h i y i + h j y j + h k y k J ijy i y j J jky j y k
7 Prediction Problem Example: Feasible Sets (cont) Ising model with external field Y = G = { 1, +1} V g(y) = 1 J i,j y i y j + h i y i 2 i V (i,j) E Unconstrained Objective function contains quadratic terms
8 Prediction Problem Example: Feasible Sets (cont) G = {0, 1} (V { 1,+1}) (E { 1,+1} { 1,+1}), Y = {y G : i V : y i, 1 + y i,+1 = 1, (i, j) E : y i,j,+1,+1 + y i,j,+1, 1 = y i,+1, (i, j) E : y i,j, 1,+1 + y i,j, 1, 1 = y i, 1 }, g(y) = 1 J i,j (y i,j,+1,+1 + y i,j, 1, 1 ) (i,j) E (i,j) E J i,j (y i,j,+1, 1 + y i,j, 1,+1 ) + h i (y i,+1 y i, 1 ) i V Constrained, more variables Objective function contains linear terms only
9 Prediction Problem Evaluating f : what do we want? f (x) = argmax g(x, y) y Y For evaluating f (x) we want an algorithm that 1. is general: applicable to all instances of the problem, 2. is optimal: provides an optimal y, 3. has good worst-case complexity: for all instances the runtime and space is acceptably bounded, 4. is integral: its solutions are restricted to Y, 5. is deterministic: its results and runtime are reproducible and depend on the input data only.
10 Prediction Problem Evaluating f : what do we want? f (x) = argmax g(x, y) y Y For evaluating f (x) we want an algorithm that 1. is general: applicable to all instances of the problem, 2. is optimal: provides an optimal y, 3. has good worst-case complexity: for all instances the runtime and space is acceptably bounded, 4. is integral: its solutions are restricted to Y, 5. is deterministic: its results and runtime are reproducible and depend on the input data only. wanting all of them impossible
11 Prediction Problem Giving up some properties Hard problem Generality Optimality Worst-case complexity Integrality Determinism giving up one or more properties allows us to design algorithms satisfying the remaining properties might be sufficient for the task at hand
12 G: Generality Hard problem Generality Optimality Worst-case complexity Integrality Determinism
13 G: Generality Giving up Generality Identify an interesting and tractable subset of instances Set of all instances Tractable subset
14 G: Generality Example: MAP Inference in Markov Random Fields Although NP-hard in general, it is tractable... with low tree-width (Lauritzen, Spiegelhalter, 1988) with binary states, pairwise submodular interactions (Boykov, Jolly, 2001) with binary states, pairwise interactions (only), planar graph structure (Globerson, Jaakkola, 2006) with submodular pairwise interactions (Schlesinger, 2006) with P n -Potts higher order factors (Kohli, Kumar, Torr, 2007) with perfect graph structure (Jebara, 2009)
15 G: Generality Binary Graph-Cuts Energy function: unary and pairwise E(y; x, w) = E F (y F ; x, w tf )+ E F (y F ; x, w tf ) F F 1 F F 2 Restriction 1 (wlog) E F (y i ; x, w tf ) 0 Restriction 2 (regular/submodular/attractive) E F (y i, y j ; x, w tf ) = 0, if y i = y j, E F (y i, y j ; x, w tf ) = E F (y j, y i ; x, w tf ) 0, otherwise.
16 G: Generality Binary Graph-Cuts Energy function: unary and pairwise E(y; x, w) = E F (y F ; x, w tf )+ E F (y F ; x, w tf ) F F 1 F F 2 Restriction 1 (wlog) E F (y i ; x, w tf ) 0 Restriction 2 (regular/submodular/attractive) E F (y i, y j ; x, w tf ) = 0, if y i = y j, E F (y i, y j ; x, w tf ) = E F (y j, y i ; x, w tf ) 0, otherwise.
17 G: Generality Binary Graph-Cuts (cont) Construct auxiliary undirected graph One node {i} i V per variable Two extra nodes: source s, sink t {i, s} s Edges Edge Graph cut weight {i, j} E F (y i = 0, y j = 1; x, w tf ) {i, s} E F (y i = 1; x, w tf ) {i, t} E F (y i = 0; x, w tf ) Find linear s-t-mincut {i, t} i j k l m n t Solution defines optimal binary labeling of the original energy minimization problem
18 G: Generality Example: Figure-Ground Segmentation Input image (
19 G: Generality Example: Figure-Ground Segmentation Color model log-odds
20 G: Generality Example: Figure-Ground Segmentation Independent decisions
21 G: Generality Example: Figure-Ground Segmentation g(x, y, w) = i V Gradient strength log p(y i x i ) + w (i,j) E C(x i, x j ) = exp(γ x i x j 2 ) C(x i, x j )I (y i y j ) γ estimated from mean edge strength (Blake et al, 2004) w 0 controls smoothing
22 G: Generality Example: Figure-Ground Segmentation w = 0
23 G: Generality Example: Figure-Ground Segmentation Small w > 0
24 G: Generality Example: Figure-Ground Segmentation Medium w > 0
25 G: Generality Example: Figure-Ground Segmentation Large w > 0
26 G: Generality General Binary Case Is there a larger class of energies for which binary graph cuts are applicable? (Kolmogorov and Zabih, 2004), (Freedman and Drineas, 2005) Theorem (Regular Binary Energies) E(y; x, w) = F F 1 E F (y F ; x, w tf ) + F F 2 E F (y F ; x, w tf ) is a energy function of binary variables containing only unary and pairwise factors. The discrete energy minimization problem argmin y E(y; x, w) is representable as a graph cut problem if and only if all pairwise energy functions E F for F F 2 with F = {i, j} satisfy E i,j (0, 0) + E i,j (1, 1) E i,j (0, 1) + E i,j (1, 0).
27 G: Generality General Binary Case Is there a larger class of energies for which binary graph cuts are applicable? (Kolmogorov and Zabih, 2004), (Freedman and Drineas, 2005) Theorem (Regular Binary Energies) E(y; x, w) = F F 1 E F (y F ; x, w tf ) + F F 2 E F (y F ; x, w tf ) is a energy function of binary variables containing only unary and pairwise factors. The discrete energy minimization problem argmin y E(y; x, w) is representable as a graph cut problem if and only if all pairwise energy functions E F for F F 2 with F = {i, j} satisfy E i,j (0, 0) + E i,j (1, 1) E i,j (0, 1) + E i,j (1, 0).
28 G: Generality Example: Class-independent Object Hypotheses (Carreira and Sminchisescu, 2010) PASCAL VOC 2009/2010 segmentation winner Generate class-independent object hypotheses Energy (almost) as before g(x, y, w) = i V E i (y i ) + w (i,j) E C(x i, x j )I (y i y j ) Fixed unaries if i V fg and y i = 0 E i (y i ) = if i V bg and y i = 1 0 otherwise Test all w 0 using parametric max-flow (Picard and Queyranne, 1980), (Kolmogorov et al., 2007)
29 G: Generality Example: Class-independent Object Hypotheses (Carreira and Sminchisescu, 2010) PASCAL VOC 2009/2010 segmentation winner Generate class-independent object hypotheses Energy (almost) as before g(x, y, w) = i V E i (y i ) + w (i,j) E C(x i, x j )I (y i y j ) Fixed unaries if i V fg and y i = 0 E i (y i ) = if i V bg and y i = 1 0 otherwise Test all w 0 using parametric max-flow (Picard and Queyranne, 1980), (Kolmogorov et al., 2007)
30 G: Generality Example: Class-independent Object Hypotheses (Carreira and Sminchisescu, 2010) PASCAL VOC 2009/2010 segmentation winner Generate class-independent object hypotheses Energy (almost) as before g(x, y, w) = i V E i (y i ) + w (i,j) E C(x i, x j )I (y i y j ) Fixed unaries if i V fg and y i = 0 E i (y i ) = if i V bg and y i = 1 0 otherwise Test all w 0 using parametric max-flow (Picard and Queyranne, 1980), (Kolmogorov et al., 2007)
31 G: Generality Example: Class-independent Object Hypotheses (cont) Input image (
32 G: Generality Example: Class-independent Object Hypotheses (cont) CPMC proposal segmentations (Carreira and Sminchisescu, 2010)
33 Hard problem Generality Optimality Worst-case complexity Integrality Determinism
34 Giving up Optimality Solving for y is hard, but is it necessary? pragmatic motivation: in many applications a close-to-optimal solution is good enough computational motivation: set of good solutions might be large and finding just one element can be easy For machine learning models modeling error: we always use the wrong model estimation error: preference for y might be an artifact
35 Giving up Optimality Solving for y is hard, but is it necessary? pragmatic motivation: in many applications a close-to-optimal solution is good enough computational motivation: set of good solutions might be large and finding just one element can be easy For machine learning models modeling error: we always use the wrong model estimation error: preference for y might be an artifact
36 Local Search Y y 0
37 Local Search Y N (y 0 ) y 0
38 Local Search Y N (y 0 ) y 0 y 1
39 Local Search Y y 0 y 1 )y2 N (y 0 ) N (y 1 N (y 2 ) y 3 N (y ) y N (y 3 )
40 Local Search Y y 0 y 1 )y2 N (y 0 ) N (y 1 N (y 2 ) y 3 N (y ) y N (y 3 ) N t : Y 2 Y, neighborhood system Optimization with respect to N t (y) must be tractable: y t+1 = argmax g(x, y) y N t(y t )
41 Example: Iterated Conditional Modes (ICM) Iterated Conditional Modes (ICM), (Besag, 1986) g(x, y) = log p(y x) y = argmax y Y log p(y x) Neighborhoods N s (y) = {(y 1,..., y s 1, z s, y s+1,..., y S ) z s Y s }
42 Example: Iterated Conditional Modes (ICM) Iterated Conditional Modes (ICM), (Besag, 1986) y t+1 = argmax y 1 Y 1 log p(y 1, y t 2,..., y t V x)
43 Example: Iterated Conditional Modes (ICM) Iterated Conditional Modes (ICM), (Besag, 1986) y t+1 = argmax y 2 Y 2 log p(y t 1, y 2, y t 3,..., y t V x)
44 Neighborhood Size ICM neighborhood N t (y t ): all states reachable from y t by changing a single variable (Besag, 1986) Neighborhood size: in general, larger is better (VLSN, Ahuja, 2000) Example: neighborhood along chains
45 Example: Block ICM Block Iterated Conditional Modes (ICM) (Kelm et al., 2006), (Kittler and Föglein, 1984) y t+1 = argmax y C1 Y C1 log p(y C1, y t V \C 1 x)
46 Example: Block ICM Block Iterated Conditional Modes (ICM) (Kelm et al., 2006), (Kittler and Föglein, 1984) y t+1 = argmax y C2 Y C2 log p(y C2, y t V \C 2 x)
47 Example: Multilabel Graph-Cut Binary graph-cuts are not applicable to multilabel energy minimization problems (Boykov et al., 2001): two local search algorithms for multilabel problems Sequence of binary directed s-t-mincut problems Iteratively improve multilabel solution
48 α-β Swap Neighborhood Select two different labels α and β Fix all variables i for which y i / {α, β} Optimize over remaining i with y i {α, β} N α,β : Y N N 2 Y, N α,β (y, α, β) := {z Y : z i = y i if y i / {α, β}, otherwise z i {α, β}}.
49 α-β-swap illustrated 5-label problem α β-swap
50 α-β-swap illustrated 5-label problem α β-swap
51 α-β-swap illustrated 5-label problem α β-swap
52 α-β-swap illustrated 5-label problem α β-swap
53 α-β-swap illustrated 5-label problem α β-swap
54 α-β-swap derivation y t+1 = argmin E(y; x) y N α,β (y t,α,β) Constant: drop out Unary: combine Pairwise: binary pairwise
55 α-β-swap derivation y t+1 = argmin E i (y i ; x) + E i,j (y i, y j ; x) y N α,β (y t,α,β) i V (i,j) E Constant: drop out Unary: combine Pairwise: binary pairwise
56 α-β-swap derivation y t+1 [ = argmin y N α,β (y t,α,β) i V, + + y t i / {α,β} (i,j) E, y t i / {α,β},y t j / {α,β} (i,j) E, y t i / {α,β},y t j {α,β} E i (y t i ; x) + E i,j (y t i, y t j ; x) + E i,j (y t i, y j ; x) + i V, y t i {α,β} E i (y i ; x) (i,j) E, y t i {α,β},y t j / {α,β} (i,j) E, y t i {α,β},y t j {α,β} E i,j (y i, y t j ; x) ] E i,j (y i, y j ; x). Constant: drop out Unary: combine Pairwise: binary pairwise
57 α-β-swap derivation y t+1 [ = argmin y N α,β (y t,α,β) i V, + + y t i / {α,β} (i,j) E, y t i / {α,β},y t j / {α,β} (i,j) E, y t i / {α,β},y t j {α,β} E i (y t i ; x) + E i,j (y t i, y t j ; x) + E i,j (y t i, y j ; x) + i V, y t i {α,β} E i (y i ; x) (i,j) E, y t i {α,β},y t j / {α,β} (i,j) E, y t i {α,β},y t j {α,β} E i,j (y i, y t j ; x) ] E i,j (y i, y j ; x). Constant: drop out Unary: combine Pairwise: binary pairwise
58 α-β-swap graph construction Directed graph G = (V, E ) V = {α, β} {i V : y i {α, β}}, E = {(α, i, t α i ) : i V : y i {α, β}} {(i, β, t β i ) : i V : y i {α, β}} {(i, j, n i,j ) : (i, j), (j, i) E : y i, y j {α, β}}. Edge weights t α i, t β i, and n i,j n i,j = E i,j (α, β; x) t α i = E i (α; x) + (i,j) E, y j / {α,β} t β i = E i (β; x) + (i,j) E, y j / {α,β} E i,j (α, y j ; x) E i,j (β, y j ; x) t α i t α j α i j... k t β i n i,j n i,j t β j β t α k t β k
59 α-β-swap graph construction Directed graph G = (V, E ) V = {α, β} {i V : y i {α, β}}, E = {(α, i, t α i ) : i V : y i {α, β}} {(i, β, t β i ) : i V : y i {α, β}} {(i, j, n i,j ) : (i, j), (j, i) E : y i, y j {α, β}}. Edge weights t α i, t β i, and n i,j n i,j = E i,j (α, β; x) t α i = E i (α; x) + (i,j) E, y j / {α,β} t β i = E i (β; x) + (i,j) E, y j / {α,β} E i,j (α, y j ; x) E i,j (β, y j ; x) t α i t α j α i j... k t β i n i,j n i,j t β j β t α k t β k
60 α-β-swap move α i j... k t β i t α i n i,j n i,j Side of cut determines y i {α, β} Iterate all possible (α, β) combinations Semi-metric requirement on pairwise energies t α j t β j β t α k t β k E i,j (y i, y j ; x) = 0 y i = y j E i,j (y i, y j ; x) = E i,j (y j, y i ; x) 0
61 α-β-swap move C α i j... k Side of cut determines y i {α, β} Iterate all possible (α, β) combinations Semi-metric requirement on pairwise energies β E i,j (y i, y j ; x) = 0 y i = y j E i,j (y i, y j ; x) = E i,j (y j, y i ; x) 0
62 α-β-swap move C α i j... k Side of cut determines y i {α, β} Iterate all possible (α, β) combinations Semi-metric requirement on pairwise energies β E i,j (y i, y j ; x) = 0 y i = y j E i,j (y i, y j ; x) = E i,j (y j, y i ; x) 0
63 Example: Stereo Disparity Estimation Infer depth from two images Discretized multi-label problem α-expansion solution close to optimal
64 Example: Stereo Disparity Estimation Infer depth from two images Discretized multi-label problem α-expansion solution close to optimal
65 Model Reduction Energy minimization problem: many decision to make jointly Model reduction 1. Fix a subset of decisions 2. Optimize the smaller remaining model Example: forcing y i = y j for pairs (i, j)
66 Example: Superpixels in Labeling Problems Input image: 500-by-375 pixels (187,500 decisions)
67 Example: Superpixels in Labeling Problems Image with 149 superpixels (149 decisions)
Pushmeet Kohli Microsoft Research
Pushmeet Kohli Microsoft Research E(x) x in {0,1} n Image (D) [Boykov and Jolly 01] [Blake et al. 04] E(x) = c i x i Pixel Colour x in {0,1} n Unary Cost (c i ) Dark (Bg) Bright (Fg) x* = arg min E(x)
More informationDiscrete Inference and Learning Lecture 3
Discrete Inference and Learning Lecture 3 MVA 2017 2018 h
More informationPart 7: Structured Prediction and Energy Minimization (2/2)
Part 7: Structured Prediction and Energy Minimization (2/2) Colorado Springs, 25th June 2011 G: Worst-case Complexity Hard problem Generality Optimality Worst-case complexity Integrality Determinism G:
More informationMarkov Random Fields for Computer Vision (Part 1)
Markov Random Fields for Computer Vision (Part 1) Machine Learning Summer School (MLSS 2011) Stephen Gould stephen.gould@anu.edu.au Australian National University 13 17 June, 2011 Stephen Gould 1/23 Pixel
More informationIntelligent Systems:
Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition
More informationGraph Cut based Inference with Co-occurrence Statistics. Ľubor Ladický, Chris Russell, Pushmeet Kohli, Philip Torr
Graph Cut based Inference with Co-occurrence Statistics Ľubor Ladický, Chris Russell, Pushmeet Kohli, Philip Torr Image labelling Problems Assign a label to each image pixel Geometry Estimation Image Denoising
More informationLearning with Structured Inputs and Outputs
Learning with Structured Inputs and Outputs Christoph H. Lampert IST Austria (Institute of Science and Technology Austria), Vienna Microsoft Machine Learning and Intelligence School 2015 July 29-August
More informationSubmodularity in Machine Learning
Saifuddin Syed MLRG Summer 2016 1 / 39 What are submodular functions Outline 1 What are submodular functions Motivation Submodularity and Concavity Examples 2 Properties of submodular functions Submodularity
More informationSemi-Markov/Graph Cuts
Semi-Markov/Graph Cuts Alireza Shafaei University of British Columbia August, 2015 1 / 30 A Quick Review For a general chain-structured UGM we have: n n p(x 1, x 2,..., x n ) φ i (x i ) φ i,i 1 (x i, x
More informationIntroduction To Graphical Models
Peter Gehler Introduction to Graphical Models Introduction To Graphical Models Peter V. Gehler Max Planck Institute for Intelligent Systems, Tübingen, Germany ENS/INRIA Summer School, Paris, July 2013
More informationEnergy Minimization via Graph Cuts
Energy Minimization via Graph Cuts Xiaowei Zhou, June 11, 2010, Journal Club Presentation 1 outline Introduction MAP formulation for vision problems Min-cut and Max-flow Problem Energy Minimization via
More informationDecision Tree Fields
Sebastian Nowozin, arsten Rother, Shai agon, Toby Sharp, angpeng Yao, Pushmeet Kohli arcelona, 8th November 2011 Introduction Random Fields in omputer Vision Markov Random Fields (MRF) (Kindermann and
More informationMAP Estimation Algorithms in Computer Vision - Part II
MAP Estimation Algorithms in Comuter Vision - Part II M. Pawan Kumar, University of Oford Pushmeet Kohli, Microsoft Research Eamle: Image Segmentation E() = c i i + c ij i (1- j ) i i,j E: {0,1} n R 0
More informationRounding-based Moves for Semi-Metric Labeling
Rounding-based Moves for Semi-Metric Labeling M. Pawan Kumar, Puneet K. Dokania To cite this version: M. Pawan Kumar, Puneet K. Dokania. Rounding-based Moves for Semi-Metric Labeling. Journal of Machine
More informationUndirected Graphical Models: Markov Random Fields
Undirected Graphical Models: Markov Random Fields 40-956 Advanced Topics in AI: Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2015 Markov Random Field Structure: undirected
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning More Approximate Inference Mark Schmidt University of British Columbia Winter 2018 Last Time: Approximate Inference We ve been discussing graphical models for density estimation,
More informationHigher-Order Clique Reduction Without Auxiliary Variables
Higher-Order Clique Reduction Without Auxiliary Variables Hiroshi Ishikawa Department of Computer Science and Engineering Waseda University Okubo 3-4-1, Shinjuku, Tokyo, Japan hfs@waseda.jp Abstract We
More informationA note on the primal-dual method for the semi-metric labeling problem
A note on the primal-dual method for the semi-metric labeling problem Vladimir Kolmogorov vnk@adastral.ucl.ac.uk Technical report June 4, 2007 Abstract Recently, Komodakis et al. [6] developed the FastPD
More informationHigher-Order Clique Reduction in Binary Graph Cut
CVPR009: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Miami Beach, Florida. June 0-5, 009. Hiroshi Ishikawa Nagoya City University Department of Information and Biological
More informationMAP Examples. Sargur Srihari
MAP Examples Sargur srihari@cedar.buffalo.edu 1 Potts Model CRF for OCR Topics Image segmentation based on energy minimization 2 Examples of MAP Many interesting examples of MAP inference are instances
More informationUndirected graphical models
Undirected graphical models Semantics of probabilistic models over undirected graphs Parameters of undirected models Example applications COMP-652 and ECSE-608, February 16, 2017 1 Undirected graphical
More informationMANY problems in computer vision, such as segmentation,
134 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 33, NO. 6, JUNE 011 Transformation of General Binary MRF Minimization to the First-Order Case Hiroshi Ishikawa, Member, IEEE Abstract
More informationA Graph Cut Algorithm for Higher-order Markov Random Fields
A Graph Cut Algorithm for Higher-order Markov Random Fields Alexander Fix Cornell University Aritanan Gruber Rutgers University Endre Boros Rutgers University Ramin Zabih Cornell University Abstract Higher-order
More informationGeneralized Roof Duality for Pseudo-Boolean Optimization
Generalized Roof Duality for Pseudo-Boolean Optimization Fredrik Kahl Petter Strandmark Centre for Mathematical Sciences, Lund University, Sweden {fredrik,petter}@maths.lth.se Abstract The number of applications
More informationPushmeet Kohli. Microsoft Research Cambridge. IbPRIA 2011
Pushmeet Kohli Microsoft Research Cambridge IbPRIA 2011 2:30 4:30 Labelling Problems Graphical Models Message Passing 4:30 5:00 - Coffee break 5:00 7:00 - Graph Cuts Move Making Algorithms Speed and Efficiency
More informationSubmodular Functions Properties Algorithms Machine Learning
Submodular Functions Properties Algorithms Machine Learning Rémi Gilleron Inria Lille - Nord Europe & LIFL & Univ Lille Jan. 12 revised Aug. 14 Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised
More informationIntegrating Local Classifiers through Nonlinear Dynamics on Label Graphs with an Application to Image Segmentation
Integrating Local Classifiers through Nonlinear Dynamics on Label Graphs with an Application to Image Segmentation Yutian Chen Andrew Gelfand Charless C. Fowlkes Max Welling Bren School of Information
More informationPotts model, parametric maxflow and k-submodular functions
2013 IEEE International Conference on Computer Vision Potts model, parametric maxflow and k-submodular functions Igor Gridchyn IST Austria igor.gridchyn@ist.ac.at Vladimir Kolmogorov IST Austria vnk@ist.ac.at
More informationTruncated Max-of-Convex Models Technical Report
Truncated Max-of-Convex Models Technical Report Pankaj Pansari University of Oxford The Alan Turing Institute pankaj@robots.ox.ac.uk M. Pawan Kumar University of Oxford The Alan Turing Institute pawan@robots.ox.ac.uk
More informationEfficient Inference in Fully Connected CRFs with Gaussian Edge Potentials
Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials by Phillip Krahenbuhl and Vladlen Koltun Presented by Adam Stambler Multi-class image segmentation Assign a class label to each
More informationMaking the Right Moves: Guiding Alpha-Expansion using Local Primal-Dual Gaps
Making the Right Moves: Guiding Alpha-Expansion using Local Primal-Dual Gaps Dhruv Batra TTI Chicago dbatra@ttic.edu Pushmeet Kohli Microsoft Research Cambridge pkohli@microsoft.com Abstract 5 This paper
More informationLecture 9: PGM Learning
13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and
More informationDual Decomposition for Inference
Dual Decomposition for Inference Yunshu Liu ASPITRG Research Group 2014-05-06 References: [1]. D. Sontag, A. Globerson and T. Jaakkola, Introduction to Dual Decomposition for Inference, Optimization for
More informationMarkov Random Fields and Bayesian Image Analysis. Wei Liu Advisor: Tom Fletcher
Markov Random Fields and Bayesian Image Analysis Wei Liu Advisor: Tom Fletcher 1 Markov Random Field: Application Overview Awate and Whitaker 2006 2 Markov Random Field: Application Overview 3 Markov Random
More informationQuadratic Programming Relaxations for Metric Labeling and Markov Random Field MAP Estimation
Quadratic Programming Relaations for Metric Labeling and Markov Random Field MAP Estimation Pradeep Ravikumar John Lafferty School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213,
More informationOptimizing Expected Intersection-over-Union with Candidate-Constrained CRFs
Optimizing Expected Intersection-over-Union with Candidate-Constrained CRFs Faruk Ahmed Université de Montréal faruk.ahmed@umontreal.ca Daniel Tarlow Microsoft Research dtarlow@microsoft.com Dhruv Batra
More informationBetter restore the recto side of a document with an estimation of the verso side: Markov model and inference with graph cuts
June 23 rd 2008 Better restore the recto side of a document with an estimation of the verso side: Markov model and inference with graph cuts Christian Wolf Laboratoire d InfoRmatique en Image et Systèmes
More informationEnergy minimization via graph-cuts
Energy minimization via graph-cuts Nikos Komodakis Ecole des Ponts ParisTech, LIGM Traitement de l information et vision artificielle Binary energy minimization We will first consider binary MRFs: Graph
More informationShared Segmentation of Natural Scenes. Dependent Pitman-Yor Processes
Shared Segmentation of Natural Scenes using Dependent Pitman-Yor Processes Erik Sudderth & Michael Jordan University of California, Berkeley Parsing Visual Scenes sky skyscraper sky dome buildings trees
More information9. Submodular function optimization
Submodular function maximization 9-9. Submodular function optimization Submodular function maximization Greedy algorithm for monotone case Influence maximization Greedy algorithm for non-monotone case
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:
More informationSpatial Bayesian Nonparametrics for Natural Image Segmentation
Spatial Bayesian Nonparametrics for Natural Image Segmentation Erik Sudderth Brown University Joint work with Michael Jordan University of California Soumya Ghosh Brown University Parsing Visual Scenes
More informationMarkov Random Fields
Markov Random Fields Umamahesh Srinivas ipal Group Meeting February 25, 2011 Outline 1 Basic graph-theoretic concepts 2 Markov chain 3 Markov random field (MRF) 4 Gauss-Markov random field (GMRF), and
More informationSupplementary Material Accompanying Geometry Driven Semantic Labeling of Indoor Scenes
Supplementary Material Accompanying Geometry Driven Semantic Labeling of Indoor Scenes Salman H. Khan 1, Mohammed Bennamoun 1, Ferdous Sohel 1 and Roberto Togneri 2 School of CSSE 1, School of EECE 2 The
More informationCSC 412 (Lecture 4): Undirected Graphical Models
CSC 412 (Lecture 4): Undirected Graphical Models Raquel Urtasun University of Toronto Feb 2, 2016 R Urtasun (UofT) CSC 412 Feb 2, 2016 1 / 37 Today Undirected Graphical Models: Semantics of the graph:
More informationDiscrete Optimization Lecture 5. M. Pawan Kumar
Discrete Optimization Lecture 5 M. Pawan Kumar pawan.kumar@ecp.fr Exam Question Type 1 v 1 s v 0 4 2 1 v 4 Q. Find the distance of the shortest path from s=v 0 to all vertices in the graph using Dijkstra
More informationOn Partial Optimality in Multi-label MRFs
On Partial Optimality in Multi-label MRFs P. Kohli 1 A. Shekhovtsov 2 C. Rother 1 V. Kolmogorov 3 P. Torr 4 1 Microsoft Research Cambridge 2 Czech Technical University in Prague 3 University College London
More informationProbabilistic Graphical Models Lecture Notes Fall 2009
Probabilistic Graphical Models Lecture Notes Fall 2009 October 28, 2009 Byoung-Tak Zhang School of omputer Science and Engineering & ognitive Science, Brain Science, and Bioinformatics Seoul National University
More informationDiscriminative Random Fields: A Discriminative Framework for Contextual Interaction in Classification
Discriminative Random Fields: A Discriminative Framework for Contextual Interaction in Classification Sanjiv Kumar and Martial Hebert The Robotics Institute, Carnegie Mellon University Pittsburgh, PA 15213,
More informationIntroduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah
Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,
More informationConditional Random Field
Introduction Linear-Chain General Specific Implementations Conclusions Corso di Elaborazione del Linguaggio Naturale Pisa, May, 2011 Introduction Linear-Chain General Specific Implementations Conclusions
More informationReformulations of nonlinear binary optimization problems
1 / 36 Reformulations of nonlinear binary optimization problems Yves Crama HEC Management School, University of Liège, Belgium Koper, May 2018 2 / 36 Nonlinear 0-1 optimization Outline 1 Nonlinear 0-1
More informationComputer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo
Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain
More informationA weighted Mirror Descent algorithm for nonsmooth convex optimization problem
Noname manuscript No. (will be inserted by the editor) A weighted Mirror Descent algorithm for nonsmooth convex optimization problem Duy V.N. Luong Panos Parpas Daniel Rueckert Berç Rustem Received: date
More informationQuadratization of symmetric pseudo-boolean functions
of symmetric pseudo-boolean functions Yves Crama with Martin Anthony, Endre Boros and Aritanan Gruber HEC Management School, University of Liège Liblice, April 2013 of symmetric pseudo-boolean functions
More informationChris Bishop s PRML Ch. 8: Graphical Models
Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular
More informationWE address a general class of binary pairwise nonsubmodular
IN IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (PAMI), 2017 - TO APPEAR 1 Local Submodularization for Binary Pairwise Energies Lena Gorelick, Yuri Boykov, Olga Veksler, Ismail Ben Ayed, Andrew
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationWhat Metrics Can Be Approximated by Geo-Cuts, or Global Optimization of Length/Area and Flux
Proceedings of International Conference on Computer Vision (ICCV), Beijing, China, October 2005 vol.., p.1 What Metrics Can Be Approximated by Geo-Cuts, or Global Optimization of Length/Area and Flux Vladimir
More informationRevisiting Uncertainty in Graph Cut Solutions
Revisiting Uncertainty in Graph Cut Solutions Daniel Tarlow Dept. of Computer Science University of Toronto dtarlow@cs.toronto.edu Ryan P. Adams School of Engineering and Applied Sciences Harvard University
More informationSubmodular Maximization and Diversity in Structured Output Spaces
Submodular Maximization and Diversity in Structured Output Spaces Adarsh Prasad Virginia Tech, UT Austin adarshprasad27@gmail.com Stefanie Jegelka UC Berkeley stefje@eecs.berkeley.edu Dhruv Batra Virginia
More informationEfficient Inference with Cardinality-based Clique Potentials
Rahul Gupta IBM Research Lab, New Delhi, India Ajit A. Diwan Sunita Sarawagi IIT Bombay, India Abstract Many collective labeling tasks require inference on graphical models where the clique potentials
More informationConvex Optimization. Ofer Meshi. Lecture 6: Lower Bounds Constrained Optimization
Convex Optimization Ofer Meshi Lecture 6: Lower Bounds Constrained Optimization Lower Bounds Some upper bounds: #iter μ 2 M #iter 2 M #iter L L μ 2 Oracle/ops GD κ log 1/ε M x # ε L # x # L # ε # με f
More informationDiscrete Optimization in Machine Learning. Colorado Reed
Discrete Optimization in Machine Learning Colorado Reed [ML-RCC] 31 Jan 2013 1 Acknowledgements Some slides/animations based on: Krause et al. tutorials: http://www.submodularity.org Pushmeet Kohli tutorial:
More informationCOMP90051 Statistical Machine Learning
COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 24. Hidden Markov Models & message passing Looking back Representation of joint distributions Conditional/marginal independence
More informationCutting Plane Training of Structural SVM
Cutting Plane Training of Structural SVM Seth Neel University of Pennsylvania sethneel@wharton.upenn.edu September 28, 2017 Seth Neel (Penn) Short title September 28, 2017 1 / 33 Overview Structural SVMs
More informationA Combined LP and QP Relaxation for MAP
A Combined LP and QP Relaxation for MAP Patrick Pletscher ETH Zurich, Switzerland pletscher@inf.ethz.ch Sharon Wulff ETH Zurich, Switzerland sharon.wulff@inf.ethz.ch Abstract MAP inference for general
More informationIntroduction to the Tensor Train Decomposition and Its Applications in Machine Learning
Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning Anton Rodomanov Higher School of Economics, Russia Bayesian methods research group (http://bayesgroup.ru) 14 March
More informationDoes Better Inference mean Better Learning?
Does Better Inference mean Better Learning? Andrew E. Gelfand, Rina Dechter & Alexander Ihler Department of Computer Science University of California, Irvine {agelfand,dechter,ihler}@ics.uci.edu Abstract
More informationInteger and Combinatorial Optimization: Introduction
Integer and Combinatorial Optimization: Introduction John E. Mitchell Department of Mathematical Sciences RPI, Troy, NY 12180 USA November 2018 Mitchell Introduction 1 / 18 Integer and Combinatorial Optimization
More informationACO Comprehensive Exam March 20 and 21, Computability, Complexity and Algorithms
1. Computability, Complexity and Algorithms Part a: You are given a graph G = (V,E) with edge weights w(e) > 0 for e E. You are also given a minimum cost spanning tree (MST) T. For one particular edge
More informationOn Partial Optimality in Multi-label MRFs
Pushmeet Kohli 1 Alexander Shekhovtsov 2 Carsten Rother 1 Vladimir Kolmogorov 3 Philip Torr 4 pkohli@microsoft.com shekhovt@cmp.felk.cvut.cz carrot@microsoft.com vnk@adastral.ucl.ac.uk philiptorr@brookes.ac.uk
More informationRevisiting the Limits of MAP Inference by MWSS on Perfect Graphs
Revisiting the Limits of MAP Inference by MWSS on Perfect Graphs Adrian Weller University of Cambridge CP 2015 Cork, Ireland Slides and full paper at http://mlg.eng.cam.ac.uk/adrian/ 1 / 21 Motivation:
More informationIntroduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah
Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,
More informationLMI Methods in Optimal and Robust Control
LMI Methods in Optimal and Robust Control Matthew M. Peet Arizona State University Lecture 02: Optimization (Convex and Otherwise) What is Optimization? An Optimization Problem has 3 parts. x F f(x) :
More informationAdvanced Structured Prediction
Advanced Structured Prediction Editors: Sebastian Nowozin Microsoft Research Cambridge, CB1 2FB, United Kingdom Peter V. Gehler Max Planck Insitute for Intelligent Systems 72076 Tübingen, Germany Jeremy
More informationAlternative Parameterizations of Markov Networks. Sargur Srihari
Alternative Parameterizations of Markov Networks Sargur srihari@cedar.buffalo.edu 1 Topics Three types of parameterization 1. Gibbs Parameterization 2. Factor Graphs 3. Log-linear Models with Energy functions
More informationThe geometry of Gaussian processes and Bayesian optimization. Contal CMLA, ENS Cachan
The geometry of Gaussian processes and Bayesian optimization. Contal CMLA, ENS Cachan Background: Global Optimization and Gaussian Processes The Geometry of Gaussian Processes and the Chaining Trick Algorithm
More informationLecture 18: Multiclass Support Vector Machines
Fall, 2017 Outlines Overview of Multiclass Learning Traditional Methods for Multiclass Problems One-vs-rest approaches Pairwise approaches Recent development for Multiclass Problems Simultaneous Classification
More informationSubmodularity beyond submodular energies: Coupling edges in graph cuts
Submodularity beyond submodular energies: Coupling edges in graph cuts Stefanie Jegelka and Jeff Bilmes Max Planck Institute for Intelligent Systems Tübingen, Germany University of Washington Seattle,
More informationPartially labeled classification with Markov random walks
Partially labeled classification with Markov random walks Martin Szummer MIT AI Lab & CBCL Cambridge, MA 0239 szummer@ai.mit.edu Tommi Jaakkola MIT AI Lab Cambridge, MA 0239 tommi@ai.mit.edu Abstract To
More informationMinimizing Count-based High Order Terms in Markov Random Fields
EMMCVPR 2011, St. Petersburg Minimizing Count-based High Order Terms in Markov Random Fields Thomas Schoenemann Center for Mathematical Sciences Lund University, Sweden Abstract. We present a technique
More informationDirected and Undirected Graphical Models
Directed and Undirected Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Last Lecture Refresher Lecture Plan Directed
More informationInference in Graphical Models Variable Elimination and Message Passing Algorithm
Inference in Graphical Models Variable Elimination and Message Passing lgorithm Le Song Machine Learning II: dvanced Topics SE 8803ML, Spring 2012 onditional Independence ssumptions Local Markov ssumption
More informationHigher-Order Energies for Image Segmentation
IEEE TRANSACTIONS ON IMAGE PROCESSING 1 Higher-Order Energies for Image Segmentation Jianbing Shen, Senior Member, IEEE, Jianteng Peng, Xingping Dong, Ling Shao, Senior Member, IEEE, and Fatih Porikli,
More informationLecture 15. Probabilistic Models on Graph
Lecture 15. Probabilistic Models on Graph Prof. Alan Yuille Spring 2014 1 Introduction We discuss how to define probabilistic models that use richly structured probability distributions and describe how
More informationJoint Optimization of Segmentation and Appearance Models
Joint Optimization of Segmentation and Appearance Models David Mandle, Sameep Tandon April 29, 2013 David Mandle, Sameep Tandon (Stanford) April 29, 2013 1 / 19 Overview 1 Recap: Image Segmentation 2 Optimization
More informationUNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS
UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS JONATHAN YEDIDIA, WILLIAM FREEMAN, YAIR WEISS 2001 MERL TECH REPORT Kristin Branson and Ian Fasel June 11, 2003 1. Inference Inference problems
More informationTightness of LP Relaxations for Almost Balanced Models
Tightness of LP Relaxations for Almost Balanced Models Adrian Weller University of Cambridge AISTATS May 10, 2016 Joint work with Mark Rowland and David Sontag For more information, see http://mlg.eng.cam.ac.uk/adrian/
More information3 : Representation of Undirected GM
10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:
More informationRandom Field Models for Applications in Computer Vision
Random Field Models for Applications in Computer Vision Nazre Batool Post-doctorate Fellow, Team AYIN, INRIA Sophia Antipolis Outline Graphical Models Generative vs. Discriminative Classifiers Markov Random
More informationSubmodularization for Binary Pairwise Energies
IEEE conference on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, 214 p. 1 Submodularization for Binary Pairwise Energies Lena Gorelick Yuri Boykov Olga Veksler Computer Science Department
More informationGraph Cut based Inference with Co-occurrence Statistics
Graph Cut based Inference with Co-occurrence Statistics Lubor Ladicky 1,3, Chris Russell 1,3, Pushmeet Kohli 2, and Philip H.S. Torr 1 1 Oxford Brookes 2 Microsoft Research Abstract. Markov and Conditional
More informationProbabilistic Graphical Models & Applications
Probabilistic Graphical Models & Applications Learning of Graphical Models Bjoern Andres and Bernt Schiele Max Planck Institute for Informatics The slides of today s lecture are authored by and shown with
More informationDiscriminative Fields for Modeling Spatial Dependencies in Natural Images
Discriminative Fields for Modeling Spatial Dependencies in Natural Images Sanjiv Kumar and Martial Hebert The Robotics Institute Carnegie Mellon University Pittsburgh, PA 15213 {skumar,hebert}@ri.cmu.edu
More informationMultiresolution Graph Cut Methods in Image Processing and Gibbs Estimation. B. A. Zalesky
Multiresolution Graph Cut Methods in Image Processing and Gibbs Estimation B. A. Zalesky 1 2 1. Plan of Talk 1. Introduction 2. Multiresolution Network Flow Minimum Cut Algorithm 3. Integer Minimization
More informationAsaf Bar Zvi Adi Hayat. Semantic Segmentation
Asaf Bar Zvi Adi Hayat Semantic Segmentation Today s Topics Fully Convolutional Networks (FCN) (CVPR 2015) Conditional Random Fields as Recurrent Neural Networks (ICCV 2015) Gaussian Conditional random
More informationA Graph Cut Algorithm for Generalized Image Deconvolution
A Graph Cut Algorithm for Generalized Image Deconvolution Ashish Raj UC San Francisco San Francisco, CA 94143 Ramin Zabih Cornell University Ithaca, NY 14853 Abstract The goal of deconvolution is to recover
More informationMulticlass Classification-1
CS 446 Machine Learning Fall 2016 Oct 27, 2016 Multiclass Classification Professor: Dan Roth Scribe: C. Cheng Overview Binary to multiclass Multiclass SVM Constraint classification 1 Introduction Multiclass
More informationStructured Prediction
Structured Prediction Classification Algorithms Classify objects x X into labels y Y First there was binary: Y = {0, 1} Then multiclass: Y = {1,...,6} The next generation: Structured Labels Structured
More information