Graph Cut based Inference with Co-occurrence Statistics. Ľubor Ladický, Chris Russell, Pushmeet Kohli, Philip Torr

Similar documents
Pushmeet Kohli Microsoft Research

Graph Cut based Inference with Co-occurrence Statistics

Discrete Inference and Learning Lecture 3

Inference Methods for CRFs with Co-occurrence Statistics

Inference Methods for CRFs with Co-occurrence Statistics

Part 6: Structured Prediction and Energy Minimization (1/2)

Markov Random Fields for Computer Vision (Part 1)

Making the Right Moves: Guiding Alpha-Expansion using Local Primal-Dual Gaps

A Graph Cut Algorithm for Higher-order Markov Random Fields

Generalized Roof Duality for Pseudo-Boolean Optimization

MAP Estimation Algorithms in Computer Vision - Part II

Higher-Order Clique Reduction Without Auxiliary Variables

Pushmeet Kohli. Microsoft Research Cambridge. IbPRIA 2011

On Partial Optimality in Multi-label MRFs

Rounding-based Moves for Semi-Metric Labeling

Truncated Max-of-Convex Models Technical Report

Learning Graph Laplacian for Image Segmentation

On Partial Optimality in Multi-label MRFs

Higher-Order Clique Reduction in Binary Graph Cut

Integrating Local Classifiers through Nonlinear Dynamics on Label Graphs with an Application to Image Segmentation

Learning with Structured Inputs and Outputs

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials

Part 7: Structured Prediction and Energy Minimization (2/2)

A note on the primal-dual method for the semi-metric labeling problem

Transformation of Markov Random Fields for Marginal Distribution Estimation

Submodular Maximization and Diversity in Structured Output Spaces

A Generative Perspective on MRFs in Low-Level Vision Supplemental Material

Higher-order Graph Cuts

Minimizing Count-based High Order Terms in Markov Random Fields

Intelligent Systems:

Advanced Structured Prediction

Inference for Order Reduction in Markov Random Fields

Revisiting Uncertainty in Graph Cut Solutions

Probabilistic Graphical Models & Applications

Decision Tree Fields

Introduction To Graphical Models

Supplementary Material Accompanying Geometry Driven Semantic Labeling of Indoor Scenes

Higher-Order Energies for Image Segmentation

Asaf Bar Zvi Adi Hayat. Semantic Segmentation

Course 16:198:520: Introduction To Artificial Intelligence Lecture 9. Markov Networks. Abdeslam Boularias. Monday, October 14, 2015

Parameter Learning for Log-supermodular Distributions

Structured Prediction

A Compact Linear Programming Relaxation for Binary Sub-modular MRF

Feedback Loop between High Level Semantics and Low Level Vision

Advanced Structured Prediction

Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing

Modeling Mutual Context of Object and Human Pose in Human-Object Interaction Activities

A Graph Cut Algorithm for Generalized Image Deconvolution

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

Discriminative Fields for Modeling Spatial Dependencies in Natural Images

Pseudo-Bound Optimization for Binary Energies

STA 4273H: Statistical Machine Learning

Undirected Graphical Models: Markov Random Fields

Graphical Object Models for Detection and Tracking

MAP Examples. Sargur Srihari

Notes on Markov Networks

Fast Approximate Energy Minimization with Label Costs

A Framework for Efficient Structured Max-Margin Learning of High-Order MRF Models

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

Quadratic Programming Relaxations for Metric Labeling and Markov Random Field MAP Estimation

A Combined LP and QP Relaxation for MAP

Max$Sum(( Exact(Inference(

Probabilistic Graphical Models in Computer Vision (IN2329)

Potts model, parametric maxflow and k-submodular functions

Pictorial Structures Revisited: People Detection and Articulated Pose Estimation. Department of Computer Science TU Darmstadt

Neural networks and optimization

Perturb-and-MAP Random Fields: Using Discrete Optimization to Learn and Sample from Energy Models ICCV 2011 paper supplementary material

MANY problems in computer vision, such as segmentation,

Part 4: Conditional Random Fields

Probabilistic Graphical Models

Spatial Bayesian Nonparametrics for Natural Image Segmentation

Undirected Graphical Models

Fast Memory-Efficient Generalized Belief Propagation

Computational Complexity

Optimization of Max-Norm Objective Functions in Image Processing and Computer Vision

Active MAP Inference in CRFs for Efficient Semantic Segmentation

Statistical and Inductive Inference by Minimum Message Length

Material presented. Direct Models for Classification. Agenda. Classification. Classification (2) Classification by machines 6/16/2010.

Energy Minimization via Graph Cuts

arxiv: v1 [cs.cv] 23 Mar 2015

Parameter learning in CRF s

A Graphical Model for Simultaneous Partitioning and Labeling

Computational Complexity. IE 496 Lecture 6. Dr. Ted Ralphs

Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013

Introduction to Machine Learning Midterm, Tues April 8

Active Detection via Adaptive Submodularity

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

Computing the M Most Probable Modes of a Graphical Model

A new look at reweighted message passing

Grouping with Bias. Stella X. Yu 1,2 Jianbo Shi 1. Robotics Institute 1 Carnegie Mellon University Center for the Neural Basis of Cognition 2

Belief Propagation for Traffic forecasting

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials

Semi-Markov/Graph Cuts

Maximum Persistency via Iterative Relaxed Inference with Graphical Models

A Unified View of Piecewise Linear Neural Network Verification Supplementary Materials

Convex Relaxations for Markov Random Field MAP estimation

Reformulations of nonlinear binary optimization problems

Discriminative Random Fields: A Discriminative Framework for Contextual Interaction in Classification

Support Vector Machine (SVM) and Kernel Methods

Submodularity beyond submodular energies: Coupling edges in graph cuts

Transcription:

Graph Cut based Inference with Co-occurrence Statistics Ľubor Ladický, Chris Russell, Pushmeet Kohli, Philip Torr

Image labelling Problems Assign a label to each image pixel Geometry Estimation Image Denoising Object Segmentation Sky Building Tree Grass

Pairwise CRF models Standard CRF Energy Data term Smoothness term

Pairwise CRF models Standard CRF Energy Data term Smoothness term Restricted expressive power

Structures in CRF Taskar et al. 02 associative potentials Kohli et al. 08 segment consistency Woodford et al. 08 planarity constraint Vicente et al. 08 connectivity constraint Nowozin & Lampert 09 connectivity constraint Roth & Black 09 field of experts Ladický et al. 09 consistency over several scales Woodford et al. 09 marginal probability Delong et al. 10 label occurrence costs

Pairwise CRF models Standard CRF Energy for Object Segmentation Local context Cannot encode global consistency of labels!!

Detection Suppression If we have 1000 categories (detectors), and each detector produces 1 fp every 10 images, we will have 100 false alarms per image pretty much garbage [Torralba et al. 10, Leibe & Schiele 09, Barinova et al. 10] chair table car keyboard road table road mage from Torralba et al. 10

Encoding Co-occurrence Co-occurrence is a powerful cue [Heitz et al. '08] [Rabinovich et al. 07] Thing Thing Stuff - Stuff Stuff - Thing [ Images from Rabinovich et al. 07 ]

Encoding Co-occurrence Co-occurrence is a powerful cue [Heitz et al. '08] [Rabinovich et al. 07] Thing Thing Stuff - Stuff Stuff - Thing Proposed solutions : 1. Csurka et al. 08 - Hard decision for label estimation 2. Torralba et al. 03 - GIST based unary potential 3. Rabinovich et al. 07 - Full-connected CRF [ Images from Rabinovich et al. 07 ]

So... What properties should these global co-occurence potentials have?

1. No hard decisions Desired properties

Desired properties 1. No hard decisions Incorporation in probabilistic framework Unlikely possibilities are not completely ruled out

Desired properties 1. No hard decisions 2. Invariance to region size

Desired properties 1. No hard decisions 2. Invariance to region size Cost for occurrence of {people, house, road etc.. } invariant to image area

Desired properties 1. No hard decisions 2. Invariance to region size The only possible solution : L(x)={,, } Local context Global context Cost defined over the assigned labels L(x)

Desired properties 1. No hard decisions 2. Invariance to region size 3. Parsimony simple solutions preferred L(x)={ aeroplane, tree, flower, building, boat, grass, sky } L(x)={ building, tree, grass, sky }

Desired properties 1. No hard decisions 2. Invariance to region size 3. Parsimony simple solutions preferred 4. Efficiency

Desired properties 1. No hard decisions 2. Invariance to region size 3. Parsimony simple solutions preferred 4. Efficiency a) Memory requirements as O(n) with the image size and number or labels b) Inference tractable

Previous work Torralba et al.(2003) Gist-based unary potentials Rabinovich et al.(2007) - complete pairwise graphs Csurka et al.(2008) - hard estimation of labels present

Related work Zhu & Yuille 1996 MDL prior Bleyer et al. 2010 Surface Stereo MDL prior Hoiem et al. 2007 3D Layout CRF MDL Prior C(x) = K L(x) Delong et al. 2010 label occurence cost C(x) = Σ L K L δ L (x)

Related work Zhu & Yuille 1996 MDL prior Bleyer et al. 2010 Surface Stereo MDL prior Hoiem et al. 2007 3D Layout CRF MDL Prior C(x) = K L(x) Delong et al. 2010 label occurence cost C(x) = Σ L K L δ L (x) All special cases of our model

Inference Pairwise CRF Energy

Inference IP formulation (Schlesinger 73)

Inference Pairwise CRF Energy with co-occurence

Inference IP formulation with co-occurence

Inference IP formulation with co-occurence Pairwise CRF cost Pairwise CRF constaints

Inference IP formulation with co-occurence Co-occurence cost

Inference IP formulation with co-occurence Inclusion constraints

Inference IP formulation with co-occurence Exclusion constraints

Inference LP relaxation Relaxed constraints

Inference LP relaxation Very Slow! 80 x 50 subsampled image takes 20 minutes

Inference: Our Contribution Pairwise representation One auxiliary variable Z 2 L Infinite pairwise costs if x i Z [see technical report] *Solvable using standard methods: BP, TRW etc.

Inference: Our Contribution Pairwise representation One auxiliary variable Z 2 L Infinite pairwise costs if x i Z [see technical report] *Solvable using standard methods: BP, TRW etc. Relatively faster but still computationally expensive!

Inference using Moves Graph Cut based move making algorithms [Boykov et al. 01] Series of locally optimal moves Each move reduces energy Optimal move by minimizing submodular function Current Solution Search Neighbourhood Move Space (t) : 2 N N L Number of Variables Number of Labels Space of Solutions (x) : L N α-expansion transformation function

Inference using Moves Graph Cut based move making algorithms [Boykov, Veksler, Zabih. 01] α-expansion transformation function

Inference using Moves Co-occurence representation Label indicator functions

Inference using Moves Move Energy Cost of current label set

Inference using Moves Move Energy Decomposition to α-dependent and α-independent part α-independent α-dependent

Inference using Moves Move Energy Decomposition to α-dependent and α-independent part Either α or all labels in the image after the move

Inference using Moves Move Energy submodular non-submodular

Inference Move Energy non-submodular Non-submodular energy overestimated by E'(t) E'(t) = E(t) for current solution E'(t) E(t) for any other labelling

Inference Move Energy non-submodular Non-submodular energy overestimated by E'(t) E'(t) = E(t) for current solution E'(t) E(t) for any other labelling Occurrence - tight

Inference Move Energy non-submodular Non-submodular energy overestimated by E'(t) E'(t) = E(t) for current solution E'(t) E(t) for any other labelling Co-occurrence overestimation

Inference Move Energy non-submodular Non-submodular energy overestimated by E'(t) E'(t) = E(t) for current solution E'(t) E(t) for any other labelling General case [See the paper]

Inference Move Energy non-submodular Non-submodular energy overestimated by E'(t) E'(t) = E(t) for current solution E'(t) E(t) for any other labelling Quadratic representation

Application: Object Segmentation Standard MRF model for Object Segmentation Label based Costs Cost defined over the assigned labels L(x)

Training of label based potentials Label set costs Approximated by 2 nd order representation Indicator variables for occurrence of each label

Methods Segment CRF Experiments Segment CRF + Co-occurrence Potential Associative HCRF [Ladický et al. 09] Associative HCRF + Co-occurrence Potential Datasets MSRC-21 Number of Images: 591 Number of Classes: 21 Training Set: 50% Test Set: 50% PASCAL VOC 2009 Number of Images: 1499 Number of Classes: 21 Training Set: 50% Test Set: 50%

MSRC - Qualitative

VOC 2010-Qualitative

Quantitative Results MSRC-21 PASCAL VOC 2009

Summary and further work Incorporated label based potentials in CRFs Proposed feasible inference Open questions Optimal training method for co-occurence Bounds of graph cut based inference Questions?