Integer programming for the MAP problem in Markov random fields

Similar documents
Cutting Planes in SCIP

Integer Programming for Bayesian Network Structure Learning

Advances in Bayesian Network Learning using Integer Programming

Integer Programming for Bayesian Network Structure Learning

Orbital Conflict. Jeff Linderoth. Jim Ostrowski. Fabrizio Rossi Stefano Smriglio. When Worlds Collide. Univ. of Wisconsin-Madison

Integer Linear Programs

Alternative Methods for Obtaining. Optimization Bounds. AFOSR Program Review, April Carnegie Mellon University. Grant FA

Mixed Integer Programming:

CS 301: Complexity of Algorithms (Term I 2008) Alex Tiskin Harald Räcke. Hamiltonian Cycle. 8.5 Sequencing Problems. Directed Hamiltonian Cycle

Integer Programming Formulations for the Minimum Weighted Maximal Matching Problem

Lift-and-Project cuts: an efficient solution method for mixed-integer programs

Week Cuts, Branch & Bound, and Lagrangean Relaxation

CS/COE

Nonconvex Quadratic Programming: Return of the Boolean Quadric Polytope

- Well-characterized problems, min-max relations, approximate certificates. - LP problems in the standard form, primal and dual linear programs

Review: Directed Models (Bayes Nets)

A hard integer program made easy by lexicography

Indicator Constraints in Mixed-Integer Programming

Machine Learning Lecture 14

8.5 Sequencing Problems

NP-Completeness I. Lecture Overview Introduction: Reduction and Expressiveness

Lift-and-Project Inequalities

Cutting Plane Separators in SCIP

Lecture 14: Random Walks, Local Graph Clustering, Linear Programming

Event-based MIP models for the resource constrained project scheduling problem

5. Partitions and Relations Ch.22 of PJE.

POLYNOMIAL MILP FORMULATIONS

A Capacity Scaling Procedure for the Multi-Commodity Capacitated Network Design Problem. Ryutsu Keizai University Naoto KATAYAMA

Computational testing of exact separation for mixed-integer knapsack problems

1 Algebraic Methods. 1.1 Gröbner Bases Applied to SAT

Monoidal Cut Strengthening and Generalized Mixed-Integer Rounding for Disjunctions and Complementarity Constraints

Combinatorial Auction: A Survey (Part I)

Computer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Limitations of Algorithms

Preprocessing. Complements of Operations Research. Giovanni Righini. Università degli Studi di Milano

From structures to heuristics to global solvers

Lecture 18: More NP-Complete Problems

Markov Networks.

The Traveling Salesman Problem: An Overview. David P. Williamson, Cornell University Ebay Research January 21, 2014

Cutting Planes for First Level RLT Relaxations of Mixed 0-1 Programs

NP-problems continued

Computational complexity

Holistic Convergence of Random Walks

Multi-Row Presolve Reductions in Mixed Integer Programming

Name: Block: Unit 2 Inequalities

Computational Integer Programming Universidad de los Andes. Lecture 1. Dr. Ted Ralphs

CS264: Beyond Worst-Case Analysis Lecture #18: Smoothed Complexity and Pseudopolynomial-Time Algorithms

TRIPARTITE MATCHING, KNAPSACK, Pseudopolinomial Algorithms, Strong NP-completeness

Directed Graphical Models

Bayesian network model selection using integer programming

Section Notes 9. IP: Cutting Planes. Applied Math 121. Week of April 12, 2010

Independencies. Undirected Graphical Models 2: Independencies. Independencies (Markov networks) Independencies (Bayesian Networks)

Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS. Lluís-Miquel Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano

Part II Strong lift-and-project cutting planes. Vienna, January 2012

Advanced topic: Space complexity

Clique trees & Belief Propagation. Siamak Ravanbakhsh Winter 2018

CS264: Beyond Worst-Case Analysis Lecture #15: Smoothed Complexity and Pseudopolynomial-Time Algorithms

Solving Box-Constrained Nonconvex Quadratic Programs

Generic Branch-Price-and-Cut

Rapid Introduction to Machine Learning/ Deep Learning

Analyzing the computational impact of individual MINLP solver components

THE CIS PROBLEM AND RELATED RESULTS IN GRAPH THEORY

Polynomial-time Reductions

Stochastic Decision Diagrams

MVE165/MMG631 Linear and integer optimization with applications Lecture 8 Discrete optimization: theory and algorithms

Presolve Reductions in Mixed Integer Programming

Using Sparsity to Design Primal Heuristics for MILPs: Two Stories

Algorithms. Outline! Approximation Algorithms. The class APX. The intelligence behind the hardware. ! Based on

The Multiple Traveling Salesperson Problem on Regular Grids

A brief introduction to Conditional Random Fields

Three-partition Flow Cover Inequalities for Constant Capacity Fixed-charge Network Flow Problems

14 : Theory of Variational Inference: Inner and Outer Approximation

A Tighter Piecewise McCormick Relaxation for Bilinear Problems

Formulations and Algorithms for Minimum Connected Dominating Set Problems

arxiv: v1 [cs.cc] 5 Dec 2018

Integer Linear Programming (ILP)

Statistics 1 - Lecture Notes Chapter 1

Question 1. Find the coordinates of the y-intercept for. f) None of the above. Question 2. Find the slope of the line:

Decision Procedures An Algorithmic Point of View

Lecture 11 - Basic Number Theory.

The degree of the polynomial function is n. We call the term the leading term, and is called the leading coefficient. 0 =

PORTA, NEOS, and Knapsack Covers. Cover Inequalities. Prof. Jeff Linderoth. October 29, June 23, 2004 DIMACS Reconnect Conference on MIP Slide 1

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4

STUDY OF PERMUTATION MATRICES BASED LDPC CODE CONSTRUCTION

NP-problems continued

A COMPUTATIONAL COMPARISON OF SYMMETRY HANDLING METHODS FOR MIXED INTEGER PROGRAMS

Undirected Graphical Models

Lecture 21 (Oct. 24): Max Cut SDP Gap and Max 2-SAT

4.3 Minimizing & Mixed Constraints

Practical Tips for Modelling Lot-Sizing and Scheduling Problems. Waldemar Kaczmarczyk

CSCI3390-Lecture 14: The class NP

Optimization Exercise Set n.5 :

CS 6820 Fall 2014 Lectures, October 3-20, 2014

4/12/2011. Chapter 8. NP and Computational Intractability. Directed Hamiltonian Cycle. Traveling Salesman Problem. Directed Hamiltonian Cycle

CSC Linear Programming and Combinatorial Optimization Lecture 12: The Lift and Project Method

Minimum Linear Arrangements

Properties of Probability

Formal definition of P

Strengthened Benders Cuts for Stochastic Integer Programs with Continuous Recourse

Asymptotic Polynomial-Time Approximation (APTAS) and Randomized Approximation Algorithms

A New Facet Generating Procedure for the Stable Set Polytope

Transcription:

Integer programming for the MAP problem in Markov random fields James Cussens, University of York HIIT, 2015-04-17 James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 1 / 21

Markov random fields A Markov random field (MRF) A B A D B C C D - - - - - - - - 0 0 0.10 0 0 0.90 0 0 0.40 0 0 0.50 0 1 0.20 0 1 0.20 0 1 0.70 0 1 0.20 1 0 0.30 1 0 0.70 1 0 0.30 1 0 0.40 1 1 0.20 1 1 0.10 1 1 0.10 1 1 0.10 James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 2 / 21

Markov random fields A Markov random field (MRF) A B A D B C C D - - - - - - - - 0 0 0.10 0 0 0.90 0 0 0.40 0 0 0.50 0 1 0.20 0 1 0.20 0 1 0.70 0 1 0.20 1 0 0.30 1 0 0.70 1 0 0.30 1 0 0.40 1 1 0.20 1 1 0.10 1 1 0.10 1 1 0.10 An MRF defines a joint probability distribution over its variables: P(A = 0, B = 1, C = 1, D = 0) 0.2 0.9 0.1 0.4 James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 2 / 21

Markov random fields A Markov random field (MRF) A B A D B C C D - - - - - - - - 0 0 0.10 0 0 0.90 0 0 0.40 0 0 0.50 0 1 0.20 0 1 0.20 0 1 0.70 0 1 0.20 1 0 0.30 1 0 0.70 1 0 0.30 1 0 0.40 1 1 0.20 1 1 0.10 1 1 0.10 1 1 0.10 Its associated graph can be used to read off conditional independence relations: B C A D James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 3 / 21

MRF MAP MAP for MRFs The MAP problem for MRF is to find an instantiation of the variables with maximal probability. This is an NP-hard problem. A = 1, B = 0, C = 1, D = 0 is a MAP solution for our example MRF: A B A D B C C D - - - - - - - - 0 0 0.10 0 0 0.90 0 0 0.40 0 0 0.50 0 1 0.20 0 1 0.20 0 1 0.70 0 1 0.20 1 0 0.30 1 0 0.70 1 0 0.30 1 0 0.40 1 1 0.20 1 1 0.10 1 1 0.10 1 1 0.10 James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 4 / 21

MIP for MRF MAP MIP for MRF MAP It is easy to encode an MRF MAP problem as an integer program with only binary variables. This is a special case of a mixed integer program (MIP) where there are no real-valued variables, so not properly mixed. But we ll call these MIPs nonetheless due to the nice alliteration. Cue demo... James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 5 / 21

MIP for MRF MAP Solving without thinking 458 MRF MAP instances from the UAI 2011 PIC challenge were given to CPLEX 12.6 using the MIP encoding I have just presented. Machine: 4-core 3.2GHz with 7.8Gb RAM CPLEX solved 313 (to certified optimality) in under 2000 seconds. Half of all instances were solved in under 6 seconds. The fileforgal 350markers instance had had 7,871,997 MIP variables (after presolving)and 312,420 linear constraints. It was solved in 870.97 seconds. Solving its initial linear relaxation took 458.09 seconds. James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 6 / 21

Set partitioning Solving with added thinking A set partitioning constraint states that exactly one of a given set of binary variables has value 1. For example, a 0 b 0 + a 0 b 1 + a 1 b 0 + a 1 b 1 = 1. A set partitioning problem (SPP) is a MIP with only binary variables where all constraints are set partitioning constraints. It is easy to see that any MRF MAP instance can presented as an SPP instance. Cue demo... James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 7 / 21

Set partitioning Set partitioning: why do we care? Set partitioning (SPP), and the closely related problems of set packing (SP) (replace = with ) and set covering (SC) (replace = with ), have been extensively studied in the mathematical programming community for decades [BP76]. For example, we can convert any SPP into either an SP or an SC instance. We can convert any SP instance into a node packing (aka vertex covering, maximum independent set, etc) instance, although this is typically not a good idea, since the linear relaxation is weaker. There are also many results on how good various approximate algorithms are [Pas97]. James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 8 / 21

SPP cuts Intersection graph a 0 b 0 a 0 b 1 b 0 c 1 b 1 c 1 a 1 b 0 a 1 b 1 b 0 c 0 b 1 c 0 a 0 d 0 a 0 d 1 c 1 d 0 c 1 d 1 a 1 d 0 a 1 d 1 c 0 d 0 c 0 d 1 James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 9 / 21

SPP cuts An odd hole in the intersection graph a 0 b 0 a 0 b 1 b 0 c 1 b 1 c 1 a 1 b 0 a 1 b 1 b 0 c 0 b 1 c 0 a 0 d 0 a 0 d 1 c 1 d 0 c 1 d 1 a 1 d 0 a 1 d 1 c 0 d 0 c 0 d 1 James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 10 / 21

SPP cuts An odd hole in the intersection graph a 0 b 0 a 0 b 1 b 0 c 1 b 1 c 1 a 1 b 0 a 1 b 1 b 0 c 0 b 1 c 0 a 0 d 0 a 0 d 1 c 1 d 0 c 1 d 1 a 1 d 0 a 1 d 1 c 0 d 0 c 0 d 1 James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 10 / 21

SPP cuts An odd hole in the intersection graph a 0 b 0 a 0 b 1 b 0 c 1 b 1 c 1 a 1 b 0 a 1 b 1 b 0 c 0 b 1 c 0 a 0 d 0 a 0 d 1 c 1 d 0 c 1 d 1 a 1 d 0 a 1 d 1 c 0 d 0 c 0 d 1 a 0 b 0 + b 1 c 0 + c 1 d 0 + a 0 d 1 + a 1 d 0 2 James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 10 / 21

SPP cuts Lifting odd holes a 0 b 0 a 0 b 1 b 0 c 1 b 1 c 1 a 1 b 0 a 1 b 1 b 0 c 0 b 1 c 0 a 0 d 0 a 0 d 1 c 1 d 0 c 1 d 1 a 1 d 0 a 1 d 1 c 0 d 0 c 0 d 1 James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 11 / 21

SPP cuts Lifting odd holes a 0 b 0 a 0 b 1 b 0 c 1 b 1 c 1 a 1 b 0 a 1 b 1 b 0 c 0 b 1 c 0 a 0 d 0 a 0 d 1 c 1 d 0 c 1 d 1 a 1 d 0 a 1 d 1 c 0 d 0 c 0 d 1 James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 11 / 21

SPP cuts Lifting odd holes a 0 b 0 a 0 b 1 b 0 c 1 b 1 c 1 a 1 b 0 a 1 b 1 b 0 c 0 b 1 c 0 a 0 d 0 a 0 d 1 c 1 d 0 c 1 d 1 a 1 d 0 a 1 d 1 c 0 d 0 c 0 d 1 a 0 b 0 + b 1 c 0 + c 1 d 0 + a 0 d 1 + a 1 d 0 + a 1 d 1 2 James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 11 / 21

Finding cuts Odd cycle cuts So how do we (quickly) find these cuts? We could use the MIP solver SCIP and change this default setting: separating/oddcycle/freq = -1 separating/oddcycle/liftoddcycles = FALSE to, say, this: separating/oddcycle/freq = 1 separating/oddcycle/liftoddcycles = TRUE Some (not very thorough) testing indicates that this is helpful. But we can do better. James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 12 / 21

Finding cuts Zero-half cuts Take 5 simple edge inequalities: a 0 b 0 + b 1 c 0 1 b 1 c 0 + c 1 d 0 1 a 0 d 1 + c 1 d 0 1 a 0 d 1 + a 1 d 0 1 a 1 d 0 + a 0 b 0 1 multiply each by half and add to derive: a 0 b 0 + b 1 c 0 + c 1 d 0 + a 0 d 1 + a 1 d 0 5/2 Since the LHS in integer we can tighten to get: a 0 b 0 + b 1 c 0 + c 1 d 0 + a 0 d 1 + a 1 d 0 2 the odd hole cut shown earlier. James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 13 / 21

Finding cuts Zero-half cuts in CPLEX A cut generated by (i) multiplying some existing inequalities by 1/2, (ii) adding and (iii) rounding is a zero-half cut. Any (lifted) odd hole cut can be generated as a zero-half cut. (As can odd anti-hole cuts for which we have no time.) James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 14 / 21

Finding cuts Zero-half cuts in CPLEX A cut generated by (i) multiplying some existing inequalities by 1/2, (ii) adding and (iii) rounding is a zero-half cut. Any (lifted) odd hole cut can be generated as a zero-half cut. (As can odd anti-hole cuts for which we have no time.) So put CPLEX s zero-half cut generator into overdrive! mip.cuts.zerohalfcut.set(2) mip.limits.cutpasses.set(9999999) mip.limits.cutsfactor.set(90) James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 14 / 21

Does it work? Comparing the times: all problems (5 more solved!) 0 500 1000 1500 2000 0 500 1000 1500 2000 default.times nodepalt.times James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 15 / 21

Does it work? Comparing the times: easy problems 0 10 20 30 40 50 60 0 10 20 30 40 50 60 default.times nodepalt.times James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 16 / 21

Does it work? Comparing the times: very easy problems 0 2 4 6 8 10 0 2 4 6 8 10 default.times nodepalt.times James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 17 / 21

Does it work? Comparing gaps: unsolved problems 0 20 40 60 80 100 0 20 40 60 80 100 default.gaps nodepalt.gaps James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 18 / 21

Markov blanket constraints Markov blanket constraints A B A D B C C D - - - - - - - - 0 0 0.10 0 0 0.90 0 0 0.40 0 0 0.50 0 1 0.20 0 1 0.20 0 1 0.70 0 1 0.20 1 0 0.30 1 0 0.70 1 0 0.30 1 0 0.40 1 1 0.20 1 1 0.10 1 1 0.10 1 1 0.10 The Markov blanket for D is {A, C}, so for any given joint instantiation of {A, C}, we can compute the optimal value of D. For A = 0, C = 0 the value for D = 0 is 0.9 0.5 and for D = 1 is 0.2 0.2, so we have the constraint a 0 c 0 d 0 we add this as the linear constraint (1 a 0 ) + (1 c 0 ) + d 0 1 James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 19 / 21

Markov blanket constraints Markov blanket constraints (ctd) In this example, it so happens that we have: a 0 c 0 d 0 a 0 c 0 d 0 a 0 c 0 d 0 a 0 c 0 d 0 so we can fix d 0 to 1 immediately. James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 20 / 21

Markov blanket constraints Markov blanket problems In the 16 deer instances each of the 60 variables has all the other 59 variables in its Markov blanket (every pair of variables is a hyperedge). Naively constructing Markov blanket constraints for such examples had predictably dire consequences! James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 21 / 21

Markov blanket constraints Markov blanket problems In the 16 deer instances each of the 60 variables has all the other 59 variables in its Markov blanket (every pair of variables is a hyperedge). Naively constructing Markov blanket constraints for such examples had predictably dire consequences! In other cases, adding Markov blanket constraints makes solving a bit (technical term!) slower. James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 21 / 21

Markov blanket constraints Markov blanket problems In the 16 deer instances each of the 60 variables has all the other 59 variables in its Markov blanket (every pair of variables is a hyperedge). Naively constructing Markov blanket constraints for such examples had predictably dire consequences! In other cases, adding Markov blanket constraints makes solving a bit (technical term!) slower. Current implementation is, however, very poor. We re mostly cutting not propagating James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 21 / 21

Markov blanket constraints Markov blanket problems In the 16 deer instances each of the 60 variables has all the other 59 variables in its Markov blanket (every pair of variables is a hyperedge). Naively constructing Markov blanket constraints for such examples had predictably dire consequences! In other cases, adding Markov blanket constraints makes solving a bit (technical term!) slower. Current implementation is, however, very poor. We re mostly cutting not propagating Further research needed... James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 21 / 21

Markov blanket constraints Markov blanket problems In the 16 deer instances each of the 60 variables has all the other 59 variables in its Markov blanket (every pair of variables is a hyperedge). Naively constructing Markov blanket constraints for such examples had predictably dire consequences! In other cases, adding Markov blanket constraints makes solving a bit (technical term!) slower. Current implementation is, however, very poor. We re mostly cutting not propagating Further research needed... as always... Acknowledgement: Thanks to COIN for supporting this research visit. James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 21 / 21

Markov blanket constraints Egon Balas and Manfred W. Padberg. Set partitioning: A survey. SIAM Review, 18(4):710 760, October 1976. Vangelis Th. Paschos. A survey of approximately optimal solutions to some covering and packing problems. ACM Comput. Surv., 29(2):171 209, 1997. James Cussens, University of York MIP for MRF MAP HIIT, 2015-04-17 21 / 21