Probabilistic Graphical Models

Size: px
Start display at page:

Download "Probabilistic Graphical Models"

Transcription

1 Probabilistic Graphical Models David Sontag New York University Lecture 6, March 7, 2013 David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

2 Today s lecture 1 Dual decomposition 2 MAP inference as an integer linear program 3 Linear programming relaxations for MAP inference David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

3 MAP inference Recall the MAP inference task, arg max p(x), x p(x) = 1 Z φ c (x c ) (we assume any evidence has been subsumed into the potentials, as discussed in the last lecture) Since the normalization term is simply a constant, this is equivalent to arg max φ c (x c ) x c C (called the max-product inference task) Furthermore, since log is monotonic, letting θ c (x c ) = lg φ c (x c ), we have that this is equivalent to arg max θ c (x c ) x (called max-sum) c C c C David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

4 Exactly solving MAP, beyond trees MAP as a discrete optimization problem is arg max θ i (x i ) + θ c (x c ) x i V c C Very general discrete optimization problem many hard combinatorial optimization problems can be written as this (e.g., 3-SAT) Studied in operations research communities, theoretical computer science, AI (constraint satisfaction, weighted SAT), etc. Very fast moving field, both for theory and heuristics David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

5 Motivating application: protein side-chain placement Probabilistic inference Find minimum energy conformation of amino acid side-chains along Given a fixed desired carbon 3D backbone: structure, choose amino-acids giving the most stable folding (Yanover, Meltzer, Weiss 06) Side-chain X 1 (corresponding to 1 amino acid) X 3 X 2 θ 13 (x 1, x 3 ) X 3 X 1 θ 12 (x 1, x 2 ) X 2 Potential function for each edge Protein backbone θ 34 (x 3, x 4 ) Joint distribution over the variables is given by Orientations of the side-chains are represented by discretized angles called rotamers Rotamer choices Partition forfunction nearby amino acids are energetically coupled Key (attractive problemsand repulsive forces) Find marginals: Find most likely assignment (MAP): Focus of this talk David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25 X 4

6 Motivating application: dependency parsing tivating Applications 5 Given a sentence, predict the dependency tree that relates the words: Figure 1.1: Example of dependency parsing for a sentence in English. Every Arc fromword head has one word parent, of each i.e. a valid phrase dependency to words parse is that a directed modify tree. itthe red arc demonstrates a non-projective dependency. May be non-projective: each word and its descendents may not be a contiguous subsequence x = {x i } i2v that maximizes m words = m(m 1) Xbinary arc i (x i )+ X selection variables x ij {0, 1} ij (x i,x j ). Let x i = {x ij } j i (all outgoing edges). Predict with: i2v ij2e Without additional restrictions on the choice of potential functions, or which max θ edges to include, the T (x) + θ problem is known ij (x ij ) + θ to be NP-hard. i (x i ) x Using the dual decomposition approach, we will ijbreak the problem i into much simpler subproblems involving maximizations of each single node potential i (x i ) and each edge potential ij (x i,x j ) independently from the other terms. Although David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

7 Dual decomposition Consider the MAP problem for pairwise Markov random fields: MAP(θ) = max θ i (x i ) + θ ij (x i, x j ). x i V ij E If we push the maximizations inside the sums, the value can only increase: MAP(θ) max θ i (x i ) + max θ ij (x i, x j ) x i x i,x j i V ij E Note that the right-hand side can be easily evaluated In PS3, problem 2, you showed that one can always reparameterize a distribution by operations like θi new (x i ) = θi old (x i ) + f (x i ) θij new (x i, x j ) = θij old (x i, x j ) f (x i ) for any function f (x i ), without changing the distribution/energy David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

8 Figure 1.2: Illustration of the the dual decomposition objective. Left: The original pairwise model consisting of four factors. Right: The maximization problems corresponding to the objective L( ). Each blue ellipse contains the David factor Sontag to(nyu) be maximized over. InGraphical all figures Models the singleton terms Lecture i(x 6, March i) are7, set / 25 Dual decomposition Introduction to Dual Decomposition for Inference f (x 1,x 2 ) x 1 f1(x 1) x 2 f2(x 2) f(x1,x2) x 1 x 2 g(x1,x3) h(x2,x4) x 3 x 4 k(x3,x4) f1(x 1) f2(x 2) x 1 + x 1 x 2 + x g (x 1,x 3 ) 2 g1(x 1 ) h2(x 2) g3(x 3 ) h (x 2,x 4 ) g1(x 1 ) h2(x 2) x g3(x 3 ) 3 k4(x 4) h4(x 4) + x 3 x 4 + x 4 k3(x 3) h4(x 4) x 3 x 4 k (x 3,x 4 ) k3(x 3) k4(x 4)

9 Dual decomposition Define: θ i (x i ) = θ i (x i ) + ij E δ j i (x i ) θ ij (x i, x j ) = θ ij (x i, x j ) δ j i (x i ) δ i j (x j ) It is easy to verify that θ i (x i ) + θ ij (x i, x j ) = i ij E i θ i (x i ) + ij E θ ij (x i, x j ) x Thus, we have that: MAP(θ) = MAP( θ) max θi (x i ) + max θij (x i, x j ) x i x i,x j i V ij E Every value of δ gives a different upper bound on the value of the MAP! The tightest upper bound can be obtained by minimizing the r.h.s. with respect to δ! David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

10 Dual decomposition We obtain the following dual objective: L(δ) = ( max θ i (x i ) + ) δ j i (x i ) + ( ) max θ ij (x i, x j ) δ j i (x i ) δ i j (x j ), x i x i,x j i V ij E ij E DUAL-LP(θ) = min L(δ) δ This provides an upper bound on the MAP assignment! MAP(θ) DUAL-LP(θ) L(δ) How can find δ which give tight bounds? David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

11 Solving the dual efficiently Many ways to solve the dual linear program, i.e. minimize with respect to δ: ( max θ i (x i ) + ) δ j i (x i ) + ( ) max θ ij (x i, x j ) δ j i (x i ) δ i j (x j ), x i x i,x j i V ij E ij E One option is to use the subgradient method Can also solve using block coordinate-descent, which gives algorithms that look very much like max-sum belief propagation: David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

12 Max-product linear programming (MPLP) algorithm Input: A set of factors θ i (x i ), θ ij (x i, x j ) Output: An assignment x 1,..., x n that approximates the MAP Algorithm: Initialize δ i j (x j ) = 0, δ j i (x i ) = 0, ij E, x i, x j Iterate until small enough change in L(δ): For each edge ij E (sequentially), perform the updates: δ j i (x i ) = 1 2 δ j i (x i ) + 1 [ ] 2 max θ ij (x i, x j ) + δ i x j (x j ) j δ i j (x j ) = 1 2 δ i j (x j ) + 1 [ ] 2 max θ ij (x i, x j ) + δ j x i (x i ) i x i x j where δ j i (x i ) = θ i (x i ) + ik E,k j δ k i(x i ) Return x i arg maxˆxi θδ i (ˆx i ) David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

13 Generalization to arbitrary factor graphs Introduction to Dual Decomposition for Inference Inputs: A set of factors θ i(x i), θ f (x f ). Output: An assignment x 1,...,x n that approximates the MAP. Algorithm: Initialize δ fi(x i)=0, f F, i f,x i. Iterate until small enough change in L(δ) (seeeq.1.2): For each f F, perform the updates δ fi(x i)= δ f i (x 1 i)+ f max θ f (x f )+ x f\i î f δ f î (xî), (1.16) simultaneously for all i f and x i.wedefineδ f i (x i)=θ i(x i)+ ˆf=f δ ˆfi (x i). Return x i arg maxˆxi θδ i (ˆx i)(seeeq.1.6). Figure 1.4: Description of the MPLP block coordinate descent algorithm for minimizing the dual L(δ) (see Section 1.5.2). Similar algorithms can be devised for different choices of coordinate blocks. See sections and David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

14 Figure 1.5: Comparison of three coordinate descent algorithms on a David Sontag two dimensional (NYU) Ising grid. TheGraphical dual objective Models L( ) is shown Lecture as a6, function March 7, 2013 of 14 / 25 Experimental results Introduction to Dual Decomposition for Inference Comparison of four coordinate descent algorithms on a two dimensional Ising grid: MSD MSD++ MPLP Star Objective Iteration

15 Experimental results Performance on stereo vision inference task: Solved optimally! Duality gap! Dual obj.! Objective! Decoded assignment! Iteration! David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

16 Today s lecture 1 Dual decomposition 2 MAP inference as an integer linear program 3 Linear programming relaxations for MAP inference David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

17 MAP as an integer linear program (ILP) MAP as a discrete optimization problem is arg max θ i (x i ) + θ ij (x i, x j ). x i V ij E To turn this into an integer linear program, we introduce indicator variables 1 µ i (x i ), one for each i V and state x i 2 µ ij (x i, x j ), one for each edge ij E and pair of states x i, x j The objective function is then max θ i (x i )µ i (x i ) + θ ij (x i, x j )µ ij (x i, x j ) µ x i x i,x j i V ij E What is the dimension of µ, if binary variables? David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

18 Visualization of feasible µ vectors µ = 1! 1! 1! 0" 0" 1" 0" 1! 1! 0" Marginal polytope! (Wainwright & Jordan, 03)! µ = 1 µ 2 + µ valid marginal probabilities! X 1! = 1! 1! 1! 1! 1" 0" 0" 0" 1! 1! 0" Assignment for X 1" Assignment for X 2" Assignment for X 3! Edge assignment for" X 1 X 3! Edge assignment for" X 1 X 2! Edge assignment for" X 2 X 3! X 1! = X 2! = 1! X 3 =! X 2! = 1! X 3 =! Figure 2-1: Illustration of the marginal polytope for a Markov random field with three nodes that have states in {0, 1}. The vertices correspond one-to-one with global assignments to David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

19 What are the constraints? Force every cluster of variables to choose a local assignment: µ i (x i ) {0, 1} i V, x i x i µ i (x i ) = 1 i V µ ij (x i, x j ) {0, 1} ij E, x i, x j x i,x j µ ij (x i, x j ) = 1 ij E Enforce that these local assignments are globally consistent: µ i (x i ) = x j µ ij (x i, x j ) ij E, x i µ j (x j ) = x i µ ij (x i, x j ) ij E, x j David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

20 MAP as an integer linear program (ILP) subject to: MAP(θ) = max θ i (x i )µ i (x i ) + θ ij (x i, x j )µ ij (x i, x j ) µ x i x i,x j i V ij E µ i (x i ) {0, 1} i V, x i x i µ i (x i ) = 1 i V µ i (x i ) = x j µ ij (x i, x j ) ij E, x i µ j (x j ) = x i µ ij (x i, x j ) ij E, x j Many extremely good off-the-shelf solvers, such as CPLEX and Gurobi David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

21 Linear programming relaxation for MAP Integer linear program was: MAP(θ) = max θ i (x i )µ i (x i ) + θ ij (x i, x j )µ ij (x i, x j ) µ x i x i,x j subject to i V ij E µ i (x i ) {0, 1} i V, x i x i µ i (x i ) = 1 i V µ i (x i ) = x j µ ij (x i, x j ) ij E, x i µ j (x j ) = x i µ ij (x i, x j ) ij E, x j Relax integrality constraints, allowing the variables to be between 0 and 1: µ i (x i ) [0, 1] i V, x i David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

22 Linear programming relaxation for MAP Linear programming relaxation is: LP(θ) = max θ µ i (x i )µ i (x i ) + θ ij (x i, x j )µ ij (x i, x j ) i V x i ij E x i,x j µ i (x i ) [0, 1] i V, x i x i µ i (x i ) = 1 i V µ i (x i ) = µ j (x j ) = µ ij (x i, x j ) ij E, x i x i µ ij (x i, x j ) ij E, x j Linear programs can be solved efficiently! Simplex method, interior point, ellipsoid algorithm Since the LP relaxation maximizes over a larger set of solutions, its value can only be higher MAP(θ) LP(θ) LP relaxation is tight for tree-structured MRFs David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

23 Dual decomposition = LP relaxation Recall we obtained the following dual linear program: L(δ) = ( max θ i (x i ) + ) δ j i (x i ) + ( ) max θ ij (x i, x j ) δ j i (x i ) δ i j (x j ), x i x i,x j i V ij E ij E DUAL-LP(θ) = min L(δ) δ We showed two ways of upper bounding the value of the MAP assignment: MAP(θ) LP(θ) (1) MAP(θ) DUAL-LP(θ) L(δ) (2) Although we derived these linear programs in seemingly very different ways, in turns out that: LP(θ) = DUAL-LP(θ) The dual LP allows us to upper bound the value of the MAP assignment without solving a LP to optimality David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

24 Linear programming duality (Dual) LP relaxation! Optimal assignment! µ x*! θ (Primal) LP relaxation! Marginal polytope! MAP(θ) LP(θ) = DUAL-LP(θ) L(δ) David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

25 Other approaches to solve MAP Graph cuts Local search Start from an arbitrary assignment (e.g., random). Iterate: Choose a variable. Change a new state for this variable to maximize the value of the resulting assignment Branch-and-bound Exhaustive search over space of assignments, pruning branches that can be provably shown not to contain a MAP assignment Can use the LP relaxation or its dual to obtain upper bounds Lower bound obtained from value of any assignment found Branch-and-cut (most powerful method; used by CPLEX & Gurobi) Same as branch-and-bound; spend more time getting tighter bounds Adds cutting-planes to cut off fractional solutions of the LP relaxation, making the upper bound tighter David Sontag (NYU) Graphical Models Lecture 6, March 7, / 25

Dual Decomposition for Inference

Dual Decomposition for Inference Dual Decomposition for Inference Yunshu Liu ASPITRG Research Group 2014-05-06 References: [1]. D. Sontag, A. Globerson and T. Jaakkola, Introduction to Dual Decomposition for Inference, Optimization for

More information

Inference and Representation

Inference and Representation Inference and Representation David Sontag New York University Lecture 5, Sept. 30, 2014 David Sontag (NYU) Inference and Representation Lecture 5, Sept. 30, 2014 1 / 16 Today s lecture 1 Running-time of

More information

Tightness of LP Relaxations for Almost Balanced Models

Tightness of LP Relaxations for Almost Balanced Models Tightness of LP Relaxations for Almost Balanced Models Adrian Weller University of Cambridge AISTATS May 10, 2016 Joint work with Mark Rowland and David Sontag For more information, see http://mlg.eng.cam.ac.uk/adrian/

More information

Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing

Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing Huayan Wang huayanw@cs.stanford.edu Computer Science Department, Stanford University, Palo Alto, CA 90 USA Daphne Koller koller@cs.stanford.edu

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Maria Ryskina, Yen-Chia Hsu 1 Introduction

More information

A Combined LP and QP Relaxation for MAP

A Combined LP and QP Relaxation for MAP A Combined LP and QP Relaxation for MAP Patrick Pletscher ETH Zurich, Switzerland pletscher@inf.ethz.ch Sharon Wulff ETH Zurich, Switzerland sharon.wulff@inf.ethz.ch Abstract MAP inference for general

More information

Junction Tree, BP and Variational Methods

Junction Tree, BP and Variational Methods Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,

More information

Revisiting the Limits of MAP Inference by MWSS on Perfect Graphs

Revisiting the Limits of MAP Inference by MWSS on Perfect Graphs Revisiting the Limits of MAP Inference by MWSS on Perfect Graphs Adrian Weller University of Cambridge CP 2015 Cork, Ireland Slides and full paper at http://mlg.eng.cam.ac.uk/adrian/ 1 / 21 Motivation:

More information

Variational algorithms for marginal MAP

Variational algorithms for marginal MAP Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Work with Qiang

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference IV: Variational Principle II Junming Yin Lecture 17, March 21, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4

More information

Dual Decomposition for Marginal Inference

Dual Decomposition for Marginal Inference Dual Decomposition for Marginal Inference Justin Domke Rochester Institute of echnology Rochester, NY 14623 Abstract We present a dual decomposition approach to the treereweighted belief propagation objective.

More information

13 : Variational Inference: Loopy Belief Propagation

13 : Variational Inference: Loopy Belief Propagation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 13 : Variational Inference: Loopy Belief Propagation Lecturer: Eric P. Xing Scribes: Rajarshi Das, Zhengzhong Liu, Dishan Gupta 1 Introduction

More information

Directed and Undirected Graphical Models

Directed and Undirected Graphical Models Directed and Undirected Graphical Models Adrian Weller MLSALT4 Lecture Feb 26, 2016 With thanks to David Sontag (NYU) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,

More information

Accurate prediction for atomic-level protein design and its application in diversifying the near-optimal sequence space

Accurate prediction for atomic-level protein design and its application in diversifying the near-optimal sequence space Accurate prediction for atomic-level protein design and its application in diversifying the near-optimal sequence space Pablo Gainza CPS 296: Topics in Computational Structural Biology Department of Computer

More information

Variational Algorithms for Marginal MAP

Variational Algorithms for Marginal MAP Variational Algorithms for Marginal MAP Qiang Liu Department of Computer Science University of California, Irvine Irvine, CA, 92697 qliu1@ics.uci.edu Alexander Ihler Department of Computer Science University

More information

MVE165/MMG631 Linear and integer optimization with applications Lecture 8 Discrete optimization: theory and algorithms

MVE165/MMG631 Linear and integer optimization with applications Lecture 8 Discrete optimization: theory and algorithms MVE165/MMG631 Linear and integer optimization with applications Lecture 8 Discrete optimization: theory and algorithms Ann-Brith Strömberg 2017 04 07 Lecture 8 Linear and integer optimization with applications

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference II: Mean Field Method and Variational Principle Junming Yin Lecture 15, March 7, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3

More information

Intelligent Systems:

Intelligent Systems: Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition

More information

arxiv: v1 [stat.ml] 26 Feb 2013

arxiv: v1 [stat.ml] 26 Feb 2013 Variational Algorithms for Marginal MAP Qiang Liu Donald Bren School of Information and Computer Sciences University of California, Irvine Irvine, CA, 92697-3425, USA qliu1@uci.edu arxiv:1302.6584v1 [stat.ml]

More information

CSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches

CSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches CSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches A presentation by Evan Ettinger November 11, 2005 Outline Introduction Motivation and Background

More information

Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm

Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm Dhruv Batra Sebastian Nowozin Pushmeet Kohli TTI-Chicago Microsoft Research Cambridge Microsoft Research Cambridge

More information

Variational Inference (11/04/13)

Variational Inference (11/04/13) STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further

More information

Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm

Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm Dhruv Batra Sebastian Nowozin Pushmeet Kohli TTI-Chicago Microsoft Research Cambridge Microsoft Research Cambridge

More information

Lagrangian Relaxation Algorithms for Inference in Natural Language Processing

Lagrangian Relaxation Algorithms for Inference in Natural Language Processing Lagrangian Relaxation Algorithms for Inference in Natural Language Processing Alexander M. Rush and Michael Collins (based on joint work with Yin-Wen Chang, Tommi Jaakkola, Terry Koo, Roi Reichart, David

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models David Sontag New York University Lecture 4, February 16, 2012 David Sontag (NYU) Graphical Models Lecture 4, February 16, 2012 1 / 27 Undirected graphical models Reminder

More information

Probabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013

Probabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013 School of Computer Science Probabilistic Graphical Models Theory of Variational Inference: Inner and Outer Approximation Junming Yin Lecture 15, March 4, 2013 Reading: W & J Book Chapters 1 Roadmap Two

More information

Probabilistic and Bayesian Machine Learning

Probabilistic and Bayesian Machine Learning Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/

More information

Discrete Inference and Learning Lecture 3

Discrete Inference and Learning Lecture 3 Discrete Inference and Learning Lecture 3 MVA 2017 2018 h

More information

Texas A&M University

Texas A&M University Texas A&M University Electrical & Computer Engineering Department Graphical Modeling Course Project Author: Mostafa Karimi UIN: 225000309 Prof Krishna Narayanan May 10 1 Introduction Proteins are made

More information

Dual Decomposition for Natural Language Processing. Decoding complexity

Dual Decomposition for Natural Language Processing. Decoding complexity Dual Decomposition for atural Language Processing Alexander M. Rush and Michael Collins Decoding complexity focus: decoding problem for natural language tasks motivation: y = arg max y f (y) richer model

More information

A new look at reweighted message passing

A new look at reweighted message passing Copyright IEEE. Accepted to Transactions on Pattern Analysis and Machine Intelligence (TPAMI) http://ieeexplore.ieee.org/xpl/articledetails.jsp?arnumber=692686 A new look at reweighted message passing

More information

Graphical Model Inference with Perfect Graphs

Graphical Model Inference with Perfect Graphs Graphical Model Inference with Perfect Graphs Tony Jebara Columbia University July 25, 2013 joint work with Adrian Weller Graphical models and Markov random fields We depict a graphical model G as a bipartite

More information

Part 7: Structured Prediction and Energy Minimization (2/2)

Part 7: Structured Prediction and Energy Minimization (2/2) Part 7: Structured Prediction and Energy Minimization (2/2) Colorado Springs, 25th June 2011 G: Worst-case Complexity Hard problem Generality Optimality Worst-case complexity Integrality Determinism G:

More information

13 : Variational Inference: Loopy Belief Propagation and Mean Field

13 : Variational Inference: Loopy Belief Propagation and Mean Field 10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction

More information

Tightening Fractional Covering Upper Bounds on the Partition Function for High-Order Region Graphs

Tightening Fractional Covering Upper Bounds on the Partition Function for High-Order Region Graphs Tightening Fractional Covering Upper Bounds on the Partition Function for High-Order Region Graphs Tamir Hazan TTI Chicago Chicago, IL 60637 Jian Peng TTI Chicago Chicago, IL 60637 Amnon Shashua The Hebrew

More information

Lecture 15. Probabilistic Models on Graph

Lecture 15. Probabilistic Models on Graph Lecture 15. Probabilistic Models on Graph Prof. Alan Yuille Spring 2014 1 Introduction We discuss how to define probabilistic models that use richly structured probability distributions and describe how

More information

THE topic of this paper is the following problem:

THE topic of this paper is the following problem: 1 Revisiting the Linear Programming Relaxation Approach to Gibbs Energy Minimization and Weighted Constraint Satisfaction Tomáš Werner Abstract We present a number of contributions to the LP relaxation

More information

Disconnecting Networks via Node Deletions

Disconnecting Networks via Node Deletions 1 / 27 Disconnecting Networks via Node Deletions Exact Interdiction Models and Algorithms Siqian Shen 1 J. Cole Smith 2 R. Goli 2 1 IOE, University of Michigan 2 ISE, University of Florida 2012 INFORMS

More information

Quadratic Programming Relaxations for Metric Labeling and Markov Random Field MAP Estimation

Quadratic Programming Relaxations for Metric Labeling and Markov Random Field MAP Estimation Quadratic Programming Relaations for Metric Labeling and Markov Random Field MAP Estimation Pradeep Ravikumar John Lafferty School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213,

More information

SOLVING INTEGER LINEAR PROGRAMS. 1. Solving the LP relaxation. 2. How to deal with fractional solutions?

SOLVING INTEGER LINEAR PROGRAMS. 1. Solving the LP relaxation. 2. How to deal with fractional solutions? SOLVING INTEGER LINEAR PROGRAMS 1. Solving the LP relaxation. 2. How to deal with fractional solutions? Integer Linear Program: Example max x 1 2x 2 0.5x 3 0.2x 4 x 5 +0.6x 6 s.t. x 1 +2x 2 1 x 1 + x 2

More information

On the optimality of tree-reweighted max-product message-passing

On the optimality of tree-reweighted max-product message-passing Appeared in Uncertainty on Artificial Intelligence, July 2005, Edinburgh, Scotland. On the optimality of tree-reweighted max-product message-passing Vladimir Kolmogorov Microsoft Research Cambridge, UK

More information

Section Notes 8. Integer Programming II. Applied Math 121. Week of April 5, expand your knowledge of big M s and logical constraints.

Section Notes 8. Integer Programming II. Applied Math 121. Week of April 5, expand your knowledge of big M s and logical constraints. Section Notes 8 Integer Programming II Applied Math 121 Week of April 5, 2010 Goals for the week understand IP relaxations be able to determine the relative strength of formulations understand the branch

More information

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,

More information

Max$Sum(( Exact(Inference(

Max$Sum(( Exact(Inference( Probabilis7c( Graphical( Models( Inference( MAP( Max$Sum(( Exact(Inference( Product Summation a 1 b 1 8 a 1 b 2 1 a 2 b 1 0.5 a 2 b 2 2 a 1 b 1 3 a 1 b 2 0 a 2 b 1-1 a 2 b 2 1 Max-Sum Elimination in Chains

More information

3 : Representation of Undirected GM

3 : Representation of Undirected GM 10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:

More information

Integer Linear Programs

Integer Linear Programs Lecture 2: Review, Linear Programming Relaxations Today we will talk about expressing combinatorial problems as mathematical programs, specifically Integer Linear Programs (ILPs). We then see what happens

More information

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,

More information

Side-chain positioning with integer and linear programming

Side-chain positioning with integer and linear programming Side-chain positioning with integer and linear programming Matt Labrum 1 Introduction One of the components of homology modeling and protein design is side-chain positioning (SCP) In [1], Kingsford, et

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture

More information

Probabilistic Graphical Models: Lagrangian Relaxation Algorithms for Natural Language Processing

Probabilistic Graphical Models: Lagrangian Relaxation Algorithms for Natural Language Processing Probabilistic Graphical Models: Lagrangian Relaxation Algorithms for atural Language Processing Alexander M. Rush (based on joint work with Michael Collins, Tommi Jaakkola, Terry Koo, David Sontag) Uncertainty

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:

More information

Semi-Markov/Graph Cuts

Semi-Markov/Graph Cuts Semi-Markov/Graph Cuts Alireza Shafaei University of British Columbia August, 2015 1 / 30 A Quick Review For a general chain-structured UGM we have: n n p(x 1, x 2,..., x n ) φ i (x i ) φ i,i 1 (x i, x

More information

The CPLEX Library: Mixed Integer Programming

The CPLEX Library: Mixed Integer Programming The CPLEX Library: Mixed Programming Ed Rothberg, ILOG, Inc. 1 The Diet Problem Revisited Nutritional values Bob considered the following foods: Food Serving Size Energy (kcal) Protein (g) Calcium (mg)

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models David Sontag New York University Lecture 12, April 23, 2013 David Sontag NYU) Graphical Models Lecture 12, April 23, 2013 1 / 24 What notion of best should learning be optimizing?

More information

5. Sum-product algorithm

5. Sum-product algorithm Sum-product algorithm 5-1 5. Sum-product algorithm Elimination algorithm Sum-product algorithm on a line Sum-product algorithm on a tree Sum-product algorithm 5-2 Inference tasks on graphical models consider

More information

UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS

UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS JONATHAN YEDIDIA, WILLIAM FREEMAN, YAIR WEISS 2001 MERL TECH REPORT Kristin Branson and Ian Fasel June 11, 2003 1. Inference Inference problems

More information

Joint Optimization of Segmentation and Appearance Models

Joint Optimization of Segmentation and Appearance Models Joint Optimization of Segmentation and Appearance Models David Mandle, Sameep Tandon April 29, 2013 David Mandle, Sameep Tandon (Stanford) April 29, 2013 1 / 19 Overview 1 Recap: Image Segmentation 2 Optimization

More information

Semidefinite relaxations for approximate inference on graphs with cycles

Semidefinite relaxations for approximate inference on graphs with cycles Semidefinite relaxations for approximate inference on graphs with cycles Martin J. Wainwright Electrical Engineering and Computer Science UC Berkeley, Berkeley, CA 94720 wainwrig@eecs.berkeley.edu Michael

More information

Decision Procedures An Algorithmic Point of View

Decision Procedures An Algorithmic Point of View An Algorithmic Point of View ILP References: Integer Programming / Laurence Wolsey Deciding ILPs with Branch & Bound Intro. To mathematical programming / Hillier, Lieberman Daniel Kroening and Ofer Strichman

More information

9 Forward-backward algorithm, sum-product on factor graphs

9 Forward-backward algorithm, sum-product on factor graphs Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous

More information

Outline. Outline. Outline DMP204 SCHEDULING, TIMETABLING AND ROUTING. 1. Scheduling CPM/PERT Resource Constrained Project Scheduling Model

Outline. Outline. Outline DMP204 SCHEDULING, TIMETABLING AND ROUTING. 1. Scheduling CPM/PERT Resource Constrained Project Scheduling Model Outline DMP204 SCHEDULING, TIMETABLING AND ROUTING Lecture 3 and Mixed Integer Programg Marco Chiarandini 1. Resource Constrained Project Model 2. Mathematical Programg 2 Outline Outline 1. Resource Constrained

More information

Lecture 9: PGM Learning

Lecture 9: PGM Learning 13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and

More information

(P ) Minimize 4x 1 + 6x 2 + 5x 3 s.t. 2x 1 3x 3 3 3x 2 2x 3 6

(P ) Minimize 4x 1 + 6x 2 + 5x 3 s.t. 2x 1 3x 3 3 3x 2 2x 3 6 The exam is three hours long and consists of 4 exercises. The exam is graded on a scale 0-25 points, and the points assigned to each question are indicated in parenthesis within the text. Problem 1 Consider

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

Lecture 4: State Estimation in Hidden Markov Models (cont.)

Lecture 4: State Estimation in Hidden Markov Models (cont.) EE378A Statistical Signal Processing Lecture 4-04/13/2017 Lecture 4: State Estimation in Hidden Markov Models (cont.) Lecturer: Tsachy Weissman Scribe: David Wugofski In this lecture we build on previous

More information

Column Generation. i = 1,, 255;

Column Generation. i = 1,, 255; Column Generation The idea of the column generation can be motivated by the trim-loss problem: We receive an order to cut 50 pieces of.5-meter (pipe) segments, 250 pieces of 2-meter segments, and 200 pieces

More information

Directed and Undirected Graphical Models

Directed and Undirected Graphical Models Directed and Undirected Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Last Lecture Refresher Lecture Plan Directed

More information

3.10 Lagrangian relaxation

3.10 Lagrangian relaxation 3.10 Lagrangian relaxation Consider a generic ILP problem min {c t x : Ax b, Dx d, x Z n } with integer coefficients. Suppose Dx d are the complicating constraints. Often the linear relaxation and the

More information

Section Notes 9. Midterm 2 Review. Applied Math / Engineering Sciences 121. Week of December 3, 2018

Section Notes 9. Midterm 2 Review. Applied Math / Engineering Sciences 121. Week of December 3, 2018 Section Notes 9 Midterm 2 Review Applied Math / Engineering Sciences 121 Week of December 3, 2018 The following list of topics is an overview of the material that was covered in the lectures and sections

More information

23. Cutting planes and branch & bound

23. Cutting planes and branch & bound CS/ECE/ISyE 524 Introduction to Optimization Spring 207 8 23. Cutting planes and branch & bound ˆ Algorithms for solving MIPs ˆ Cutting plane methods ˆ Branch and bound methods Laurent Lessard (www.laurentlessard.com)

More information

3.4 Relaxations and bounds

3.4 Relaxations and bounds 3.4 Relaxations and bounds Consider a generic Discrete Optimization problem z = min{c(x) : x X} with an optimal solution x X. In general, the algorithms generate not only a decreasing sequence of upper

More information

Machine Learning 4771

Machine Learning 4771 Machine Learning 4771 Instructor: Tony Jebara Topic 16 Undirected Graphs Undirected Separation Inferring Marginals & Conditionals Moralization Junction Trees Triangulation Undirected Graphs Separation

More information

Bounding the Partition Function using Hölder s Inequality

Bounding the Partition Function using Hölder s Inequality Qiang Liu Alexander Ihler Department of Computer Science, University of California, Irvine, CA, 92697, USA qliu1@uci.edu ihler@ics.uci.edu Abstract We describe an algorithm for approximate inference in

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 9: Variational Inference Relaxations Volkan Cevher, Matthias Seeger Ecole Polytechnique Fédérale de Lausanne 24/10/2011 (EPFL) Graphical Models 24/10/2011 1 / 15

More information

Graphs and Network Flows IE411. Lecture 12. Dr. Ted Ralphs

Graphs and Network Flows IE411. Lecture 12. Dr. Ted Ralphs Graphs and Network Flows IE411 Lecture 12 Dr. Ted Ralphs IE411 Lecture 12 1 References for Today s Lecture Required reading Sections 21.1 21.2 References AMO Chapter 6 CLRS Sections 26.1 26.2 IE411 Lecture

More information

Optimization Bounds from Binary Decision Diagrams

Optimization Bounds from Binary Decision Diagrams Optimization Bounds from Binary Decision Diagrams J. N. Hooker Joint work with David Bergman, André Ciré, Willem van Hoeve Carnegie Mellon University ICS 203 Binary Decision Diagrams BDDs historically

More information

CS Lecture 3. More Bayesian Networks

CS Lecture 3. More Bayesian Networks CS 6347 Lecture 3 More Bayesian Networks Recap Last time: Complexity challenges Representing distributions Computing probabilities/doing inference Introduction to Bayesian networks Today: D-separation,

More information

Introduction To Graphical Models

Introduction To Graphical Models Peter Gehler Introduction to Graphical Models Introduction To Graphical Models Peter V. Gehler Max Planck Institute for Intelligent Systems, Tübingen, Germany ENS/INRIA Summer School, Paris, July 2013

More information

Unifying Local Consistency and MAX SAT Relaxations for Scalable Inference with Rounding Guarantees

Unifying Local Consistency and MAX SAT Relaxations for Scalable Inference with Rounding Guarantees for Scalable Inference with Rounding Guarantees Stephen H. Bach Bert Huang Lise Getoor University of Maryland Virginia Tech University of California, Santa Cruz Abstract We prove the equivalence of first-order

More information

Linear Programming. Scheduling problems

Linear Programming. Scheduling problems Linear Programming Scheduling problems Linear programming (LP) ( )., 1, for 0 min 1 1 1 1 1 11 1 1 n i x b x a x a b x a x a x c x c x z i m n mn m n n n n! = + + + + + + = Extreme points x ={x 1,,x n

More information

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

More information

Introduction to Integer Linear Programming

Introduction to Integer Linear Programming Lecture 7/12/2006 p. 1/30 Introduction to Integer Linear Programming Leo Liberti, Ruslan Sadykov LIX, École Polytechnique liberti@lix.polytechnique.fr sadykov@lix.polytechnique.fr Lecture 7/12/2006 p.

More information

Resource Constrained Project Scheduling Linear and Integer Programming (1)

Resource Constrained Project Scheduling Linear and Integer Programming (1) DM204, 2010 SCHEDULING, TIMETABLING AND ROUTING Lecture 3 Resource Constrained Project Linear and Integer Programming (1) Marco Chiarandini Department of Mathematics & Computer Science University of Southern

More information

Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS. Lluís-Miquel Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano

Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS. Lluís-Miquel Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS Lluís-Miquel Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano ... Our contribution PIPS-PSBB*: Multi-level parallelism for Stochastic

More information

Inference in Graphical Models Variable Elimination and Message Passing Algorithm

Inference in Graphical Models Variable Elimination and Message Passing Algorithm Inference in Graphical Models Variable Elimination and Message Passing lgorithm Le Song Machine Learning II: dvanced Topics SE 8803ML, Spring 2012 onditional Independence ssumptions Local Markov ssumption

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

How to Compute Primal Solution from Dual One in LP Relaxation of MAP Inference in MRF?

How to Compute Primal Solution from Dual One in LP Relaxation of MAP Inference in MRF? How to Compute Primal Solution from Dual One in LP Relaxation of MAP Inference in MRF? Tomáš Werner Center for Machine Perception, Czech Technical University, Prague Abstract In LP relaxation of MAP inference

More information

Undirected graphical models

Undirected graphical models Undirected graphical models Semantics of probabilistic models over undirected graphs Parameters of undirected models Example applications COMP-652 and ECSE-608, February 16, 2017 1 Undirected graphical

More information

Decomposition-based Methods for Large-scale Discrete Optimization p.1

Decomposition-based Methods for Large-scale Discrete Optimization p.1 Decomposition-based Methods for Large-scale Discrete Optimization Matthew V Galati Ted K Ralphs Department of Industrial and Systems Engineering Lehigh University, Bethlehem, PA, USA Départment de Mathématiques

More information

Discrete Markov Random Fields the Inference story. Pradeep Ravikumar

Discrete Markov Random Fields the Inference story. Pradeep Ravikumar Discrete Markov Random Fields the Inference story Pradeep Ravikumar Graphical Models, The History How to model stochastic processes of the world? I want to model the world, and I like graphs... 2 Mid to

More information

Convexifying the Bethe Free Energy

Convexifying the Bethe Free Energy 402 MESHI ET AL. Convexifying the Bethe Free Energy Ofer Meshi Ariel Jaimovich Amir Globerson Nir Friedman School of Computer Science and Engineering Hebrew University, Jerusalem, Israel 91904 meshi,arielj,gamir,nir@cs.huji.ac.il

More information

Sampling Algorithms for Probabilistic Graphical models

Sampling Algorithms for Probabilistic Graphical models Sampling Algorithms for Probabilistic Graphical models Vibhav Gogate University of Washington References: Chapter 12 of Probabilistic Graphical models: Principles and Techniques by Daphne Koller and Nir

More information

CS Lecture 8 & 9. Lagrange Multipliers & Varitional Bounds

CS Lecture 8 & 9. Lagrange Multipliers & Varitional Bounds CS 6347 Lecture 8 & 9 Lagrange Multipliers & Varitional Bounds General Optimization subject to: min ff 0() R nn ff ii 0, h ii = 0, ii = 1,, mm ii = 1,, pp 2 General Optimization subject to: min ff 0()

More information

An Integrated Approach to Truss Structure Design

An Integrated Approach to Truss Structure Design Slide 1 An Integrated Approach to Truss Structure Design J. N. Hooker Tallys Yunes CPAIOR Workshop on Hybrid Methods for Nonlinear Combinatorial Problems Bologna, June 2010 How to Solve Nonlinear Combinatorial

More information

Factor Graphs and Message Passing Algorithms Part 1: Introduction

Factor Graphs and Message Passing Algorithms Part 1: Introduction Factor Graphs and Message Passing Algorithms Part 1: Introduction Hans-Andrea Loeliger December 2007 1 The Two Basic Problems 1. Marginalization: Compute f k (x k ) f(x 1,..., x n ) x 1,..., x n except

More information

- Well-characterized problems, min-max relations, approximate certificates. - LP problems in the standard form, primal and dual linear programs

- Well-characterized problems, min-max relations, approximate certificates. - LP problems in the standard form, primal and dual linear programs LP-Duality ( Approximation Algorithms by V. Vazirani, Chapter 12) - Well-characterized problems, min-max relations, approximate certificates - LP problems in the standard form, primal and dual linear programs

More information

Does Better Inference mean Better Learning?

Does Better Inference mean Better Learning? Does Better Inference mean Better Learning? Andrew E. Gelfand, Rina Dechter & Alexander Ihler Department of Computer Science University of California, Irvine {agelfand,dechter,ihler}@ics.uci.edu Abstract

More information

Graphical models and message-passing Part II: Marginals and likelihoods

Graphical models and message-passing Part II: Marginals and likelihoods Graphical models and message-passing Part II: Marginals and likelihoods Martin Wainwright UC Berkeley Departments of Statistics, and EECS Tutorial materials (slides, monograph, lecture notes) available

More information

Relaxed Marginal Inference and its Application to Dependency Parsing

Relaxed Marginal Inference and its Application to Dependency Parsing Relaxed Marginal Inference and its Application to Dependency Parsing Sebastian Riedel David A. Smith Department of Computer Science University of Massachusetts, Amherst {riedel,dasmith}@cs.umass.edu Abstract

More information

Lagrange Relaxation: Introduction and Applications

Lagrange Relaxation: Introduction and Applications 1 / 23 Lagrange Relaxation: Introduction and Applications Operations Research Anthony Papavasiliou 2 / 23 Contents 1 Context 2 Applications Application in Stochastic Programming Unit Commitment 3 / 23

More information