Quadratic Programming Relaxations for Metric Labeling and Markov Random Field MAP Estimation

Size: px
Start display at page:

Download "Quadratic Programming Relaxations for Metric Labeling and Markov Random Field MAP Estimation"

Transcription

1 Quadratic Programming Relaations for Metric Labeling and Markov Random Field MAP Estimation Pradeep Ravikumar John Lafferty School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA Abstract Quadratic program relaations are proposed as an alternative to linear program relaations and tree reweighted belief propagation for the metric labeling or MAP estimation problem. An additional conve relaation of the quadratic approimation is shown to have additive approimation guarantees that apply even when the graph weights have mied sign or do not come from a metric. The approimations are etended in a manner that allows tight variational relaations of the MAP problem, although they generally involve non-conve optimization. Eperiments carried out on synthetic data show that the quadratic approimations can be more accurate and computationally efficient than the linear programming and propagation based alternatives. 1. Introduction Undirected graphical models, or Markov random fields (MRFs), are natural tools in many domains, from image processing to social network modeling. A key inference problem for MRFs is to compute the maimum a posteriori (MAP) configuration the most probable labeling which is used in multiple applications such as image denoising, protein folding and error control coding. For arbitrary graphs and parameter settings this problem is NPhard, but various approimate techniques have been proposed that have enabled the application of MRFs to a range of practical problems. For tree-structured distributions, the MAP estimate for random fields can be computed efficiently by dynamic programming. It can also be computed in polynomial time using graph cuts (Greig et al., 1989) when the parameter settings yield a submodular energy function. In the gen- Appearing in Proceedings of the 23 rd International Conference on Machine Learning, Pittsburgh, PA, 26. Copyright 26 by the author(s)/owner(s). eral setting, a widely used approimation technique is maproduct belief propagation (Pearl, 1988). The algorithm is convergent on trees, and its fied point configuration upon convergence can be shown to be locally optimal with respect to a large set of moves (Weiss & Freeman, 21). A similar message passing algorithm, tree-reweighted ma product (Wainwright et al., 25), has stronger correctness and convergence guarantees. Boykov et al. (21) have proposed graph-cut based algorithms that efficiently find a local energy minimum with respect to two types of large moves. A different direction has been taken in recent work on linear program relaations for the MAP problem in the specific setting of metric labeling. In the metric labeling formulation, a weighted graph and a metric on labels specifies the energy or cost of different labelings of a set of obects, and the goal is to find a minimum cost labeling. Casting this as an integer linear program, Kleinberg and Tardos (1999) proposed linear relaations for specific metrics. Chekuri et al. (25) recently etended these techniques using the natural linear relaation of the metric labeling task, and obtained stronger approimation guarantees. In this paper, we propose a quadratic programming (QP) relaation to the MAP or metric labeling problem. While the linear relaations have O( E k 2 ) variables, where E is the number of edges in the graph and k is the number of labels, in our QP formulation there are kn variables, and yet we show that the quadratic obective function more accurately represents the energy in the graphical model. In particular, we show that the QP formulation computes the MAP solution eactly. Under certain conditions the relaation results in a non-conve problem however, which requires an intractable search over local minima. This motivates an additional conve approimation to the relaation, which we show satisfies an additive approimation guarantee. We also etend the relaation to general variational inner polytope relaations which we also show to compute the MAP eactly. Eperiments indicate that our quadratic relaation with the conve approimation outperforms or is comparable to eisting methods under many

2 settings. Quadratic Programming Relaations for Metric Labeling and Markov Random Field MAP Estimation In the following section, we establish some notation and recall the relevant background. In Section 2.1 we review linear relaations for MAP estimation. In Section 3, we describe the quadratic relaation, prove that it is tight, and detail its conve approimation. In Section 4, we then etend the above relaation to show the tightness of various variational inner polytope relaations. Finally, in Section 5 and Section 6, we present our eperimental results and conclusions. 2. Notation and Background Consider a graph G = (V,E), where V denotes the set of nodes and E denotes the set of edges. Let X s be a random variable associated with node s, for s V, yielding a random vector X = {X 1,...,X n }, and let φ = {φ α,α I} denote the set of potential functions (or sufficient statistics) for a set I of cliques in G. Associated with φ is a vector of parameters θ = {θ α,α I}. With this notation, the eponential family of distributions of X, associated with φ and G is given by ( ) p(;θ) = ep θ α φ α Ψ(θ). α As discussed in (Yedidia et al., 21), at the epense of increasing the state space one can assume without loss of generality that the graphical model is a pairwise Markov random field, i.e., the set of cliques I is the set of edges {(s,t) E}, so that p(;θ) ep θ s φ s ( s ) + θ st φ st ( s, t ). s V (s,t) E If each X s takes values in a discrete set X s, we can represent any potential function as a linear combination of indicator functions, φ s ( s ) = φ s() I ( s ) and φ st ( s, t ) = k φ st(,k) I,k ( s, t ) where { 1 s = I ( s ) = otherwise and I,k ( s, t ) = { 1 s = and t = k otherwise. The MAP problem is then given by = argma θ I ( s ) + θ s,;t,k I,k ( s, t ). s, 2.1. Linear Relaations MAP estimation in the discrete case is essentially a combinatorial optimization problem, and it can be cast as an integer program. Recent work has studied approimate MAP estimation using linear relaations (Bertsimas & Tsitsiklis, 1997). Letting variables () and (s,;t,k) correspond to the indicator variables I ( s ) and I,k ( s, t ), we obtain the following integer linear program (ILP), ma such that (1) θ 1 () + θ s,;t,k 2 (s,;t,k) 2 (s,;t,k) = 1 () k 1 () = 1 1 () {,1} 2 (s,;t,k) {,1}. This ILP can then be relaed to the following linear program (LP), ma such that θ 1 () + θ s,;t,k 2 (s,;t,k) 2 (s,;t,k) = 1 () k 1 () = 1 1 () 1 2 (s,;t,k) 1. Chekuri et al. (25) propose the above LP relaation as an approimation algorithm for the metric labeling task, which is the MAP problem with spatially homogeneous MRF parameters; thus, θ s,;t,k = w st d(,k), where w st is a non-negative edge weight and d is a metric that is the same for all the edges. Kleinberg and Tardos (1999) proposed related linear relaations for specific metrics. The above LP relaation was also proposed for the general pairwise graphical model setting by Wainwright and Jordan (23). Letting θ and φ() denote the vectors of parameters and potential functions, respectively, and letting θ, φ() denote the inner product We thus consider pairwise MRFs with indicator potential functions as p( θ) ep θ I ( s ) + θ s,;t,k I,k ( s, t ). θ,φ() = θ I ( s )+ s, (s,t) E;,k (2) θ s,;t,k I,k ( s, t )

3 the MAP problem is then given by = argma θ,φ() = sup θ, M where M is the set of moment parameters { M = : p()φ() = for some distribution p The polytope M can be seen to be upper-bounded by the set LOCAL(G) of all single and pairwise vectors 1 and 2 that satisfy the local consistency constraints k 2(s,;t,k) = 1 () 1(s,) = 1 1 () 1 2 (s,;t,k) 1. Wainwright and Jordan (23) thus proposed the upperbounding relaation of using LOCAL(G) as an outer bound for the polytope M, = sup θ,, (3) LOCAL(G) which is the same LP formulation as in equation (2). Furthermore, Wainwright et al. (25) show that under certain conditions, the tree-reweighted belief propagation updates solve the dual of the LP in equation (3); since strong duality holds, the tree updates also give the optimal primal value for the LP. 3. Quadratic Relaation In the linear relaation of equation (2), the variables 2 (s,;t,k) are relaations of the indicator variables I,k ( s, t ), with a value of one indicating that for edge (s,t) E, variable s is labeled and variable t is labeled k. These pairwise variables are constrained by demanding that they be consistent with the corresponding marginal variables 1 (s,). Note, however, that the binary indicator variables satisfy the additional independence constraint I,k ( s, t ) = I ( s ) I k ( t ). This then suggests that constraining the relaation variables in a similar manner, 2 (s,;t,k) = 1 ()(t;k), might yield a tighter relaation. This leads to the following quadratic program ma θ () + θ s,;t,k ()(t;k) subect to () = 1 (4) () 1 }. The following result shows that the relaation is in fact tight; the proof uses the probabilistic method. Theorem 3.1. The optimal value of problem (4) is equal to the optimal value of the MAP problem (1). Proof. Let the optimal MAP energy be e MAP = ma θ I ( s ) + θ s,;t,k I,k ( s, t ) and let the optimal value of the relaed problem be e = ma θ () + θ s,;t,k ()(t;k) where () = 1 and () [,1]. Clearly, e e MAP since problem (4) is a relaation of problem (1). We now show that e MAP e. Let be an optimal solution of problem (4), and consider the following randomized rounding scheme. For each node s, assign it value with probability (). The epected energy of such a rounding is e R = θ ()) + θ s,;t,k ()(t;k) = e s,;t,k But there has to eist some discrete assignment y whose energy is greater than the epected energy; e(y) e. Since e MAP is the energy of the optimal configuration e MAP e(y), which thus gives e MAP e. Note that the randomization in the proof ust shows the eistence of a discrete solution with the same energy as that of the optimal real relaation. The problem of obtaining such a discrete solution efficiently is considered net. Theorem 3.2. Any solution of the MAP problem (1) efficiently yields a solution of the relaation (4) and vice versa. Thus the relaation (4) is equivalent to the MAP problem (1). Proof. From theorem 3.1, the optimal values of problems (1) and (4) are equal; let e denote this maimum energy. Let ˆ be an optimal solution of the MAP problem (1). As problem (4) is a relaation of the MAP problem, () = I(ˆ;) is also a feasible and optimal solution for (4). For the converse, let be an optimal solution of problem (4). Its energy is given by e = θ () + θ s,;t,k () (t;k) (s,t) E;,k (5)

4 If each () is integer valued, that is, in {,1}, then we can use itself as the feasible optimal solution for the MAP problem (1). Otherwise, consider to be real valued; we (efficiently) construct a labeling y with the maimum energy e. Consider an unlabeled node s. Assign it label y s = argma θ + t:(s,t) E;k θ s,;t,k (t;k). Now, set (s;y s ) = 1 and (s;k) = ; k y s. Continue with this labeling process until all nodes are labeled. It can be shown that the energy of this assignment y is equal to the energy e of the optimal MAP assignment. In particular, each time we take up an unlabeled node t, we select a labeling that does not decrease the epected energy of the unlabeled nodes given the labelings of the labeled nodes. Given that the initial epected energy of all unlabeled nodes was e, the energy at the end of the process, that is, of the assignment y, is thus at least e Conve Approimation The previous section showed that the relaation in equation (4), while a simple etension of the LP in equation (2), is actually equivalent to the MAP problem. This yields the interesting result that the MAP problem is solvable in polynomial time if the edge parameter matri Θ = [θ s,;t,k ] is negative definite, since in this case the QP (4) is a conve program. Note also that the quadratic program has a simple set of constraints (only linear and bo constraints), which are also small in number, and is thus a simple problem instance of general conve optimization. It should also be stressed that for an n node graph, the QP has only kn variables while the LP has O(k 2 E ) variables. The case where the edge parameter matri Θ is not negative definite yields a non-conve program. While one could carry out an iterative search procedure up to a local maimum as in the ma-product algorithm, we now describe a conve approimation that provides a polynomial time solution with additive bound guarantees. Consider the quadratic integer program (QIP) corresponding to the QP, given by ma θ () + θ s,;t,k ()(t;k) subect to () = 1 (6) () {,1} (s,;t,k) {,1}. This is clearly equivalent to the MAP problem in equation (1). Let Θ = [θ s,;t,k ] be a parameter matri that is not negative semi-definite. Let d(s, i) be the (positive) diagonal terms that need to be subtracted from the matri to make it negative semi-definite. An upper bound for d is d(s,i) (t,k) θ s,;t,k, since the negative of a diagonally dominant matri is negative semi-definite. Let Θ = Θ diag {d(s;i)} be the negative semi-definite matri obtained by subtracting off diagonal elements d(s; i). Also, let θ = θ + d(). (7) Now, for binary (s;i) {,1}, we have that (s;i) 2 = (s;i); in particular, d(s;i)() d(s;i)() 2 =. We thus get that the following QIP is equivalent to the QIP (6), ma such that θ () + θ s,;t,k ()(t;k) () = 1 () {,1} Relaing this QIP as before, we obtain the following optimization problem. ma θ () + θ s,;t,k()(t;k) such that () = 1 () [,1] This is a conve program solvable in polynomial time. The optimality results of the previous section do not follow, however, and the relaation (8) is not always tight. But as shown net, we can get an additive approimation bound for the discrete solution obtained using the rounding procedure described in the previous section. Theorem 3.3. Let be the optimal solution for the conve QP (8), and let e be the optimal MAP energy. Then there is a discrete configuration y (from ) with energy E(y) satisfying E(y) e d(s;i) (s;i)(1 (s;i)) s,i e 1 d(s;i). 4 s,i This result shows that if either Θ is close to negative definite, so that s,i d(s;i) is small, or if the solution is close to integral, so that (s;i) is close to zero or one, then the conve relaation achieves a solution that is close to the optimal MAP solution. Proof. Just as in the proof of Theorem 3.2, given, the optimal solution to the conve relaation, we can effi-

5 ciently obtain a discrete solution y whose energy is E(y) = θ ()+ θ s,;t,k () (t;k). (s,t) E;,k On the other hand, the optimal value of the conve QP is given by e CQP We then have that = (θ + d(s;i)) () + θ s,;t,k () (t;k) (s,i) d(s;i) (s;i) 2. E(y) = e CQP (s,i) d(s;i) (s;i)[1 (s;i)] e s,i e s,i and the result follows. d(s;i) (s;i)(1 (s;i)) d(s; i) Iterative Update Procedure Just as tree-reweighted ma product gives a set of iterative updates for solving the LP in equation (3), we might ask if there is an iterative update counterpart for the QP. Maproduct is a co-ordinate ascent algorithm in the dual (Lagrangian) space for the LP; however, since the dual space of the QP (4) is more complicated, we look at a set of fied point co-ordinate ascent updates in its primal space. The QP is given by = ma θ () + θ s,;t,k ()(t;k) (8) subect to (s; ) = 1 (s; ) [, 1]. Consider node s, and suppose that values for (t;.) are fied for other nodes t s. Then the optimal parameter values (s;.) for node s are given by (s;.) = ma (s;.) θ () + t;,k θ s,;t,k ()(t;k) subect to () = 1. This is easily seen to be solved by taking (s) = argma θ + θ s,;t,k (t;k) t;,k and setting (s,) = I (s)(). This is essentially the iterative conditional modes algorithm (Besag, 1986), which iteratively updates each node with a labeling that most increases the energy, holding fied the labels of the other nodes. A better iterative procedure, with stronger and faster convergence properties, albeit for conve programs, is proected conugate gradient ascent (Aelsson & Barker, 21). Thus, another advantage of our conve approimation is that we can use conugate gradient ascent as a simple iterative procedure that is guaranteed to converge (unlike co-ordinate ascent for ma product). This makes the conve approimation to the QP applicable to large scale problems. 4. Inner Polytope Relaations In the previous section, we obtained a quadratic relaation by imposing an independence constraint on the parameters (s,;t,k) in equation (4). We also showed that this relaation is actually tight, and is equivalent to the MAP problem. In this section, we show how one can think of this relaation as the counterpart of mean-field for MAP, and how any of the corresponding relaation counterparts of structured mean-field are also tight. Consider Wainwright and Jordan (23) s polytope formulation of MAP in equation (3), given by = ma θ,φ = sup θ, M where M is the conve hull of all configuration potentials φ(). The second equality follows from the fact that in a linear program, the optimum occurs at an etremal point φ( ). Thus, if M I M is any subset of the marginal polytope that includes all of the vertices, then the equations = ma θ,φ = sup θ, M I still hold. In other words, any relaation of the indicator variables to (s,;t,k) M I would lead to a tight relaation, as long as M I contains all vertices. In contrast, tree-reweighted ma product is not tight, since the domain set for its relaed parameters is LOCAL(G) M; see Section 2.1. As described in (Wainwright & Jordan, 23), one can think of structured mean field methods as inner polytope

6 approimations. For the given graph G and a subgraph H, let E(H) = {θ θ st = θ st 1 (s,t) H } where θ st is the vector of natural parameters associated with edge (s,t). Thus, the parameters not included in the subgraph are set to zero. Now for the subgraph H, we can define the following set of moment parameters: M(G;H) = { = E θ [φ()] for some θ E(H)}. In essence, the moment parameters in M(G; H) must be realizable by a distribution that respects the structure of H. For any H G, the relation M(G;H) M(G) thus always holds, and M(G; H) is an inner polytope approimation to M. In particular, taking H to be the completely disconnected graph (i.e. no edges) H, we have, M(G;H ) = {(),(s,;t,k) () 1 (s,;t,k) = ()(t;k)} which can be seen to be equal to the feasible set of the QP relaation (4). For this subgraph H = H, the mean field relaation thus becomes sup θ, M(G;H ) = sup M(G;H ) = sup M(G;H ) θ () + θ s,;t,k (s,;t,k) st;k θ () + θ s,;t,k ()(t;k) st;k which is equivalent to the quadratic relaation in equation (4). Thus, we can, in principle, use any structured mean-field relaation of the form sup M(G;H) θ, to solve the MAP eactly. The caveat is that this problem, like structured mean field, is a non-conve problem. However, while structured mean field only solves for an approimate value of the log-partition function, the results from Section 3 show that its counterpart for the MAP problem is eact, if the global optimum can be found. 5. Eperiments The quadratic relaation with the conve approimation was evaluated by comparing it against three competing methods: the linear programming relaation (Chekuri et al., 25), the tree-reweighted ma product algorithm (Wainwright et al., 25), and iterative conditional modes () (Besag, 1986). For tree-reweighted ma product, we use the sequential update variant detailed in (Kolmogorov, 25), which has better convergence properties than the originally proposed algorithm. The approimate MAP algorithms were compared on different potential functions and coupling types for 2D nearest neighbor grid graphs with 1 nodes and a label set of size four. The node potentials were generated uniformly U( 1, 1), while the edge potentials were generated as a product of an edge weight and a distance function on labels. For different settings of an edge coupling-strength parameter, d coup, the edge weight was selected from U( d coup,d coup ) for the mied coupling, from U(,2d coup ) for the positive coupling, and from U( 2d coup,) for the negative coupling. The following four commonly used distances were used for the distance function: Ising, φ(l, m) = lm; uniform, or Potts, φ(l,m) = I(l = m); quadratic, φ(l,m) = (l m) 2 ; linear φ(l,m) = l m. Figures 1 and 2 show plots of the value (energy) of the MAP estimates using the different algorithms for a range of model types. For any given setting of parameters and potential functions, a higher value is closer to the MAP estimate and is thus better. As the plots show, the quadratic relaation slightly outperforms tree-reweighted ma product for mied and positive couplings, and is comparable or slightly worse for negative coupling. The quadratic approimation typically beats both and the linear relaation Ising Mied Coupling Number of Nodes Relative wrt QP Ising Mied Coupling Number of Nodes Figure 3. Comparison of and on larger graphs, using Ising potentials with mied coupling. The right plot shows (e e QP)/e and (e e QP)/e. In Figure 3 we compare the MAP estimates from different algorithms on larger graphs, using the Ising potential function with mied coupling. The quadratic relaation is seen to outperform and tree-reweighted ma product, even as the number of nodes increases. We note that since the conve approimation to the QP is a conve program, it can be solved (in polynomial time) using standard QP solvers for small problems, and for largerscale problems one can use iterative proected conugate gradient, which provides a fast iterative method that is guaranteed to converge. In our eperiments, the computation time for the QP method was comparable to that required by tree-reweighted ma product, which in turn required much less time to solve than the linear programming relaation. This is due primarily to the fact that the linear program has E k 2 variables, while the conve quadratic

7 Ising Mied Coupling Ising Positive Coupling Ising Negative Coupling Uniform Mied Coupling Uniform Positive Coupling Uniform Negative Coupling Figure 1. Comparison of linear relaation (LP), iterative conditional modes (), tree-reweighted ma product (), and quadratic programming relaation (QP) on 1 1 grid graphs using Ising potentials (top row) and uniform potentials (bottom) with mied (left), positive (center) and negative (right) couplings. A better MAP estimate has a higher value. relaation has only nk variables, where n is the number of nodes in the graph, E is the number of edges, and k is the number of labels. 6. Conclusions This paper has proposed a quadratic programming relaation to the MAP problem for random fields, or the metric labeling problem. The quadratic obective function more accurately represents the energy in the graphical model while using fewer variables than the linear relaation. It was shown that the QP relaation is tight and computes the MAP solution eactly. However, under certain conditions the relaation results in a non-conve problem, which requires an intractable search over local minima. This led to an additional conve approimation to the relaation, for which there is an additive approimation guarantee. The quadratic programming approimation was also etended to general variational inner polytope relaations that also compute the MAP eactly. Eperiments demonstrated that the quadratic relaation, with the conve approimation, can outperform eisting methods under many settings, while also being computationally attractive. Acknowledgments This research was supported in part by NSF grants IIS and IIS The authors thank Ramin Zabih and Vladimir Kolmogorov for helpful comments. References Aelsson, O., & Barker, V. A. (21). Finite element solution of boundary value problems: theory and computation. Society for Industrial and Applied Mathematics. Bertsimas, D., & Tsitsiklis, J. (1997). Introduction to linear optimization. Athena Scientific. Besag, J. (1986). On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society, Series B. Boykov, Y., Veksler, O., & Zabih, R. (21). Fast approimate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell., 23. Chekuri, C., Khanna, S., Naor, J., & Zosin, L. (25). A linear programming formulation and approimation algorithms for the metric labeling problem. SIAM Journal on Discrete Mathematics, 18, Greig, D., Porteous, B., & Seheult, A. (1989). Eact maimum a posteriori estimation for binary images. Journal of the Royal Statistical Society, Series B, 51. Kleinberg, J., & Tardos, E. (1999). Approimation algorithms for classification problems with pairwise relationships: Metric partitioning and markov random fields.

8 Linear Mied Coupling Quadratic Mied Coupling Linear Positive Coupling Quadratic Positive Coupling Linear Negative Coupling Quadratic Negative Coupling Figure 2. Comparison of linear relaation (LP), iterative conditional modes (), tree-reweighted ma product (), and quadratic programming relaation (QP) on 1 1 grid graphs using linear potentials (top row) and quadratic potentials (bottom) with mied (left), positive (center) and negative (right) couplings. A better MAP estimate has a higher value. IEEE Symposium on the Foundations of Computer Science. Kolmogorov, V. (25). Convergent tree-reweighted message passing for energy minimization. AISTATS. Pearl, J. (1988). Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann Publishers Inc. Wainwright, M., Jaakkola, T., & Willsky, A. (25). Map estimation via agreement on (hyper)trees: Messagepassing and linear-programming approaches. IEEE Transactions on Information Theory, 51, Wainwright, M. J., & Jordan, M. I. (23). Variational inference in graphical models: The view from the marginal polytope. Allerton Conference on Communication, Control, and Computing. Weiss, Y., & Freeman, W. T. (21). On the optimality of solutions of the ma-product belief-propagation algorithm in arbitrary graphs. IEEE Transactions on Information Theory, 47. Yedidia, J. S., Freeman, W. T., & Weiss, Y. (21). Understanding belief propagation and its generalizations. IJ- CAI 21 Distinguished Lecture track.

On the optimality of tree-reweighted max-product message-passing

On the optimality of tree-reweighted max-product message-passing Appeared in Uncertainty on Artificial Intelligence, July 2005, Edinburgh, Scotland. On the optimality of tree-reweighted max-product message-passing Vladimir Kolmogorov Microsoft Research Cambridge, UK

More information

Discrete Markov Random Fields the Inference story. Pradeep Ravikumar

Discrete Markov Random Fields the Inference story. Pradeep Ravikumar Discrete Markov Random Fields the Inference story Pradeep Ravikumar Graphical Models, The History How to model stochastic processes of the world? I want to model the world, and I like graphs... 2 Mid to

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Maria Ryskina, Yen-Chia Hsu 1 Introduction

More information

A Combined LP and QP Relaxation for MAP

A Combined LP and QP Relaxation for MAP A Combined LP and QP Relaxation for MAP Patrick Pletscher ETH Zurich, Switzerland pletscher@inf.ethz.ch Sharon Wulff ETH Zurich, Switzerland sharon.wulff@inf.ethz.ch Abstract MAP inference for general

More information

Dual Decomposition for Marginal Inference

Dual Decomposition for Marginal Inference Dual Decomposition for Marginal Inference Justin Domke Rochester Institute of echnology Rochester, NY 14623 Abstract We present a dual decomposition approach to the treereweighted belief propagation objective.

More information

Discrete Inference and Learning Lecture 3

Discrete Inference and Learning Lecture 3 Discrete Inference and Learning Lecture 3 MVA 2017 2018 h

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models David Sontag New York University Lecture 6, March 7, 2013 David Sontag (NYU) Graphical Models Lecture 6, March 7, 2013 1 / 25 Today s lecture 1 Dual decomposition 2 MAP inference

More information

Probabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013

Probabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013 School of Computer Science Probabilistic Graphical Models Theory of Variational Inference: Inner and Outer Approximation Junming Yin Lecture 15, March 4, 2013 Reading: W & J Book Chapters 1 Roadmap Two

More information

Preconditioner Approximations for Probabilistic Graphical Models

Preconditioner Approximations for Probabilistic Graphical Models Preconditioner Approximations for Probabilistic Graphical Models Pradeep Ravikumar John Lafferty School of Computer Science Carnegie Mellon University Abstract We present a family of approximation techniques

More information

Approximate inference, Sampling & Variational inference Fall Cours 9 November 25

Approximate inference, Sampling & Variational inference Fall Cours 9 November 25 Approimate inference, Sampling & Variational inference Fall 2015 Cours 9 November 25 Enseignant: Guillaume Obozinski Scribe: Basile Clément, Nathan de Lara 9.1 Approimate inference with MCMC 9.1.1 Gibbs

More information

Does Better Inference mean Better Learning?

Does Better Inference mean Better Learning? Does Better Inference mean Better Learning? Andrew E. Gelfand, Rina Dechter & Alexander Ihler Department of Computer Science University of California, Irvine {agelfand,dechter,ihler}@ics.uci.edu Abstract

More information

A note on the primal-dual method for the semi-metric labeling problem

A note on the primal-dual method for the semi-metric labeling problem A note on the primal-dual method for the semi-metric labeling problem Vladimir Kolmogorov vnk@adastral.ucl.ac.uk Technical report June 4, 2007 Abstract Recently, Komodakis et al. [6] developed the FastPD

More information

Rounding-based Moves for Semi-Metric Labeling

Rounding-based Moves for Semi-Metric Labeling Rounding-based Moves for Semi-Metric Labeling M. Pawan Kumar, Puneet K. Dokania To cite this version: M. Pawan Kumar, Puneet K. Dokania. Rounding-based Moves for Semi-Metric Labeling. Journal of Machine

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference IV: Variational Principle II Junming Yin Lecture 17, March 21, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4

More information

Tightness of LP Relaxations for Almost Balanced Models

Tightness of LP Relaxations for Almost Balanced Models Tightness of LP Relaxations for Almost Balanced Models Adrian Weller University of Cambridge AISTATS May 10, 2016 Joint work with Mark Rowland and David Sontag For more information, see http://mlg.eng.cam.ac.uk/adrian/

More information

Semidefinite relaxations for approximate inference on graphs with cycles

Semidefinite relaxations for approximate inference on graphs with cycles Semidefinite relaxations for approximate inference on graphs with cycles Martin J. Wainwright Electrical Engineering and Computer Science UC Berkeley, Berkeley, CA 94720 wainwrig@eecs.berkeley.edu Michael

More information

Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract)

Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles

More information

Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm

Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm Dhruv Batra Sebastian Nowozin Pushmeet Kohli TTI-Chicago Microsoft Research Cambridge Microsoft Research Cambridge

More information

Probabilistic and Bayesian Machine Learning

Probabilistic and Bayesian Machine Learning Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/

More information

Fractional Belief Propagation

Fractional Belief Propagation Fractional Belief Propagation im iegerinck and Tom Heskes S, niversity of ijmegen Geert Grooteplein 21, 6525 EZ, ijmegen, the etherlands wimw,tom @snn.kun.nl Abstract e consider loopy belief propagation

More information

MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches

MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches Martin Wainwright Tommi Jaakkola Alan Willsky martinw@eecs.berkeley.edu tommi@ai.mit.edu willsky@mit.edu

More information

Exact MAP estimates by (hyper)tree agreement

Exact MAP estimates by (hyper)tree agreement Exact MAP estimates by (hyper)tree agreement Martin J. Wainwright, Department of EECS, UC Berkeley, Berkeley, CA 94720 martinw@eecs.berkeley.edu Tommi S. Jaakkola and Alan S. Willsky, Department of EECS,

More information

Learning Associative Markov Networks

Learning Associative Markov Networks Learning Associative Markov Networks Ben Taskar Vassil Chatalbashev Daphne Koller Computer Science Department, Stanford University, Stanford, CA btaskar@cs.stanford.edu vasco@cs.stanford.edu koller@cs.stanford.edu

More information

Learning in Markov Random Fields An Empirical Study

Learning in Markov Random Fields An Empirical Study Learning in Markov Random Fields An Empirical Study Sridevi Parise, Max Welling University of California, Irvine sparise,welling@ics.uci.edu Abstract Learning the parameters of an undirected graphical

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture

More information

Convexifying the Bethe Free Energy

Convexifying the Bethe Free Energy 402 MESHI ET AL. Convexifying the Bethe Free Energy Ofer Meshi Ariel Jaimovich Amir Globerson Nir Friedman School of Computer Science and Engineering Hebrew University, Jerusalem, Israel 91904 meshi,arielj,gamir,nir@cs.huji.ac.il

More information

Lecture 9: PGM Learning

Lecture 9: PGM Learning 13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and

More information

9. Interpretations, Lifting, SOS and Moments

9. Interpretations, Lifting, SOS and Moments 9-1 Interpretations, Lifting, SOS and Moments P. Parrilo and S. Lall, CDC 2003 2003.12.07.04 9. Interpretations, Lifting, SOS and Moments Polynomial nonnegativity Sum of squares (SOS) decomposition Eample

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 5, 06 Reading: See class website Eric Xing @ CMU, 005-06

More information

Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm

Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm Dhruv Batra Sebastian Nowozin Pushmeet Kohli TTI-Chicago Microsoft Research Cambridge Microsoft Research Cambridge

More information

Energy Minimization via Graph Cuts

Energy Minimization via Graph Cuts Energy Minimization via Graph Cuts Xiaowei Zhou, June 11, 2010, Journal Club Presentation 1 outline Introduction MAP formulation for vision problems Min-cut and Max-flow Problem Energy Minimization via

More information

Part 6: Structured Prediction and Energy Minimization (1/2)

Part 6: Structured Prediction and Energy Minimization (1/2) Part 6: Structured Prediction and Energy Minimization (1/2) Providence, 21st June 2012 Prediction Problem Prediction Problem y = f (x) = argmax y Y g(x, y) g(x, y) = p(y x), factor graphs/mrf/crf, g(x,

More information

3 : Representation of Undirected GM

3 : Representation of Undirected GM 10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference II: Mean Field Method and Variational Principle Junming Yin Lecture 15, March 7, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3

More information

Minimum Weight Perfect Matching via Blossom Belief Propagation

Minimum Weight Perfect Matching via Blossom Belief Propagation Minimum Weight Perfect Matching via Blossom Belief Propagation Sungsoo Ahn Sejun Park Michael Chertkov Jinwoo Shin School of Electrical Engineering, Korea Advanced Institute of Science and Technology,

More information

Tightening Fractional Covering Upper Bounds on the Partition Function for High-Order Region Graphs

Tightening Fractional Covering Upper Bounds on the Partition Function for High-Order Region Graphs Tightening Fractional Covering Upper Bounds on the Partition Function for High-Order Region Graphs Tamir Hazan TTI Chicago Chicago, IL 60637 Jian Peng TTI Chicago Chicago, IL 60637 Amnon Shashua The Hebrew

More information

UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS

UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS JONATHAN YEDIDIA, WILLIAM FREEMAN, YAIR WEISS 2001 MERL TECH REPORT Kristin Branson and Ian Fasel June 11, 2003 1. Inference Inference problems

More information

Part III. for energy minimization

Part III. for energy minimization ICCV 2007 tutorial Part III Message-assing algorithms for energy minimization Vladimir Kolmogorov University College London Message assing ( E ( (,, Iteratively ass messages between nodes... Message udate

More information

Bounding the Partition Function using Hölder s Inequality

Bounding the Partition Function using Hölder s Inequality Qiang Liu Alexander Ihler Department of Computer Science, University of California, Irvine, CA, 92697, USA qliu1@uci.edu ihler@ics.uci.edu Abstract We describe an algorithm for approximate inference in

More information

ON CONVERGENCE PROPERTIES OF MESSAGE-PASSING ESTIMATION ALGORITHMS. Justin Dauwels

ON CONVERGENCE PROPERTIES OF MESSAGE-PASSING ESTIMATION ALGORITHMS. Justin Dauwels ON CONVERGENCE PROPERTIES OF MESSAGE-PASSING ESTIMATION ALGORITHMS Justin Dauwels Amari Research Unit, RIKEN Brain Science Institute, Wao-shi, 351-0106, Saitama, Japan email: justin@dauwels.com ABSTRACT

More information

Lecture : Lovász Theta Body. Introduction to hierarchies.

Lecture : Lovász Theta Body. Introduction to hierarchies. Strong Relaations for Discrete Optimization Problems 20-27/05/6 Lecture : Lovász Theta Body. Introduction to hierarchies. Lecturer: Yuri Faenza Scribes: Yuri Faenza Recall: stable sets and perfect graphs

More information

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

A graph contains a set of nodes (vertices) connected by links (edges or arcs) BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,

More information

Structured Learning with Approximate Inference

Structured Learning with Approximate Inference Structured Learning with Approximate Inference Alex Kulesza and Fernando Pereira Department of Computer and Information Science University of Pennsylvania {kulesza, pereira}@cis.upenn.edu Abstract In many

More information

Linear Response for Approximate Inference

Linear Response for Approximate Inference Linear Response for Approximate Inference Max Welling Department of Computer Science University of Toronto Toronto M5S 3G4 Canada welling@cs.utoronto.ca Yee Whye Teh Computer Science Division University

More information

13 : Variational Inference: Loopy Belief Propagation

13 : Variational Inference: Loopy Belief Propagation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 13 : Variational Inference: Loopy Belief Propagation Lecturer: Eric P. Xing Scribes: Rajarshi Das, Zhengzhong Liu, Dishan Gupta 1 Introduction

More information

Junction Tree, BP and Variational Methods

Junction Tree, BP and Variational Methods Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,

More information

Bethe-ADMM for Tree Decomposition based Parallel MAP Inference

Bethe-ADMM for Tree Decomposition based Parallel MAP Inference Bethe-ADMM for Tree Decomposition based Parallel MAP Inference Qiang Fu Huahua Wang Arindam Banerjee Department of Computer Science & Engineering University of Minnesota, Twin Cities Minneapolis, MN 55455

More information

Concavity of reweighted Kikuchi approximation

Concavity of reweighted Kikuchi approximation Concavity of reweighted ikuchi approximation Po-Ling Loh Department of Statistics The Wharton School University of Pennsylvania loh@wharton.upenn.edu Andre Wibisono Computer Science Division University

More information

Intelligent Systems:

Intelligent Systems: Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition

More information

Semi-Markov/Graph Cuts

Semi-Markov/Graph Cuts Semi-Markov/Graph Cuts Alireza Shafaei University of British Columbia August, 2015 1 / 30 A Quick Review For a general chain-structured UGM we have: n n p(x 1, x 2,..., x n ) φ i (x i ) φ i,i 1 (x i, x

More information

CSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches

CSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches CSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches A presentation by Evan Ettinger November 11, 2005 Outline Introduction Motivation and Background

More information

Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing

Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing Huayan Wang huayanw@cs.stanford.edu Computer Science Department, Stanford University, Palo Alto, CA 90 USA Daphne Koller koller@cs.stanford.edu

More information

Unifying Local Consistency and MAX SAT Relaxations for Scalable Inference with Rounding Guarantees

Unifying Local Consistency and MAX SAT Relaxations for Scalable Inference with Rounding Guarantees for Scalable Inference with Rounding Guarantees Stephen H. Bach Bert Huang Lise Getoor University of Maryland Virginia Tech University of California, Santa Cruz Abstract We prove the equivalence of first-order

More information

A new class of upper bounds on the log partition function

A new class of upper bounds on the log partition function Accepted to IEEE Transactions on Information Theory This version: March 25 A new class of upper bounds on the log partition function 1 M. J. Wainwright, T. S. Jaakkola and A. S. Willsky Abstract We introduce

More information

Outline. Spring It Introduction Representation. Markov Random Field. Conclusion. Conditional Independence Inference: Variable elimination

Outline. Spring It Introduction Representation. Markov Random Field. Conclusion. Conditional Independence Inference: Variable elimination Probabilistic Graphical Models COMP 790-90 Seminar Spring 2011 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Outline It Introduction ti Representation Bayesian network Conditional Independence Inference:

More information

arxiv: v1 [stat.ml] 28 Oct 2017

arxiv: v1 [stat.ml] 28 Oct 2017 Jinglin Chen Jian Peng Qiang Liu UIUC UIUC Dartmouth arxiv:1710.10404v1 [stat.ml] 28 Oct 2017 Abstract We propose a new localized inference algorithm for answering marginalization queries in large graphical

More information

Power EP. Thomas Minka Microsoft Research Ltd., Cambridge, UK MSR-TR , October 4, Abstract

Power EP. Thomas Minka Microsoft Research Ltd., Cambridge, UK MSR-TR , October 4, Abstract Power EP Thomas Minka Microsoft Research Ltd., Cambridge, UK MSR-TR-2004-149, October 4, 2004 Abstract This note describes power EP, an etension of Epectation Propagation (EP) that makes the computations

More information

Convergent Tree-reweighted Message Passing for Energy Minimization

Convergent Tree-reweighted Message Passing for Energy Minimization Accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2006. Convergent Tree-reweighted Message Passing for Energy Minimization Vladimir Kolmogorov University College London,

More information

Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation

Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation Jason K. Johnson, Dmitry M. Malioutov and Alan S. Willsky Department of Electrical Engineering and Computer Science Massachusetts Institute

More information

Variational Inference (11/04/13)

Variational Inference (11/04/13) STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further

More information

Message-Passing Algorithms for GMRFs and Non-Linear Optimization

Message-Passing Algorithms for GMRFs and Non-Linear Optimization Message-Passing Algorithms for GMRFs and Non-Linear Optimization Jason Johnson Joint Work with Dmitry Malioutov, Venkat Chandrasekaran and Alan Willsky Stochastic Systems Group, MIT NIPS Workshop: Approximate

More information

Approximate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery

Approximate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery Approimate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery arxiv:1606.00901v1 [cs.it] Jun 016 Shuai Huang, Trac D. Tran Department of Electrical and Computer Engineering Johns

More information

On the Convergence of the Convex Relaxation Method and Distributed Optimization of Bethe Free Energy

On the Convergence of the Convex Relaxation Method and Distributed Optimization of Bethe Free Energy On the Convergence of the Convex Relaxation Method and Distributed Optimization of Bethe Free Energy Ming Su Department of Electrical Engineering and Department of Statistics University of Washington,

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

Efficient Inference with Cardinality-based Clique Potentials

Efficient Inference with Cardinality-based Clique Potentials Rahul Gupta IBM Research Lab, New Delhi, India Ajit A. Diwan Sunita Sarawagi IIT Bombay, India Abstract Many collective labeling tasks require inference on graphical models where the clique potentials

More information

Graphical Model Inference with Perfect Graphs

Graphical Model Inference with Perfect Graphs Graphical Model Inference with Perfect Graphs Tony Jebara Columbia University July 25, 2013 joint work with Adrian Weller Graphical models and Markov random fields We depict a graphical model G as a bipartite

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft

More information

Message-passing for Graph-structured Linear Programs: Proximal Methods and Rounding Schemes

Message-passing for Graph-structured Linear Programs: Proximal Methods and Rounding Schemes MESSAGE-PASSING FOR GRAPH-STRUCTURED LINEAR PROGRAMS Message-passing for Graph-structured Linear Programs: Proximal Methods and Rounding Schemes Pradeep Ravikumar Department of Statistics University of

More information

Introduction to graphical models: Lecture III

Introduction to graphical models: Lecture III Introduction to graphical models: Lecture III Martin Wainwright UC Berkeley Departments of Statistics, and EECS Martin Wainwright (UC Berkeley) Some introductory lectures January 2013 1 / 25 Introduction

More information

Safe and Tight Linear Estimators for Global Optimization

Safe and Tight Linear Estimators for Global Optimization Safe and Tight Linear Estimators for Global Optimization Glencora Borradaile and Pascal Van Hentenryck Brown University Bo 1910 Providence, RI, 02912 {glencora, pvh}@cs.brown.edu Abstract. Global optimization

More information

Representation of undirected GM. Kayhan Batmanghelich

Representation of undirected GM. Kayhan Batmanghelich Representation of undirected GM Kayhan Batmanghelich Review Review: Directed Graphical Model Represent distribution of the form ny p(x 1,,X n = p(x i (X i i=1 Factorizes in terms of local conditional probabilities

More information

Lagrangian Relaxation for MAP Estimation in Graphical Models

Lagrangian Relaxation for MAP Estimation in Graphical Models In Proceedings of The th Allerton Conference on Communication, Control and Computing, September, 007. Lagrangian Relaxation for MAP Estimation in Graphical Models Jason K. Johnson, Dmitry M. Malioutov

More information

Graphical models and message-passing Part II: Marginals and likelihoods

Graphical models and message-passing Part II: Marginals and likelihoods Graphical models and message-passing Part II: Marginals and likelihoods Martin Wainwright UC Berkeley Departments of Statistics, and EECS Tutorial materials (slides, monograph, lecture notes) available

More information

A Graphical Transformation for Belief Propagation: Maximum Weight Matchings and Odd-Sized Cycles

A Graphical Transformation for Belief Propagation: Maximum Weight Matchings and Odd-Sized Cycles A Graphical Transformation for Belief Propagation: Maximum Weight Matchings and Odd-Sized Cycles Jinwoo Shin Department of Electrical Engineering Korea Advanced Institute of Science and Technology Daejeon,

More information

High dimensional ising model selection using l 1 -regularized logistic regression

High dimensional ising model selection using l 1 -regularized logistic regression High dimensional ising model selection using l 1 -regularized logistic regression 1 Department of Statistics Pennsylvania State University 597 Presentation 2016 1/29 Outline Introduction 1 Introduction

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference III: Variational Principle I Junming Yin Lecture 16, March 19, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4

More information

17 Variational Inference

17 Variational Inference Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 17 Variational Inference Prompted by loopy graphs for which exact

More information

Convex Relaxations for Markov Random Field MAP estimation

Convex Relaxations for Markov Random Field MAP estimation Convex Relaxations for Markov Random Field MAP estimation Timothee Cour GRASP Lab, University of Pennsylvania September 2008 Abstract Markov Random Fields (MRF) are commonly used in computer vision and

More information

Fast Memory-Efficient Generalized Belief Propagation

Fast Memory-Efficient Generalized Belief Propagation Fast Memory-Efficient Generalized Belief Propagation M. Pawan Kumar P.H.S. Torr Department of Computing Oxford Brookes University Oxford, UK, OX33 1HX {pkmudigonda,philiptorr}@brookes.ac.uk http://cms.brookes.ac.uk/computervision

More information

Consistency as Projection

Consistency as Projection Consistency as Projection John Hooker Carnegie Mellon University INFORMS 2015, Philadelphia USA Consistency as Projection Reconceive consistency in constraint programming as a form of projection. For eample,

More information

NUMERICAL COMPUTATION OF THE CAPACITY OF CONTINUOUS MEMORYLESS CHANNELS

NUMERICAL COMPUTATION OF THE CAPACITY OF CONTINUOUS MEMORYLESS CHANNELS NUMERICAL COMPUTATION OF THE CAPACITY OF CONTINUOUS MEMORYLESS CHANNELS Justin Dauwels Dept. of Information Technology and Electrical Engineering ETH, CH-8092 Zürich, Switzerland dauwels@isi.ee.ethz.ch

More information

Discriminative Fields for Modeling Spatial Dependencies in Natural Images

Discriminative Fields for Modeling Spatial Dependencies in Natural Images Discriminative Fields for Modeling Spatial Dependencies in Natural Images Sanjiv Kumar and Martial Hebert The Robotics Institute Carnegie Mellon University Pittsburgh, PA 15213 {skumar,hebert}@ri.cmu.edu

More information

Variational Algorithms for Marginal MAP

Variational Algorithms for Marginal MAP Variational Algorithms for Marginal MAP Qiang Liu Department of Computer Science University of California, Irvine Irvine, CA, 92697 qliu1@ics.uci.edu Alexander Ihler Department of Computer Science University

More information

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3 MI 6.97 Algebraic techniques and semidefinite optimization February 4, 6 Lecture 3 Lecturer: Pablo A. Parrilo Scribe: Pablo A. Parrilo In this lecture, we will discuss one of the most important applications

More information

Dual Decomposition for Inference

Dual Decomposition for Inference Dual Decomposition for Inference Yunshu Liu ASPITRG Research Group 2014-05-06 References: [1]. D. Sontag, A. Globerson and T. Jaakkola, Introduction to Dual Decomposition for Inference, Optimization for

More information

Accelerated dual decomposition for MAP inference

Accelerated dual decomposition for MAP inference Vladimir Jojic vjojic@csstanfordedu Stephen Gould sgould@stanfordedu Daphne Koller koller@csstanfordedu Computer Science Department, 353 Serra Mall, Stanford University, Stanford, CA 94305, USA Abstract

More information

A Graphical Model for Simultaneous Partitioning and Labeling

A Graphical Model for Simultaneous Partitioning and Labeling A Graphical Model for Simultaneous Partitioning and Labeling Philip J Cowans Cavendish Laboratory, University of Cambridge, Cambridge, CB3 0HE, United Kingdom pjc51@camacuk Martin Szummer Microsoft Research

More information

Bayesian Model Scoring in Markov Random Fields

Bayesian Model Scoring in Markov Random Fields Bayesian Model Scoring in Markov Random Fields Sridevi Parise Bren School of Information and Computer Science UC Irvine Irvine, CA 92697-325 sparise@ics.uci.edu Max Welling Bren School of Information and

More information

A Convex Upper Bound on the Log-Partition Function for Binary Graphical Models

A Convex Upper Bound on the Log-Partition Function for Binary Graphical Models A Convex Upper Bound on the Log-Partition Function for Binary Graphical Models Laurent El Ghaoui and Assane Gueye Department of Electrical Engineering and Computer Science University of California Berkeley

More information

An LP Formulation and Approximation Algorithms for the Metric Labeling Problem

An LP Formulation and Approximation Algorithms for the Metric Labeling Problem An LP Formulation and Approximation Algorithms for the Metric Labeling Problem C. Chekuri y S. Khanna z J. Naor x L. Zosin { August 4, 2004 Abstract We consider approximation algorithms for the metric

More information

Improved Dynamic Schedules for Belief Propagation

Improved Dynamic Schedules for Belief Propagation Improved Dynamic Schedules for Belief Propagation Charles Sutton and Andrew McCallum Department of Computer Science University of Massachusetts Amherst, MA 13 USA {casutton,mccallum}@cs.umass.edu Abstract

More information

Minimizing Count-based High Order Terms in Markov Random Fields

Minimizing Count-based High Order Terms in Markov Random Fields EMMCVPR 2011, St. Petersburg Minimizing Count-based High Order Terms in Markov Random Fields Thomas Schoenemann Center for Mathematical Sciences Lund University, Sweden Abstract. We present a technique

More information

1 Undirected Graphical Models. 2 Markov Random Fields (MRFs)

1 Undirected Graphical Models. 2 Markov Random Fields (MRFs) Machine Learning (ML, F16) Lecture#07 (Thursday Nov. 3rd) Lecturer: Byron Boots Undirected Graphical Models 1 Undirected Graphical Models In the previous lecture, we discussed directed graphical models.

More information

A Cluster-Cumulant Expansion at the Fixed Points of Belief Propagation

A Cluster-Cumulant Expansion at the Fixed Points of Belief Propagation A Cluster-Cumulant Expansion at the Fixed Points of Belief Propagation Max Welling Dept. of Computer Science University of California, Irvine Irvine, CA 92697-3425, USA Andrew E. Gelfand Dept. of Computer

More information

Improved Dynamic Schedules for Belief Propagation

Improved Dynamic Schedules for Belief Propagation 376 SUTTON & McCALLUM Improved Dynamic Schedules for Belief Propagation Charles Sutton and Andrew McCallum Department of Computer Science University of Massachusetts Amherst, MA 01003 USA {casutton,mccallum}@cs.umass.edu

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture Notes Fall 2009 November, 2009 Byoung-Ta Zhang School of Computer Science and Engineering & Cognitive Science, Brain Science, and Bioinformatics Seoul National University

More information

Lecture 3: Latent Variables Models and Learning with the EM Algorithm. Sam Roweis. Tuesday July25, 2006 Machine Learning Summer School, Taiwan

Lecture 3: Latent Variables Models and Learning with the EM Algorithm. Sam Roweis. Tuesday July25, 2006 Machine Learning Summer School, Taiwan Lecture 3: Latent Variables Models and Learning with the EM Algorithm Sam Roweis Tuesday July25, 2006 Machine Learning Summer School, Taiwan Latent Variable Models What to do when a variable z is always

More information

3.4 Relaxations and bounds

3.4 Relaxations and bounds 3.4 Relaxations and bounds Consider a generic Discrete Optimization problem z = min{c(x) : x X} with an optimal solution x X. In general, the algorithms generate not only a decreasing sequence of upper

More information

A Graph Cut Algorithm for Generalized Image Deconvolution

A Graph Cut Algorithm for Generalized Image Deconvolution A Graph Cut Algorithm for Generalized Image Deconvolution Ashish Raj UC San Francisco San Francisco, CA 94143 Ramin Zabih Cornell University Ithaca, NY 14853 Abstract The goal of deconvolution is to recover

More information