CSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches
|
|
- Lesley Kelly
- 5 years ago
- Views:
Transcription
1 CSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches A presentation by Evan Ettinger November 11, 2005
2 Outline Introduction Motivation and Background Tree-Reparameterizations Definitions and Upper Bounds LP Approaches IP to LP to Lagrangian Dual Message-passing Approaches Two Algorithms Summary
3 Outline Introduction Motivation and Background Tree-Reparameterizations Definitions and Upper Bounds LP Approaches IP to LP to Lagrangian Dual Message-passing Approaches Two Algorithms Summary
4 Motivation and Background Motivation I Want: MAP estimation on pairwise MRFs i.e. configurations in the set argmax x 2 np(x; ). I We limit ourselves to pairwise MRFs and discrete sets of labels,.
5 Motivation and Background MRFs as Exponential Family I Remember that we can write any MRF in the exponential family form as: p(x; ) / expfh; (x)ig expf 2I (x)g I Here I indexes over the vertices, edges and labels. I We will use the overcomplete representation f j (x s) j j 2 sg for s 2 V f j (x s) k (x t ) j (j; k ) 2 s tg for (s; t) 2 E I Consequence: There are many exponential parameters corresponding to a given distribution (i.e. p(x; ) = p(x; ^) for 6= ^)
6 Motivation and Background Marginal Distributions I Using this representation we can now write marginal probabilities as: s;j = E[ j (x s)] := p(x; ) j (x s) x2 n st;jk = E[ j (x s) k (x t )] := p(x; ) j (x s) k (x t ) x2 n I We define the marginal polytope as MARG(G) := f 2 R d j s;j = E p[ j (x s)]; st;jk = E p[ j (x s) k (x t )] for some p( )g
7 Motivation and Background MAP Estimation I We can represent the exponential parameters of our sufficient statistic (x) = f j (x s); j (x s) k (x t )g by: s(x s) := s;j j (x s) st (x s; x t ) := j2s st;jk j (x s) k (x t ) (j;k)2s t I The MAP problem is then equivalent to: 1( ) := max x2 n h ; (x)i max x2 n s2v s(x s) + (s;t)2e st (x s; x t )
8 Outline Introduction Motivation and Background Tree-Reparameterizations Definitions and Upper Bounds LP Approaches IP to LP to Lagrangian Dual Message-passing Approaches Two Algorithms Summary
9 Definitions and Upper Bounds Convex Combinations and Upper Bounds I Consider f i g such that P i pi i = and P i pi = 1. Claim: 1( ) P i pi 1( i ) I Let x be our MAP solution for 1( ) 1( ) = max x2 nh ; (x)i = h ; (x )i = «p i max x2 nhi ; (x)i i = p i 1( i ) i p i h i ; (x )i I Goals: 1. Characterize f i g by tree-structured exponential parameters. 2. When is the upper bound tight? 3. When not tight, how can we make it as tight as possible? i
10 Definitions and Upper Bounds Tree Reparameterizations I Let fti g be a set of spanning trees on G, p be a probability distribution over this set. I = Pi p(t i )(T i ) is a p-reparameterization. I Consider pe, the probability a given edge e 2 E appears in a spanning tree chosen at random under p. I We can always find Pi p(t i )(T i ) = as long as p e > 0 for all e 2 E. I E.g. Above, if p(ti ) = 1=3 for all i, then p b = 1; p e = 2=3; p f = 1=3.
11 Definitions and Upper Bounds Example: Ising Model I Consider a 4-node cycle and let x 2 f0; 1g 4. Also let, p(x; ) / expfx 1 x 2 + x 2 x 3 + x 3 x 4 + x 4 x 1 g I So, = [ ] I We will reparameterize by a convex combination of exponential parameters of the following spanning trees:
12 Definitions and Upper Bounds Example (cont.) I Give each tree, Ti, equal weighting p(t i ) = 1=4 I We construct each tree-structured exponential parameter as: I This gives us = Pi p(t i )(T i ). (T 1 ) = (4=3)[ ]; (T 2 ) = (4=3)[ ]; (T 3 ) = (4=3)[ ]; (T 4 ) = (4=3)[ ]
13 Definitions and Upper Bounds Tightness of Upper Bound I For any tree p-reparameterization we have: 1( ) p(t i ) 1((T i )) = p(t i ) max nfh(t i ); (x)ig x2 i i I When does the above inequality meet with equality? I Let s define the set of optimal configurations for a given : OPT () := fx 2 n j h; (x 0 )i h; (x)i for all x 0 2 n g Lemma: Let = P i p(t i )(T i ), then \ T i OPT ((T i )) OPT ( ) Moreover, the bound is tight iff the intersection on the LHS in non-empty.
14 Definitions and Upper Bounds Proof I Let x 2 OPT ( ) and consider the difference between the RHS and LHS: " # " # 0 p(t i ) 1((T i )) 1( ) = p(t i ) 1((T i )) h; (x ))i i i» = p(t i ) 1((T i )) h(t i ); (x )i {z } i I This underscored term is non-negative, and equal to zero only when x 2 OPT ((T i )). I So bound is tight iff x 2 T T i OPT ((T i )).
15 Definitions and Upper Bounds Tree Reparameterization Wrap-up I A set of spanning trees where TT i OPT ((T i )) is non-empty is said to follow the tree agreement. I Suppose that we fix the spanning tree distribution p and our target parameter. I Goal: Optimize the upper bound as a function of (Ti ) I Two Approaches: 1. Direct Minimization through LP relaxation techniques 2. Message-passing Approaches
16 Outline Introduction Motivation and Background Tree-Reparameterizations Definitions and Upper Bounds LP Approaches IP to LP to Lagrangian Dual Message-passing Approaches Two Algorithms Summary
17 IP to LP to Lagrangian Dual MAP as an Integer Program (IP) I Fix our set of spanning trees fti g and our distribution over them, p. I Main Problem: We would like to minimize the RHS of the following inequality w.r.t. (T i ) to get a bound as tight as possible: 1( ) i p(t i ) 1((T i )) I Consider the MAP optimization problem: arg max x2 nh ; (x)i I We first relax the above integer program to an equivalent linear program...
18 IP to LP to Lagrangian Dual LP Relaxation I Remind yourself what the marginal polytope is: MARG(G) := f 2 R d j s;j = E p[ j (x s)]; st;jk = E p[ j (x s) k (x t )] for some p( )g I Claim: maxx2 nh; (x)i = max 2MARG(G)h; i I Let P = fp( ) j p(x) 0; Px p(x) = 1g. 0 1 max x2 nh ; (x)i = p(x)h; (x)ia p2p x2 n I We now transform the RHS of this equation...
19 IP to LP to Lagrangian Dual LP Relaxation (cont.) I From the last slide we had: p(x)h; (x)ia = s(x s) + st (x s; x t ) A p2p x2 n x2 n s2v (s;t)2e = s;j s;j + st;jk st;jk s2v j2s (s;t)2e (j;k)2s t I So as p ranges over P the marginals range over MARG(G). I Conclusion: maxx2 nh ; (x)i = max 2MARG(G)h ; i
20 IP to LP to Lagrangian Dual Lagrangian Dual I We now have an LP formulation of MAP which is much easier to solve. I Remember we have: 1( ) p(t i ) 1((T i )) = p(t i ) max nfh(t i ); (x)ig x2 i i I We now would like to address the problem of: min p(t i ) 1((T i )) (T i ) i s:t: p(t i )(T i ) = i I Attack problem via its Lagrangian dual, which yields the relaxed LP: LOCAL(G) := f 2 R d + j 1( ) max h; i 2LOCAL(G) s;j = 1 8s 2 V ; st;jk = s;j 8(s; t) 2 E ; j 2 sg j2s k2 t
21 IP to LP to Lagrangian Dual LOCAL(G) I We ve relaxed our original constraint set from MARG(G) to LOCAL(G). I The relaxation is exact for any problem on a tree-structured graph. I LOCAL(G) is characterized by O(mn + m 2 jej); m := max s j sj constraints. I 2 optimum possibilities: 1. Optimum is a vertex that is both in MARG(G) and LOCAL(G), then tree agreement holds. 2. Optimum is a fractional vertex of LOCAL(G), then no tree agreement can occur.
22 IP to LP to Lagrangian Dual LP Summary I We ve now reformulated the MAP estimation problem as an approximate LP. I We could easily use classical LP techniques (i.e. Simplex Method) to solve this problem. I However, we would like to exploit the graphical structure of our MRF. I Want: Message-passing algorithms to solve the same LP over LOCAL(G) fixed points of iterative algorithm = optimum of LP relaxation.
23 Outline Introduction Motivation and Background Tree-Reparameterizations Definitions and Upper Bounds LP Approaches IP to LP to Lagrangian Dual Message-passing Approaches Two Algorithms Summary
24 Two Algorithms Preliminaries I Let s define max-marginals: s;j := s max p(x; (T )) fx j x s =jg st;jk := st max fx j (x s ;xt )=(j;k)g s(x s) := s;j j (x s) p(x; (T )) st (x s; x t ) := j2s st;jk j (x s) k (x t ) (j;k)2s t I For trees we can decompose the joint distributions into the junction tree decomposition: p(x; (T )) / Y s2v s(x s) Y (s;t)2e(t ) st (x s; x t ) s(x s) t (x t )
25 Two Algorithms Trees and Max-marginals I For every edge (s; t) 2 E(T ) I x is in OPT ((T )) iff: s(x s) = max st (x s; x 0 x t 0 t ) 2 t x s 2 arg max s(x s) 8 s x s (x s ; x t ) 2 arg max st (x s; x t ) 8 (s; t) x s ;xt I All the above are local conditions. I So, to find a MAP configuration: 1. root T at x r and choose x r 2 arg max xr r (x r ). 2. from parent s to child t choose x t 2 arg max xt st (x s ; x t )
26 Two Algorithms Algorithm Idea I For graphs with cycles we develop the idea of pseudo-max marginals = f s; stg I Denote (T ) as elements of pertaining to tree T. I (T ) implicitly specifies a tree exponential parameter (T ) by junction tree decomposition I Goal: Given p and fti g, we would like to find a that satisfies the following two properties: 1. implicitly specifies a p-reparameterization of (T i )! (T i ) and = P i p i (T i ). 2. (T i ) is exact max-marginal under p(x; (T i )) 8i tree consistency I We devise in iterative algorithm that after each iteration maintains (1) and upon convergence achieves (2).
27 Two Algorithms Preliminary Idea I Update pseudo-max marginal iteratively. I Two strategies: 1. Tree based updates update only (T ) a tree at a time. 2. Edge based updates update st along with associated s; t in parellel. I The algorithm to follow uses (2) and is motivated by: s(x s) = max st (x s; x 0 x t 0 t ) 2 t I So is consistent on every tree Ti iff the above holds for every (s; t) 2 E.
28 Two Algorithms Algorithm 1: Edge-based updates I Here st is the edge-appearance probability in our ft i g
29 Two Algorithms p-reparameterization proof I Note: From junction-tree decomposition we can define: n s (T )(x s) = log n s (x s) 8 s 2 V n st (T )(x s; x t ) = ( log n st (xs ;x t ) n s (xs )n t (x t ) if (s; t) 2 E(T ) 0 otherwise 8 (s; t) 2 E I Lemma: At each iteration we satisfy the p-reparameterization requirement. (proof by induction) I Base Case (n=0): 2 p(t i ) 4 s 0 (T ) + i s2v (s;t)2e(t ) 3 2 st 0 (T ) 5 = p(t i ) 4 s + i s2v = s + s2v (s;t)2e st 1 st st (s;t)2e(t ) 3 5
30 Two Algorithms Induction Step I Induction Step: Pi p(t i ) n+1 (T )(x) = 0 1 = p(t i s n (x s) + st log max x 0 t st n (x s; x 0 t ) A s n (x s) i s2v t2 (s) st n (x s; x t ) + p(t i ) log max i x 0 n (s;t)2e(t ) t st (xs; x0 t ) max xs 0 n st (x0 s; x t ) = log s n (x s) + log n st (x s; x t ) s n (x s) n s2v (s;t)2e t (xt ) I This last line is equal to Pi p(t i ) n (T )(x) up to an additive constant independent of x.
31 Two Algorithms Convergence Criteria Proof I Lemma: A fixed point of the algorithm satisfies the tree consistency condition. I Substitute = n+1 = n into all the updates in the algorithm and we get: max x 0 t st (x s) st (xs; x0 t ) t (x t ) max x 0 s v st (x0 s; x t ) = 8 (xs; xt ) I This implies is consistent on every tree Ti I Reminder: consistent on every tree!= consistent on entire graph (i.e. unique MAP).
32 Two Algorithms Algorithm 2: Message passing version I We can create a message passing version of the same algorithm by utilizing the following mapping: s (x s ) := exp( s (xs )) Y v2 (s) [M vs (xs )] vs Q st (xs ; x 1 t ) := exp( st (xs ; x t ) + s (x s ) + t (x t )) v2 (s)nt [Mvs (xs )]vs st [M ts (x s )] (1 ts ) Q v2 (t )ns [M vt (x t )] vt [M st (x t )] (1 st )
33 Two Algorithms Analysis of Algorithms I The equivalence between algorithms 1 & 2 follows from the equivalence between message-passing and reparameterization (Buhm s talk). I Key observations about Algorithm 2: 1. if st = 1 for every edge, equvalent to loopy BP. 2. Key difference from standard message passing is M ts depends on M st. I Not always guaranteed to converge. I In practice it s nice to use a dampening technique to encourage convergence. I Connection to LP relaxation: Let = log M elementwise where M is fixed point of algorithm 2. Then is an optimal solution to the LP relaxation given earlier.
34 Outline Introduction Motivation and Background Tree-Reparameterizations Definitions and Upper Bounds LP Approaches IP to LP to Lagrangian Dual Message-passing Approaches Two Algorithms Summary
35 Wrapping Things Up I MAP can be attacked via tree-reparameterizations and LP relaxations. I Message-passing can be viewed as an alternative way to solve the LP dual problem proposed. I Authors claim they ve successfully used these algorithms for a few applications. Results? I When we do not converge, can we perform a further relaxation by clustering random variables like Kikuchi approximations?
36 Citations I V. Kolmogorov, M. Wainwright. On the optimality of tree-reweighted max-product message-passing. Uncertainty on Artificial Intelligence. July I M. Wainwright, T. Jaakkola, A. Willsky. Tree consistency and bounds on the performance of the max-product algorithm and its generalizations. Statistics and Computing. Vol. 14 pp I M. Wainwright, T. Jaakkola, A. Willsky. MAP estimation via agreement on (hyper)trees: Message-passing and linear programming. UC Berkeley Report I V. Komogorov. Convergent tree-reweighted message passing for energy minimization. Microsoft Technical Reports p. 24. June 2005.
MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches
MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches Martin Wainwright Tommi Jaakkola Alan Willsky martinw@eecs.berkeley.edu tommi@ai.mit.edu willsky@mit.edu
More informationExact MAP estimates by (hyper)tree agreement
Exact MAP estimates by (hyper)tree agreement Martin J. Wainwright, Department of EECS, UC Berkeley, Berkeley, CA 94720 martinw@eecs.berkeley.edu Tommi S. Jaakkola and Alan S. Willsky, Department of EECS,
More informationOn the optimality of tree-reweighted max-product message-passing
Appeared in Uncertainty on Artificial Intelligence, July 2005, Edinburgh, Scotland. On the optimality of tree-reweighted max-product message-passing Vladimir Kolmogorov Microsoft Research Cambridge, UK
More information14 : Theory of Variational Inference: Inner and Outer Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2017 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Maria Ryskina, Yen-Chia Hsu 1 Introduction
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Variational Inference IV: Variational Principle II Junming Yin Lecture 17, March 21, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4
More informationProbabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013
School of Computer Science Probabilistic Graphical Models Theory of Variational Inference: Inner and Outer Approximation Junming Yin Lecture 15, March 4, 2013 Reading: W & J Book Chapters 1 Roadmap Two
More informationDiscrete Markov Random Fields the Inference story. Pradeep Ravikumar
Discrete Markov Random Fields the Inference story Pradeep Ravikumar Graphical Models, The History How to model stochastic processes of the world? I want to model the world, and I like graphs... 2 Mid to
More information14 : Theory of Variational Inference: Inner and Outer Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture
More informationA new class of upper bounds on the log partition function
Accepted to IEEE Transactions on Information Theory This version: March 25 A new class of upper bounds on the log partition function 1 M. J. Wainwright, T. S. Jaakkola and A. S. Willsky Abstract We introduce
More informationJunction Tree, BP and Variational Methods
Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Variational Inference II: Mean Field Method and Variational Principle Junming Yin Lecture 15, March 7, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3
More informationDual Decomposition for Marginal Inference
Dual Decomposition for Marginal Inference Justin Domke Rochester Institute of echnology Rochester, NY 14623 Abstract We present a dual decomposition approach to the treereweighted belief propagation objective.
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Variational Inference III: Variational Principle I Junming Yin Lecture 16, March 19, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4
More informationSemidefinite relaxations for approximate inference on graphs with cycles
Semidefinite relaxations for approximate inference on graphs with cycles Martin J. Wainwright Electrical Engineering and Computer Science UC Berkeley, Berkeley, CA 94720 wainwrig@eecs.berkeley.edu Michael
More informationGraphical models and message-passing Part II: Marginals and likelihoods
Graphical models and message-passing Part II: Marginals and likelihoods Martin Wainwright UC Berkeley Departments of Statistics, and EECS Tutorial materials (slides, monograph, lecture notes) available
More informationProbabilistic Graphical Models
Probabilistic Graphical Models David Sontag New York University Lecture 6, March 7, 2013 David Sontag (NYU) Graphical Models Lecture 6, March 7, 2013 1 / 25 Today s lecture 1 Dual decomposition 2 MAP inference
More informationConvergent Tree-reweighted Message Passing for Energy Minimization
Accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2006. Convergent Tree-reweighted Message Passing for Energy Minimization Vladimir Kolmogorov University College London,
More informationMessage-Passing Algorithms for GMRFs and Non-Linear Optimization
Message-Passing Algorithms for GMRFs and Non-Linear Optimization Jason Johnson Joint Work with Dmitry Malioutov, Venkat Chandrasekaran and Alan Willsky Stochastic Systems Group, MIT NIPS Workshop: Approximate
More informationTightness of LP Relaxations for Almost Balanced Models
Tightness of LP Relaxations for Almost Balanced Models Adrian Weller University of Cambridge AISTATS May 10, 2016 Joint work with Mark Rowland and David Sontag For more information, see http://mlg.eng.cam.ac.uk/adrian/
More informationUNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS
UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS JONATHAN YEDIDIA, WILLIAM FREEMAN, YAIR WEISS 2001 MERL TECH REPORT Kristin Branson and Ian Fasel June 11, 2003 1. Inference Inference problems
More informationConcavity of reweighted Kikuchi approximation
Concavity of reweighted ikuchi approximation Po-Ling Loh Department of Statistics The Wharton School University of Pennsylvania loh@wharton.upenn.edu Andre Wibisono Computer Science Division University
More informationDiscrete Inference and Learning Lecture 3
Discrete Inference and Learning Lecture 3 MVA 2017 2018 h
More informationA note on the primal-dual method for the semi-metric labeling problem
A note on the primal-dual method for the semi-metric labeling problem Vladimir Kolmogorov vnk@adastral.ucl.ac.uk Technical report June 4, 2007 Abstract Recently, Komodakis et al. [6] developed the FastPD
More informationConvex Relaxations for Markov Random Field MAP estimation
Convex Relaxations for Markov Random Field MAP estimation Timothee Cour GRASP Lab, University of Pennsylvania September 2008 Abstract Markov Random Fields (MRF) are commonly used in computer vision and
More informationEstimating the Capacity of the 2-D Hard Square Constraint Using Generalized Belief Propagation
Estimating the Capacity of the 2-D Hard Square Constraint Using Generalized Belief Propagation Navin Kashyap (joint work with Eric Chan, Mahdi Jafari Siavoshani, Sidharth Jaggi and Pascal Vontobel, Chinese
More informationOn Partial Optimality in Multi-label MRFs
On Partial Optimality in Multi-label MRFs P. Kohli 1 A. Shekhovtsov 2 C. Rother 1 V. Kolmogorov 3 P. Torr 4 1 Microsoft Research Cambridge 2 Czech Technical University in Prague 3 University College London
More information17 Variational Inference
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 17 Variational Inference Prompted by loopy graphs for which exact
More informationCS281A/Stat241A Lecture 19
CS281A/Stat241A Lecture 19 p. 1/4 CS281A/Stat241A Lecture 19 Junction Tree Algorithm Peter Bartlett CS281A/Stat241A Lecture 19 p. 2/4 Announcements My office hours: Tuesday Nov 3 (today), 1-2pm, in 723
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft
More informationVariational Inference (11/04/13)
STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further
More informationVariational algorithms for marginal MAP
Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Work with Qiang
More informationLecture 8: PGM Inference
15 September 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I 1 Variable elimination Max-product Sum-product 2 LP Relaxations QP Relaxations 3 Marginal and MAP X1 X2 X3 X4
More informationDual Decomposition for Inference
Dual Decomposition for Inference Yunshu Liu ASPITRG Research Group 2014-05-06 References: [1]. D. Sontag, A. Globerson and T. Jaakkola, Introduction to Dual Decomposition for Inference, Optimization for
More informationUC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics. EECS 281A / STAT 241A Statistical Learning Theory
UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics EECS 281A / STAT 241A Statistical Learning Theory Solutions to Problem Set 2 Fall 2011 Issued: Wednesday,
More informationGraphical Model Inference with Perfect Graphs
Graphical Model Inference with Perfect Graphs Tony Jebara Columbia University July 25, 2013 joint work with Adrian Weller Graphical models and Markov random fields We depict a graphical model G as a bipartite
More information13 : Variational Inference: Loopy Belief Propagation and Mean Field
10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction
More informationVariational Algorithms for Marginal MAP
Variational Algorithms for Marginal MAP Qiang Liu Department of Computer Science University of California, Irvine Irvine, CA, 92697 qliu1@ics.uci.edu Alexander Ihler Department of Computer Science University
More informationTightening Fractional Covering Upper Bounds on the Partition Function for High-Order Region Graphs
Tightening Fractional Covering Upper Bounds on the Partition Function for High-Order Region Graphs Tamir Hazan TTI Chicago Chicago, IL 60637 Jian Peng TTI Chicago Chicago, IL 60637 Amnon Shashua The Hebrew
More informationBounding the Partition Function using Hölder s Inequality
Qiang Liu Alexander Ihler Department of Computer Science, University of California, Irvine, CA, 92697, USA qliu1@uci.edu ihler@ics.uci.edu Abstract We describe an algorithm for approximate inference in
More informationIntelligent Systems:
Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition
More informationCOMPSCI 276 Fall 2007
Exact Inference lgorithms for Probabilistic Reasoning; OMPSI 276 Fall 2007 1 elief Updating Smoking lung ancer ronchitis X-ray Dyspnoea P lung cancer=yes smoking=no, dyspnoea=yes =? 2 Probabilistic Inference
More informationGRAPH ALGORITHMS Week 7 (13 Nov - 18 Nov 2017)
GRAPH ALGORITHMS Week 7 (13 Nov - 18 Nov 2017) C. Croitoru croitoru@info.uaic.ro FII November 12, 2017 1 / 33 OUTLINE Matchings Analytical Formulation of the Maximum Matching Problem Perfect Matchings
More information12 : Variational Inference I
10-708: Probabilistic Graphical Models, Spring 2015 12 : Variational Inference I Lecturer: Eric P. Xing Scribes: Fattaneh Jabbari, Eric Lei, Evan Shapiro 1 Introduction Probabilistic inference is one of
More informationQuadratic Programming Relaxations for Metric Labeling and Markov Random Field MAP Estimation
Quadratic Programming Relaations for Metric Labeling and Markov Random Field MAP Estimation Pradeep Ravikumar John Lafferty School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213,
More informationGraphs with Cycles: A Summary of Martin Wainwright s Ph.D. Thesis
Graphs with Cycles: A Summary of Martin Wainwright s Ph.D. Thesis Constantine Caramanis November 6, 2002 This paper is a (partial) summary of chapters 2,5 and 7 of Martin Wainwright s Ph.D. Thesis. For
More informationPart 7: Structured Prediction and Energy Minimization (2/2)
Part 7: Structured Prediction and Energy Minimization (2/2) Colorado Springs, 25th June 2011 G: Worst-case Complexity Hard problem Generality Optimality Worst-case complexity Integrality Determinism G:
More informationarxiv: v1 [stat.ml] 26 Feb 2013
Variational Algorithms for Marginal MAP Qiang Liu Donald Bren School of Information and Computer Sciences University of California, Irvine Irvine, CA, 92697-3425, USA qliu1@uci.edu arxiv:1302.6584v1 [stat.ml]
More informationInconsistent parameter estimation in Markov random fields: Benefits in the computation-limited setting
Inconsistent parameter estimation in Markov random fields: Benefits in the computation-limited setting Martin J. Wainwright Department of Statistics, and Department of Electrical Engineering and Computer
More informationCS Lecture 8 & 9. Lagrange Multipliers & Varitional Bounds
CS 6347 Lecture 8 & 9 Lagrange Multipliers & Varitional Bounds General Optimization subject to: min ff 0() R nn ff ii 0, h ii = 0, ii = 1,, mm ii = 1,, pp 2 General Optimization subject to: min ff 0()
More informationProbabilistic and Bayesian Machine Learning
Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/
More informationInference in Bayesian Networks
Andrea Passerini passerini@disi.unitn.it Machine Learning Inference in graphical models Description Assume we have evidence e on the state of a subset of variables E in the model (i.e. Bayesian Network)
More informationLecture 9: PGM Learning
13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and
More information13: Variational inference II
10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational
More informationGross Substitutes Tutorial
Gross Substitutes Tutorial Part I: Combinatorial structure and algorithms (Renato Paes Leme, Google) Part II: Economics and the boundaries of substitutability (Inbal Talgam-Cohen, Hebrew University) Three
More informationMachine Learning 4771
Machine Learning 4771 Instructor: Tony Jebara Topic 18 The Junction Tree Algorithm Collect & Distribute Algorithmic Complexity ArgMax Junction Tree Algorithm Review: Junction Tree Algorithm end message
More informationOn the Convergence of the Convex Relaxation Method and Distributed Optimization of Bethe Free Energy
On the Convergence of the Convex Relaxation Method and Distributed Optimization of Bethe Free Energy Ming Su Department of Electrical Engineering and Department of Statistics University of Washington,
More informationOutline. Relaxation. Outline DMP204 SCHEDULING, TIMETABLING AND ROUTING. 1. Lagrangian Relaxation. Lecture 12 Single Machine Models, Column Generation
Outline DMP204 SCHEDULING, TIMETABLING AND ROUTING 1. Lagrangian Relaxation Lecture 12 Single Machine Models, Column Generation 2. Dantzig-Wolfe Decomposition Dantzig-Wolfe Decomposition Delayed Column
More informationSubproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing Huayan Wang huayanw@cs.stanford.edu Computer Science Department, Stanford University, Palo Alto, CA 90 USA Daphne Koller koller@cs.stanford.edu
More informationChalmers Publication Library
Chalmers Publication Library Copyright Notice 20 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for
More informationPart V. Matchings. Matching. 19 Augmenting Paths for Matchings. 18 Bipartite Matching via Flows
Matching Input: undirected graph G = (V, E). M E is a matching if each node appears in at most one Part V edge in M. Maximum Matching: find a matching of maximum cardinality Matchings Ernst Mayr, Harald
More informationSeparation Techniques for Constrained Nonlinear 0 1 Programming
Separation Techniques for Constrained Nonlinear 0 1 Programming Christoph Buchheim Computer Science Department, University of Cologne and DEIS, University of Bologna MIP 2008, Columbia University, New
More informationApproximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract)
Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles
More informationInteger Programming ISE 418. Lecture 8. Dr. Ted Ralphs
Integer Programming ISE 418 Lecture 8 Dr. Ted Ralphs ISE 418 Lecture 8 1 Reading for This Lecture Wolsey Chapter 2 Nemhauser and Wolsey Sections II.3.1, II.3.6, II.4.1, II.4.2, II.5.4 Duality for Mixed-Integer
More information15-780: LinearProgramming
15-780: LinearProgramming J. Zico Kolter February 1-3, 2016 1 Outline Introduction Some linear algebra review Linear programming Simplex algorithm Duality and dual simplex 2 Outline Introduction Some linear
More informationProbability Propagation
Graphical Models, Lecture 12, Michaelmas Term 2010 November 19, 2010 Characterizing chordal graphs The following are equivalent for any undirected graph G. (i) G is chordal; (ii) G is decomposable; (iii)
More informationBayesian Machine Learning - Lecture 7
Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1
More informationInput: System of inequalities or equalities over the reals R. Output: Value for variables that minimizes cost function
Linear programming Input: System of inequalities or equalities over the reals R A linear cost function Output: Value for variables that minimizes cost function Example: Minimize 6x+4y Subject to 3x + 2y
More informationConcavity of reweighted Kikuchi approximation
Concavity of reweighted Kikuchi approximation Po-Ling Loh Department of Statistics The Wharton School University of Pennsylvania loh@wharton.upenn.edu Andre Wibisono Computer Science Division University
More informationMinimum Weight Perfect Matching via Blossom Belief Propagation
Minimum Weight Perfect Matching via Blossom Belief Propagation Sungsoo Ahn Sejun Park Michael Chertkov Jinwoo Shin School of Electrical Engineering, Korea Advanced Institute of Science and Technology,
More informationTHE topic of this paper is the following problem:
1 Revisiting the Linear Programming Relaxation Approach to Gibbs Energy Minimization and Weighted Constraint Satisfaction Tomáš Werner Abstract We present a number of contributions to the LP relaxation
More informationIntroduction to graphical models: Lecture III
Introduction to graphical models: Lecture III Martin Wainwright UC Berkeley Departments of Statistics, and EECS Martin Wainwright (UC Berkeley) Some introductory lectures January 2013 1 / 25 Introduction
More information1 Matroid intersection
CS 369P: Polyhedral techniques in combinatorial optimization Instructor: Jan Vondrák Lecture date: October 21st, 2010 Scribe: Bernd Bandemer 1 Matroid intersection Given two matroids M 1 = (E, I 1 ) and
More informationLinear and conic programming relaxations: Graph structure and message-passing
Linear and conic programming relaxations: Graph structure and message-passing Martin Wainwright UC Berkeley Departments of EECS and Statistics Banff Workshop Partially supported by grants from: National
More informationECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4
ECE52 Tutorial Topic Review ECE52 Winter 206 Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides ECE52 Tutorial ECE52 Winter 206 Credits to Alireza / 4 Outline K-means, PCA 2 Bayesian
More informationECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning
ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Summary of Class Advanced Topics Dhruv Batra Virginia Tech HW1 Grades Mean: 28.5/38 ~= 74.9%
More informationIterative Rounding and Relaxation
Iterative Rounding and Relaxation James Davis Department of Computer Science Rutgers University Camden jamesdav@camden.rutgers.edu March 11, 2010 James Davis (Rutgers Camden) Iterative Rounding 1 / 58
More informationPart III. for energy minimization
ICCV 2007 tutorial Part III Message-assing algorithms for energy minimization Vladimir Kolmogorov University College London Message assing ( E ( (,, Iteratively ass messages between nodes... Message udate
More informationCSC373: Algorithm Design, Analysis and Complexity Fall 2017 DENIS PANKRATOV NOVEMBER 1, 2017
CSC373: Algorithm Design, Analysis and Complexity Fall 2017 DENIS PANKRATOV NOVEMBER 1, 2017 Linear Function f: R n R is linear if it can be written as f x = a T x for some a R n Example: f x 1, x 2 =
More informationLecture 9 Tuesday, 4/20/10. Linear Programming
UMass Lowell Computer Science 91.503 Analysis of Algorithms Prof. Karen Daniels Spring, 2010 Lecture 9 Tuesday, 4/20/10 Linear Programming 1 Overview Motivation & Basics Standard & Slack Forms Formulating
More informationA New Approximation Algorithm for the Asymmetric TSP with Triangle Inequality By Markus Bläser
A New Approximation Algorithm for the Asymmetric TSP with Triangle Inequality By Markus Bläser Presented By: Chris Standish chriss@cs.tamu.edu 23 November 2005 1 Outline Problem Definition Frieze s Generic
More informationLinear Programming. Jie Wang. University of Massachusetts Lowell Department of Computer Science. J. Wang (UMass Lowell) Linear Programming 1 / 47
Linear Programming Jie Wang University of Massachusetts Lowell Department of Computer Science J. Wang (UMass Lowell) Linear Programming 1 / 47 Linear function: f (x 1, x 2,..., x n ) = n a i x i, i=1 where
More informationInteger Programming for Bayesian Network Structure Learning
Integer Programming for Bayesian Network Structure Learning James Cussens Helsinki, 2013-04-09 James Cussens IP for BNs Helsinki, 2013-04-09 1 / 20 Linear programming The Belgian diet problem Fat Sugar
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 12: Gaussian Belief Propagation, State Space Models and Kalman Filters Guest Kalman Filter Lecture by
More informationSection Notes 8. Integer Programming II. Applied Math 121. Week of April 5, expand your knowledge of big M s and logical constraints.
Section Notes 8 Integer Programming II Applied Math 121 Week of April 5, 2010 Goals for the week understand IP relaxations be able to determine the relative strength of formulations understand the branch
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Lecture 9: Variational Inference Relaxations Volkan Cevher, Matthias Seeger Ecole Polytechnique Fédérale de Lausanne 24/10/2011 (EPFL) Graphical Models 24/10/2011 1 / 15
More informationStructured Variational Inference
Structured Variational Inference Sargur srihari@cedar.buffalo.edu 1 Topics 1. Structured Variational Approximations 1. The Mean Field Approximation 1. The Mean Field Energy 2. Maximizing the energy functional:
More informationLagrangian Relaxation for MAP Estimation in Graphical Models
In Proceedings of The th Allerton Conference on Communication, Control and Computing, September, 007. Lagrangian Relaxation for MAP Estimation in Graphical Models Jason K. Johnson, Dmitry M. Malioutov
More informationOn the Choice of Regions for Generalized Belief Propagation
UAI 2004 WELLING 585 On the Choice of Regions for Generalized Belief Propagation Max Welling (welling@ics.uci.edu) School of Information and Computer Science University of California Irvine, CA 92697-3425
More informationLinear Programming. Scheduling problems
Linear Programming Scheduling problems Linear programming (LP) ( )., 1, for 0 min 1 1 1 1 1 11 1 1 n i x b x a x a b x a x a x c x c x z i m n mn m n n n n! = + + + + + + = Extreme points x ={x 1,,x n
More informationConvexifying the Bethe Free Energy
402 MESHI ET AL. Convexifying the Bethe Free Energy Ofer Meshi Ariel Jaimovich Amir Globerson Nir Friedman School of Computer Science and Engineering Hebrew University, Jerusalem, Israel 91904 meshi,arielj,gamir,nir@cs.huji.ac.il
More informationAdaptive Cut Generation for Improved Linear Programming Decoding of Binary Linear Codes
Adaptive Cut Generation for Improved Linear Programming Decoding of Binary Linear Codes Xiaojie Zhang and Paul H. Siegel University of California, San Diego, La Jolla, CA 9093, U Email:{ericzhang, psiegel}@ucsd.edu
More informationMath Models of OR: Branch-and-Bound
Math Models of OR: Branch-and-Bound John E. Mitchell Department of Mathematical Sciences RPI, Troy, NY 12180 USA November 2018 Mitchell Branch-and-Bound 1 / 15 Branch-and-Bound Outline 1 Branch-and-Bound
More information- Well-characterized problems, min-max relations, approximate certificates. - LP problems in the standard form, primal and dual linear programs
LP-Duality ( Approximation Algorithms by V. Vazirani, Chapter 12) - Well-characterized problems, min-max relations, approximate certificates - LP problems in the standard form, primal and dual linear programs
More informationConvergence Analysis of Reweighted Sum-Product Algorithms
IEEE TRANSACTIONS ON SIGNAL PROCESSING, PAPER IDENTIFICATION NUMBER T-SP-06090-2007.R1 1 Convergence Analysis of Reweighted Sum-Product Algorithms Tanya Roosta, Student Member, IEEE, Martin J. Wainwright,
More informationProbabilistic Graphical Models (I)
Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random
More informationBELIEF-PROPAGATION FOR WEIGHTED b-matchings ON ARBITRARY GRAPHS AND ITS RELATION TO LINEAR PROGRAMS WITH INTEGER SOLUTIONS
BELIEF-PROPAGATION FOR WEIGHTED b-matchings ON ARBITRARY GRAPHS AND ITS RELATION TO LINEAR PROGRAMS WITH INTEGER SOLUTIONS MOHSEN BAYATI, CHRISTIAN BORGS, JENNIFER CHAYES, AND RICCARDO ZECCHINA Abstract.
More informationA Graphical Transformation for Belief Propagation: Maximum Weight Matchings and Odd-Sized Cycles
A Graphical Transformation for Belief Propagation: Maximum Weight Matchings and Odd-Sized Cycles Jinwoo Shin Department of Electrical Engineering Korea Advanced Institute of Science and Technology Daejeon,
More informationLecture 4 October 18th
Directed and undirected graphical models Fall 2017 Lecture 4 October 18th Lecturer: Guillaume Obozinski Scribe: In this lecture, we will assume that all random variables are discrete, to keep notations
More informationA Convex Upper Bound on the Log-Partition Function for Binary Graphical Models
A Convex Upper Bound on the Log-Partition Function for Binary Graphical Models Laurent El Ghaoui and Assane Gueye Department of Electrical Engineering and Computer Science University of California Berkeley
More informationHigh dimensional Ising model selection
High dimensional Ising model selection Pradeep Ravikumar UT Austin (based on work with John Lafferty, Martin Wainwright) Sparse Ising model US Senate 109th Congress Banerjee et al, 2008 Estimate a sparse
More information