UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS

Size: px
Start display at page:

Download "UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS"

Transcription

1 UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS JONATHAN YEDIDIA, WILLIAM FREEMAN, YAIR WEISS 2001 MERL TECH REPORT Kristin Branson and Ian Fasel June 11, 2003

2 1. Inference Inference problems in networks include: Conditional Probability Query: What is p(x i X E = x E )? Most Probable Explanation: What is argmax xu p(x U = x U X E = x E )? Inference problems also arise in statistical physics. In general, these problems are intractable.

3 1.1. Inference Approximations Pearl s Belief Propagation (BP) algorithm solves inference Exactly for tree networks and Approximately for other networks. BP is not well-understood for loopy networks. BP is closely connected to the Bethe approximation of statistical physics.

4 1.2. Purpose Our goal in this paper is to Gain understanding of the BP approximation and How to make it more exact by exploring equivalent approximations in statistical physics. This has lead to an improved inference algorithm, Generalized BP.

5 Outline The Settings. Pairwise Markov Random Fields. The Potts and Ising Models. The Approximations. Belief Propagation. The Mean-Field Approximation. The Bethe Approximation. The connection between the Bethe and BP approximations. The Kikuchi Approximation. Generalized Belief Propagation.

6 2. The Settings

7 2.1. Pairwise Markov Random Fields A pairwise MRF is an undirected network with cliques of size two, e.g.: observed node hidden node With hidden variables x and observed variables y, p(x y) = 1 ψ ij (x i, x j ) φ i (x i, y i ). Z (i,j) i Any MRF can be converted into this form (Weiss, 2001). We want to calculate p(x i y) or argmax x p(x y).

8 2.2. The Potts Model The pairwise MRF can be brought into a form recognizable to physicists as the Potts model: p(x y) = 1 Z exp J ij (x i, x j ) + h i (x i ) /T (i,j) i by setting The interaction J ij (x i, x j ) = lg ψ ij (x i, x j ), The field h i (x i ) = lg φ i (x i, y i ), and the temperature T = 1.

9 2.3. The Ising Model The Ising model is a special case of the Potts model in which Each variable is binary, Interactions J ij are symmetric, The distribution [( can be expressed as: p(x) = 1 Z exp (i,j) J ijx i x j + ) ] i h ix i /T -1 +1

10 Use of the Ising Model The Ising model was invented to describe phase transitions in magnetics. J encourages neighboring particle to have equal values. h represents an external magnetic field. The magnetization of a particle is its expected value, m i = p(x i = +1) p(x i = 1). Phase transitions (Veytsman and Kotelyanskii, 1997).

11 3. The Approximations

12 3.1. Standard Belief Propogation i Beliefs: b i (x i ) Messages: m ji (x i ) b i (x i ) = kφ i (x i ) m ji (x i ) j N(i) The belief is the BP approximation of the marginal probability.

13 BP Message-update rules To get marginal beliefs b i (x i ), sum over other variables: b i (x i ) = X a\x i b a (X a ) So we write the messages as: m ij (x j ) φ i (x i )ψ ij (x i, x j ) x i m ki (x i ) k N(i)\j i j = i j

14 BP is Exact for Trees b i (x i ) m 21 (x 1 ) ψ 12 (x 1, x 2 )m 23 (x 3 )m 24 (x 4 ) x 2 ψ 12 (x 1, x 2 )ψ 23 (x 2, x 3 )ψ 24 (x 2, x 4 ) x 2,x 3,x 4

15 3.2. Mean-Field Approximations The true distribution p(x y) is approximated by the closest distribution b(x) in the family Q. What is the definition of closest? The Maximum-Likelihood definition is b ( ) = argmin b( ) KL(p( y) b( )) For reasonable Q, computing b ( ) is intractable. The Mean-Field definition is b( ) = argmin b( ) KL(b( ) p( y)).

16 3.3. Why This Measure? For some reasonable Q, we can compute b( ). If p Q, we will exactly compute b(x) = p(x). Minimizing KL(b( ) p( y)) yields the best lower bound on the log-likelihood l(p; y): l(p; y) = x b(x) lg p(y) = p(x,y)b(x) x b(x) lg b(x)p(x y) = p(x,y) x b(x) lg b(x) + b(x) x b(x) lg p(x y) = lg p(x,y) b(x) + KL(b( ) p( y)) b lg p(x,y) b(x) (Tanaka, 2001; Jaakkola, 2001) b

17 3.4. The Gibbs Free Energy In physics, this distance is the Gibbs Free Energy: G(b( ), y) = KL(b( ) p( y)) = x b(x)e(x, y) + x b(x) lg b(x) + lg Z where the energy E(x, y) = (i,j) lg ψ ij (x i, x j ) i lg φ i (x i, y i ) (thus p(x y) = 1 Z exp[ E(x, y)]).

18 The Helmholtz Free Energy Minimizing the Gibbs Free Energy is equivalent to minimizing the variational free energy F (b( ), y) = x b(x)e(x, y) + x b(x) lg b(x). If b( ) = p( y), then the variational free energy achieves the Helmholtz Free Energy F (y) = lg Z = x b(x)e(x, y) + x b(x) lg b(x). Otherwise, F (b( ), y) is an upperbound on F (y). Minimizing G(b( ), y) thus minimizes this upperbound (Yedidia, 2001).

19 3.5. The Naive MF Approximation The naive MF approximation restricts Q to be the set of distributions factorizable as b(x) = i b i (x i ). The Gibbs free energy is then G(b i, y) = b i (x i )b j (x j ) lg ψ ij (x i, x j ) (i,j) x i,x j b i (x i ) lg φ i (x i, y i ) i x i + b i (x i ) lg b i (x i ) + lg Z i x i

20 3.6. The Naive MF Algorithm G MF is minimized (subj to x i b i (x i ) = 0) by setting the derivative wrt b i (x i ) to 0, yielding b i (x i ) = αφ i (x i, y i ) exp b j (x j ) lg ψ ij (x i, x j ), j Ni x j where α is a normalization constant. Each estimate b i (x i ) is iteratively updated until convergence. p(x i y) is approximated by the steady-state b i (x i ) (Weiss, 2001).

21 4. Region Based Inference Approximations Naive MF restricts Q to distributions expressible in terms of single-node beliefs. A better approximation is to restrict Q to distributions expressible in terms of node-cluster beliefs.

22 4.1. The Bethe Approximation The Bethe approximation restricts Q to b(x) that can be expressed in terms of the one- and two-node beliefs: b(x) = (i,j) b ij (x i, x j ) i b i (x i ) 1 q i The Gibbs free energy is then: G Bethe = b ij (x i, x j )(E ij (x i, x j ) + ln b ij (x i, x j )) x i,x j x i,x j + (1 q i ) b i (x i ) ln b i (x i ) i x i

23 Minimizing the Bethe Free Energy Minimize by constrained optimization using Lagrange multipliers. L = G Bethe + { } λ i b i (x i ) = 1 i x i + { } λ ai (x i ) b a (X a ) = b i (x i ) a i N(a) x i X a\x i Results in belief equations: L b i (x i ) = 0 b i(x i ) exp 1 λ ai (x i ) d i 1 a N(i) L b ai (x a ) = 0 b a(x a ) exp E a (X a ) + λ ai (x i ) a N(i)

24 Equivalence of BP to the Bethe Free Energy Identify: λ i,j (x j ) = ln m kj (x j ) To obtain BP equations: k N(j)\i i b i (X i ) a N(i) m a i(x i ) i j = i j m ij (x j ) x i φ i (x i )ψ ij (x i, x j ) k N(i)\j m ki(x i )

25 4.2. Generalizing the Bethe Approximation The Bethe approximation restricts Q to b(x) that can be expressed in terms of the one- and two-node beliefs: b(x) = (i,j) b ij(x i, x j ) i b i(x i ) 1 q i. Another way to write the Bethe Approximation: Let C = {{i, j}}

26 4.2. Generalizing the Bethe Approximation The Bethe approximation restricts Q to b(x) that can be expressed in terms of the one- and two-node beliefs: b(x) = (i,j) b ij(x i, x j ) i b i(x i ) 1 q i. Another way to write the Bethe Approximation: Let C = {{i, j}}

27 4.2. Generalizing the Bethe Approximation The Bethe approximation restricts Q to b(x) that can be expressed in terms of the one- and two-node beliefs: b(x) = (i,j) b ij(x i, x j ) i b i(x i ) 1 q i. Another way to write the Bethe Approximation: Let C = {{i, j}}

28 4.2. Generalizing the Bethe Approximation The Bethe approximation restricts Q to b(x) that can be expressed in terms of the one- and two-node beliefs: b(x) = (i,j) b ij(x i, x j ) i b i(x i ) 1 q i. Another way to write the Bethe Approximation: Let C = {{i, j}}

29 4.2. Generalizing the Bethe Approximation The Bethe approximation restricts Q to b(x) that can be expressed in terms of the one- and two-node beliefs: b(x) = (i,j) b ij(x i, x j ) i b i(x i ) 1 q i. Another way to write the Bethe Approximation: Let C = {{i, j}} and R = {C i } {C i C j }

30 4.2. Generalizing the Bethe Approximation The Bethe approximation restricts Q to b(x) that can be expressed in terms of the one- and two-node beliefs: b(x) = (i,j) b ij(x i, x j ) i b i(x i ) 1 q i. Another way to write the Bethe Approximation: Let C = {{i, j}} and R = {C i } {C i C j }

31 4.2. Generalizing the Bethe Approximation The Bethe approximation restricts Q to b(x) that can be expressed in terms of the one- and two-node beliefs: b(x) = (i,j) b ij(x i, x j ) i b i(x i ) 1 q i. Another way to write the Bethe Approximation: Let C = {{i, j}} and R = {C i } {C i C j }. Define c r = 1 s super(r) c s and super(r) = {s r s}. Thus, If r = {i, j}, super(r) = {}, so c r = 1. If r = {i}, super(r) = {{i, }}, so c r = 1 q i. b(x) = r R b r(x r ) c r,

32 4.3. Kikuchi Approximations The Kikuchi approximation allows C to be any set of clusters of nodes s.t. every edge and node of the MRF is in some cluster, and sets Example: R = C {C i C j } {(C i C j ) (C k C l )} Original MRF C {C i C j } {(C i C j ) (C k C l )}

33 4.3. Kikuchi Approximations The Kikuchi approximation restricts Q to b(x) that can be expressed in terms of cluster beliefs: b(x) = r R b r (x r ) 1 c r. p( y) is approximated by b( ) = argmin b( ) Q KL(b( ) p( y)). The Kikuchi approximation can be made exact, in which case it is equivalent to the junction tree algorithm.

34 4.4. Other Approximations In general, it is possible to define a set of rules for constructing valid region approximations. Bethe restricts regions to neighboring pairs Kikuchi restricts regions to be connected clusters of nodes Example of a region that cannot be created using other methods: Many others...!

35 4.4. Other Approximations Other region based approximations exist, given a few rules for constructing valid approximations (involving counting numbers, etc.)

36 5. Generalizing BP Given a valid region-based approximation, we can construct a generalized belief propagation (GBP) algorithm. Belief in a region is the product of: Local information (factors in a region) Messages from parent regions Messages into descendant regions from parents who are not descendants Message update rule obtained by enforcing marginalization constraints. We do this for Cluster Variational Method (Kikuchi) as an example...

37 5.1. Constructing clusters First, construct clusters and find intersection regions (assign counting numbers at this point)

38 5.2. Construct region graph A hierarchy of regions and their direct sub-regions (This example is the region graph using Kikuchi method.)

39 5.3. Construct Belief Equations Belief equations for every region r are proportional to the product of local information (factors in a region) messages from parent regions messages into descendant regions from parents who are not descendants

40 Constructing Belief Equations Step 3: Construct belief equations for each region b 5 (x 5 ) = k[φ 5 ][m 2 5 m 4 5 m 6 5 m 8 5 ]

41 Constructing Belief Equations Step 3: Construct belief equations for each region b 45 = k[φ 4 φ 5 ψ 45 ][m m m 2 5 m 6 5 m 8 5 ]

42 Constructing Belief Equations Step 3: Construct belief equations for each region b 1245 = k[φ 1 φ 2 φ 4 φ 5 ψ 12 ψ 14 ψ 25 ψ 45 ][m m m 6 5 m 8 5 ]

43 5.4. Message Update Rule Use marginalization constraints to define message update rules. b 5 (x 5 ) = b 45 (x 4, x 5 ) x 4 Combining previous belief equations: k 1 [φ 5 ][m 2 5 m 4 5 m 6 5 m 8 5 ] = k 2 [φ 4 φ 5 ψ 45 ][m m m 2 5 m 6 5 m 8 5 ] x 4 Gives us: m 4 5 (x 5 ) = k x 4 φ 4 (x 4 )ψ 45 (x 4, x 5 )m (x 4, x 5 )m (x 2, x 5 )

44 Message Update Rule Use marginalization constraints to define message update rules: = b 5 = x 4 b 45 (x 4, x 5 )

45 Message Update Rule Use marginalization constraints to define message update rules: = b 5 = x 4 b 45 (x 4, x 5 )

46 Message Update Rule Use marginalization constraints to define message update rules: = b 5 = x 4 b 45 (x 4, x 5 )

47 Message Update Rule Use marginalization constraints to define message update rules: = m 4 5 (x 5 ) = k x 4 φ 4 (x 4 )ψ 45 (x 4, x 5 )m (x 4, x 5 )m (x 2, x 5 )

48 5.5. Run GBP algorithm The GBP algorithm runs in the same way as BP: 1. Initialize messages to unbiased states 2. Iterate through message update rules until (hopefully) convergence Occasionally, helpful to move only part-way to new values of messages at each iteration.

49 5.6. Generalized Belief Propagation Theorems: Stationary points of Bethe approximation are fixed points of BP Stationary points of Kikuchi approximation are fixed points of GBP. When region graph is a tree, message-passing algorithm is exact Empirically, more likely to converge than BP Can be nearly as fast, but much more accurate (depending on details of region graph)

50 6. Conclusions Standard BP is equivalent to minimizing Bethe Free Energy. The Bethe and Junction Tree approximations are both sub-classes of the Kikuchi approximation. GBP is equivalent to minimizing the Kikuchi Free Energy. GBP is exact when region graph is a tree.

51 References Jaakkola, T. (2001). Tutorial on variational approximation methods. In Opper, M. and Saad, D., editors, Advanced Mean Field Methods: Theory and Practice, chapter 10, pages MIT Press. Kappen, H. and Wiegerninck, W. (2001). Mean field theory for graphical models. In Opper, M. and Saad, D., editors, Advanced Mean Field Methods: Theory and Practice, chapter 4, pages MIT Press. Opper, M. and Winther, O. (2001). From naive mean field theory to the tap equations. In Opper, M. and Saad, D., editors, Advanced Mean Field Methods: Theory and Practice, chapter 2, pages MIT Press. Tanaka, T. (2001). Information geometry of mean-field approximation. In Opper, M. and Saad, D., editors, Advanced Mean Field Methods: Theory and Practice, chapter 17, pages MIT Press. Veytsman, B. and Kotelyanskii, M. (1997). Website: Ising model and its applications at www/matsc597c-1997/phases/lecture3/.

52 Weiss, Y. (2001). Comparing the mean field method and belief propagation for approximate inference in mrfs. In Opper, M. and Saad, D., editors, Advanced Mean Field Methods: Theory and Practice, chapter 15, pages MIT Press. Yedidia, J. (2001). An idiosynchratic journey beyond mean field theory. In Opper, M. and Saad, D., editors, Advanced Mean Field Methods: Theory and Practice, chapter 3, pages MIT Press.

12 : Variational Inference I

12 : Variational Inference I 10-708: Probabilistic Graphical Models, Spring 2015 12 : Variational Inference I Lecturer: Eric P. Xing Scribes: Fattaneh Jabbari, Eric Lei, Evan Shapiro 1 Introduction Probabilistic inference is one of

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Maria Ryskina, Yen-Chia Hsu 1 Introduction

More information

Lecture 18 Generalized Belief Propagation and Free Energy Approximations

Lecture 18 Generalized Belief Propagation and Free Energy Approximations Lecture 18, Generalized Belief Propagation and Free Energy Approximations 1 Lecture 18 Generalized Belief Propagation and Free Energy Approximations In this lecture we talked about graphical models and

More information

13 : Variational Inference: Loopy Belief Propagation

13 : Variational Inference: Loopy Belief Propagation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 13 : Variational Inference: Loopy Belief Propagation Lecturer: Eric P. Xing Scribes: Rajarshi Das, Zhengzhong Liu, Dishan Gupta 1 Introduction

More information

Probabilistic and Bayesian Machine Learning

Probabilistic and Bayesian Machine Learning Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/

More information

13 : Variational Inference: Loopy Belief Propagation and Mean Field

13 : Variational Inference: Loopy Belief Propagation and Mean Field 10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction

More information

Estimating the Capacity of the 2-D Hard Square Constraint Using Generalized Belief Propagation

Estimating the Capacity of the 2-D Hard Square Constraint Using Generalized Belief Propagation Estimating the Capacity of the 2-D Hard Square Constraint Using Generalized Belief Propagation Navin Kashyap (joint work with Eric Chan, Mahdi Jafari Siavoshani, Sidharth Jaggi and Pascal Vontobel, Chinese

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference II: Mean Field Method and Variational Principle Junming Yin Lecture 15, March 7, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3

More information

Linear Response for Approximate Inference

Linear Response for Approximate Inference Linear Response for Approximate Inference Max Welling Department of Computer Science University of Toronto Toronto M5S 3G4 Canada welling@cs.utoronto.ca Yee Whye Teh Computer Science Division University

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference IV: Variational Principle II Junming Yin Lecture 17, March 21, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4

More information

Bayesian Machine Learning - Lecture 7

Bayesian Machine Learning - Lecture 7 Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1

More information

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,

More information

Junction Tree, BP and Variational Methods

Junction Tree, BP and Variational Methods Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,

More information

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,

More information

Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract)

Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles

More information

17 Variational Inference

17 Variational Inference Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 17 Variational Inference Prompted by loopy graphs for which exact

More information

Probabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013

Probabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013 School of Computer Science Probabilistic Graphical Models Theory of Variational Inference: Inner and Outer Approximation Junming Yin Lecture 15, March 4, 2013 Reading: W & J Book Chapters 1 Roadmap Two

More information

Fractional Belief Propagation

Fractional Belief Propagation Fractional Belief Propagation im iegerinck and Tom Heskes S, niversity of ijmegen Geert Grooteplein 21, 6525 EZ, ijmegen, the etherlands wimw,tom @snn.kun.nl Abstract e consider loopy belief propagation

More information

The Generalized Distributive Law and Free Energy Minimization

The Generalized Distributive Law and Free Energy Minimization The Generalized Distributive Law and Free Energy Minimization Srinivas M. Aji Robert J. McEliece Rainfinity, Inc. Department of Electrical Engineering 87 N. Raymond Ave. Suite 200 California Institute

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Belief Propagation on Partially Ordered Sets Robert J. McEliece California Institute of Technology

Belief Propagation on Partially Ordered Sets Robert J. McEliece California Institute of Technology Belief Propagation on Partially Ordered Sets Robert J. McEliece California Institute of Technology +1 +1 +1 +1 {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1,3} {2,3} {3,4} {3,5} -1-1 -1-1 {3} +1 International Symposium

More information

A Cluster-Cumulant Expansion at the Fixed Points of Belief Propagation

A Cluster-Cumulant Expansion at the Fixed Points of Belief Propagation A Cluster-Cumulant Expansion at the Fixed Points of Belief Propagation Max Welling Dept. of Computer Science University of California, Irvine Irvine, CA 92697-3425, USA Andrew E. Gelfand Dept. of Computer

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture

More information

Does Better Inference mean Better Learning?

Does Better Inference mean Better Learning? Does Better Inference mean Better Learning? Andrew E. Gelfand, Rina Dechter & Alexander Ihler Department of Computer Science University of California, Irvine {agelfand,dechter,ihler}@ics.uci.edu Abstract

More information

Probabilistic Graphical Models

Probabilistic Graphical Models 2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector

More information

Message Passing and Junction Tree Algorithms. Kayhan Batmanghelich

Message Passing and Junction Tree Algorithms. Kayhan Batmanghelich Message Passing and Junction Tree Algorithms Kayhan Batmanghelich 1 Review 2 Review 3 Great Ideas in ML: Message Passing Each soldier receives reports from all branches of tree 3 here 7 here 1 of me 11

More information

Machine Learning for Data Science (CS4786) Lecture 24

Machine Learning for Data Science (CS4786) Lecture 24 Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each

More information

Statistical Approaches to Learning and Discovery

Statistical Approaches to Learning and Discovery Statistical Approaches to Learning and Discovery Graphical Models Zoubin Ghahramani & Teddy Seidenfeld zoubin@cs.cmu.edu & teddy@stat.cmu.edu CALD / CS / Statistics / Philosophy Carnegie Mellon University

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ and Center for Automated Learning and

More information

Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation

Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation Jason K. Johnson, Dmitry M. Malioutov and Alan S. Willsky Department of Electrical Engineering and Computer Science Massachusetts Institute

More information

Fast Memory-Efficient Generalized Belief Propagation

Fast Memory-Efficient Generalized Belief Propagation Fast Memory-Efficient Generalized Belief Propagation M. Pawan Kumar P.H.S. Torr Department of Computing Oxford Brookes University Oxford, UK, OX33 1HX {pkmudigonda,philiptorr}@brookes.ac.uk http://cms.brookes.ac.uk/computervision

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul

More information

Expectation Consistent Free Energies for Approximate Inference

Expectation Consistent Free Energies for Approximate Inference Expectation Consistent Free Energies for Approximate Inference Manfred Opper ISIS School of Electronics and Computer Science University of Southampton SO17 1BJ, United Kingdom mo@ecs.soton.ac.uk Ole Winther

More information

Inference in Graphical Models Variable Elimination and Message Passing Algorithm

Inference in Graphical Models Variable Elimination and Message Passing Algorithm Inference in Graphical Models Variable Elimination and Message Passing lgorithm Le Song Machine Learning II: dvanced Topics SE 8803ML, Spring 2012 onditional Independence ssumptions Local Markov ssumption

More information

Approximate inference, Sampling & Variational inference Fall Cours 9 November 25

Approximate inference, Sampling & Variational inference Fall Cours 9 November 25 Approimate inference, Sampling & Variational inference Fall 2015 Cours 9 November 25 Enseignant: Guillaume Obozinski Scribe: Basile Clément, Nathan de Lara 9.1 Approimate inference with MCMC 9.1.1 Gibbs

More information

Undirected Graphical Models: Markov Random Fields

Undirected Graphical Models: Markov Random Fields Undirected Graphical Models: Markov Random Fields 40-956 Advanced Topics in AI: Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2015 Markov Random Field Structure: undirected

More information

Variational Inference (11/04/13)

Variational Inference (11/04/13) STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further

More information

A Gaussian Tree Approximation for Integer Least-Squares

A Gaussian Tree Approximation for Integer Least-Squares A Gaussian Tree Approximation for Integer Least-Squares Jacob Goldberger School of Engineering Bar-Ilan University goldbej@eng.biu.ac.il Amir Leshem School of Engineering Bar-Ilan University leshema@eng.biu.ac.il

More information

Intelligent Systems:

Intelligent Systems: Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition

More information

Cours 7 12th November 2014

Cours 7 12th November 2014 Sum Product Algorithm and Hidden Markov Model 2014/2015 Cours 7 12th November 2014 Enseignant: Francis Bach Scribe: Pauline Luc, Mathieu Andreux 7.1 Sum Product Algorithm 7.1.1 Motivations Inference, along

More information

11 The Max-Product Algorithm

11 The Max-Product Algorithm Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 11 The Max-Product Algorithm In the previous lecture, we introduced

More information

5. Sum-product algorithm

5. Sum-product algorithm Sum-product algorithm 5-1 5. Sum-product algorithm Elimination algorithm Sum-product algorithm on a line Sum-product algorithm on a tree Sum-product algorithm 5-2 Inference tasks on graphical models consider

More information

Generative and Discriminative Approaches to Graphical Models CMSC Topics in AI

Generative and Discriminative Approaches to Graphical Models CMSC Topics in AI Generative and Discriminative Approaches to Graphical Models CMSC 35900 Topics in AI Lecture 2 Yasemin Altun January 26, 2007 Review of Inference on Graphical Models Elimination algorithm finds single

More information

Undirected graphical models

Undirected graphical models Undirected graphical models Semantics of probabilistic models over undirected graphs Parameters of undirected models Example applications COMP-652 and ECSE-608, February 16, 2017 1 Undirected graphical

More information

Convexifying the Bethe Free Energy

Convexifying the Bethe Free Energy 402 MESHI ET AL. Convexifying the Bethe Free Energy Ofer Meshi Ariel Jaimovich Amir Globerson Nir Friedman School of Computer Science and Engineering Hebrew University, Jerusalem, Israel 91904 meshi,arielj,gamir,nir@cs.huji.ac.il

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft

More information

Lecture 9: PGM Learning

Lecture 9: PGM Learning 13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and

More information

Dual Decomposition for Marginal Inference

Dual Decomposition for Marginal Inference Dual Decomposition for Marginal Inference Justin Domke Rochester Institute of echnology Rochester, NY 14623 Abstract We present a dual decomposition approach to the treereweighted belief propagation objective.

More information

9 Forward-backward algorithm, sum-product on factor graphs

9 Forward-backward algorithm, sum-product on factor graphs Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous

More information

Lecture 12: Binocular Stereo and Belief Propagation

Lecture 12: Binocular Stereo and Belief Propagation Lecture 12: Binocular Stereo and Belief Propagation A.L. Yuille March 11, 2012 1 Introduction Binocular Stereo is the process of estimating three-dimensional shape (stereo) from two eyes (binocular) or

More information

Inference and Representation

Inference and Representation Inference and Representation David Sontag New York University Lecture 5, Sept. 30, 2014 David Sontag (NYU) Inference and Representation Lecture 5, Sept. 30, 2014 1 / 16 Today s lecture 1 Running-time of

More information

A Generalized Loop Correction Method for Approximate Inference in Graphical Models

A Generalized Loop Correction Method for Approximate Inference in Graphical Models for Approximate Inference in Graphical Models Siamak Ravanbakhsh mravanba@ualberta.ca Chun-Nam Yu chunnam@ualberta.ca Russell Greiner rgreiner@ualberta.ca Department of Computing Science, University of

More information

Linear Response Algorithms for Approximate Inference in Graphical Models

Linear Response Algorithms for Approximate Inference in Graphical Models Linear Response Algorithms for Approximate Inference in Graphical Models Max Welling Department of Computer Science University of Toronto 10 King s College Road, Toronto M5S 3G4 Canada welling@cs.toronto.edu

More information

Lecture 6: Graphical Models

Lecture 6: Graphical Models Lecture 6: Graphical Models Kai-Wei Chang CS @ Uniersity of Virginia kw@kwchang.net Some slides are adapted from Viek Skirmar s course on Structured Prediction 1 So far We discussed sequence labeling tasks:

More information

The Origin of Deep Learning. Lili Mou Jan, 2015

The Origin of Deep Learning. Lili Mou Jan, 2015 The Origin of Deep Learning Lili Mou Jan, 2015 Acknowledgment Most of the materials come from G. E. Hinton s online course. Outline Introduction Preliminary Boltzmann Machines and RBMs Deep Belief Nets

More information

On the Choice of Regions for Generalized Belief Propagation

On the Choice of Regions for Generalized Belief Propagation UAI 2004 WELLING 585 On the Choice of Regions for Generalized Belief Propagation Max Welling (welling@ics.uci.edu) School of Information and Computer Science University of California Irvine, CA 92697-3425

More information

Bayesian Model Scoring in Markov Random Fields

Bayesian Model Scoring in Markov Random Fields Bayesian Model Scoring in Markov Random Fields Sridevi Parise Bren School of Information and Computer Science UC Irvine Irvine, CA 92697-325 sparise@ics.uci.edu Max Welling Bren School of Information and

More information

MAP Examples. Sargur Srihari

MAP Examples. Sargur Srihari MAP Examples Sargur srihari@cedar.buffalo.edu 1 Potts Model CRF for OCR Topics Image segmentation based on energy minimization 2 Examples of MAP Many interesting examples of MAP inference are instances

More information

Markov Random Fields

Markov Random Fields Markov Random Fields Umamahesh Srinivas ipal Group Meeting February 25, 2011 Outline 1 Basic graph-theoretic concepts 2 Markov chain 3 Markov random field (MRF) 4 Gauss-Markov random field (GMRF), and

More information

Message Passing Algorithms and Junction Tree Algorithms

Message Passing Algorithms and Junction Tree Algorithms Message Passing lgorithms and Junction Tree lgorithms Le Song Machine Learning II: dvanced Topics S 8803ML, Spring 2012 Inference in raphical Models eneral form of the inference problem P X 1,, X n Ψ(

More information

Machine Learning Lecture 14

Machine Learning Lecture 14 Many slides adapted from B. Schiele, S. Roth, Z. Gharahmani Machine Learning Lecture 14 Undirected Graphical Models & Inference 23.06.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de

More information

On the Convergence of the Convex Relaxation Method and Distributed Optimization of Bethe Free Energy

On the Convergence of the Convex Relaxation Method and Distributed Optimization of Bethe Free Energy On the Convergence of the Convex Relaxation Method and Distributed Optimization of Bethe Free Energy Ming Su Department of Electrical Engineering and Department of Statistics University of Washington,

More information

Learning MN Parameters with Approximation. Sargur Srihari

Learning MN Parameters with Approximation. Sargur Srihari Learning MN Parameters with Approximation Sargur srihari@cedar.buffalo.edu 1 Topics Iterative exact learning of MN parameters Difficulty with exact methods Approximate methods Approximate Inference Belief

More information

Conditional Independence and Factorization

Conditional Independence and Factorization Conditional Independence and Factorization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Inference in Bayesian Networks

Inference in Bayesian Networks Andrea Passerini passerini@disi.unitn.it Machine Learning Inference in graphical models Description Assume we have evidence e on the state of a subset of variables E in the model (i.e. Bayesian Network)

More information

Directed and Undirected Graphical Models

Directed and Undirected Graphical Models Directed and Undirected Graphical Models Adrian Weller MLSALT4 Lecture Feb 26, 2016 With thanks to David Sontag (NYU) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,

More information

Inference as Optimization

Inference as Optimization Inference as Optimization Sargur Srihari srihari@cedar.buffalo.edu 1 Topics in Inference as Optimization Overview Exact Inference revisited The Energy Functional Optimizing the Energy Functional 2 Exact

More information

Tightness of LP Relaxations for Almost Balanced Models

Tightness of LP Relaxations for Almost Balanced Models Tightness of LP Relaxations for Almost Balanced Models Adrian Weller University of Cambridge AISTATS May 10, 2016 Joint work with Mark Rowland and David Sontag For more information, see http://mlg.eng.cam.ac.uk/adrian/

More information

Structure Learning in Markov Random Fields

Structure Learning in Markov Random Fields THIS IS A DRAFT VERSION. FINAL VERSION TO BE PUBLISHED AT NIPS 06 Structure Learning in Markov Random Fields Sridevi Parise Bren School of Information and Computer Science UC Irvine Irvine, CA 92697-325

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 9 Undirected Models CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due next Wednesday (Nov 4) in class Start early!!! Project milestones due Monday (Nov 9)

More information

Texas A&M University

Texas A&M University Texas A&M University Electrical & Computer Engineering Department Graphical Modeling Course Project Author: Mostafa Karimi UIN: 225000309 Prof Krishna Narayanan May 10 1 Introduction Proteins are made

More information

Clique trees & Belief Propagation. Siamak Ravanbakhsh Winter 2018

Clique trees & Belief Propagation. Siamak Ravanbakhsh Winter 2018 Graphical Models Clique trees & Belief Propagation Siamak Ravanbakhsh Winter 2018 Learning objectives message passing on clique trees its relation to variable elimination two different forms of belief

More information

CS281A/Stat241A Lecture 19

CS281A/Stat241A Lecture 19 CS281A/Stat241A Lecture 19 p. 1/4 CS281A/Stat241A Lecture 19 Junction Tree Algorithm Peter Bartlett CS281A/Stat241A Lecture 19 p. 2/4 Announcements My office hours: Tuesday Nov 3 (today), 1-2pm, in 723

More information

Minimizing D(Q,P) def = Q(h)

Minimizing D(Q,P) def = Q(h) Inference Lecture 20: Variational Metods Kevin Murpy 29 November 2004 Inference means computing P( i v), were are te idden variables v are te visible variables. For discrete (eg binary) idden nodes, exact

More information

3 : Representation of Undirected GM

3 : Representation of Undirected GM 10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:

More information

6.867 Machine learning, lecture 23 (Jaakkola)

6.867 Machine learning, lecture 23 (Jaakkola) Lecture topics: Markov Random Fields Probabilistic inference Markov Random Fields We will briefly go over undirected graphical models or Markov Random Fields (MRFs) as they will be needed in the context

More information

CSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches

CSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches CSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches A presentation by Evan Ettinger November 11, 2005 Outline Introduction Motivation and Background

More information

Lecture 4 October 18th

Lecture 4 October 18th Directed and undirected graphical models Fall 2017 Lecture 4 October 18th Lecturer: Guillaume Obozinski Scribe: In this lecture, we will assume that all random variables are discrete, to keep notations

More information

Connecting Belief Propagation with Maximum Likelihood Detection

Connecting Belief Propagation with Maximum Likelihood Detection Connecting Belief Propagation with Maximum Likelihood Detection John MacLaren Walsh 1, Phillip Allan Regalia 2 1 School of Electrical Computer Engineering, Cornell University, Ithaca, NY. jmw56@cornell.edu

More information

Structured Variational Inference

Structured Variational Inference Structured Variational Inference Sargur srihari@cedar.buffalo.edu 1 Topics 1. Structured Variational Approximations 1. The Mean Field Approximation 1. The Mean Field Energy 2. Maximizing the energy functional:

More information

Graphical models and message-passing Part II: Marginals and likelihoods

Graphical models and message-passing Part II: Marginals and likelihoods Graphical models and message-passing Part II: Marginals and likelihoods Martin Wainwright UC Berkeley Departments of Statistics, and EECS Tutorial materials (slides, monograph, lecture notes) available

More information

LINK scheduling algorithms based on CSMA have received

LINK scheduling algorithms based on CSMA have received Efficient CSMA using Regional Free Energy Approximations Peruru Subrahmanya Swamy, Venkata Pavan Kumar Bellam, Radha Krishna Ganti, and Krishna Jagannathan arxiv:.v [cs.ni] Feb Abstract CSMA Carrier Sense

More information

Improved Dynamic Schedules for Belief Propagation

Improved Dynamic Schedules for Belief Propagation Improved Dynamic Schedules for Belief Propagation Charles Sutton and Andrew McCallum Department of Computer Science University of Massachusetts Amherst, MA 13 USA {casutton,mccallum}@cs.umass.edu Abstract

More information

Rapid Introduction to Machine Learning/ Deep Learning

Rapid Introduction to Machine Learning/ Deep Learning Rapid Introduction to Machine Learning/ Deep Learning Hyeong In Choi Seoul National University 1/24 Lecture 5b Markov random field (MRF) November 13, 2015 2/24 Table of contents 1 1. Objectives of Lecture

More information

B C D E E E" D C B D" C" A A" A A A A

B C D E E E D C B D C A A A A A A 1 MERL { A MITSUBISHI ELECTRIC RESEARCH LABORATORY http://www.merl.com Correctness of belief propagation in Gaussian graphical models of arbitrary topology Yair Weiss William T. Freeman y TR-99-38 October

More information

Message Scheduling Methods for Belief Propagation

Message Scheduling Methods for Belief Propagation Message Scheduling Methods for Belief Propagation Christian Knoll 1(B), Michael Rath 1, Sebastian Tschiatschek 2, and Franz Pernkopf 1 1 Signal Processing and Speech Communication Laboratory, Graz University

More information

Loopy Belief Propagation for Bipartite Maximum Weight b-matching

Loopy Belief Propagation for Bipartite Maximum Weight b-matching Loopy Belief Propagation for Bipartite Maximum Weight b-matching Bert Huang and Tony Jebara Computer Science Department Columbia University New York, NY 10027 Outline 1. Bipartite Weighted b-matching 2.

More information

arxiv: v1 [stat.ml] 26 Feb 2013

arxiv: v1 [stat.ml] 26 Feb 2013 Variational Algorithms for Marginal MAP Qiang Liu Donald Bren School of Information and Computer Sciences University of California, Irvine Irvine, CA, 92697-3425, USA qliu1@uci.edu arxiv:1302.6584v1 [stat.ml]

More information

Fixed Point Solutions of Belief Propagation

Fixed Point Solutions of Belief Propagation Fixed Point Solutions of Belief Propagation Christian Knoll Signal Processing and Speech Comm. Lab. Graz University of Technology christian.knoll@tugraz.at Dhagash Mehta Dept of ACMS University of Notre

More information

4 : Exact Inference: Variable Elimination

4 : Exact Inference: Variable Elimination 10-708: Probabilistic Graphical Models 10-708, Spring 2014 4 : Exact Inference: Variable Elimination Lecturer: Eric P. ing Scribes: Soumya Batra, Pradeep Dasigi, Manzil Zaheer 1 Probabilistic Inference

More information

CSC 412 (Lecture 4): Undirected Graphical Models

CSC 412 (Lecture 4): Undirected Graphical Models CSC 412 (Lecture 4): Undirected Graphical Models Raquel Urtasun University of Toronto Feb 2, 2016 R Urtasun (UofT) CSC 412 Feb 2, 2016 1 / 37 Today Undirected Graphical Models: Semantics of the graph:

More information

Outline. Spring It Introduction Representation. Markov Random Field. Conclusion. Conditional Independence Inference: Variable elimination

Outline. Spring It Introduction Representation. Markov Random Field. Conclusion. Conditional Independence Inference: Variable elimination Probabilistic Graphical Models COMP 790-90 Seminar Spring 2011 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Outline It Introduction ti Representation Bayesian network Conditional Independence Inference:

More information

MCMC and Gibbs Sampling. Kayhan Batmanghelich

MCMC and Gibbs Sampling. Kayhan Batmanghelich MCMC and Gibbs Sampling Kayhan Batmanghelich 1 Approaches to inference l Exact inference algorithms l l l The elimination algorithm Message-passing algorithm (sum-product, belief propagation) The junction

More information

Improved Dynamic Schedules for Belief Propagation

Improved Dynamic Schedules for Belief Propagation 376 SUTTON & McCALLUM Improved Dynamic Schedules for Belief Propagation Charles Sutton and Andrew McCallum Department of Computer Science University of Massachusetts Amherst, MA 01003 USA {casutton,mccallum}@cs.umass.edu

More information

Review: Directed Models (Bayes Nets)

Review: Directed Models (Bayes Nets) X Review: Directed Models (Bayes Nets) Lecture 3: Undirected Graphical Models Sam Roweis January 2, 24 Semantics: x y z if z d-separates x and y d-separation: z d-separates x from y if along every undirected

More information

The Ising model and Markov chain Monte Carlo

The Ising model and Markov chain Monte Carlo The Ising model and Markov chain Monte Carlo Ramesh Sridharan These notes give a short description of the Ising model for images and an introduction to Metropolis-Hastings and Gibbs Markov Chain Monte

More information

On a Discrete Dirichlet Model

On a Discrete Dirichlet Model On a Discrete Dirichlet Model Arthur Choi and Adnan Darwiche University of California, Los Angeles {aychoi, darwiche}@cs.ucla.edu Abstract The Dirichlet distribution is a statistical model that is deeply

More information

Lecture 8: Bayesian Networks

Lecture 8: Bayesian Networks Lecture 8: Bayesian Networks Bayesian Networks Inference in Bayesian Networks COMP-652 and ECSE 608, Lecture 8 - January 31, 2017 1 Bayes nets P(E) E=1 E=0 0.005 0.995 E B P(B) B=1 B=0 0.01 0.99 E=0 E=1

More information

Representation of undirected GM. Kayhan Batmanghelich

Representation of undirected GM. Kayhan Batmanghelich Representation of undirected GM Kayhan Batmanghelich Review Review: Directed Graphical Model Represent distribution of the form ny p(x 1,,X n = p(x i (X i i=1 Factorizes in terms of local conditional probabilities

More information