Exact Inference: Clique Trees. Sargur Srihari

Size: px
Start display at page:

Download "Exact Inference: Clique Trees. Sargur Srihari"

Transcription

1 Exact Inference: Clique Trees Sargur 1

2 Topics 1. Overview 2. Variable Elimination and Clique Trees 3. Message Passing: Sum-Product VE in a Clique Tree Clique-Tree Calibration 4. Message Passing: Belief Update 5. Constructing a Clique Tree 2

3 Overview Two methods of inference using factors Φ over variables χ 1. Variable elimination (VE) algorithm uses factor representation and local operations instead of generating entire distribution (See next slide) 2. Clique Trees: alternative implementation of same insight Use a more global data structure for scheduling operations 3

4 Sum-product VE P(C,D,I,G,S,L,J,H)= P(C)P(D C)P(I)P(G I,D)P(S I)P(L G)P(J L)P(H G,J)= ϕ C (C) ϕ D (D,C) ϕ I (I) ϕ G (G,I,D) ϕ S (S,I) ϕ L (L,G) ϕ J (J,L,S) ϕ H (H,G,J) Elimination ordering C,D,I,H.G,S,L 1. Eliminating C: Compute the factors 2. Eliminating D: Note we already eliminated one factor with D, but introduced τ 1 involving D 3. Eliminating I: 4. Eliminating H: Note τ 4 (G,J)=1 5. Eliminating G: P(J ) = P(C,D,I,G,S,L,J,H ) L S G H ψ 1 ( C,D) = φ C (C)φ D ( D,C ) τ 1 ( D) = ψ 1 ψ 2 (G,I,D) = φ G (G,I,D)τ 1 (D) τ 2 G,I I D C C ( ) = ψ 2 D ( C,D) ( G,I,D) ψ 3 ( G,I,S) = φ I ( I )φ S ( S,I )τ 2 ( G,I ) τ 3 ( G,S) = ψ 3 ( G,I,S) ψ 4 ( G,J,H ) = φ H (H,G,J) τ 4 ( G,J ) = ψ 4 ( G,J,H ) H I Each step involves factor product and factor marginalization ψ 5 ( G,J,L,S ) = τ 4 ( G,J )τ 3 ( G,S)φ L ( L,G) τ 5 ( J,L,S ) = ψ 5 ( G,J,L,S ) G 6. Eliminating S: 7. Eliminating L: ψ 6 ψ 7 ( J,L,S ) = τ 5 ( J,L,S ) φ J J,L,S ( J,L) = τ 6 J,L ( ) τ 6 ( J,L) = ψ 6 ( J,L,S ) ( ) τ 7 ( J ) = ψ 7 ( J,L) L S

5 Unnormalized Measure with Factors 1. We deal with unnormalized measure here 2. For a BN 1. without evidence factors are CPDs and!p ( Φ χ) is a normalized distribution 2. with evidence E=e, 1. factors are CPDs restricted to e and 3. For a Gibbs distribution, 1. factors are potentials!p Φ χ ( ) ( ) = φ i φ i Φ!P Φ χ ( )!P B ( χ) = P B ( χ,e) 2. is the unnormalized Gibbs measure X i 5

6 Marginalize with Unnormalized Unnormalized Conditional Measure equivalent to Normalized Conditional Probability!P Φ ( X Y ) = P Φ ( X Y ) since!p Φ ( X Y ) = P φ ( )!P Φ Y ( )!P Φ X, Y ( X Y) = P (X,Y ) Φ = P Φ Y ( ) = 1 Z φ i Φ φ i X 1 Z φ i Φ φ i Φ X ( D i ) φ i φ i φ i Φ ( D i ) ( D i ) φ i ( D i ) 6

7 Factor Product Let X, Y and Z be three disjoint sets of variables and let Φ 1 (X,Y) and Φ 2 (Y,Z) be two factors. The factor product is the mapping Val(X,Y,Z)à R as follows An example: Φ 1 : 3 x 2 = 6 entries Φ 2 : 2 x 2= 4 entries ψ(x,y,z)=φ 1 (X,Y) Φ 2 (Y,Z) ψ=φ 1 x Φ 2 has 3 x 2 x 2= 12 entries 7

8 VE and Factor Creation In variable elimination each step creates a factor ψ i by multiplying existing factors A variable is then eliminated to create a factor τ 1 which is then used to create another factor P(C,D,I,G,S,L,J,H)= P(C)P(D C)P(I)P(G I,D)P(S I)P(L G)P(J L)P(H G,J)= Φ C (C) Φ D (D,C) Φ I (I) Φ G (G,I,D) Φ S (S,I) Φ L (L,G) Φ J (J,L,S) Φ H (H,G,J) ( ) = φ C (C )φ D ( D,C ) τ 1 ( D) = ψ 1 ( C,D) ψ 1 C,D C 8

9 VE Alternative View Alternative view We take ψ i to be a data-structure takes messages τ 1 generated by other factors ψ j and generates message τ i used by another factor ψ l ( ) = φ C (C )φ D ( D,C ) τ 1 ( D) = ψ 1 ( C,D) ψ 1 C,D C τ 1 (D) ψ 2 (G,I,D) = φ G (G,I,D)τ 1 (D) τ 2 ( G,I ) = ψ 2 ( G,I,D) D τ 2 (G,I) ψ 3 ( G,I,S) = φ I ( I )φ S ( S,I )τ 2 G,I ( G,S) = ψ 3 ( G,I,S) τ 3 I ( ) ψ 4 ( G,J,H ) = φ H (H,G,J ) ( G,J ) = ψ 4 G,J,H τ 4 H ( ) τ 4 (G,J) ψ 5 ( G,J,L,S ) = τ 4 ( G,J )τ 3 ( G,S)φ L L,G ( J,L,S ) = ψ 9 5 ( G,J,L,S ) τ 5 G τ 3 (G,S) ( )

10 Example of Cluster Graph VE execution defines cluster graph ( a flow-chart) A cluster for each factor ψ i Draw edge between clusters C i and C j if message τ i produced by eliminating a variable in ψ i is used in the computation of τ j ψ 1 ( C,D) = φ C (C)φ D ( D,C ) τ 1 ( D) = ψ 1 C ( C,D) ψ 2 (G,I,D) = φ G (G,I,D)τ 1 (D) τ 2 G,I ( ) = ψ 2 ( G,I,D) Edge between C 1 and C 2 since message τ 1 (D) produced by eliminating C is used for τ 2 (G,I) D Arrows indicate flow of messages τ 1 (D) generated from ψ 1 (C,D) participates In the computation of ψ 2

11 Cluster Graph Definition Cluster graph U for factors Φ over χ is an undirected graph 1. Each of whose nodes i is associated with a subset C i χ Cluster graph is family-preserving Each factor ϕ must be associated with a cluster C i, denoted α(ϕ) such that Scope φ 2. Each edge between pair of clusters C i and C j is associated with a sepset E.g., D {C,D) (D,I,G} ( ) C i S i,j C i C j 11

12 Cluster Graph is a Directed Tree In a tree there are no cycles Directions for this tree are specified by messages Since intermediate factor τ i is used only once Otherwise there would be more than one link for a node Called Clique Tree (or Junction Tree or Join Tree) Root is up Leaf is down

13 Definition of Tree Tree a graph with only one path between any pair of nodes Such graphs have no loops In directed graphs a tree has a single node with no parents called a root Directed to undirected will not add moralization links since every node has only one parent Polytree A directed graph has nodes with more than one parent but there is only one path between nodes (ignoring arrow direction) Moralization will add links Undirected tree Directed tree Directed polytree 13

14 Running Intersection Property 1. Definition If X ε C i & X ε C j then X is in every clique inbetween In a clique, every pair of nodes is connected In a maximal clique no more nodes can be added Ex: in cluster graph below, G is present in C 2 and C 4 and also present in clique inbetween: C 3 and C 4 2. A VE generated cluster graph satisfies running intersection property

15 Clique Tree 1. BN 2. Induced Graph 3. Some Cliques: {C,D}, {G,I,D},{G,S,I},{GI,S,L},{H,G,J} 4. A clique Tree that satisfies running intersection

16 Clique Tree Definition A tree T is a clique tree for graph H if Each node in T corresponds to a clique in H and each maximal clique in H is a node in T Each sepset S i,j separates W <Ij,j) and W< (j,i) in H Edge S 2,3 ={G,I} separates W <(2,3) ={G,I,D} and W <(3,2) ={G,S,I}

17 Message Passing: Sum Product Proceed in opposite direction of VE algorithm: Starting from a clique tree, how to perform VE Clique Tree is a very versatile Data Structure

18 Variable Elimination in a Clique Tree Clique Tree can be used as guidance for VE Factors are computed in the cliques and messages are sent along edges

19 Variable Elimination in a Clique Tree A Clique Tree for Student Network This tree satisfies Running Intersection Property i.e., If X ε C i & X ε C j then X is in every clique inbetween Family Preservation property i.e., each factor is associated with a cluster

20 Example of VE in a Clique Tree A Clique Tree for Student Network Non-maximal cliques C 6 and C 7 are absent Assign α: initial factors (CPDs) to cliques First step: Generate initial set of potentials by multiplying out the factors E.g., ψ 5 (J,L,G,S)=ϕ L (L,G)*ϕ J (J,L,S) Root is selected to have variable J, since we are interested in determining P(J), e.g., C 5

21 Message Propagation in a Clique Tree Root=C 5 To compute P(J) In C 1 : eliminate C by performing C The resulting factor has scope D. We send it as a message δ 1-2 (D) to C 2 ψ 1 ( C,D) In C 2 : We define β 2 (G,I,D)=δ 1-2 (D)ψ 2 (G,I,D). We then eliminate D to get a factor over G,I. The resulting factor is δ 2-3 (G,I) which is sent to C 3.

22 Message Propagation in a Clique Tree Root=C 5 To compute P(J) Root=C 3 To compute P(G)

23 VE as Clique Tree Message Passing 1. Let T be a clique tree with Cliques C 1,..C k 2. Begin by multiplying factors assigned to each clique, resulting in initial potentials ψ j ( C j ) = φ 3. Begin passing messages between neighbor cliques sending towards root node δ i j = ψ i δ k i C i S i,j k Nb i j ( { }) 4. Message passing culminates at root node φ: α( φ)=j Result is a factor called beliefs denoted β r (C r ) which is equivalent to ( ) = φ!p φ C r χ C r φ

24 Algorithm: Upward Pass of VE in Clique Tree Procedure Ctree-SP-Upward ( O, // Set of factors T, // Clique tree over id (t, // lnitial assignment of factors to cliques C, // Some selected root clique I ) lnitialize-cliques 2 while C, is not ready 3 Let C a be a ready clique 4 6n*e,U)(St,e,@)* SP-Message(i,p"(i,)) 5 B" * 0l 'fl*.l.rl.,5**, 6 return B, Procedure lnitialize-cliques ( I 2 3 ) for each clique C; th(ct)* 1I4,,.14s:ndi Procedure SP-Message ( i, // sending clique j // receiving clique I 2 3 ),h(ct) * /,.II*.(*oo *1iy.6*-t r(si,i) <- D"o_ "o,, rh@ r) return r(sti)

25 Clique Tree Calibration We have seen how to use the same clique tree to compute probability of any variable We wish to compute the probability of a large number of variables Consider task of computing the posterior distribution over every random variable in network As with HMMs with several latent variables

26 Ready Clique C i is ready to transmit to neighbor C j when C i has messages from all of its neighbors except from C j Sum-product belief propagation algorithm Uses yet another layer of dynamic programming Defined asynchronously

27 Sum-Product Belief Propagation Algorithm: Calibration using sum-product message passing in a clique tree Procedure CTree-SP-Calibrate ( ) O, // Set of factors T // Clique tree over O I lnitialize-cliques 2 while exist e, j such that z is ready to transmit to 7 3 d,- i(sqi) <- sp-message(i., i) { for each clique ri 5 {1, * th.fl*.rtn dn-, 6 return {Ba}

28 Result at End of Algorithm Computes beliefs of all cliques by Multiplying the initial potential with each of the incoming messages For each clique i, β i is computed as β i ( C i ) = PΦ! ( χ) χ C i Which is the unnormalized marginal distribution of variables in C i

29 Calibration Definition If X appears in two cliques they must agree on its marginal Two adjacent cliques C i and C k =j are said to be calibrated if β i ( C i ) = β j ( C j ) C i S i,j Clique tree T is calibrated if all adjacent pairs of cliques are calibrated Terminology: Clique Beliefs: β i (C i ) Sepset Beliefs: μ i,j (S i,j )= C i S i,j C j S i,j β i ( C i ) = β j C j S i,j ( C j )

30 Calibration Tree as a Distribution A calibrated clique tree Is more than a data structure that stores results of probabilistic inference It can be viewed as an alternative representation of P Φ At convergence of clique tree calibration algorithm!p Φ χ ( ) = i V T ( i j) E T β i ( C ) i µ ( i,j S ) i,j

31 Misconception Markov Network D A B Factors in terms of potentials C (a) Gibbs Distribution P(a,b,c,d) = 1 Z φ 1 (a,b) φ 2 (b,c) φ 3 (c,d) φ 4 (d,a) where Z = a,b,c,d φ 1 (a,b) φ 2 (b,c) φ 3 (c,d) φ 4 (d,a) Z=7,201,840 31

32 Beliefs for Misconception example One clique tree consists cliques {A,B,D} and {B,C,D} with sepset {B,D} A,B,D {B,D} B,C,D Tree obtained either from (i) VE or from (ii) triangulation (constructing a chordal graph) Final clique potentials and sepset Assignment a a,o ao n a" al al al AL 6o 6r 6t 6o 6o 6r 6r maxc 600, ,030 5, ooo,5oo 1, ,000, , ,000 Assignment d Assienment Potential from Gibbs and Clique Tree are!p same: Φ ( a 1,b 0,c 1,d 0 ) = 100 β 1 ( a 1,b 0,d 0 )β 2 b 0,c 1,d 0 6o bl br ( ) ( ) µ 1,2 b 0,d 0 n,z(b, D) = 600,200 1,300, 130 5, 100, , b0 bo b0 bl bl bt 6t co cl ct co co c1 ct = 100 d1 ll I 4o 4t 4o 4r d0 ll 5, l3z(8,c, 32

33 Message Passing: Belief Update Alternative Message Passing Scheme Involves operations on reparameterized distribution in terms of cliques {β i (C i )}, iεv T and sepset beliefs {μ i,j (S i,j )}, (i--j) ε V T

34 Message Passing with Division Multiply all the messages and then divide the resulting factor by δ jà i

35 Factor Division Message Passing with Division An example of factor division 35

36 Constructing a Clique Tree Two approaches to construct a clique tree from a graph From Variable Elimination From Chordal Graphs

37 Clique Tree from VE Execution of Variable Elimination can be associated with a cluster graph Satisfies running intersection property and is hence a clique tree Unambitious Student Variable Elimination with ordering J,L,S,H,C,D,I, G results in clique tree:

38 Clique Tree from Chordal Graphs There exists a clique tree for Φ whose cliques are precisely the maximal cliques in I Φ,< Triangulation: construct chordal graph subsuming existing graph 1. Undirected factor graph 2. A triangulation 3. Cluster graph With edge weights

39 Algorithm: Clique Tree from Chordal Graph Given a set of factors, construct the undirected graph H Φ Triangulate H Φ to construct Chordal Graph H* Find cliques in H*, and make each one a node in a cluster graph Run the maximal spanning tree algorithm on the cluster graph to construct a tree

Variable Elimination: Algorithm

Variable Elimination: Algorithm Variable Elimination: Algorithm Sargur srihari@cedar.buffalo.edu 1 Topics 1. Types of Inference Algorithms 2. Variable Elimination: the Basic ideas 3. Variable Elimination Sum-Product VE Algorithm Sum-Product

More information

Variable Elimination: Algorithm

Variable Elimination: Algorithm Variable Elimination: Algorithm Sargur srihari@cedar.buffalo.edu 1 Topics 1. Types of Inference Algorithms 2. Variable Elimination: the Basic ideas 3. Variable Elimination Sum-Product VE Algorithm Sum-Product

More information

Inference as Optimization

Inference as Optimization Inference as Optimization Sargur Srihari srihari@cedar.buffalo.edu 1 Topics in Inference as Optimization Overview Exact Inference revisited The Energy Functional Optimizing the Energy Functional 2 Exact

More information

Learning MN Parameters with Approximation. Sargur Srihari

Learning MN Parameters with Approximation. Sargur Srihari Learning MN Parameters with Approximation Sargur srihari@cedar.buffalo.edu 1 Topics Iterative exact learning of MN parameters Difficulty with exact methods Approximate methods Approximate Inference Belief

More information

Chapter 8 Cluster Graph & Belief Propagation. Probabilistic Graphical Models 2016 Fall

Chapter 8 Cluster Graph & Belief Propagation. Probabilistic Graphical Models 2016 Fall Chapter 8 Cluster Graph & elief ropagation robabilistic Graphical Models 2016 Fall Outlines Variable Elimination 消元法 imple case: linear chain ayesian networks VE in complex graphs Inferences in HMMs and

More information

Clique trees & Belief Propagation. Siamak Ravanbakhsh Winter 2018

Clique trees & Belief Propagation. Siamak Ravanbakhsh Winter 2018 Graphical Models Clique trees & Belief Propagation Siamak Ravanbakhsh Winter 2018 Learning objectives message passing on clique trees its relation to variable elimination two different forms of belief

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 9 Undirected Models CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due next Wednesday (Nov 4) in class Start early!!! Project milestones due Monday (Nov 9)

More information

Alternative Parameterizations of Markov Networks. Sargur Srihari

Alternative Parameterizations of Markov Networks. Sargur Srihari Alternative Parameterizations of Markov Networks Sargur srihari@cedar.buffalo.edu 1 Topics Three types of parameterization 1. Gibbs Parameterization 2. Factor Graphs 3. Log-linear Models Features (Ising,

More information

Alternative Parameterizations of Markov Networks. Sargur Srihari

Alternative Parameterizations of Markov Networks. Sargur Srihari Alternative Parameterizations of Markov Networks Sargur srihari@cedar.buffalo.edu 1 Topics Three types of parameterization 1. Gibbs Parameterization 2. Factor Graphs 3. Log-linear Models with Energy functions

More information

Message Passing and Junction Tree Algorithms. Kayhan Batmanghelich

Message Passing and Junction Tree Algorithms. Kayhan Batmanghelich Message Passing and Junction Tree Algorithms Kayhan Batmanghelich 1 Review 2 Review 3 Great Ideas in ML: Message Passing Each soldier receives reports from all branches of tree 3 here 7 here 1 of me 11

More information

CS281A/Stat241A Lecture 19

CS281A/Stat241A Lecture 19 CS281A/Stat241A Lecture 19 p. 1/4 CS281A/Stat241A Lecture 19 Junction Tree Algorithm Peter Bartlett CS281A/Stat241A Lecture 19 p. 2/4 Announcements My office hours: Tuesday Nov 3 (today), 1-2pm, in 723

More information

Inference and Representation

Inference and Representation Inference and Representation David Sontag New York University Lecture 5, Sept. 30, 2014 David Sontag (NYU) Inference and Representation Lecture 5, Sept. 30, 2014 1 / 16 Today s lecture 1 Running-time of

More information

From Bayesian Networks to Markov Networks. Sargur Srihari

From Bayesian Networks to Markov Networks. Sargur Srihari From Bayesian Networks to Markov Networks Sargur srihari@cedar.buffalo.edu 1 Topics Bayesian Networks and Markov Networks From BN to MN: Moralized graphs From MN to BN: Chordal graphs 2 Bayesian Networks

More information

6.867 Machine learning, lecture 23 (Jaakkola)

6.867 Machine learning, lecture 23 (Jaakkola) Lecture topics: Markov Random Fields Probabilistic inference Markov Random Fields We will briefly go over undirected graphical models or Markov Random Fields (MRFs) as they will be needed in the context

More information

Probabilistic Graphical Models Homework 2: Due February 24, 2014 at 4 pm

Probabilistic Graphical Models Homework 2: Due February 24, 2014 at 4 pm Probabilistic Graphical Models 10-708 Homework 2: Due February 24, 2014 at 4 pm Directions. This homework assignment covers the material presented in Lectures 4-8. You must complete all four problems to

More information

Message Passing Algorithms and Junction Tree Algorithms

Message Passing Algorithms and Junction Tree Algorithms Message Passing lgorithms and Junction Tree lgorithms Le Song Machine Learning II: dvanced Topics S 8803ML, Spring 2012 Inference in raphical Models eneral form of the inference problem P X 1,, X n Ψ(

More information

Graphical Models. Lecture 12: Belief Update Message Passing. Andrew McCallum

Graphical Models. Lecture 12: Belief Update Message Passing. Andrew McCallum Graphical Models Lecture 12: Belief Update Message Passing Andrew McCallum mccallum@cs.umass.edu Thanks to Noah Smith and Carlos Guestrin for slide materials. 1 Today s Plan Quick Review: Sum Product Message

More information

Variable Elimination (VE) Barak Sternberg

Variable Elimination (VE) Barak Sternberg Variable Elimination (VE) Barak Sternberg Basic Ideas in VE Example 1: Let G be a Chain Bayesian Graph: X 1 X 2 X n 1 X n How would one compute P X n = k? Using the CPDs: P X 2 = x = x Val X1 P X 1 = x

More information

9 Forward-backward algorithm, sum-product on factor graphs

9 Forward-backward algorithm, sum-product on factor graphs Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous

More information

Machine Learning 4771

Machine Learning 4771 Machine Learning 4771 Instructor: Tony Jebara Topic 16 Undirected Graphs Undirected Separation Inferring Marginals & Conditionals Moralization Junction Trees Triangulation Undirected Graphs Separation

More information

Exact Inference I. Mark Peot. In this lecture we will look at issues associated with exact inference. = =

Exact Inference I. Mark Peot. In this lecture we will look at issues associated with exact inference. = = Exact Inference I Mark Peot In this lecture we will look at issues associated with exact inference 10 Queries The objective of probabilistic inference is to compute a joint distribution of a set of query

More information

13 : Variational Inference: Loopy Belief Propagation

13 : Variational Inference: Loopy Belief Propagation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 13 : Variational Inference: Loopy Belief Propagation Lecturer: Eric P. Xing Scribes: Rajarshi Das, Zhengzhong Liu, Dishan Gupta 1 Introduction

More information

Bayesian & Markov Networks: A unified view

Bayesian & Markov Networks: A unified view School of omputer Science ayesian & Markov Networks: unified view Probabilistic Graphical Models (10-708) Lecture 3, Sep 19, 2007 Receptor Kinase Gene G Receptor X 1 X 2 Kinase Kinase E X 3 X 4 X 5 TF

More information

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4 ECE52 Tutorial Topic Review ECE52 Winter 206 Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides ECE52 Tutorial ECE52 Winter 206 Credits to Alireza / 4 Outline K-means, PCA 2 Bayesian

More information

Structured Variational Inference

Structured Variational Inference Structured Variational Inference Sargur srihari@cedar.buffalo.edu 1 Topics 1. Structured Variational Approximations 1. The Mean Field Approximation 1. The Mean Field Energy 2. Maximizing the energy functional:

More information

Machine Learning Lecture 14

Machine Learning Lecture 14 Many slides adapted from B. Schiele, S. Roth, Z. Gharahmani Machine Learning Lecture 14 Undirected Graphical Models & Inference 23.06.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de

More information

Probabilistic Graphical Models (I)

Probabilistic Graphical Models (I) Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random

More information

Inference in Bayesian Networks

Inference in Bayesian Networks Andrea Passerini passerini@disi.unitn.it Machine Learning Inference in graphical models Description Assume we have evidence e on the state of a subset of variables E in the model (i.e. Bayesian Network)

More information

Variational Inference (11/04/13)

Variational Inference (11/04/13) STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further

More information

From Distributions to Markov Networks. Sargur Srihari

From Distributions to Markov Networks. Sargur Srihari From Distributions to Markov Networks Sargur srihari@cedar.buffalo.edu 1 Topics The task: How to encode independencies in given distribution P in a graph structure G Theorems concerning What type of Independencies?

More information

Undirected Graphical Models 4 Bayesian Networks and Markov Networks. Bayesian Networks to Markov Networks

Undirected Graphical Models 4 Bayesian Networks and Markov Networks. Bayesian Networks to Markov Networks Undirected Graphical Models 4 ayesian Networks and Markov Networks 1 ayesian Networks to Markov Networks 2 1 Ns to MNs X Y Z Ns can represent independence constraints that MN cannot MNs can represent independence

More information

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Problem Set 3 Issued: Thursday, September 25, 2014 Due: Thursday,

More information

COMPSCI 276 Fall 2007

COMPSCI 276 Fall 2007 Exact Inference lgorithms for Probabilistic Reasoning; OMPSI 276 Fall 2007 1 elief Updating Smoking lung ancer ronchitis X-ray Dyspnoea P lung cancer=yes smoking=no, dyspnoea=yes =? 2 Probabilistic Inference

More information

Undirected Graphical Models: Markov Random Fields

Undirected Graphical Models: Markov Random Fields Undirected Graphical Models: Markov Random Fields 40-956 Advanced Topics in AI: Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2015 Markov Random Field Structure: undirected

More information

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008 MIT OpenCourseWare http://ocw.mit.edu 6.047 / 6.878 Computational Biology: Genomes, Networks, Evolution Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Probability Propagation

Probability Propagation Graphical Models, Lectures 9 and 10, Michaelmas Term 2009 November 13, 2009 Characterizing chordal graphs The following are equivalent for any undirected graph G. (i) G is chordal; (ii) G is decomposable;

More information

Statistical Approaches to Learning and Discovery

Statistical Approaches to Learning and Discovery Statistical Approaches to Learning and Discovery Graphical Models Zoubin Ghahramani & Teddy Seidenfeld zoubin@cs.cmu.edu & teddy@stat.cmu.edu CALD / CS / Statistics / Philosophy Carnegie Mellon University

More information

Lecture 17: May 29, 2002

Lecture 17: May 29, 2002 EE596 Pat. Recog. II: Introduction to Graphical Models University of Washington Spring 2000 Dept. of Electrical Engineering Lecture 17: May 29, 2002 Lecturer: Jeff ilmes Scribe: Kurt Partridge, Salvador

More information

4 : Exact Inference: Variable Elimination

4 : Exact Inference: Variable Elimination 10-708: Probabilistic Graphical Models 10-708, Spring 2014 4 : Exact Inference: Variable Elimination Lecturer: Eric P. ing Scribes: Soumya Batra, Pradeep Dasigi, Manzil Zaheer 1 Probabilistic Inference

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

Markov Chain Monte Carlo Methods

Markov Chain Monte Carlo Methods Markov Chain Monte Carlo Methods Sargur Srihari srihari@cedar.buffalo.edu 1 Topics Limitations of Likelihood Weighting Gibbs Sampling Algorithm Markov Chains Gibbs Sampling Revisited A broader class of

More information

Inference in Graphical Models Variable Elimination and Message Passing Algorithm

Inference in Graphical Models Variable Elimination and Message Passing Algorithm Inference in Graphical Models Variable Elimination and Message Passing lgorithm Le Song Machine Learning II: dvanced Topics SE 8803ML, Spring 2012 onditional Independence ssumptions Local Markov ssumption

More information

11 The Max-Product Algorithm

11 The Max-Product Algorithm Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 11 The Max-Product Algorithm In the previous lecture, we introduced

More information

Representation of undirected GM. Kayhan Batmanghelich

Representation of undirected GM. Kayhan Batmanghelich Representation of undirected GM Kayhan Batmanghelich Review Review: Directed Graphical Model Represent distribution of the form ny p(x 1,,X n = p(x i (X i i=1 Factorizes in terms of local conditional probabilities

More information

UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics. EECS 281A / STAT 241A Statistical Learning Theory

UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics. EECS 281A / STAT 241A Statistical Learning Theory UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics EECS 281A / STAT 241A Statistical Learning Theory Solutions to Problem Set 2 Fall 2011 Issued: Wednesday,

More information

Using Graphs to Describe Model Structure. Sargur N. Srihari

Using Graphs to Describe Model Structure. Sargur N. Srihari Using Graphs to Describe Model Structure Sargur N. srihari@cedar.buffalo.edu 1 Topics in Structured PGMs for Deep Learning 0. Overview 1. Challenge of Unstructured Modeling 2. Using graphs to describe

More information

Lecture 12: May 09, Decomposable Graphs (continues from last time)

Lecture 12: May 09, Decomposable Graphs (continues from last time) 596 Pat. Recog. II: Introduction to Graphical Models University of Washington Spring 00 Dept. of lectrical ngineering Lecture : May 09, 00 Lecturer: Jeff Bilmes Scribe: Hansang ho, Izhak Shafran(000).

More information

12 : Variational Inference I

12 : Variational Inference I 10-708: Probabilistic Graphical Models, Spring 2015 12 : Variational Inference I Lecturer: Eric P. Xing Scribes: Fattaneh Jabbari, Eric Lei, Evan Shapiro 1 Introduction Probabilistic inference is one of

More information

Machine Learning 4771

Machine Learning 4771 Machine Learning 4771 Instructor: Tony Jebara Topic 18 The Junction Tree Algorithm Collect & Distribute Algorithmic Complexity ArgMax Junction Tree Algorithm Review: Junction Tree Algorithm end message

More information

Implementing Machine Reasoning using Bayesian Network in Big Data Analytics

Implementing Machine Reasoning using Bayesian Network in Big Data Analytics Implementing Machine Reasoning using Bayesian Network in Big Data Analytics Steve Cheng, Ph.D. Guest Speaker for EECS 6893 Big Data Analytics Columbia University October 26, 2017 Outline Introduction Probability

More information

Undirected Graphical Models

Undirected Graphical Models Undirected Graphical Models 1 Conditional Independence Graphs Let G = (V, E) be an undirected graph with vertex set V and edge set E, and let A, B, and C be subsets of vertices. We say that C separates

More information

Discrete Bayesian Networks: The Exact Posterior Marginal Distributions

Discrete Bayesian Networks: The Exact Posterior Marginal Distributions arxiv:1411.6300v1 [cs.ai] 23 Nov 2014 Discrete Bayesian Networks: The Exact Posterior Marginal Distributions Do Le (Paul) Minh Department of ISDS, California State University, Fullerton CA 92831, USA dminh@fullerton.edu

More information

CSC 412 (Lecture 4): Undirected Graphical Models

CSC 412 (Lecture 4): Undirected Graphical Models CSC 412 (Lecture 4): Undirected Graphical Models Raquel Urtasun University of Toronto Feb 2, 2016 R Urtasun (UofT) CSC 412 Feb 2, 2016 1 / 37 Today Undirected Graphical Models: Semantics of the graph:

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft

More information

UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS

UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS JONATHAN YEDIDIA, WILLIAM FREEMAN, YAIR WEISS 2001 MERL TECH REPORT Kristin Branson and Ian Fasel June 11, 2003 1. Inference Inference problems

More information

Variable Elimination: Basic Ideas

Variable Elimination: Basic Ideas Variable Elimination: asic Ideas Sargur srihari@cedar.buffalo.edu 1 Topics 1. Types of Inference lgorithms 2. Variable Elimination: the asic ideas 3. Variable Elimination Sum-Product VE lgorithm Sum-Product

More information

p L yi z n m x N n xi

p L yi z n m x N n xi y i z n x n N x i Overview Directed and undirected graphs Conditional independence Exact inference Latent variables and EM Variational inference Books statistical perspective Graphical Models, S. Lauritzen

More information

Graphical Models. Lecture 10: Variable Elimina:on, con:nued. Andrew McCallum

Graphical Models. Lecture 10: Variable Elimina:on, con:nued. Andrew McCallum Graphical Models Lecture 10: Variable Elimina:on, con:nued Andrew McCallum mccallum@cs.umass.edu Thanks to Noah Smith and Carlos Guestrin for some slide materials. 1 Last Time Probabilis:c inference is

More information

Probabilistic Graphical Models (Cmput 651): Hybrid Network. Matthew Brown 24/11/2008

Probabilistic Graphical Models (Cmput 651): Hybrid Network. Matthew Brown 24/11/2008 Probabilistic Graphical Models (Cmput 651): Hybrid Network Matthew Brown 24/11/2008 Reading: Handout on Hybrid Networks (Ch. 13 from older version of Koller Friedman) 1 Space of topics Semantics Inference

More information

Example: multivariate Gaussian Distribution

Example: multivariate Gaussian Distribution School of omputer Science Probabilistic Graphical Models Representation of undirected GM (continued) Eric Xing Lecture 3, September 16, 2009 Reading: KF-chap4 Eric Xing @ MU, 2005-2009 1 Example: multivariate

More information

Computational Complexity of Inference

Computational Complexity of Inference Computational Complexity of Inference Sargur srihari@cedar.buffalo.edu 1 Topics 1. What is Inference? 2. Complexity Classes 3. Exact Inference 1. Variable Elimination Sum-Product Algorithm 2. Factor Graphs

More information

Probability Propagation

Probability Propagation Graphical Models, Lecture 12, Michaelmas Term 2010 November 19, 2010 Characterizing chordal graphs The following are equivalent for any undirected graph G. (i) G is chordal; (ii) G is decomposable; (iii)

More information

Variational Inference. Sargur Srihari

Variational Inference. Sargur Srihari Variational Inference Sargur srihari@cedar.buffalo.edu 1 Plan of discussion We first describe inference with PGMs and the intractability of exact inference Then give a taxonomy of inference algorithms

More information

Deep Learning Srihari. Deep Belief Nets. Sargur N. Srihari

Deep Learning Srihari. Deep Belief Nets. Sargur N. Srihari Deep Belief Nets Sargur N. Srihari srihari@cedar.buffalo.edu Topics 1. Boltzmann machines 2. Restricted Boltzmann machines 3. Deep Belief Networks 4. Deep Boltzmann machines 5. Boltzmann machines for continuous

More information

Fisher Information in Gaussian Graphical Models

Fisher Information in Gaussian Graphical Models Fisher Information in Gaussian Graphical Models Jason K. Johnson September 21, 2006 Abstract This note summarizes various derivations, formulas and computational algorithms relevant to the Fisher information

More information

The Ising model and Markov chain Monte Carlo

The Ising model and Markov chain Monte Carlo The Ising model and Markov chain Monte Carlo Ramesh Sridharan These notes give a short description of the Ising model for images and an introduction to Metropolis-Hastings and Gibbs Markov Chain Monte

More information

Generative and Discriminative Approaches to Graphical Models CMSC Topics in AI

Generative and Discriminative Approaches to Graphical Models CMSC Topics in AI Generative and Discriminative Approaches to Graphical Models CMSC 35900 Topics in AI Lecture 2 Yasemin Altun January 26, 2007 Review of Inference on Graphical Models Elimination algorithm finds single

More information

13 : Variational Inference: Loopy Belief Propagation and Mean Field

13 : Variational Inference: Loopy Belief Propagation and Mean Field 10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction

More information

Need for Sampling in Machine Learning. Sargur Srihari

Need for Sampling in Machine Learning. Sargur Srihari Need for Sampling in Machine Learning Sargur srihari@cedar.buffalo.edu 1 Rationale for Sampling 1. ML methods model data with probability distributions E.g., p(x,y; θ) 2. Models are used to answer queries,

More information

Stat 521A Lecture 18 1

Stat 521A Lecture 18 1 Stat 521A Lecture 18 1 Outline Cts and discrete variables (14.1) Gaussian networks (14.2) Conditional Gaussian networks (14.3) Non-linear Gaussian networks (14.4) Sampling (14.5) 2 Hybrid networks A hybrid

More information

The Origin of Deep Learning. Lili Mou Jan, 2015

The Origin of Deep Learning. Lili Mou Jan, 2015 The Origin of Deep Learning Lili Mou Jan, 2015 Acknowledgment Most of the materials come from G. E. Hinton s online course. Outline Introduction Preliminary Boltzmann Machines and RBMs Deep Belief Nets

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Bayesian Networks Representation and Reasoning

Bayesian Networks Representation and Reasoning ayesian Networks Representation and Reasoning Marco F. Ramoni hildren s Hospital Informatics Program Harvard Medical School (2003) Harvard-MIT ivision of Health Sciences and Technology HST.951J: Medical

More information

Lecture 15. Probabilistic Models on Graph

Lecture 15. Probabilistic Models on Graph Lecture 15. Probabilistic Models on Graph Prof. Alan Yuille Spring 2014 1 Introduction We discuss how to define probabilistic models that use richly structured probability distributions and describe how

More information

Basic Sampling Methods

Basic Sampling Methods Basic Sampling Methods Sargur Srihari srihari@cedar.buffalo.edu 1 1. Motivation Topics Intractability in ML How sampling can help 2. Ancestral Sampling Using BNs 3. Transforming a Uniform Distribution

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Recitation 3 1 Gaussian Graphical Models: Schur s Complement Consider

More information

Sampling Algorithms for Probabilistic Graphical models

Sampling Algorithms for Probabilistic Graphical models Sampling Algorithms for Probabilistic Graphical models Vibhav Gogate University of Washington References: Chapter 12 of Probabilistic Graphical models: Principles and Techniques by Daphne Koller and Nir

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture Notes Fall 2009 November, 2009 Byoung-Ta Zhang School of Computer Science and Engineering & Cognitive Science, Brain Science, and Bioinformatics Seoul National University

More information

Undirected graphical models

Undirected graphical models Undirected graphical models Semantics of probabilistic models over undirected graphs Parameters of undirected models Example applications COMP-652 and ECSE-608, February 16, 2017 1 Undirected graphical

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Markov Random Fields: Representation Conditional Random Fields Log-Linear Models Readings: KF

More information

Junction Tree, BP and Variational Methods

Junction Tree, BP and Variational Methods Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,

More information

Bayesian Machine Learning - Lecture 7

Bayesian Machine Learning - Lecture 7 Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1

More information

Bayesian Network Inference Using Marginal Trees

Bayesian Network Inference Using Marginal Trees Bayesian Network Inference Using Marginal Trees Cory J. Butz 1?, Jhonatan de S. Oliveira 2??, and Anders L. Madsen 34 1 University of Regina, Department of Computer Science, Regina, S4S 0A2, Canada butz@cs.uregina.ca

More information

Prof. Dr. Lars Schmidt-Thieme, L. B. Marinho, K. Buza Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany, Course

Prof. Dr. Lars Schmidt-Thieme, L. B. Marinho, K. Buza Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany, Course Course on Bayesian Networks, winter term 2007 0/31 Bayesian Networks Bayesian Networks I. Bayesian Networks / 1. Probabilistic Independence and Separation in Graphs Prof. Dr. Lars Schmidt-Thieme, L. B.

More information

Likelihood Weighting and Importance Sampling

Likelihood Weighting and Importance Sampling Likelihood Weighting and Importance Sampling Sargur Srihari srihari@cedar.buffalo.edu 1 Topics Likelihood Weighting Intuition Importance Sampling Unnormalized Importance Sampling Normalized Importance

More information

Learning MN Parameters with Alternative Objective Functions. Sargur Srihari

Learning MN Parameters with Alternative Objective Functions. Sargur Srihari Learning MN Parameters with Alternative Objective Functions Sargur srihari@cedar.buffalo.edu 1 Topics Max Likelihood & Contrastive Objectives Contrastive Objective Learning Methods Pseudo-likelihood Gradient

More information

Graphical Models Another Approach to Generalize the Viterbi Algorithm

Graphical Models Another Approach to Generalize the Viterbi Algorithm Exact Marginalization Another Approach to Generalize the Viterbi Algorithm Oberseminar Bioinformatik am 20. Mai 2010 Institut für Mikrobiologie und Genetik Universität Göttingen mario@gobics.de 1.1 Undirected

More information

MAP Examples. Sargur Srihari

MAP Examples. Sargur Srihari MAP Examples Sargur srihari@cedar.buffalo.edu 1 Potts Model CRF for OCR Topics Image segmentation based on energy minimization 2 Examples of MAP Many interesting examples of MAP inference are instances

More information

Graphical Models and Independence Models

Graphical Models and Independence Models Graphical Models and Independence Models Yunshu Liu ASPITRG Research Group 2014-03-04 References: [1]. Steffen Lauritzen, Graphical Models, Oxford University Press, 1996 [2]. Christopher M. Bishop, Pattern

More information

Bayesian networks approximation

Bayesian networks approximation Bayesian networks approximation Eric Fabre ANR StochMC, Feb. 13, 2014 Outline 1 Motivation 2 Formalization 3 Triangulated graphs & I-projections 4 Successive approximations 5 Best graph selection 6 Conclusion

More information

Exact Inference Algorithms Bucket-elimination

Exact Inference Algorithms Bucket-elimination Exact Inference Algorithms Bucket-elimination COMPSCI 276, Spring 2011 Class 5: Rina Dechter (Reading: class notes chapter 4, Darwiche chapter 6) 1 Belief Updating Smoking lung Cancer Bronchitis X-ray

More information

Rapid Introduction to Machine Learning/ Deep Learning

Rapid Introduction to Machine Learning/ Deep Learning Rapid Introduction to Machine Learning/ Deep Learning Hyeong In Choi Seoul National University 1/24 Lecture 5b Markov random field (MRF) November 13, 2015 2/24 Table of contents 1 1. Objectives of Lecture

More information

STATISTICAL METHODS IN AI/ML Vibhav Gogate The University of Texas at Dallas. Markov networks: Representation

STATISTICAL METHODS IN AI/ML Vibhav Gogate The University of Texas at Dallas. Markov networks: Representation STATISTICAL METHODS IN AI/ML Vibhav Gogate The University of Texas at Dallas Markov networks: Representation Markov networks: Undirected Graphical models Model the following distribution using a Bayesian

More information

Geometry Chapter 3 3-6: PROVE THEOREMS ABOUT PERPENDICULAR LINES

Geometry Chapter 3 3-6: PROVE THEOREMS ABOUT PERPENDICULAR LINES Geometry Chapter 3 3-6: PROVE THEOREMS ABOUT PERPENDICULAR LINES Warm-Up 1.) What is the distance between the points (2, 3) and (5, 7). 2.) If < 1 and < 2 are complements, and m < 1 = 49, then what is

More information

Chapter 16. Structured Probabilistic Models for Deep Learning

Chapter 16. Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 1 Chapter 16 Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 2 Structured Probabilistic Models way of using graphs to describe

More information

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov Probabilistic Graphical Models: MRFs and CRFs CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov Why PGMs? PGMs can model joint probabilities of many events. many techniques commonly

More information

Exact Inference: Variable Elimination

Exact Inference: Variable Elimination Readings: K&F 9.2 9. 9.4 9.5 Exact nerence: Variable Elimination ecture 6-7 Apr 1/18 2011 E 515 tatistical Methods pring 2011 nstructor: u-n ee University o Washington eattle et s revisit the tudent Network

More information