Markov properties for undirected graphs

Similar documents
Markov properties for undirected graphs

Conditional Independence and Markov Properties

Markov properties for directed graphs

Likelihood Analysis of Gaussian Graphical Models

Independencies. Undirected Graphical Models 2: Independencies. Independencies (Markov networks) Independencies (Bayesian Networks)

On Markov Properties in Evidence Theory

Lecture 4 October 18th

Undirected Graphical Models

Total positivity in Markov structures

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

4.1 Notation and probability review

Review: Directed Models (Bayes Nets)

Undirected Graphical Models: Markov Random Fields

Faithfulness of Probability Distributions and Graphs

CSC 412 (Lecture 4): Undirected Graphical Models

Rapid Introduction to Machine Learning/ Deep Learning

3 : Representation of Undirected GM

Representing Independence Models with Elementary Triplets

Markov properties for mixed graphs

Graphical Models and Independence Models

3 Undirected Graphical Models

Gibbs Field & Markov Random Field

Independence, Decomposability and functions which take values into an Abelian Group

1 Undirected Graphical Models. 2 Markov Random Fields (MRFs)

Probabilistic Graphical Models. Rudolf Kruse, Alexander Dockhorn Bayesian Networks 153

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Graphic sequences, adjacency matrix

Gibbs Fields & Markov Random Fields

Lecture 17: May 29, 2002

6.867 Machine learning, lecture 23 (Jaakkola)

Maximum likelihood in log-linear models

Probabilistic Reasoning: Graphical Models

Markov Random Fields

1. what conditional independencies are implied by the graph. 2. whether these independecies correspond to the probability distribution

Conditional Independence and Factorization

Math 3012 Applied Combinatorics Lecture 14

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

Probability Propagation

UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics. EECS 281A / STAT 241A Statistical Learning Theory

On Conditional Independence in Evidence Theory

MATH 829: Introduction to Data Mining and Analysis Graphical Models I

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

Lecture 8: Equivalence Relations

Section 7.1 Relations and Their Properties. Definition: A binary relation R from a set A to a set B is a subset R A B.

Probability Propagation

Bayesian Machine Learning - Lecture 7

Prof. Dr. Lars Schmidt-Thieme, L. B. Marinho, K. Buza Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany, Course

Example: multivariate Gaussian Distribution

EE512 Graphical Models Fall 2009

Statistical Approaches to Learning and Discovery

Separable and Transitive Graphoids

Probabilistic Graphical Models (I)

Capturing Independence Graphically; Undirected Graphs

1 Some loose ends from last time

5 Set Operations, Functions, and Counting

Massachusetts Institute of Technology 6.042J/18.062J, Fall 02: Mathematics for Computer Science Professor Albert Meyer and Dr.

CS Lecture 4. Markov Random Fields

Undirected Graphical Models

Undirected Graphical Models

COSE212: Programming Languages. Lecture 1 Inductive Definitions (1)

Variable Elimination (VE) Barak Sternberg

Bayesian & Markov Networks: A unified view

Directed and Undirected Graphical Models

Undirected graphical models

CS281A/Stat241A Lecture 19

CRF for human beings

Chapter 9: Relations Relations

Undirected Graphical Models 4 Bayesian Networks and Markov Networks. Bayesian Networks to Markov Networks

Introduction to Kleene Algebras

Part V. Matchings. Matching. 19 Augmenting Paths for Matchings. 18 Bipartite Matching via Flows

1 The Local-to-Global Lemma

Preliminaries Bayesian Networks Graphoid Axioms d-separation Wrap-up. Bayesian Networks. Brandon Malone

Graphical Models with R

HW Graph Theory SOLUTIONS (hbovik) - Q

An Introduction to Exponential-Family Random Graph Models

Independence for Full Conditional Measures, Graphoids and Bayesian Networks

Chris Bishop s PRML Ch. 8: Graphical Models

Quiz 1 Date: Monday, October 17, 2016

Graph coloring, perfect graphs

ON THE NP-COMPLETENESS OF SOME GRAPH CLUSTER MEASURES

Machine Learning Summer School

COMP538: Introduction to Bayesian Networks

Markov and Gibbs Random Fields

Representation of undirected GM. Kayhan Batmanghelich

MATH 433 Applied Algebra Lecture 14: Functions. Relations.

Probabilistic Graphical Models

Machine Learning 4771

First-Order Logic. 1 Syntax. Domain of Discourse. FO Vocabulary. Terms

Probabilistic Graphical Models

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

Representing Independence Models with Elementary Triplets

ON THE NUMBER OF COMPONENTS OF A GRAPH

A Tutorial on Bayesian Belief Networks

CS Lecture 3. More Bayesian Networks

Tutorial: Gaussian conditional independence and graphical models. Thomas Kahle Otto-von-Guericke Universität Magdeburg

Markov Random Fields

Markov Independence (Continued)

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

HOW IS A CHORDAL GRAPH LIKE A SUPERSOLVABLE BINARY MATROID?

Transcription:

Graphical Models, Lecture 2, Michaelmas Term 2009 October 15, 2009

Formal definition Fundamental properties Random variables X and Y are conditionally independent given the random variable Z if L(X Y, Z) = L(X Z). We then write X Y Z (or X P Y Z) Intuitively: Knowing Z renders Y irrelevant for predicting X. Factorisation of densities: X Y Z f (x, y, z)f (z) = f (x, z)f (y, z) a, b : f (x, y, z) = a(x, z)b(y, z).

Formal definition Fundamental properties For random variables X, Y, Z, and W it holds (C1) If X Y Z then Y X Z; (C2) If X Y Z and U = g(y ), then X U Z; (C3) If X Y Z and U = g(y ), then X Y (Z, U); (C4) If X Y Z and X W (Y, Z), then X (Y, W ) Z; If density w.r.t. product measure f (x, y, z, w) > 0 also (C5) If X Y (Z, W ) and X Z (Y, W ) then X (Y, Z) W.

Formal definition Fundamental properties Graphoid axioms Ternary relation σ is graphoid if for all disjoint subsets A, B, C, and D of V : (S1) if A σ B C then B σ A C; (S2) if A σ B C and D B, then A σ D C; (S3) if A σ B C and D B, then A σ B (C D); (S4) if A σ B C and A σ D (B C), then A σ (B D) C; (S5) if A σ B (C D) and A σ C (B D) then A σ (B C) D. Semigraphoid if only (S1) (S4) holds.

Separation in undirected graphs Formal definition Fundamental properties Let G = (V, E) be finite and simple undirected graph (no self-loops, no multiple edges). For subsets A, B, S of V, let A G B S denote that S separates A from B in G, i.e. that all paths from A to B intersect S. Fact: The relation G on subsets of V is a graphoid. This fact is the reason for choosing the name graphoid for such separation relations.

Formal definition Fundamental properties Systems of random variables For a system V of labeled random variables X v, v V, we use the shorthand A B C X A X B X C, where X A = (X v, v A) denotes the set of variables with labels in A. The properties (C1) (C4) imply that satisfies the semi-graphoid axioms for such a system, and the graphoid axioms if the joint density of the variables is strictly positive.

Definitions Structural relations among Markov properties G = (V, E) simple undirected graph; σ Say σ satisfies (P) the pairwise Markov property if (semi)graphoid relation. α β α σ β V \ {α, β}; (L) the local Markov property if α V : α σ V \ cl(α) bd(α); (G) the global Markov property if A G B S A σ B S.

Pairwise Markov property Definitions Structural relations among Markov properties 2 4 1 5 7 3 6 Any non-adjacent pair of random variables are conditionally independent given the remaning. For example, 1 σ 5 {2, 3, 4, 6, 7} and 4 σ 6 {1, 2, 3, 5, 7}.

Local Markov property Definitions Structural relations among Markov properties 2 4 1 5 7 3 6 Every variable is conditionally independent of the remaining, given its neighbours. For example, 5 σ {1, 4} {2, 3, 6, 7} and 7 σ {1, 2, 3} {4, 5, 6}.

Global Markov property Definitions Structural relations among Markov properties 2 4 1 5 7 3 6 To find conditional independence relations, one should look for separating sets, such as {2, 3}, {4, 5, 6}, or {2, 5, 6} For example, it follows that 1 σ 7 {2, 5, 6} and 2 σ 6 {3, 4, 5}.

Definitions Structural relations among Markov properties For any semigraphoid it holds that (G) (L) (P) If σ satisfies graphoid axioms it further holds that (P) (G) so that in the graphoid case (G) (L) (P). The latter holds in particular for, when f (x) > 0.

Definitions Structural relations among Markov properties (G) (L) (P) (G) implies (L) because bd(α) separates α from V \ cl(α). Assume (L). Then β V \ cl(α) because α β. Thus bd(α) ((V \ cl(α)) \ {β}) = V \ {α, β}, Hence by (L) and (S3) we get that α σ (V \ cl(α)) V \ {α, β}. (S2) then gives α σ β V \ {α, β} which is (P).

(P) (G) for graphoids Definitions Structural relations among Markov properties Assume (P) and A G B S. We must show A σ B S. Wlog assume A and B non-empty. Proof is reverse induction on n = S. If n = V 2 then A and B are singletons and (P) yields A σ B S directly. Assume S = n < V 2 and conclusion established for S > n: First assume V = A B S. Then either A or B has at least two elements, say A. If α A then B G (A \ {α}) (S {α}) and also α G B (S A \ {α}) (as G is a semi-graphoid). Thus by the induction hypothesis (A \ {α}) σ B (S {α}) and {α} σ B (S A \ {α}). Now (S5) gives A σ B S.

Definitions Structural relations among Markov properties (P) (G) for graphoids, continued For A B S V we choose α V \ (A B S). Then A G B (S {α}) and hence the induction hypothesis yields A σ B (S {α}). Further, either A S separates B from {α} or B S separates A from {α}. Assuming the former gives α σ B A S. Using (S5) we get (A {α}) σ B S and from (S2) we derive that A σ B S. The latter case is similar.

Dependence graph Assume density f w.r.t. product measure on X. For a V, ψ a (x) denotes a function which depends on x a only, i.e. x a = y a ψ a (x) = ψ a (y). We can then write ψ a (x) = ψ a (x a ) without ambiguity. The distribution of X factorizes w.r.t. G or satisfies (F) if f (x) = a A ψ a (x) where A are complete subsets of G. Complete subsets of a graph are sets with all elements pairwise neighbours.

Factorization example Dependence graph 2 4 1 5 7 3 6 The cliques of this graph are the maximal complete subsets {1, 2}, {1, 3}, {2, 4}, {2, 5}, {3, 5, 6}, {4, 7}, and {5, 6, 7}. A complete set is any subset of these sets. The graph above corresponds to a factorization as f (x) = ψ 12 (x 1, x 2 )ψ 13 (x 1, x 3 )ψ 24 (x 2, x 4 )ψ 25 (x 2, x 5 ) ψ 356 (x 3, x 5, x 6 )ψ 47 (x 4, x 7 )ψ 567 (x 5, x 6, x 7 ).

Factorization theorem Dependence graph Let (F) denote the property that f factorizes w.r.t. G and let (G), (L) and (P) denote Markov properties w.r.t.. It then holds that (F) (G) and further: If f (x) > 0 for all x, (P) (F). The former of these is a simple direct consequence of the factorization whereas the second implication is more subtle and known as the Hammersley Clifford Theorem. Thus in the case of positive density (but typically only then), all the properties coincide: (F) (G) (L) (P).

Dependence graph Any joint probability distribution P of X = (X v, v V ) has a dependence graph G = G(P) = (V, E(P)). This is defined by letting α β in G(P) exactly when α P β V \ {α, β}. X will then satisfy the pairwise Markov w.r.t. G(P) and G(P) is smallest with this property, i.e. P is pairwise Markov w.r.t. G iff G(P) G. If f (x) > 0 for all x, P is also globally Markov w.r.t. G(P).