Markov properties for undirected graphs

Similar documents
Markov properties for undirected graphs

Conditional Independence and Markov Properties

Markov properties for directed graphs

Likelihood Analysis of Gaussian Graphical Models

Independencies. Undirected Graphical Models 2: Independencies. Independencies (Markov networks) Independencies (Bayesian Networks)

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

Lecture 4 October 18th

On Markov Properties in Evidence Theory

Total positivity in Markov structures

Undirected Graphical Models

4.1 Notation and probability review

Faithfulness of Probability Distributions and Graphs

Markov properties for mixed graphs

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Representing Independence Models with Elementary Triplets

Independence, Decomposability and functions which take values into an Abelian Group

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Probabilistic Graphical Models. Rudolf Kruse, Alexander Dockhorn Bayesian Networks 153

Prof. Dr. Lars Schmidt-Thieme, L. B. Marinho, K. Buza Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany, Course

Graphical Models and Independence Models

3 : Representation of Undirected GM

Rapid Introduction to Machine Learning/ Deep Learning

Probabilistic Reasoning: Graphical Models

Undirected Graphical Models: Markov Random Fields

Review: Directed Models (Bayes Nets)

Gibbs Field & Markov Random Field

Preliminaries Bayesian Networks Graphoid Axioms d-separation Wrap-up. Bayesian Networks. Brandon Malone

6.867 Machine learning, lecture 23 (Jaakkola)

CSC 412 (Lecture 4): Undirected Graphical Models

1 Undirected Graphical Models. 2 Markov Random Fields (MRFs)

Capturing Independence Graphically; Undirected Graphs

Separable and Transitive Graphoids

Probability Propagation

Graphic sequences, adjacency matrix

Conditional Independence and Factorization

Gibbs Fields & Markov Random Fields

3 Undirected Graphical Models

Probabilistic Graphical Models (I)

Undirected Graphical Models

Probability Propagation

Chris Bishop s PRML Ch. 8: Graphical Models

Lecture 17: May 29, 2002

Directed and Undirected Graphical Models

Independence for Full Conditional Measures, Graphoids and Bayesian Networks

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

Tutorial: Gaussian conditional independence and graphical models. Thomas Kahle Otto-von-Guericke Universität Magdeburg

UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics. EECS 281A / STAT 241A Statistical Learning Theory

Math 3012 Applied Combinatorics Lecture 14

Lecture 12: May 09, Decomposable Graphs (continues from last time)

Causality in Econometrics (3)

Bayesian Machine Learning - Lecture 7

MATH 829: Introduction to Data Mining and Analysis Graphical Models I

K 4 -free graphs with no odd holes

Maximum likelihood in log-linear models

A Tutorial on Bayesian Belief Networks

EE512 Graphical Models Fall 2009

Markov Random Fields

STATISTICAL METHODS IN AI/ML Vibhav Gogate The University of Texas at Dallas. Bayesian networks: Representation

On an Additive Semigraphoid Model for Statistical Networks With Application to Nov Pathway 25, 2016 Analysis -1 Bing / 38Li,

Probabilistic Graphical Models Lecture Notes Fall 2009

Statistical Approaches to Learning and Discovery

Lecture 8: Equivalence Relations

1. what conditional independencies are implied by the graph. 2. whether these independecies correspond to the probability distribution

On Conditional Independence in Evidence Theory

Motivation. Bayesian Networks in Epistemology and Philosophy of Science Lecture. Overview. Organizational Issues

1 Some loose ends from last time

Capturing Independence Graphically; Undirected Graphs

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

Introduction to Graphical Models

Markov and Gibbs Random Fields

Machine Learning 4771

Undirected Graphical Models

Learning from Sensor Data: Set II. Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University

Undirected graphical models

5 Set Operations, Functions, and Counting

Reasoning with Bayesian Networks

Example: multivariate Gaussian Distribution

Probabilistic Graphical Models

Undirected Graphical Models 4 Bayesian Networks and Markov Networks. Bayesian Networks to Markov Networks

Markov Chains and MCMC

Markov Random Fields

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

The Coin Algebra and Conditional Independence

Graphical Models with R

CMPSCI 250: Introduction to Computation. Lecture #11: Equivalence Relations David Mix Barrington 27 September 2013

Tope Graph of Oriented Matroids

Quiz 1 Date: Monday, October 17, 2016

Graphical Model Inference with Perfect Graphs

Machine Learning Summer School

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

Proof Theoretical Studies on Semilattice Relevant Logics

CS Lecture 4. Markov Random Fields

Concepts for decision making under severe uncertainty with partial ordinal and partial cardinal preferences

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

Bayesian & Markov Networks: A unified view

Mapping Class Groups MSRI, Fall 2007 Day 8, October 25

Probabilistic Graphical Models

Variable Elimination (VE) Barak Sternberg

CRF for human beings

Chordal Coxeter Groups

Transcription:

Graphical Models, Lecture 2, Michaelmas Term 2011 October 12, 2011

Formal definition Fundamental properties Random variables X and Y are conditionally independent given the random variable Z if L(X Y, Z) = L(X Z). We then write X Y Z (or X P Y Z) Intuitively: Knowing Z renders Y irrelevant for predicting X. Factorisation of densities: X Y Z f (x, y, z)f (z) = f (x, z)f (y, z) a, b : f (x, y, z) = a(x, z)b(y, z).

Formal definition Fundamental properties For random variables X, Y, Z, and W it holds (C1) If X Y Z then Y X Z; (C2) If X Y Z and U = g(y ), then X U Z; (C3) If X Y Z and U = g(y ), then X Y (Z, U); (C4) If X Y Z and X W (Y, Z), then X (Y, W ) Z; If density w.r.t. product measure f (x, y, z, w) > 0 also (C5) If X Y (Z, W ) and X Z (Y, W ) then X (Y, Z) W.

Graphoids and semi-graphoids Examples An independence model σ is a ternary relation over subsets of a finite set V. It is semi-graphoid if for all subsets A, B, C, D: (S1) if A σ B C then B σ A C (symmetry); (S2) if A σ (B D) C then A σ B C and A σ D C (decomposition); (S3) if A σ (B D) C then A σ B (C D) (weak union); (S4) if A σ B C and A σ D (B C), then A σ (B D) C (contraction). It is a graphoid if (S1) (S4) holds and (S5) if A σ B (C D) and A σ C (B D) then A σ (B C) D (intersection). It is compositional if also (S6) if A σ B C and A σ D C then A σ (B D) C (composition).

Graphoids and semi-graphoids Examples Separation in undirected graphs Let G = (V, E) be finite and simple undirected graph (no self-loops, no multiple edges). For subsets A, B, S of V, let A G B S denote that S separates A from B in G, i.e. that all paths from A to B intersect S. Fact: The relation G on subsets of V is a graphoid. This fact is the reason for choosing the name graphoid for such separation relations.

Graphoids and semi-graphoids Examples Systems of random variables For a system V of labeled random variables X v, v V, we use the shorthand A B C X A X B X C, where X A = (X v, v A) denotes the variables with labels in A. The properties (C1) (C4) imply that satisfies the semi-graphoid axioms for such a system, and the graphoid axioms if the joint density of the variables is strictly positive. In general, composition does not hold, but it does in interesting special cases. An independence model determined in this way is said to be probabilistic.

Definitions Structural relations among Markov properties Let G = (V, E) simple undirected graph and let σ independence model. We say σ satisfies (P) the pairwise Markov property w.r.t. G if be an α β α σ β V \ {α, β}; (L) the local Markov property w.r.t. G if α V : α σ V \ cl(α) bd(α); (G) the global Markov property w.r.t. G if A G B S A σ B S.

Pairwise Markov property Definitions Structural relations among Markov properties 2 4 1 5 7 3 6 Any non-adjacent pair of random variables are conditionally independent given the remaning. For example, 1 σ 5 {2, 3, 4, 6, 7} and 4 σ 6 {1, 2, 3, 5, 7}.

Local Markov property Definitions Structural relations among Markov properties 2 4 1 5 7 3 6 Every variable is conditionally independent of the remaining, given its neighbours. For example, 5 σ {1, 4} {2, 3, 6, 7} and 7 σ {1, 2, 3} {4, 5, 6}.

Global Markov property Definitions Structural relations among Markov properties 2 4 1 5 7 3 6 To find conditional independence relations, one should look for separating sets, such as {2, 3}, {4, 5, 6}, or {2, 5, 6} For example, it follows that 1 σ 7 {2, 5, 6} and 2 σ 6 {3, 4, 5}.

Definitions Structural relations among Markov properties For any semigraphoid it holds that (G) (L) (P) If σ satisfies graphoid axioms it further holds that (P) (G) so that in the graphoid case (G) (L) (P). The latter holds in particular for, when f (x) > 0.

Definitions Structural relations among Markov properties (G) (L) (P) (G) implies (L) because bd(α) separates α from V \ cl(α). Assume (L). Then β V \ cl(α) because α β. Thus bd(α) ((V \ cl(α)) \ {β}) = V \ {α, β}, Hence by (L) and (S3) we get that α σ (V \ cl(α)) V \ {α, β}. (S2) then gives α σ β V \ {α, β} which is (P).

(P) (G) for graphoids Definitions Structural relations among Markov properties Assume (P) and A G B S. We must show A σ B S. Wlog assume A and B non-empty. Proof is reverse induction on n = S. If n = V 2 then A and B are singletons and (P) yields A σ B S directly. Assume S = n < V 2 and conclusion established for S > n: First assume V = A B S. Then either A or B has at least two elements, say A. If α A then B G (A \ {α}) (S {α}) and also α G B (S A \ {α}) (as G is a semi-graphoid). Thus by the induction hypothesis (A \ {α}) σ B (S {α}) and {α} σ B (S A \ {α}). Now (S5) gives A σ B S.

Definitions Structural relations among Markov properties (P) (G) for graphoids, continued For A B S V we choose α V \ (A B S). Then A G B (S {α}) and hence the induction hypothesis yields A σ B (S {α}). Further, either A S separates B from {α} or B S separates A from {α}. Assuming the former gives α σ B A S. Using (S5) we get (A {α}) σ B S and from (S2) we derive that A σ B S. The latter case is similar.

Definition Factorization example Factorization theorem Assume density f w.r.t. product measure on X. For a V, ψ a (x) denotes a function which depends on x a only, i.e. x a = y a ψ a (x) = ψ a (y). We can then write ψ a (x) = ψ a (x a ) without ambiguity. The distribution of X factorizes w.r.t. G or satisfies (F) if f (x) = a A ψ a (x) where A are complete subsets of G. Complete subsets of a graph are sets with all elements pairwise neighbours.

Definition Factorization example Factorization theorem 2 4 1 5 7 3 6 The cliques of this graph are the maximal complete subsets {1, 2}, {1, 3}, {2, 4}, {2, 5}, {3, 5, 6}, {4, 7}, and {5, 6, 7}. A complete set is any subset of these sets. The graph above corresponds to a factorization as f (x) = ψ 12 (x 1, x 2 )ψ 13 (x 1, x 3 )ψ 24 (x 2, x 4 )ψ 25 (x 2, x 5 ) ψ 356 (x 3, x 5, x 6 )ψ 47 (x 4, x 7 )ψ 567 (x 5, x 6, x 7 ).

Definition Factorization example Factorization theorem Let (F) denote the property that f factorizes w.r.t. G and let (G), (L) and (P) denote Markov properties w.r.t.. It then holds that (F) (G) and further: If f (x) > 0 for all x, (P) (F). The former of these is a simple direct consequence of the factorization whereas the second implication is more subtle and known as the Hammersley Clifford Theorem. Thus in the case of positive density (but typically only then), all the properties coincide: (F) (G) (L) (P).