Machine learning: lecture 20 ommi. Jaakkola MI AI tommi@csail.mit.edu
Bayesian networks examples, specification graphs and independence associated distribution Outline ommi Jaakkola, MI AI 2
Bayesian networks Bayesian networks are directed acyclic graphs, where the nodes represent variables and directed edges capture dependencies "parent of x" A mixture model as a Bayesian network "i influences x" "i causes x" "x depends on i" i x P (i)p (x i) "child of i" ommi Jaakkola, MI AI 3
Bayesian networks Bayesian networks are directed acyclic graphs, where the nodes represent variables and directed edges capture dependencies "parent of x" A mixture model as a Bayesian network "i influences x" "i causes x" "x depends on i" i x P (i)p (x i) Graph semantics: "child of i" graph separation properties independence Association with probability distributions: independence family of distributions ommi Jaakkola, MI AI 4
Example A simple Bayesian network: coin tosses x 1 x 2 ommi Jaakkola, MI AI 5
Example A simple Bayesian network: coin tosses P (x 1 ) : x 1 x 2 P (x2 ) : ommi Jaakkola, MI AI 6
Example A simple Bayesian network: coin tosses P (x 1 ) : x 1 x 2 P (x2 ) : x 3 = same? ommi Jaakkola, MI AI 7
Example A simple Bayesian network: coin tosses P (x 1 ) : x 1 x 2 P (x2 ) : P (x 3 x 1, x 2 ) : x 3 = same? hh ht th tt y 1.0 0.0 0.0 1.0 n 0.0 1.0 1.0 0.0 ommi Jaakkola, MI AI 8
Example A simple Bayesian network: coin tosses P (x 1 ) : x 1 x 2 P (x2 ) : x 3 = same? hh ht th tt y 1.0 0.0 0.0 1.0 P (x 3 x 1, x 2 ) : n 0.0 1.0 1.0 0.0 wo levels of description 1. graph structure (dependencies, independencies) 2. associated probability distribution ommi Jaakkola, MI AI 9
Example cont d What can the graph alone tell us? x 1 x 2 x 3 = same? ommi Jaakkola, MI AI 10
Example cont d What can the graph alone tell us? x 1 x 2 x 3 = same? x 1 and x 2 are marginally independent ommi Jaakkola, MI AI 11
Example cont d What can the graph alone tell us? x 1 x 2 x 3 = same? x 1 and x 2 are marginally independent x 1 x 2 x 3 = same? x 1 and x 2 become dependent if we know x 3 (the dependence concerns our beliefs about the outcomes) ommi Jaakkola, MI AI 12
raffic example = X is nice? = traffic light = X decides to stop? = the other car turns left? = crash? ommi Jaakkola, MI AI 13
raffic example = X is nice? = traffic light = X decides to stop? = the other car turns left? = crash? ommi Jaakkola, MI AI 14
raffic example = X is nice? = traffic light = X decides to stop? = the other car turns left? = crash? ommi Jaakkola, MI AI 15
raffic example = X is nice? = traffic light = X decides to stop? = the other car turns left? = crash? ommi Jaakkola, MI AI 16
raffic example = X is nice? = traffic light = X decides to stop? = the other car turns left? = crash? ommi Jaakkola, MI AI 17
raffic example = X is nice? = traffic light = X decides to stop? = the other car turns left? = crash? If we only know that X decided to stop, can X s character (variable ) tell us anything about the other car turning (variable )? ommi Jaakkola, MI AI 18
Graph, independence, d-separation Are and independent given? ommi Jaakkola, MI AI 19
Graph, independence, d-separation Are and independent given? Definition: Variables and are D-separated given if separates them in the moralized ancestral graph ommi Jaakkola, MI AI 20
Graph, independence, d-separation Are and independent given? Definition: Variables and are D-separated given if separates them in the moralized ancestral graph original ommi Jaakkola, MI AI 21
Graph, independence, d-separation Are and independent given? Definition: Variables and are D-separated given if separates them in the moralized ancestral graph original ancestral ommi Jaakkola, MI AI 22
Graph, independence, d-separation Are and independent given? Definition: Variables and are D-separated given if separates them in the moralized ancestral graph original ancestral moralized ancestral ommi Jaakkola, MI AI 23
Graph, independence, d-separation Are and independent given? Definition: Variables and are D-separated given if separates them in the moralized ancestral graph original ancestral moralized ancestral ommi Jaakkola, MI AI 24
Graphs and distributions A graph is a compact representation of a large collection of independence properties ommi Jaakkola, MI AI 25
Graphs and distributions A graph is a compact representation of a large collection of independence properties heorem: Any probability distribution that is consistent with a directed graph G has to factor according to node given parents : d P (x G) = P (x i x pai ) i=1 where x pai are the parents of x i and d is the number of nodes (variables) in the graph. ommi Jaakkola, MI AI 26
Model Explaining away phenomenon Earthquake Burglary Radio report Alarm ommi Jaakkola, MI AI 27
Model Explaining away phenomenon Earthquake Burglary Radio report Alarm Evidence, competing causes Earthquake Burglary Radio report Alarm ommi Jaakkola, MI AI 28
Model Explaining away phenomenon Earthquake Burglary Radio report Alarm Evidence, competing causes Earthquake Burglary Radio report Alarm Additional evidence and explaining away Earthquake Burglary Radio report Alarm ommi Jaakkola, MI AI 29