University of Washington Department of Electrical Engineering EE512 Spring, 2006 Graphical Models

Similar documents
Bayesian Networks Inference with Probabilistic Graphical Models

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

COS402- Artificial Intelligence Fall Lecture 10: Bayesian Networks & Exact Inference

Directed Graphical Models

Lecture 17: May 29, 2002

Bayesian Networks BY: MOHAMAD ALSABBAGH

Based on slides by Richard Zemel

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

3 : Representation of Undirected GM

Intelligent Systems (AI-2)

Conditional Independence and Factorization

Graphical Models and Kernel Methods

Lecture 5: Bayesian Network

EE512 Graphical Models Fall 2009

A Brief Introduction to Graphical Models. Presenter: Yijuan Lu November 12,2004

CS 188: Artificial Intelligence Spring Announcements

Machine Learning Lecture 14

Directed Graphical Models or Bayesian Networks

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

p L yi z n m x N n xi

6.867 Machine learning, lecture 23 (Jaakkola)

CS Lecture 4. Markov Random Fields

Intelligent Systems (AI-2)

Sum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017

2 : Directed GMs: Bayesian Networks

Undirected Graphical Models

CS 484 Data Mining. Classification 7. Some slides are from Professor Padhraic Smyth at UC Irvine

Chris Bishop s PRML Ch. 8: Graphical Models

Graphical Models for Automatic Speech Recognition

Intelligent Systems (AI-2)

Conditional Random Field

Intelligent Systems: Reasoning and Recognition. Reasoning with Bayesian Networks

STA 4273H: Statistical Machine Learning

Introduction to Artificial Intelligence. Unit # 11

Machine learning: lecture 20. Tommi S. Jaakkola MIT CSAIL

Intelligent Systems:

CS 188: Artificial Intelligence Fall 2009

Probabilistic Graphical Models (I)

Introduction to Graphical Models

Probabilistic Reasoning. (Mostly using Bayesian Networks)

CSC 412 (Lecture 4): Undirected Graphical Models

What you need to know about Kalman Filters

Intelligent Systems (AI-2)

Outline. CSE 573: Artificial Intelligence Autumn Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Machine learning: lecture 20. Tommi S. Jaakkola MIT CSAIL

Extensions of Bayesian Networks. Outline. Bayesian Network. Reasoning under Uncertainty. Features of Bayesian Networks.

Sum-Product Networks: A New Deep Architecture

Probabilistic Models

Review: Bayesian learning and inference

Computational Genomics

Preliminaries Bayesian Networks Graphoid Axioms d-separation Wrap-up. Bayesian Networks. Brandon Malone

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov

Outline of Today s Lecture

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas

Pattern Recognition and Machine Learning

Dynamic models 1 Kalman filters, linearization,

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic

Template-Based Representations. Sargur Srihari

Bayes Networks. CS540 Bryan R Gibson University of Wisconsin-Madison. Slides adapted from those used by Prof. Jerry Zhu, CS540-1

Implementing Machine Reasoning using Bayesian Network in Big Data Analytics

Probabilistic Graphical Models

Bayesian Networks (Part II)

Graphical models and causality: Directed acyclic graphs (DAGs) and conditional (in)dependence

Bayesian Networks. instructor: Matteo Pozzi. x 1. x 2. x 3 x 4. x 5. x 6. x 7. x 8. x 9. Lec : Urban Systems Modeling

CPSC 540: Machine Learning

CS 5522: Artificial Intelligence II

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas

Recall from last time. Lecture 3: Conditional independence and graph structure. Example: A Bayesian (belief) network.

STA 4273H: Statistical Machine Learning

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4

CS 5522: Artificial Intelligence II

CS839: Probabilistic Graphical Models. Lecture 2: Directed Graphical Models. Theo Rekatsinas

Announcements. CS 188: Artificial Intelligence Spring Probability recap. Outline. Bayes Nets: Big Picture. Graphical Model Notation

Graphical Models - Part I

Bayesian Networks Introduction to Machine Learning. Matt Gormley Lecture 24 April 9, 2018

Lecture 6: Graphical Models

Bayes Nets: Independence

CS 343: Artificial Intelligence

Introduction to Artificial Intelligence (AI)

Rapid Introduction to Machine Learning/ Deep Learning

Probabilistic Models. Models describe how (a portion of) the world works

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

Directed and Undirected Graphical Models

CS 188: Artificial Intelligence Fall 2008

Recall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem

2 : Directed GMs: Bayesian Networks

Introduction to Bayesian Learning

Tópicos Especiais em Modelagem e Análise - Aprendizado por Máquina CPS863

Probability. CS 3793/5233 Artificial Intelligence Probability 1

COMS 4771 Probabilistic Reasoning via Graphical Models. Nakul Verma

Bayesian belief networks. Inference.

Bayesian Networks. Semantics of Bayes Nets. Example (Binary valued Variables) CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-III

Random Field Models for Applications in Computer Vision

Bayesian Networks Representation

Outline. Spring It Introduction Representation. Markov Random Field. Conclusion. Conditional Independence Inference: Variable elimination

Bayesian Networks. Motivation

CSE 473: Artificial Intelligence Autumn 2011

Lecture 4 October 18th

15-780: Graduate Artificial Intelligence. Bayesian networks: Construction and inference

Transcription:

University of Washington Department of Electrical Engineering EE512 Spring, 2006 Graphical Models Jeff A. Bilmes <bilmes@ee.washington.edu> Lecture 1 Slides March 28 th, 2006 Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 1

Outline of Today s Lecture Class overview What are graphical models Semantics of Bayesian networks Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 2

Books and Sources for Today Jordan: Chapters 1 and 2 Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 3

Class Road Map L1: Tues, 3/28: Overview, GMs, Intro BNs. L2: Thur, 3/30 L3: Tues, 4/4 L4: Thur, 4/6 L5: Tue, 4/11 L6: Thur, 4/13 L7: Tues, 4/18 L8: Thur, 4/20 L9: Tue, 4/25 L10: Thur, 4/27 L11: Tues, 5/2 L12: Thur, 5/4 L13: Tues, 5/9 L14: Thur, 5/11 L15: Tue, 5/16 L16: Thur, 5/18 L17: Tues, 5/23 L18: Thur, 5/25 L19: Tue, 5/30 L20: Thur, 6/1: final presentations Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 4

Announcements READING: Chapter 1,2 in Jordan s book (pick up book from basement of communications copy center). List handout: name, department, and email List handout: regular makeup slot, and discussion section Syllabus Course web page: http://ssli.ee.washington.edu/ee512 Goal: powerpoint slides this quarter, they will be on the web page after lecture (but not before as you are getting them hot off the press). Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 5

Graphical Models A graphical model is a visual, abstract, and mathematically formal description of properties of families of probability distributions (densities, mass functions) There are many different types of Graphical model, ex: Bayesian Networks Factor Graph Markov Random Fields Chain Graph Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 6

GMs cover many well-known methods Graphical Models Other Semantics Causal Models Chain Graphs Factor Graphs Dependency Networks DGMs AR UGMs FST HMM ZMs DBNs Factorial HMM/Mixed Memory Markov Models BMMs Bayesian Networks Mixture Models Kalman Decision Trees Segment Models Simple Models PCA LDA MRFs Gibbs/Boltzman Distributions Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 7

Graphical Models Provide: Structure Algorithms Language Approximations Data-Bases Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 8

Graphical Models Provide GMs give us: I. Structure: A method to explore the structure of natural phenomena (causal vs. correlated relations, properties of natural signals and scenes) II. Algorithms: A set of algorithms that provide efficient probabilistic inference and statistical decision making III. Language: A mathematically formal, abstract, visual language with which to efficiently discuss families of probabilistic models and their properties. Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 9

GMs Provide GMs give us (cont): IV. Approximation: Methods to explore systems of approximation and their implications. E.g., what are the consequences of a (perhaps known to be) wrong assumption? V. Data-base: Provide a probabilistic data-base and corresponding search algorithms for making queries about properties in such model families. Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 10

GMs There are many different types of GM. Each GM has its semantics A GM (under the current semantics) is really a set of constraints. The GM represents all probability distributions that obey these constraints, including those that obey additional constraints (but not including those that obey fewer constraints). Most often, the constraints are some form of factorization property, e.g., f() factorizes (is composed of a product of factors of subsets of arguments). Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 11

Types of Queries Several types of queries we may be interested in: Compute: p(one subset of vars) Compute: p(one subset of vars another subset of vars) Find the N most probable configurations of one subset of variables given assignments of values to some other sets Q: Is one subset independent of another subset? Q: Is one subset independent of another given a third? How efficiently can we do this? Can this question be answered? What if it is too costly, can we approximate, and if so, how well? These are questions we will answer this term. GMs are like a probabilistic data-base (or data structure), a system that can be queried to provide answers to these sorts of questions. Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 12

Example Typical goal of pattern recognition: training (say, EM or gradient descent), need query of form: In this form, we need to compute p(o,h) efficiently. Bayes decision rule, need to find best class for a given unknown pattern: but this is yet another query on a probability distribution. We can train, and perform Bayes decision theory quickly if we can compute with probabilities quickly. Graphical models provide a way to reason about, and understand when this is possible, and if not, how to reasonably approximate. Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 13

Some Notation Random variables ψ ψ, ψ ψ, ψ, ψ (scalar or vector) Distributions: Subsets: Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 14

Main types of Graphical Models Markov Random Fields a form of undirected graphical model relatively simple to understand their semantics also, log-linear models, Gibbs distributions, Boltzman distributions, many exponential models, conditional random fields (CRFs), etc. Bayesian networks a form of directed graphical model originally developed to represent a form of causality, but not ideal for that (they still represent factorization) Semantics more interesting (but trickier) than MRFs Factor Graphs pure, the assembly language models for factorization properties came out of coding theory community (LDPC, Turbo codes) Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 15

Chain graphs: Main types of Graphical Models Hybrid between Bayesian networks and MRFs A set of clusters of undirected nodes connected as directed links Not as widely used, but very powerful. Ancestral graphs we probably won t cover these. Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 16

Bayesian Network Examples Mixture models C Markov Chains Q 1 Q 2 Q 3 Q 4 p( qt q1 : t 1) = p( qt qt 1) X px ( ) = cpxi i ( ) i Q 1 Q 2 Q 3 Q 4 p( qt q1 : t 1) = p( qt qt 1, qt 2) Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 17

GMs: PCA and Factor Analysis Y μ = AX X Y + Q R X N(0, Q) N( μ, R) PCA: Q = Λ, R 0, A = ortho Y = ΓX + μ FA: Q = I, R = diagonal Y = Γ X + u+ μ Other generalizations possible E.g., Q = gen. diagonal, or capture using general A since T Y N( μ, AQA + R) Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 18

Independent Component Analysis I 1 I 2 I I 1 2 X 1 X 2 X 3 X 4 The data X 1:4 is explained by the two (marginally) independent causes. Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 19

Linear Discriminant Analysis fi ( X) p( i) PC ( = i X) = f ( X) p( k) f ( X) = N( μ, Σ) j k Class conditional data has diff. mean but common covariance matrix. Fisher s formulation: project onto space spanned by the means. j k μ C X Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 20

Extensions to LDA HDA/QDA MDA HMDA C C C μ M M X μ μ Heteroschedastic Discriminant Analysis/Quadratic Discriminant Analysis Mixture Discriminant Analsysis X X Both Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 21

Generalized Decision Trees I D 1 D 2 Generalized Probabilistic Decision Trees Hierarchical Mixtures of Experts (Jordan) D 3 O Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 22

Example: Printer Troubleshooting Dechter Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 23

Classifier Combination Mixture of Experts (Sum rule) Naive Bayes (prod. rule) Other Combination Schemes X X 1 X 2... X N C S C 1 C 2 C C X 1 X 2 Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 24

Discriminative and Generative Models Generative Model C Discriminative Model C X X Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 25

HMMs/Kalman Filter HMM Q 1 Q 2 Q 3 Q 4 X 1 X 2 X 3 X 4 Autoregressive HMM Q 1 Q 2 Q 3 Q 4 X 1 X 2 X 3 X 4 Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 26

Switching Kalman Filter Q 1 Q 2 Q 3 Q 4 S 1 S 2 S 3 S 4 X 1 X 2 X 3 X 4 Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 27

Factorial HMM Q 1 Q 2 Q 3 Q 4 Q 1 Q 2 Q 3 Q 4 X 1 X 2 X 3 X 4 Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 28

Standard Language Modeling Example: standard 4-gram Pw ( h) = Pw ( w, w, w ) t t t t 1 t 2 t 3 W t 4 W t 3 W t 2 Wt 1 W t Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 29

Interpolated Uni-,Bi-,Tri-Grams Pw ( h) = P( α = 1) Pw ( ) + P( α = 2) Pw ( w ) t t t t t t t 1 + P( α = 3) p( w w, w ) t t t 1 t 2 Nothing gets zero probability Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 30

Conditional mixture tri-grams Pw ( h) = P( α = 1 w, w ) Pw ( ) t t t t 1 t 2 t + P( α = 2 w, w ) P( w w ) t t 1 t 2 t t 1 + P( α = 3 w, w ) p( w w, w ) t t 1 t 2 t t 1 t 2 Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 31

Bayesian Networks and so on We need to be more formal about what BNs mean. In the rest of this, and in the next, lecture, we start with basic semantics of BNs, move on to undirected models, and then come back to BNs again to clean up Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 32

Bayesian Networks Has nothing to do with Bayesian statistical models (there are Bayesian and non-bayesian Bayesian networks). valid: invalid: Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 33

BN: Alarm Network Example Compact representation: factors of probabilities of children given parents (J. Pearl, 1988) Sub-family specification: Directed acyclic graph (DAG) Nodes - random variables Edges - direct influence Radio Earthquake Burglary Alarm E e e e e B b b b b P(A E,B) 0.9 0.1 0.2 0.8 0.9 0.1 0.01 0.99 Together (graph and inst.): Defines a unique distribution in a factored form PBEACR (,,,, ) = PBPEPABEPR ( ) ( ) (, ) ( EPC ) ( A) Instantiation: Set of conditional probability distributions Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 34 Call

Bayesian Networks Reminder: chain rule of probability. For any order σ Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 35

Bayesian Networks Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 36

Conditional Independence Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 37

Conditional Independence & BNs GMs can help answer CI queries: X A B Z X Z Y??? C E D No!! Y Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 38

Bayesian Networks & CI Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 39

Bayesian Networks & CI Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 40

Bayesian Networks & CI Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 41

Bayesian Networks & CI Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 42

Pictorial d-separation: blocked/unblocked paths Blocked Paths Unblocked Paths Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 43

Three Canonical Cases Three 3-node examples of BNs and their conditional independence statements. V 1 V 1 V 2 V 3 V 2 V 3 V V V 2 3 1 V V 2 3 V 2 V 3 V V V 2 3 1 V V 2 3 V 1 V V 2 3 V V V 2 3 1 Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 44

Case 1 Markov Chain Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 45

Case 2 Still a Markov Chain Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 46

Case 3 NOT a Markov Chain Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 47

Examples of the three cases SUVs Greenhouse Gasses Global Warming Lung Cancer Smoking Bad Breath Genetics Cancer Smoking Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 48

Bayes Ball simple algorithm, ball bouncing along path in graph, to tell if path is blocked or not (more details in text). Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 49

What are implied conditional independences? Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 50

Two Views of a Family Lec 1: March 28th, 2006 EE512 - Graphical Models - J. Bilmes Page 51