Introduction to Probabilistic Graphical Models
|
|
- Della Flowers
- 6 years ago
- Views:
Transcription
1 Introduction to Probabilistic Graphical Models Kyu-Baek Hwang and Byoung-Tak Zhang Biointelligence Lab School of Computer Science and Engineering Seoul National University Seoul Korea
2 Overview I Graphical models are a marriage between probability theory and graph theory. They provide a natural tool for dealing with two problems that occur throughout applied mathematics and engineering -- uncertainty and complexity -- and in particular they are playing an increasingly important role in the design and analysis of machine learning algorithms. Fundamental to the idea of a graphical model is the notion of modularity -- a complex system is built by combining simpler parts. Probability theory provides the glue whereby the parts are combined ensuring that the system as a whole is consistent and providing ways to interface models to data. (c) 2004 SNU CSE Biointelligence Lab 2
3 Overview II The graph theoretic side of graphical models provides both an intuitively appealing interface by which humans can model highlyinteracting sets of variables as well as a data structure that lends itself naturally to the design of efficient general-purpose algorithms. Many of the classical multivariate probabilistic systems studied in fields such as statistics systems engineering information theory pattern recognition and statistical mechanics are special cases of the general graphical model formalism -- examples include mixture models factor analysis hidden Markov models Kalman filters and Ising models. (c) 2004 SNU CSE Biointelligence Lab 3
4 Overview III The graphical model framework provides a way to view all of these systems as instances of a common underlying formalism. This view has many advantages -- in particular specialized techniques that have been developed in one field can be transferred between research communities and exploited more widely. Moreover the graphical model formalism provides a natural framework for the design of new systems. --- Michael Jordan (c) 2004 SNU CSE Biointelligence Lab 4
5 Three Points of View Representation Basically probabilistic graphical models (PGMs) represent the probabilistic relationship among a set of random variables. For example the relationship between symptoms and disease. Inference Given a PGM one can calculate a conditional probability of interest. Learning Given a set of examples (data) one can find a plausible PGM for describing the underlying process. (c) 2004 SNU CSE Biointelligence Lab 5
6 Two Main Classes of PGMs Undirected models Edges have no direction. Markov random fields Markov networks etc. Image analysis Directed models Edges have a direction. Bayesian networks belief networks etc. Intuitive Causal analysis (c) 2004 SNU CSE Biointelligence Lab 6
7 Contents Bayesian networks Causal networks Bayesian networks Inference in Bayesian networks Learning Bayesian networks from data Applications Concluding remarks Bibliography (c) 2004 SNU CSE Biointelligence Lab 7
8 Causal Networks Node: event Arc: causal relationship between two nodes A B: A causes B. Causal network for the car start problem [Jensen 01] Fuel Clean Spark Plugs Fuel Meter Standing Start (c) 2004 SNU CSE Biointelligence Lab 8
9 Reasoning with Causal Networks 1. My car does not start. increases the certainty of no fuel and dirty spark plugs. increases the certainty of fuel meter s standing for the empty. 2. Fuel meter stands for the half. decreases the certainty of no fuel increases the certainty of dirty spark plugs. Fuel Clean Spark Plugs Fuel Meter Standing Start (c) 2004 SNU CSE Biointelligence Lab 9
10 d-separation : the Set of Rules for Reasoning Connections in causal networks Serial Converging Diverging Definition [Jensen 01]: Two nodes in a causal network are d- separated if for all paths between them there is an intermediate node V such that the connection is serial or diverging and the state of V is known or the connection is converging and neither V nor any of V s descendants have received evidence. If A and B are d-separated then changes in the certainty of A have no impact on the certainty of B and vice versa. (c) 2004 SNU CSE Biointelligence Lab 10
11 d-separation in the Car Start Problem 1. Start and Fuel are dependent on each other. 2. Start and Clean Spark Plugs are dependent on each other. 3. Fuel and Fuel Meter Standing are dependent on each other. 4. Fuel and Clean Spark Plugs are conditionally dependent on each other given the value of Start. Fuel Clean Spark Plugs Fuel Meter Standing Start (c) 2004 SNU CSE Biointelligence Lab 11
12 Probability for the Certainty in Causal Networks Basic axioms P(A) = 1 iff A is certain. Σ A P(A) = 1. (the summation is taken over all the possible values of A.) P(A B) = P(A) + P(B) iff A and B are mutually exclusive. Conditional probability P(A B) = P(A B) / P(B) = P(B A)P(A) / P(B) Event in the causal network a variable If A and B are d-separated then P(A B) = P(A). A and B are independent. A and B are (conditionally) independent given the value of C. (c) 2004 SNU CSE Biointelligence Lab 12
13 Definition: Bayesian Networks A Bayesian network consists of the following. A set of n variables X = {X 1 X 2 X n } and a set of directed edges between variables. The variables (nodes) with the directed edges form a directed acyclic graph (DAG) structure. Directed cycles are not modeled. To each variable X i and its parents Pa(X i ) there is attached a conditional probability table for P(X i Pa(X i )). Modeling for the continuous variables is also possible. (c) 2004 SNU CSE Biointelligence Lab 13
14 Bayesian Network Represents the Joint Probability Distribution By the d-separation property the Bayesian network over n variables X = {X 1 X 2 X n } represents P(X) as follows: P( X X... X 1 2 n i = 1 ) = n P( X i Pa( X i )). Given the joint probability distribution any conditional probability can be calculated in principle. (c) 2004 SNU CSE Biointelligence Lab 14
15 Bayesian Network for the Car Start Problem P(Fu = Yes) = 0.98 P(CSP = Yes) = 0.96 Fuel Clean Spark Plugs Fuel Meter Standing Start P(FMS Fu) P(St Fu CSP) Fu = Yes Fu = No Fu = Yes Fu = No FMS = Full CSP = Yes ( ) (0 1) FMS = Half CSP = No ( ) (0 1) FMS = Empty (c) 2004 SNU CSE Biointelligence Lab 15
16 The Car Start Problem Revisited 1. No start P(St = No) = 1 (evidence 1) Update the conditional probabilities P(Fu St = No) P(CSP St = No) and P(FMS St = No) 2. Fuel meter stands for the half P(FMS = Half) = 1 (evidence 2) Update the conditional probabilities P(Fu St = No FMS = Half) and P(CSP St = No FMS = Half). Fuel Clean Spark Plugs Fuel Meter Standing Start (c) 2004 SNU CSE Biointelligence Lab 16
17 Calculation of the Conditional Probabilities Calculation of P(CSP St = No FMS = Half) is as follows. P( CSP St FMS) = P( CSP St FMS) P( St FMS) = Fu Fu CSP P( Fu CSP St FMS) P( Fu CSP St FMS) Summations in the above equation are taken over all the possible values of the variables. In general cases the calculation of the conditional probability by marginalization is nearly impossible. (c) 2004 SNU CSE Biointelligence Lab 17
18 Initial State P(Fu) P(CSP) P(St) and P(FMS) (c) 2004 SNU CSE Biointelligence Lab 18
19 No Start P(Fu St = No) P(CSP St = No) and P(FMS St = No) (c) 2004 SNU CSE Biointelligence Lab 19
20 Fuel Meter Stands for Half P(Fu St = No FMS = Half) and P(CSP St = No FMS = Half) (c) 2004 SNU CSE Biointelligence Lab 20
21 Causal Networks vs. Bayesian Networks Certainty vs. probability calculus A causes B vs. B depends on A P(B A) conditional probability Impact dependence d-separation conditional independencies Causality probabilistic dependence Probabilistic dependence causality ( ) (c) 2004 SNU CSE Biointelligence Lab 21
22 Equivalent Bayesian Network Structures Bayesian network structure A corresponding set of probability distributions Informal definition: equivalence of the Bayesian network structure Two Bayesian network structures are equivalent if the set of distributions that can be represented using one of the DAGs is identical to the set of distributions that can be represented using the other. (c) 2004 SNU CSE Biointelligence Lab 22
23 Example: Equivalent Two DAGs X Y X Y Two DAGs say that X and Y are dependent on each other. Equivalence class (c) 2004 SNU CSE Biointelligence Lab 23
24 Verma and Pearl s s Theorem Theorem [Verma and Pearl 90]: Two DAGs are equivalent if and only if they have the same skeleton and the same v-structures. X Y v-structure (X Z Y) Z : X and Y are parents of Z and not adjacent to each other. (c) 2004 SNU CSE Biointelligence Lab 24
25 PDAG Representations Minimal PDAG representations of the equivalence class The only directed edges are those that participate in v-structures. Completed PDAG representation Every directed edge corresponds to a compelled edge and every undirected edge corresponds to a reversible edge. (c) 2004 SNU CSE Biointelligence Lab 25
26 Example: PDAG Representations X W V X W V Y Z Y Z An equivalence class X W V X W V Minimal PDAG Y Y Completed PDAG Z Z (c) 2004 SNU CSE Biointelligence Lab 26
27 Inference in Bayesian Networks Infer the probability of an event given some observations. [Frey 98] Infer the exact distribution over small groups of variables in singly-connected networks. Probability propagation Convert a multiply-connected network to the singlyconnected network. Not practical especially for large networks Approximate inference methods Monte Carlo approaches Variational methods Helmholtz machines (c) 2004 SNU CSE Biointelligence Lab 27
28 Singly-Connected Networks A singly-connected network has only a single path (ignoring edge directions) connecting any two vertices. f C x s f A u y z v f D f B w f E (c) 2004 SNU CSE Biointelligence Lab 28
29 (c) 2004 SNU CSE Biointelligence Lab 29 Factorization of Factorization of the Global Distribution and Inference the Global Distribution and Inference Example network represents the joint probability distribution as follows: The probability of s given the value of z is calculated as ). ( ) ( ) ( ) ( ) ( ) ( z y f y u f x u f w v f v u s f z y x w v u s P E D C B A = ')]}. ( ) ( )][ ( )}{[ ( ){ ( ') ( ') ( ') ( ') ( ') / ( ') ( = = = = = = = = = = y E D x C w B v u A y x w v u s z z y f y u f x u f w v f v u s f z z s P z z y x w v u s P z z s P z z s P z z s P z z s P
30 The Generalized Forward-Backward Algorithm The generalized forward-backward algorithm is one flavor of the probability propagation. The generalized forward-backward algorithm: 1. Convert a Bayesian network into the factor graph. 2. The factor graph is arranged as a horizontal tree with an arbitrary chosen root vertex. 3. Beginning at the left-most level messages are passed level by level forward to the root. 4. Messages are passed level by level backward from root to the leaves. Messages represent the propagated probability through edges of the graphical model. (c) 2004 SNU CSE Biointelligence Lab 30
31 Convert a Bayesian Network into the Factor Graph z 10 z 1 z 2 z 3 z z 4 1 z 5 z 6 z 7 z 2 z 3 z 5 z 6 z 4 z 7 z 7 z 4 z 3 z 2 z 6 z9 z 8 z 9 z 10 z 8 z 9 z 10 z 1 z 5 z 8 (c) 2004 SNU CSE Biointelligence Lab 31
32 Message Passing in the Graphical Model Two types of messages: Variable-to-function messages Function-to-variable messages f B y x µ x A µ A x f A f C z (c) 2004 SNU CSE Biointelligence Lab 32
33 Calculation of the Message The variable-to-function message: If x is unobserved then µ x A( x) = µ B x( x) µ C x( x). If x is observed as x then µ x ( x') 1 µ ( x) = A = x A 0 (for other values). The function-to-variable message: µ ( x ) f ( x y z) µ ( y) µ A x = y z A y A z A ( z). (c) 2004 SNU CSE Biointelligence Lab 33
34 Computation of the Conditional Probability After the generalized forward-backward algorithm ends each edge in the factor graph has its calculated message values. The probability of x given the observations v is as follows: P( x v) = βµ ( x) µ ( x) µ ( x) A x B x where β is a normalizing constant. C x (c) 2004 SNU CSE Biointelligence Lab 34
35 Inference in the Multiply-Connected Network Probabilistic inference in Bayesian networks (also in Markov random fields and factor graphs) in general is very hard. Approximate inferences Use probability propagation in the multiply-connected network. Monte Carlo methods Variational inference Helmholtz machines (c) 2004 SNU CSE Biointelligence Lab 35
36 Learning Bayesian Networks Parametric learning Learn the local probability distribution for each node given a DAG structure. P( X1 X 2... X n) = i = P( X i Pa( X 1 i )) Structural learning Learn the DAG structure. Bayesian network learning Structural learning parametric learning n (c) 2004 SNU CSE Biointelligence Lab 36
37 Four Possible Situations Given structure complete data ML MAP and Bayesian learning Given structure incomplete data EM algorithm variational method and Markov chain Monte Carlo (MCMC) method Unknown structure complete data Greedy search GA MCMC and Bayesian learning Unknown structure incomplete data Structure search + EM or MCMC (c) 2004 SNU CSE Biointelligence Lab 37
38 Parametric Learning Learning for the local probability distribution Complete data Maximum likelihood learning Bayesian learning [Heckerman 96] P( θ ) = Dir( θ α... α ij ij Incomplete data ij ij ij1 P( θ D) = Dir( θ α + N... α EM (expectation-maximization) algorithm [Heckerman 96] Markov chain Monte Carlo methods ij1 ijr i ) ij1 ijr i + N ijr i ) (c) 2004 SNU CSE Biointelligence Lab 38
39 Structural Learning Metric approach Use a scoring metric to measure how well a particular structure fits an observed set of cases. A search algorithm is used. Find a canonical form of an equivalence class. Independence approach An independence oracle (approximated by some statistical test) is queried to identify the equivalence class that captures the independencies in the distribution from which the data was generated. Search for a PDAG. (c) 2004 SNU CSE Biointelligence Lab 39
40 Scoring Metrics for Bayesian Networks Likelihood L(G θ G C) = P(C G h θ G ) G h : the hypothesis that the data (C) was generated by a distribution that can be factored according to G. The maximum likelihood metric of G M ML ( G C) = max L( G θ C) θ G prefer the complete graph structure. G (c) 2004 SNU CSE Biointelligence Lab 40
41 Information Criterion Scoring Metrics The Akaike information criterion (AIC) metric M AIC ( G C) = log M ( G C) Dim( G) ML The Bayesian information criterion (BIC) metric M BIC ( G C) = log M ( G C) ML 1 2 Dim( G) log N (c) 2004 SNU CSE Biointelligence Lab 41
42 MDL Scoring Metrics The minimum description length (MDL) metric 1 M 1( G C) = log P( G) M BIC ( G C) MDL + The minimum description length (MDL) metric 2 M MDL 2( G C) = log M ML( G C) EG log N c Dim( G) (c) 2004 SNU CSE Biointelligence Lab 42
43 Bayesian Scoring Metrics A Bayesian metric h h M ( G C ξ ) = log P( G ξ ) + log P( C G ξ ) + c The BDe (Bayesian Dirichlet & likelihood equivalence) metric [Heckerman et al. 95] p( C G) = = p( G) p( G) p( C G) Γ( α ) Γ( α n q ij r ijk i= 1 j= 1 k = 1 Γ( αij + Nij ) Γ i i N + ( α ijk ) ijk ). α ij = α N N k ijk ij = k Γ(1) = 1 Γ( x + 1) = xγ( x) ijk Prior Sufficient statistics calculated from D (c) 2004 SNU CSE Biointelligence Lab 43
44 Greedy Search Algorithm for Bayesian Network Learning Generate the initial Bayesian network structure G 0. For m = until convergence. Among all the possible local changes (insertion of an edge reversal of an edge and deletion of an edge) in G m 1 the one leads to the largest improvement in the score is performed. The resulting graph is G m. Stopping criterion Score(G m 1 ) == Score(G m ). At each iteration (learning Bayesian networks consisting of n variables) O(n 2 ) local changes should be evaluated to select the best one. Random restarts is usually adopted to escape the local maxima. (c) 2004 SNU CSE Biointelligence Lab 44
45 Other Approaches to the Structural Learning Genetic algorithms Markov chain Monte Carlo sampling Bayesian learning Summing over all the possible structures Possible space is exponential in the number of variables. approximation (c) 2004 SNU CSE Biointelligence Lab 45
46 Applications Classification Neural networks vs. PGMs Text mining Topic extraction Motion tracking Bioinformatics Gene-regulatory network construction Gene-drug dependency analysis (c) 2004 SNU CSE Biointelligence Lab 46
47 Gene-Regulatory Network Construction Eran Segal et al. Module Networks: Identifying Regulatory Modules and their Condition Specific Regulators from Gene Expression Data Nature Genetics 34(2): (c) 2004 SNU CSE Biointelligence Lab 47
48 Gene-Drug Dependency Analysis (c) 2004 SNU CSE Biointelligence Lab 48
49 Concluding Remarks Probabilistic graphical models Probability theory (uncertainty) + Graph theory (complexity) Framework of thought Artificial intelligence machine learning data mining Representation inference and learning Further works are needed for these topics. In the viewpoint of engineering Implement an established theory for specific applications. (c) 2004 SNU CSE Biointelligence Lab 49
50 Bibliography [Jensen 96] Jensen F.V. An Introduction to Bayesian Networks Springer-Verlag [Jensen 01] Jensen F.V. Bayesian Networks and Decision Graphs Springer-Verlag [Heckerman 96] Heckerman D. A tutorial on learning with Bayesian networks Technical Report MSR-TR Microsoft Research [Pearl 88] Pearl J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference Morgan Kaufmann Publishers [Spirtes et al. 00] Spirtes P. Glymour C. and Scheines R. Causation Prediction and Search 2 nd edition MIT Press [Frey 98] Frey B.J. Graphical Models for Machine Learning and Digital Communication MIT Press [Friedman and Goldszmidt 99] Friedman N. and Goldszmidt M. Learning Bayesian networks with local structure Learning in Graphical Models pp MIT Press [Heckerman et al. 95] Heckerman D. Geiger D. and Chickering D.M. Learning Bayesian networks: the combination of knowledge and statistical data Technical Report MSR-TR Microsoft Research [Verma and Pearl 90] Verma T. and Pearl J. Equivalence and synthesis of causal models In Proceedings of UAI 90 pp (c) 2004 SNU CSE Biointelligence Lab 50
Learning in Bayesian Networks
Learning in Bayesian Networks Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Berlin: 20.06.2002 1 Overview 1. Bayesian Networks Stochastic Networks
More informationA Brief Introduction to Graphical Models. Presenter: Yijuan Lu November 12,2004
A Brief Introduction to Graphical Models Presenter: Yijuan Lu November 12,2004 References Introduction to Graphical Models, Kevin Murphy, Technical Report, May 2001 Learning in Graphical Models, Michael
More informationCOMP538: Introduction to Bayesian Networks
COMP538: Introduction to Bayesian Networks Lecture 9: Optimal Structure Learning Nevin L. Zhang lzhang@cse.ust.hk Department of Computer Science and Engineering Hong Kong University of Science and Technology
More informationLecture 6: Graphical Models: Learning
Lecture 6: Graphical Models: Learning 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering, University of Cambridge February 3rd, 2010 Ghahramani & Rasmussen (CUED)
More informationTDT70: Uncertainty in Artificial Intelligence. Chapter 1 and 2
TDT70: Uncertainty in Artificial Intelligence Chapter 1 and 2 Fundamentals of probability theory The sample space is the set of possible outcomes of an experiment. A subset of a sample space is called
More informationOutline. Spring It Introduction Representation. Markov Random Field. Conclusion. Conditional Independence Inference: Variable elimination
Probabilistic Graphical Models COMP 790-90 Seminar Spring 2011 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Outline It Introduction ti Representation Bayesian network Conditional Independence Inference:
More informationMachine Learning Summer School
Machine Learning Summer School Lecture 3: Learning parameters and structure Zoubin Ghahramani zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/ Department of Engineering University of Cambridge,
More informationLearning With Bayesian Networks. Markus Kalisch ETH Zürich
Learning With Bayesian Networks Markus Kalisch ETH Zürich Inference in BNs - Review P(Burglary JohnCalls=TRUE, MaryCalls=TRUE) Exact Inference: P(b j,m) = c Sum e Sum a P(b)P(e)P(a b,e)p(j a)p(m a) Deal
More informationRespecting Markov Equivalence in Computing Posterior Probabilities of Causal Graphical Features
Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Respecting Markov Equivalence in Computing Posterior Probabilities of Causal Graphical Features Eun Yong Kang Department
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationBayesian Networks Inference with Probabilistic Graphical Models
4190.408 2016-Spring Bayesian Networks Inference with Probabilistic Graphical Models Byoung-Tak Zhang intelligence Lab Seoul National University 4190.408 Artificial (2016-Spring) 1 Machine Learning? Learning
More informationp L yi z n m x N n xi
y i z n x n N x i Overview Directed and undirected graphs Conditional independence Exact inference Latent variables and EM Variational inference Books statistical perspective Graphical Models, S. Lauritzen
More informationMachine Learning Lecture 14
Many slides adapted from B. Schiele, S. Roth, Z. Gharahmani Machine Learning Lecture 14 Undirected Graphical Models & Inference 23.06.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de
More informationProbabilistic Graphical Networks: Definitions and Basic Results
This document gives a cursory overview of Probabilistic Graphical Networks. The material has been gleaned from different sources. I make no claim to original authorship of this material. Bayesian Graphical
More informationRapid Introduction to Machine Learning/ Deep Learning
Rapid Introduction to Machine Learning/ Deep Learning Hyeong In Choi Seoul National University 1/32 Lecture 5a Bayesian network April 14, 2016 2/32 Table of contents 1 1. Objectives of Lecture 5a 2 2.Bayesian
More information4 : Exact Inference: Variable Elimination
10-708: Probabilistic Graphical Models 10-708, Spring 2014 4 : Exact Inference: Variable Elimination Lecturer: Eric P. ing Scribes: Soumya Batra, Pradeep Dasigi, Manzil Zaheer 1 Probabilistic Inference
More informationMixtures of Gaussians with Sparse Structure
Mixtures of Gaussians with Sparse Structure Costas Boulis 1 Abstract When fitting a mixture of Gaussians to training data there are usually two choices for the type of Gaussians used. Either diagonal or
More informationTowards an extension of the PC algorithm to local context-specific independencies detection
Towards an extension of the PC algorithm to local context-specific independencies detection Feb-09-2016 Outline Background: Bayesian Networks The PC algorithm Context-specific independence: from DAGs to
More informationProbabilistic Graphical Models (I)
Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random
More informationBayesian Networks BY: MOHAMAD ALSABBAGH
Bayesian Networks BY: MOHAMAD ALSABBAGH Outlines Introduction Bayes Rule Bayesian Networks (BN) Representation Size of a Bayesian Network Inference via BN BN Learning Dynamic BN Introduction Conditional
More informationLearning Causal Bayesian Networks from Observations and Experiments: A Decision Theoretic Approach
Learning Causal Bayesian Networks from Observations and Experiments: A Decision Theoretic Approach Stijn Meganck 1, Philippe Leray 2, and Bernard Manderick 1 1 Vrije Universiteit Brussel, Pleinlaan 2,
More informationAn Empirical-Bayes Score for Discrete Bayesian Networks
JMLR: Workshop and Conference Proceedings vol 52, 438-448, 2016 PGM 2016 An Empirical-Bayes Score for Discrete Bayesian Networks Marco Scutari Department of Statistics University of Oxford Oxford, United
More informationPart I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS
Part I C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Probabilistic Graphical Models Graphical representation of a probabilistic model Each variable corresponds to a
More informationLinear Dynamical Systems
Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations
More informationCS Lecture 3. More Bayesian Networks
CS 6347 Lecture 3 More Bayesian Networks Recap Last time: Complexity challenges Representing distributions Computing probabilities/doing inference Introduction to Bayesian networks Today: D-separation,
More informationArtificial Intelligence: Cognitive Agents
Artificial Intelligence: Cognitive Agents AI, Uncertainty & Bayesian Networks 2015-03-10 / 03-12 Kim, Byoung-Hee Biointelligence Laboratory Seoul National University http://bi.snu.ac.kr A Bayesian network
More informationAn Empirical-Bayes Score for Discrete Bayesian Networks
An Empirical-Bayes Score for Discrete Bayesian Networks scutari@stats.ox.ac.uk Department of Statistics September 8, 2016 Bayesian Network Structure Learning Learning a BN B = (G, Θ) from a data set D
More informationA graph contains a set of nodes (vertices) connected by links (edges or arcs)
BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,
More informationDirected and Undirected Graphical Models
Directed and Undirected Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Last Lecture Refresher Lecture Plan Directed
More informationRecent Advances in Bayesian Inference Techniques
Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian
More informationIntroduction to Probabilistic Graphical Models
Introduction to Probabilistic Graphical Models Sargur Srihari srihari@cedar.buffalo.edu 1 Topics 1. What are probabilistic graphical models (PGMs) 2. Use of PGMs Engineering and AI 3. Directionality in
More informationCausality in Econometrics (3)
Graphical Causal Models References Causality in Econometrics (3) Alessio Moneta Max Planck Institute of Economics Jena moneta@econ.mpg.de 26 April 2011 GSBC Lecture Friedrich-Schiller-Universität Jena
More informationCausal Inference & Reasoning with Causal Bayesian Networks
Causal Inference & Reasoning with Causal Bayesian Networks Neyman-Rubin Framework Potential Outcome Framework: for each unit k and each treatment i, there is a potential outcome on an attribute U, U ik,
More informationAbstract. Three Methods and Their Limitations. N-1 Experiments Suffice to Determine the Causal Relations Among N Variables
N-1 Experiments Suffice to Determine the Causal Relations Among N Variables Frederick Eberhardt Clark Glymour 1 Richard Scheines Carnegie Mellon University Abstract By combining experimental interventions
More informationBiointelligence Lab School of Computer Sci. & Eng. Seoul National University
Artificial Intelligence Chater 19 easoning with Uncertain Information Biointelligence Lab School of Comuter Sci. & Eng. Seoul National University Outline l eview of Probability Theory l Probabilistic Inference
More informationChris Bishop s PRML Ch. 8: Graphical Models
Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular
More informationIntelligent Systems: Reasoning and Recognition. Reasoning with Bayesian Networks
Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2016/2017 Lesson 13 24 march 2017 Reasoning with Bayesian Networks Naïve Bayesian Systems...2 Example
More informationLearning Bayesian Networks: The Combination of Knowledge and Statistical Data
Learning Bayesian Networks: The Combination of Knowledge and Statistical Data David Heckerman Dan Geiger Microsoft Research, Bldg 9S Redmond, WA 98052-6399 David M. Chickering heckerma@microsoft.com, dang@cs.technion.ac.il,
More informationDirected Graphical Models
Directed Graphical Models Instructor: Alan Ritter Many Slides from Tom Mitchell Graphical Models Key Idea: Conditional independence assumptions useful but Naïve Bayes is extreme! Graphical models express
More informationDirected Graphical Models
CS 2750: Machine Learning Directed Graphical Models Prof. Adriana Kovashka University of Pittsburgh March 28, 2017 Graphical Models If no assumption of independence is made, must estimate an exponential
More informationGraphical models and causality: Directed acyclic graphs (DAGs) and conditional (in)dependence
Graphical models and causality: Directed acyclic graphs (DAGs) and conditional (in)dependence General overview Introduction Directed acyclic graphs (DAGs) and conditional independence DAGs and causal effects
More informationDirected Graphical Models or Bayesian Networks
Directed Graphical Models or Bayesian Networks Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Bayesian Networks One of the most exciting recent advancements in statistical AI Compact
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Lecture 5 Bayesian Learning of Bayesian Networks CS/CNS/EE 155 Andreas Krause Announcements Recitations: Every Tuesday 4-5:30 in 243 Annenberg Homework 1 out. Due in class
More informationTópicos Especiais em Modelagem e Análise - Aprendizado por Máquina CPS863
Tópicos Especiais em Modelagem e Análise - Aprendizado por Máquina CPS863 Daniel, Edmundo, Rosa Terceiro trimestre de 2012 UFRJ - COPPE Programa de Engenharia de Sistemas e Computação Bayesian Networks
More informationBayesian Networks. Introduction
Bayesian Networks Introduction Bayesian networks (BNs), also known as belief networks (or Bayes nets for short), belong to the family of probabilistic graphical models (GMs). These graphical structures
More informationLecture 5: Bayesian Network
Lecture 5: Bayesian Network Topics of this lecture What is a Bayesian network? A simple example Formal definition of BN A slightly difficult example Learning of BN An example of learning Important topics
More informationGraphical Models and Kernel Methods
Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.
More information3 : Representation of Undirected GM
10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:
More informationECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4
ECE52 Tutorial Topic Review ECE52 Winter 206 Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides ECE52 Tutorial ECE52 Winter 206 Credits to Alireza / 4 Outline K-means, PCA 2 Bayesian
More information9 Forward-backward algorithm, sum-product on factor graphs
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous
More informationLearning of Causal Relations
Learning of Causal Relations John A. Quinn 1 and Joris Mooij 2 and Tom Heskes 2 and Michael Biehl 3 1 Faculty of Computing & IT, Makerere University P.O. Box 7062, Kampala, Uganda 2 Institute for Computing
More informationProbabilistic Graphical Models
2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector
More informationCS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016
CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016 Plan for today and next week Today and next time: Bayesian networks (Bishop Sec. 8.1) Conditional
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More informationStatistical Approaches to Learning and Discovery
Statistical Approaches to Learning and Discovery Graphical Models Zoubin Ghahramani & Teddy Seidenfeld zoubin@cs.cmu.edu & teddy@stat.cmu.edu CALD / CS / Statistics / Philosophy Carnegie Mellon University
More informationBeyond Uniform Priors in Bayesian Network Structure Learning
Beyond Uniform Priors in Bayesian Network Structure Learning (for Discrete Bayesian Networks) scutari@stats.ox.ac.uk Department of Statistics April 5, 2017 Bayesian Network Structure Learning Learning
More informationLearning causal network structure from multiple (in)dependence models
Learning causal network structure from multiple (in)dependence models Tom Claassen Radboud University, Nijmegen tomc@cs.ru.nl Abstract Tom Heskes Radboud University, Nijmegen tomh@cs.ru.nl We tackle the
More information2 : Directed GMs: Bayesian Networks
10-708: Probabilistic Graphical Models 10-708, Spring 2017 2 : Directed GMs: Bayesian Networks Lecturer: Eric P. Xing Scribes: Jayanth Koushik, Hiroaki Hayashi, Christian Perez Topic: Directed GMs 1 Types
More informationProbabilistic Graphical Models for Image Analysis - Lecture 1
Probabilistic Graphical Models for Image Analysis - Lecture 1 Alexey Gronskiy, Stefan Bauer 21 September 2018 Max Planck ETH Center for Learning Systems Overview 1. Motivation - Why Graphical Models 2.
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationIntroduction to Probabilistic Graphical Models
Introduction to Probabilistic Graphical Models Franz Pernkopf, Robert Peharz, Sebastian Tschiatschek Graz University of Technology, Laboratory of Signal Processing and Speech Communication Inffeldgasse
More informationSupplementary material to Structure Learning of Linear Gaussian Structural Equation Models with Weak Edges
Supplementary material to Structure Learning of Linear Gaussian Structural Equation Models with Weak Edges 1 PRELIMINARIES Two vertices X i and X j are adjacent if there is an edge between them. A path
More informationIntroduction to Artificial Intelligence. Unit # 11
Introduction to Artificial Intelligence Unit # 11 1 Course Outline Overview of Artificial Intelligence State Space Representation Search Techniques Machine Learning Logic Probabilistic Reasoning/Bayesian
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationCausal Models with Hidden Variables
Causal Models with Hidden Variables Robin J. Evans www.stats.ox.ac.uk/ evans Department of Statistics, University of Oxford Quantum Networks, Oxford August 2017 1 / 44 Correlation does not imply causation
More informationIntroduction to Bayes Nets. CS 486/686: Introduction to Artificial Intelligence Fall 2013
Introduction to Bayes Nets CS 486/686: Introduction to Artificial Intelligence Fall 2013 1 Introduction Review probabilistic inference, independence and conditional independence Bayesian Networks - - What
More informationBayesian Networks. Characteristics of Learning BN Models. Bayesian Learning. An Example
Bayesian Networks Characteristics of Learning BN Models (All hail Judea Pearl) (some hail Greg Cooper) Benefits Handle incomplete data Can model causal chains of relationships Combine domain knowledge
More informationChapter 4 Dynamic Bayesian Networks Fall Jin Gu, Michael Zhang
Chapter 4 Dynamic Bayesian Networks 2016 Fall Jin Gu, Michael Zhang Reviews: BN Representation Basic steps for BN representations Define variables Define the preliminary relations between variables Check
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationInference in Graphical Models Variable Elimination and Message Passing Algorithm
Inference in Graphical Models Variable Elimination and Message Passing lgorithm Le Song Machine Learning II: dvanced Topics SE 8803ML, Spring 2012 onditional Independence ssumptions Local Markov ssumption
More informationDynamic Approaches: The Hidden Markov Model
Dynamic Approaches: The Hidden Markov Model Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Inference as Message
More informationIntroduction to Causal Calculus
Introduction to Causal Calculus Sanna Tyrväinen University of British Columbia August 1, 2017 1 / 1 2 / 1 Bayesian network Bayesian networks are Directed Acyclic Graphs (DAGs) whose nodes represent random
More informationCS 484 Data Mining. Classification 7. Some slides are from Professor Padhraic Smyth at UC Irvine
CS 484 Data Mining Classification 7 Some slides are from Professor Padhraic Smyth at UC Irvine Bayesian Belief networks Conditional independence assumption of Naïve Bayes classifier is too strong. Allows
More informationExact model averaging with naive Bayesian classifiers
Exact model averaging with naive Bayesian classifiers Denver Dash ddash@sispittedu Decision Systems Laboratory, Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA 15213 USA Gregory F
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Matrix Data: Classification: Part 2 Instructor: Yizhou Sun yzsun@ccs.neu.edu September 21, 2014 Methods to Learn Matrix Data Set Data Sequence Data Time Series Graph & Network
More informationRecall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem
Recall from last time: Conditional probabilities Our probabilistic models will compute and manipulate conditional probabilities. Given two random variables X, Y, we denote by Lecture 2: Belief (Bayesian)
More informationStudy Notes on the Latent Dirichlet Allocation
Study Notes on the Latent Dirichlet Allocation Xugang Ye 1. Model Framework A word is an element of dictionary {1,,}. A document is represented by a sequence of words: =(,, ), {1,,}. A corpus is a collection
More informationJunction Tree, BP and Variational Methods
Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,
More information10708 Graphical Models: Homework 2
10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves
More informationArrowhead completeness from minimal conditional independencies
Arrowhead completeness from minimal conditional independencies Tom Claassen, Tom Heskes Radboud University Nijmegen The Netherlands {tomc,tomh}@cs.ru.nl Abstract We present two inference rules, based on
More information{ p if x = 1 1 p if x = 0
Discrete random variables Probability mass function Given a discrete random variable X taking values in X = {v 1,..., v m }, its probability mass function P : X [0, 1] is defined as: P (v i ) = Pr[X =
More informationLecture 4 October 18th
Directed and undirected graphical models Fall 2017 Lecture 4 October 18th Lecturer: Guillaume Obozinski Scribe: In this lecture, we will assume that all random variables are discrete, to keep notations
More informationProbability. CS 3793/5233 Artificial Intelligence Probability 1
CS 3793/5233 Artificial Intelligence 1 Motivation Motivation Random Variables Semantics Dice Example Joint Dist. Ex. Axioms Agents don t have complete knowledge about the world. Agents need to make decisions
More informationMinimum Free Energies with Data Temperature for Parameter Learning of Bayesian Networks
28 2th IEEE International Conference on Tools with Artificial Intelligence Minimum Free Energies with Data Temperature for Parameter Learning of Bayesian Networks Takashi Isozaki 1,2, Noriji Kato 2, Maomi
More informationCAUSAL MODELS: THE MEANINGFUL INFORMATION OF PROBABILITY DISTRIBUTIONS
CAUSAL MODELS: THE MEANINGFUL INFORMATION OF PROBABILITY DISTRIBUTIONS Jan Lemeire, Erik Dirkx ETRO Dept., Vrije Universiteit Brussel Pleinlaan 2, 1050 Brussels, Belgium jan.lemeire@vub.ac.be ABSTRACT
More informationNoisy-OR Models with Latent Confounding
Noisy-OR Models with Latent Confounding Antti Hyttinen HIIT & Dept. of Computer Science University of Helsinki Finland Frederick Eberhardt Dept. of Philosophy Washington University in St. Louis Missouri,
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft
More informationLearning Semi-Markovian Causal Models using Experiments
Learning Semi-Markovian Causal Models using Experiments Stijn Meganck 1, Sam Maes 2, Philippe Leray 2 and Bernard Manderick 1 1 CoMo Vrije Universiteit Brussel Brussels, Belgium 2 LITIS INSA Rouen St.
More informationComputational Genomics. Systems biology. Putting it together: Data integration using graphical models
02-710 Computational Genomics Systems biology Putting it together: Data integration using graphical models High throughput data So far in this class we discussed several different types of high throughput
More informationLearning Bayesian networks
1 Lecture topics: Learning Bayesian networks from data maximum likelihood, BIC Bayesian, marginal likelihood Learning Bayesian networks There are two problems we have to solve in order to estimate Bayesian
More information12 : Variational Inference I
10-708: Probabilistic Graphical Models, Spring 2015 12 : Variational Inference I Lecturer: Eric P. Xing Scribes: Fattaneh Jabbari, Eric Lei, Evan Shapiro 1 Introduction Probabilistic inference is one of
More informationScore Metrics for Learning Bayesian Networks used as Fitness Function in a Genetic Algorithm
Score Metrics for Learning Bayesian Networks used as Fitness Function in a Genetic Algorithm Edimilson B. dos Santos 1, Estevam R. Hruschka Jr. 2 and Nelson F. F. Ebecken 1 1 COPPE/UFRJ - Federal University
More informationA Decision Theoretic View on Choosing Heuristics for Discovery of Graphical Models
A Decision Theoretic View on Choosing Heuristics for Discovery of Graphical Models Y. Xiang University of Guelph, Canada Abstract Discovery of graphical models is NP-hard in general, which justifies using
More informationRepresentation of undirected GM. Kayhan Batmanghelich
Representation of undirected GM Kayhan Batmanghelich Review Review: Directed Graphical Model Represent distribution of the form ny p(x 1,,X n = p(x i (X i i=1 Factorizes in terms of local conditional probabilities
More informationDirected and Undirected Graphical Models
Directed and Undirected Graphical Models Adrian Weller MLSALT4 Lecture Feb 26, 2016 With thanks to David Sontag (NYU) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,
More informationStructure Learning: the good, the bad, the ugly
Readings: K&F: 15.1, 15.2, 15.3, 15.4, 15.5 Structure Learning: the good, the bad, the ugly Graphical Models 10708 Carlos Guestrin Carnegie Mellon University September 29 th, 2006 1 Understanding the uniform
More informationLearning Causality. Sargur N. Srihari. University at Buffalo, The State University of New York USA
Learning Causality Sargur N. Srihari University at Buffalo, The State University of New York USA 1 Plan of Discussion Bayesian Networks Causal Models Learning Causal Models 2 BN and Complexity of Prob
More informationBayesian Networks to design optimal experiments. Davide De March
Bayesian Networks to design optimal experiments Davide De March davidedemarch@gmail.com 1 Outline evolutionary experimental design in high-dimensional space and costly experimentation the microwell mixture
More informationLearning Bayesian Networks Does Not Have to Be NP-Hard
Learning Bayesian Networks Does Not Have to Be NP-Hard Norbert Dojer Institute of Informatics, Warsaw University, Banacha, 0-097 Warszawa, Poland dojer@mimuw.edu.pl Abstract. We propose an algorithm for
More information