Towards an extension of the PC algorithm to local context-specific independencies detection

Size: px
Start display at page:

Download "Towards an extension of the PC algorithm to local context-specific independencies detection"

Transcription

1 Towards an extension of the PC algorithm to local context-specific independencies detection Feb

2 Outline Background: Bayesian Networks The PC algorithm Context-specific independence: from DAGs to LDAGs The PSPC algorithm

3 Background Bayesian Networks (BNs), are a powerful tool for the construction of multivariate distributions from univariate independent components B = (G, P ) G being a Directed Acyclic Graph (DAG) P being a probability distribution factorizing according to G (Hammersley, Clifford 1971)

4 Background Each variable is conditionally independent of all its nondescendants in the graph given the value of all its parents: P(V) = P(X 1,..., X d ) = d P(X i pa(x i )) i=1 Main assumptions: Causal Markov Condition (CMC) Causal Faithfulness Condition (CFC) Computationally more efficient: d local small CPTs, V = d

5 Background Some fields of applications: Probabilistic expert systems Decision analysis Causality Data mining Complex statistical models

6 Background

7 Background G = (V, E) V = (X 1,..., X d ) r.v.s as nodes in the graph E V V (i, j) E representing (conditional) dependence among variables X i and X j

8 Background A toy example... Parents pa(d) = {A, B} Children ch(d) = {E} Non-descendants nd(d) = {A, B, C} V-structures A B A B D D is a collider

9 Background V = {A, B, C, D, E} P (V ) = P (A, B, C, D, E) = P (A)P (C A)P (B)P (D A, B)P (D E) = P (A, C)P (A, B, C)P (D, E) P (A)P (D)

10 Background Markov Equivalence Classes {C A D} P {C A D} = C D A

11 Background P (A, B, C) = P (A, B)P (B, C) P (B)

12 Background P (A, B, C) = P (A)P (C)P (B A, C)

13 Background {A D E} = A E D

14 Background Learning and inference on BNs (Koller, Friedman 2009): Structure Learning*: Search-and-Score (or Bayesian) approach Constraint-based approach* Parameter Estimation ML estimation, Bayesian estimation Inference Variable elimination, Belief Propagation, MAP Estimation, Sampling methods

15 The PC algorithm Spirtes P, Glymour C, Scheines R (1993, 1st ed.) Causally Sufficient setting: V = O, H S = Sound and complete under i) Consistency of CI statistical tests ii) CMC, CFC

16 The PC algorithm Input: V, oracle/sample knowledge on the pattern of independencies among variables S1 S2 S3 S4 Output: A Completed Partially Directed Acyclic Graph (CPDAG) is returned, definining a Markov Equivalence Class

17 The PC algorithm S1: G := complete undirected graph over V S2: The skeleton of G is inferred and a list M of unshielded triples is returned Lemma 1 (Zhang, Spirtes 2008, Spirtes et al. 2000): X Adj(Y ; G) iff S V\{X, Y } s.t. X Y S

18 The PC algorithm S3: < X, Y, Z > in M is eventually oriented as a v-structure according to: Lemma 2 (Zhang, Spirtes 2008, Spirtes et al. 2000): In a DAG G, given any unshielded triple < X, Y, Z >, Y is a collider iff S s.t. X Z S, Y S; Y is a non-collider iff S s.t. X Z S, Y / S S4: As many unoriented edges as possible are oriented according to the orientation rules provided by Zhang (2008)

19 The PC algorithm Conservative PC algorithm (CPC, Ramsey et al. 2013) S3 S3 S4 S4 (see [2] for details) CFC is relaxed Output is an e-pattern where unfaithful triples* are allowed P M P are represented by the same e-pattern! *Triples which are not qualified as a v-structures or Markov chains

20 CSI Conditional Independence (CI): Let X, Y, Z be pairwise disjoint subsets of V, X is conditionally independent of Y given Z, if (x, y, z) V al(x) V al(y ) V al(z) whenever P (y, z) > 0 P (x y, z) = P (x z) X Y Z

21 CSI CI: X Y Z P (x y, z) = P (x z), wheneverp (y, z) > 0 Context-Specific Conditional Independence (CSI, Boutilier 1996): Let X, Y, Z, C be pairwise disjoint subsets of V, X is conditionally independent of Y given Z in context C = c, where c V al(c), if it holds that (x, y, z) V al(x) V al(y ) V al(z) whenever P (y, z, c) > 0 P (x y, z, c) = P (x z, c) X Y Z, c

22 CSI CI: X Y Z P (x y, z) = P (x z), wheneverp (y, z) > 0 CSI: X Y Z, c P (x y, z, c) = P (x z, c), wheneverp (y, z, c) > 0 Local CSI: X and Y are CSI given C = c X and C define a partition of pa(y )

23 CSI CI: X Y Z P (x y, z) = P (x z), wheneverp (y, z) > 0 CSI: X Y Z, c P (x y, z, c) = P (x z, c), wheneverp (y, z, c) > 0 Local CSI e.g. (Zhang, 1998) X: Weather, Y : Income, C: Profession

24 CSI CI: X Y Z P (x y, z) = P (x z), wheneverp (y, z) > 0 CSI: X Y Z, c P (x y, z, c) = P (x z, c), wheneverp (y, z, c) > 0 Local CSI e.g. (Zhang, 1998) X: Weather, Y : Income, C: Profession

25 From Local CSIs to LDAGs Labelled Directed Acyclic Graphs (LDAGs, Pensar et al. 2014) account for Local CSIs: G L = (V, E, L E ), where V is the set of nodes, corresponding to the set of r.v.s E is the set of oriented edges, (i, j) E iff X i pa(x j ) L E is the set of all labels, L E = (i,j) E L (i,j)

26 LDAGs e.g. (Pensar 2014) G L = (V, E, L E ), V = {1, 2, 3, 4}, E = {(2, 1), (3, 1), (4, 1)} L 2,1 = (0, 1) X 1 X 2 (X 3, X 4 ) = (0, 1) L 4,1 = (, 1) = V al(x 2 ) {1} X 1 X 4 X 2, X 3 = 1

27 Extending the PC algorithm CSPC algorithm for undirected log-linear models (Edera et al., 2013) PSPC algorithm for LDAG models

28 Extending the PC algorithm Input: V, oracle/sample knowledge on the pattern of independencies among variables S1 S2 S3 S4 Unmark the unfaithful triples CSeek routine (+ Orient Parents) Output (Best case scenario): A Completed Partially Labelled Directed Acyclic Graph (CPLDAG) is returned, definining a CSI-Equivalence Class (see Pensar et al., 2014)

29 Extending the PC algorithm: the CSeek routine

30 Discussion and future work Consistency and generalizations Assumptions (CSC, CI tests, from CMC+CFC to CMC+AFC to CMC+TFC to...) and related issues Computational efficiency Idea: CSeek routine applied to unfaithful triples only according to some threshold Development of the algorithm and applications Efficient inference on BNs with LDAGs (Zhang, Poole 1998, Poole 2003)

31 References Boutilier, Craig, et al., Context-specific independence in Bayesian networks, Proceedings of the Twelfth international conference on Uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., 1996 Edera, Alejandro, Federico Schluter, and Facundo Bromberg, Learning Markov networks with context-specific independences, Tools with Artificial Intelligence (ICTAI), 2013 IEEE 25th International Conference on. IEEE, 2013 Isozaki, Takashi, A robust causal discovery algorithm against faithfulness violation, Information and Media Technologies 9.1 (2014): Kalisch, Markus, and Peter Buhlmann, Estimating high-dimensional directed acyclic graphs with the PC-algorithm, The Journal of Machine Learning Research 8 (2007): Kalisch, Markus, and Peter Buhlmann, Robustification of the PC-algorithm for Directed Acyclic Graphs, Journal of Computational and Graphical Statistics 17.4 (2008):

32 References Koller, Daphne, and Nir Friedman, Probabilistic graphical models: principles and techniques, MIT press, 2009 Lemeire, Jan, Stijn Meganck, and Francesco Cartella, Robust independence-based causal structure learning in absence of adjacency faithfulness, on Probabilistic Graphical Models (2010): 169 Pensar, Johan, et al.,labeled directed acyclic graphs: a generalization of context-specific independence in directed graphical models, Data Mining and Knowledge Discovery 29.2 (2015): Poole, David, and Nevin Lianwen Zhang, Exploiting contextual independence in probabilistic inference, J. Artif. Intell. Res.(JAIR) 18 (2003): Ramsey, Joseph, Jiji Zhang, and Peter L. Spirtes, Adjacency-faithfulness and conservative causal inference, arxiv preprint arxiv: (2012)

33 References Spirtes, Peter, Clark N. Glymour, and Richard Scheines, Causation, prediction, and search, MIT press, 2000 Zhang, Jiji, and Peter Spirtes, Strong faithfulness and uniform consistency in causal inference, Proceedings of the nineteenth conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., 2002 Zhang, Jiji, and Peter Spirtes, Detection of unfaithfulness and robust causal inference, Minds and Machines 18.2 (2008): Zhang, Nevin Lianwen, Inference in bayesian networks: the role of context specific independence, (1998) Zhang, Nevin Lianwen, and David Poole, On the role of context-specific independence in probabilistic inference, IJCAI-99: Proceedings of the 16th International Joint Conference on Artificial intelligence, Vols 1,2 (1999)

34 ADDITIONAL FEATURES

35 [1] Background: Main assumptions 1/3 Causal Markov Condition (CMC): Given a set of (causally sufficient) r.v.s V whose causal structure is represented by a DAG G, X G nd(x) pa(x) X nd(x) pa(x) (1) P is Markov to G whenever (1) holds G is an I-map of P whenever (1) holds

36 [1] Background: Main assumptions 2/3 Causal Faithfulness Condition (CFC): Given a set of (causally sufficient) r.v.s V whose causal structure is represented by a DAG G, the joint probability distribution P(V) is faithful to G if it holds that If CMC does not entail X Y S then X will be dependent on Y conditional on S in P

37 [1]Background: Main assumptions 3/3 Two observations on the CFC assumptions: It follows that whenever CMC and CFC hold: X G nd(x) pa(x) X nd(x)\pa(x) pa(x) (2) P is faithful to G whenever (2) holds G is a perfect I-map of P whenever (2) holds Lebesgue measure zero argument (Meek, 1995): not too restrictive!

38 [2] PC algorithm continued 1/3 Given pointwise consistent statistical tests for the independence among variables, the PC procedure is pointwise consistent under CMC and CFC. Uniform consistency? CFC λ-strong CFC (λ-sfc) (provided uniformly consistent statistical tests) Robins et al. (2003, 2006) on CFC s decomposability Isozaki (2014) on weak CFC test-related violations Complexity bounded by d 2 (d 1) k 1 \(k 1)!, k - maximal degree of connectivity for any vertex

39 [2] PC algorithm continued 2/3 S3 Let G* be the graph resulting from S1+S2 and M be the list of unshielded triples. For each?x,y,z? in M, for every S Adj(X; G*), Adj(Y; G*) Adj(Z; G*): If S s.t. X Z S, Y / S then X Y Z := X Y Z If S s.t. X Z S, Y S then leave the triple unmarked Otherwise, mark the triple as unfaithful: X Y Z := X Y Z S4 Orientation rules that are applied to unoriented unshielded triples only

40 [2] PC algorithm continued 3/3 e-patterns: A DAG G is represented by an e-pattern e-g if (i) A Adj(B; e-g) corresponds to A Adj(B, G) (ii) A B in G is marked as A B in e-g (iii) The colliders in G are either marked as such or as part of an unfaithful triple in e-g

41 [3] λ-strong CFC 1/3 Gaussian setting (Zhang, Spirtes 2003, Uhler et al. 2013)) Discrete setting* (Rudas et al. 2015) Parametrization! (Many variations to be considered)

42 [3] λ-strong CFC 2/3 e.g. (Rudas et al. 2015) Variation dependent case V = {A, B} set of 2 binary r.v.s parametrized as cell probabilities within 3 (2x2 CPT) φ 1 log-odds ratio, φ 2 Yule s coefficient as measures of association Given λ > 0, P is λ-sfc to G whenever φ 1 = log(p 00p 11 ) p 01 p 10 > λ or φ 2 = p 00 p 11 p 01 p 10 p 00 p 11 + p 01 p 10 > λ

43 [3] λ-strong CFC 3/3 e.g. (Rudas et al. 2015) Variation independent case V = {A, B} set of 2 binary r.v.s parametrized as conditional probabilities within (0, 1) 3 (2x2 CPT), with θ 1 = P (A = 0), θ 2 = P (B = 0 A = 0), θ 3 = P (B = 0 A = 1) φ 3 absolute difference between conditional probabilities as measure of association: Given λ > 0, P is λ-sfc to G whenever φ 3 = θ 2 θ 3 > λ

44 [4] Properties of LDAGs 1/3 Labelled Directed Acyclic Graphs (LDAGs, Pensar et al. 2014) account for Local CSIs: G L = (V, E, L E ), where V is the set of nodes, corresponding to the set of r.v.s E is the set of oriented edges, (i, j) E iff X i pa(x j ) L E is the set of all labels, L E = (i,j) E L (i,j) L(i,j) being a list of configurations of L (i,j) = pa(x j )\X i : x L(i,j) L (i,j) x L(i,j) V al(l (i,j) ) is s.t. X j X i L (i,j) = x L(i,j)

45 [4] Properties of LDAGs 2/3 Maximality Regularity CSI-faithfulness* (CS-LDAG: G L (x C )) CSI-equivalence: G L = (V, E, L E ) and G L = (V, E, L E) belong to the same CSI-equivalence if GL and G L share the same skeleton G(x V ) and G (x V ) are Markov equivalent x v V al(v ) If x V V al(v ) s.t. no label in both L E and L E is satisfied, G and G are Markov equivalent

46 [4] Properties of LDAGs 3/3 CSI-faithfulness Def. Let B be a BN and let B(c) be the model instatiated to context C=c. If are not d-separated by Z in B and they are d-separated by Z in B(c) then they are CSI-separated by Z given context C=c in B, namely X GC=c Y Z X G Y Z, C = c CSI-CFC of P to G follows from CSI-separation (Boutilier et al. 1996) Context-specific Hammersley-Clifford theorem (Edera et al. 2013)

47 [5] Further definitions: Markov Equivalence Classes Def. Two DAGs belong to the same Markov Equivalence class (ME class) whenever they entail the same conditional independence relations among the observed variables G = (V, E ), G = (V, E ) s.t. P (V ; G ) M P (V ; G ) Elements of a ME class are represented by means of a Partially oriented DAG (PDAG, or by a Completed Partially oriented DAG, CPDAG)

48 [5] D-separation A path u =< X,..., Y > is blocked by some subset Z V \{X, Y } if either Z is on u and no element of Z is a collider on u u contains a collider W and W Z de(z) Def. X and Y are d-separated by Z iff Z blocks all paths between X and Y

Learning Multivariate Regression Chain Graphs under Faithfulness

Learning Multivariate Regression Chain Graphs under Faithfulness Sixth European Workshop on Probabilistic Graphical Models, Granada, Spain, 2012 Learning Multivariate Regression Chain Graphs under Faithfulness Dag Sonntag ADIT, IDA, Linköping University, Sweden dag.sonntag@liu.se

More information

COMP538: Introduction to Bayesian Networks

COMP538: Introduction to Bayesian Networks COMP538: Introduction to Bayesian Networks Lecture 9: Optimal Structure Learning Nevin L. Zhang lzhang@cse.ust.hk Department of Computer Science and Engineering Hong Kong University of Science and Technology

More information

Interpreting and using CPDAGs with background knowledge

Interpreting and using CPDAGs with background knowledge Interpreting and using CPDAGs with background knowledge Emilija Perković Seminar for Statistics ETH Zurich, Switzerland perkovic@stat.math.ethz.ch Markus Kalisch Seminar for Statistics ETH Zurich, Switzerland

More information

Arrowhead completeness from minimal conditional independencies

Arrowhead completeness from minimal conditional independencies Arrowhead completeness from minimal conditional independencies Tom Claassen, Tom Heskes Radboud University Nijmegen The Netherlands {tomc,tomh}@cs.ru.nl Abstract We present two inference rules, based on

More information

Introduction to Probabilistic Graphical Models

Introduction to Probabilistic Graphical Models Introduction to Probabilistic Graphical Models Kyu-Baek Hwang and Byoung-Tak Zhang Biointelligence Lab School of Computer Science and Engineering Seoul National University Seoul 151-742 Korea E-mail: kbhwang@bi.snu.ac.kr

More information

Learning Causal Bayesian Networks from Observations and Experiments: A Decision Theoretic Approach

Learning Causal Bayesian Networks from Observations and Experiments: A Decision Theoretic Approach Learning Causal Bayesian Networks from Observations and Experiments: A Decision Theoretic Approach Stijn Meganck 1, Philippe Leray 2, and Bernard Manderick 1 1 Vrije Universiteit Brussel, Pleinlaan 2,

More information

Graphical models and causality: Directed acyclic graphs (DAGs) and conditional (in)dependence

Graphical models and causality: Directed acyclic graphs (DAGs) and conditional (in)dependence Graphical models and causality: Directed acyclic graphs (DAGs) and conditional (in)dependence General overview Introduction Directed acyclic graphs (DAGs) and conditional independence DAGs and causal effects

More information

Causality in Econometrics (3)

Causality in Econometrics (3) Graphical Causal Models References Causality in Econometrics (3) Alessio Moneta Max Planck Institute of Economics Jena moneta@econ.mpg.de 26 April 2011 GSBC Lecture Friedrich-Schiller-Universität Jena

More information

CSC 412 (Lecture 4): Undirected Graphical Models

CSC 412 (Lecture 4): Undirected Graphical Models CSC 412 (Lecture 4): Undirected Graphical Models Raquel Urtasun University of Toronto Feb 2, 2016 R Urtasun (UofT) CSC 412 Feb 2, 2016 1 / 37 Today Undirected Graphical Models: Semantics of the graph:

More information

Undirected Graphical Models: Markov Random Fields

Undirected Graphical Models: Markov Random Fields Undirected Graphical Models: Markov Random Fields 40-956 Advanced Topics in AI: Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2015 Markov Random Field Structure: undirected

More information

Identifiability assumptions for directed graphical models with feedback

Identifiability assumptions for directed graphical models with feedback Biometrika, pp. 1 26 C 212 Biometrika Trust Printed in Great Britain Identifiability assumptions for directed graphical models with feedback BY GUNWOONG PARK Department of Statistics, University of Wisconsin-Madison,

More information

Directed and Undirected Graphical Models

Directed and Undirected Graphical Models Directed and Undirected Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Last Lecture Refresher Lecture Plan Directed

More information

Causal Inference & Reasoning with Causal Bayesian Networks

Causal Inference & Reasoning with Causal Bayesian Networks Causal Inference & Reasoning with Causal Bayesian Networks Neyman-Rubin Framework Potential Outcome Framework: for each unit k and each treatment i, there is a potential outcome on an attribute U, U ik,

More information

Inferring the Causal Decomposition under the Presence of Deterministic Relations.

Inferring the Causal Decomposition under the Presence of Deterministic Relations. Inferring the Causal Decomposition under the Presence of Deterministic Relations. Jan Lemeire 1,2, Stijn Meganck 1,2, Francesco Cartella 1, Tingting Liu 1 and Alexander Statnikov 3 1-ETRO Department, Vrije

More information

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), Institute BW/WI & Institute for Computer Science, University of Hildesheim

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), Institute BW/WI & Institute for Computer Science, University of Hildesheim Course on Bayesian Networks, summer term 2010 0/42 Bayesian Networks Bayesian Networks 11. Structure Learning / Constrained-based Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL)

More information

CS Lecture 3. More Bayesian Networks

CS Lecture 3. More Bayesian Networks CS 6347 Lecture 3 More Bayesian Networks Recap Last time: Complexity challenges Representing distributions Computing probabilities/doing inference Introduction to Bayesian networks Today: D-separation,

More information

Causal Inference on Data Containing Deterministic Relations.

Causal Inference on Data Containing Deterministic Relations. Causal Inference on Data Containing Deterministic Relations. Jan Lemeire Kris Steenhaut Sam Maes COMO lab, ETRO Department Vrije Universiteit Brussel, Belgium {jan.lemeire, kris.steenhaut}@vub.ac.be, sammaes@gmail.com

More information

Abstract. Three Methods and Their Limitations. N-1 Experiments Suffice to Determine the Causal Relations Among N Variables

Abstract. Three Methods and Their Limitations. N-1 Experiments Suffice to Determine the Causal Relations Among N Variables N-1 Experiments Suffice to Determine the Causal Relations Among N Variables Frederick Eberhardt Clark Glymour 1 Richard Scheines Carnegie Mellon University Abstract By combining experimental interventions

More information

Learning in Bayesian Networks

Learning in Bayesian Networks Learning in Bayesian Networks Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Berlin: 20.06.2002 1 Overview 1. Bayesian Networks Stochastic Networks

More information

Using Bayesian Network Representations for Effective Sampling from Generative Network Models

Using Bayesian Network Representations for Effective Sampling from Generative Network Models Using Bayesian Network Representations for Effective Sampling from Generative Network Models Pablo Robles-Granda and Sebastian Moreno and Jennifer Neville Computer Science Department Purdue University

More information

Lecture 4 October 18th

Lecture 4 October 18th Directed and undirected graphical models Fall 2017 Lecture 4 October 18th Lecturer: Guillaume Obozinski Scribe: In this lecture, we will assume that all random variables are discrete, to keep notations

More information

Respecting Markov Equivalence in Computing Posterior Probabilities of Causal Graphical Features

Respecting Markov Equivalence in Computing Posterior Probabilities of Causal Graphical Features Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Respecting Markov Equivalence in Computing Posterior Probabilities of Causal Graphical Features Eun Yong Kang Department

More information

Learning causal network structure from multiple (in)dependence models

Learning causal network structure from multiple (in)dependence models Learning causal network structure from multiple (in)dependence models Tom Claassen Radboud University, Nijmegen tomc@cs.ru.nl Abstract Tom Heskes Radboud University, Nijmegen tomh@cs.ru.nl We tackle the

More information

Faithfulness of Probability Distributions and Graphs

Faithfulness of Probability Distributions and Graphs Journal of Machine Learning Research 18 (2017) 1-29 Submitted 5/17; Revised 11/17; Published 12/17 Faithfulness of Probability Distributions and Graphs Kayvan Sadeghi Statistical Laboratory University

More information

Rapid Introduction to Machine Learning/ Deep Learning

Rapid Introduction to Machine Learning/ Deep Learning Rapid Introduction to Machine Learning/ Deep Learning Hyeong In Choi Seoul National University 1/32 Lecture 5a Bayesian network April 14, 2016 2/32 Table of contents 1 1. Objectives of Lecture 5a 2 2.Bayesian

More information

Directed Graphical Models or Bayesian Networks

Directed Graphical Models or Bayesian Networks Directed Graphical Models or Bayesian Networks Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Bayesian Networks One of the most exciting recent advancements in statistical AI Compact

More information

Supplementary material to Structure Learning of Linear Gaussian Structural Equation Models with Weak Edges

Supplementary material to Structure Learning of Linear Gaussian Structural Equation Models with Weak Edges Supplementary material to Structure Learning of Linear Gaussian Structural Equation Models with Weak Edges 1 PRELIMINARIES Two vertices X i and X j are adjacent if there is an edge between them. A path

More information

CAUSAL MODELS: THE MEANINGFUL INFORMATION OF PROBABILITY DISTRIBUTIONS

CAUSAL MODELS: THE MEANINGFUL INFORMATION OF PROBABILITY DISTRIBUTIONS CAUSAL MODELS: THE MEANINGFUL INFORMATION OF PROBABILITY DISTRIBUTIONS Jan Lemeire, Erik Dirkx ETRO Dept., Vrije Universiteit Brussel Pleinlaan 2, 1050 Brussels, Belgium jan.lemeire@vub.ac.be ABSTRACT

More information

Undirected Graphical Models

Undirected Graphical Models Undirected Graphical Models 1 Conditional Independence Graphs Let G = (V, E) be an undirected graph with vertex set V and edge set E, and let A, B, and C be subsets of vertices. We say that C separates

More information

Probabilistic Graphical Models (I)

Probabilistic Graphical Models (I) Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random

More information

A Decision Theoretic View on Choosing Heuristics for Discovery of Graphical Models

A Decision Theoretic View on Choosing Heuristics for Discovery of Graphical Models A Decision Theoretic View on Choosing Heuristics for Discovery of Graphical Models Y. Xiang University of Guelph, Canada Abstract Discovery of graphical models is NP-hard in general, which justifies using

More information

An Introduction to Bayesian Machine Learning

An Introduction to Bayesian Machine Learning 1 An Introduction to Bayesian Machine Learning José Miguel Hernández-Lobato Department of Engineering, Cambridge University April 8, 2013 2 What is Machine Learning? The design of computational systems

More information

An Efficient Bayesian Network Structure Learning Algorithm in the Presence of Deterministic Relations

An Efficient Bayesian Network Structure Learning Algorithm in the Presence of Deterministic Relations An Efficient Bayesian Network Structure Learning Algorithm in the Presence of Deterministic Relations Ahmed Mabrouk 1 and Christophe Gonzales 2 and Karine Jabet-Chevalier 1 and Eric Chojnaki 1 Abstract.

More information

Representation of undirected GM. Kayhan Batmanghelich

Representation of undirected GM. Kayhan Batmanghelich Representation of undirected GM Kayhan Batmanghelich Review Review: Directed Graphical Model Represent distribution of the form ny p(x 1,,X n = p(x i (X i i=1 Factorizes in terms of local conditional probabilities

More information

Using Bayesian Network Representations for Effective Sampling from Generative Network Models

Using Bayesian Network Representations for Effective Sampling from Generative Network Models Using Bayesian Network Representations for Effective Sampling from Generative Network Models Pablo Robles-Granda and Sebastian Moreno and Jennifer Neville Computer Science Department Purdue University

More information

PCSI-labeled Directed Acyclic Graphs

PCSI-labeled Directed Acyclic Graphs Date of acceptance 7.4.2014 Grade Instructor Eximia cum laude approbatur Jukka Corander PCSI-labeled Directed Acyclic Graphs Jarno Lintusaari Helsinki April 8, 2014 UNIVERSITY OF HELSINKI Department of

More information

Graphical Models and Independence Models

Graphical Models and Independence Models Graphical Models and Independence Models Yunshu Liu ASPITRG Research Group 2014-03-04 References: [1]. Steffen Lauritzen, Graphical Models, Oxford University Press, 1996 [2]. Christopher M. Bishop, Pattern

More information

arxiv: v3 [stat.me] 3 Jun 2015

arxiv: v3 [stat.me] 3 Jun 2015 The Annals of Statistics 2015, Vol. 43, No. 3, 1060 1088 DOI: 10.1214/14-AOS1295 c Institute of Mathematical Statistics, 2015 A GENERALIZED BACK-DOOR CRITERION 1 arxiv:1307.5636v3 [stat.me] 3 Jun 2015

More information

3 : Representation of Undirected GM

3 : Representation of Undirected GM 10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:

More information

Marginal consistency of constraint-based causal learning

Marginal consistency of constraint-based causal learning Marginal consistency of constraint-based causal learning nna Roumpelaki, Giorgos orboudakis, Sofia Triantafillou and Ioannis Tsamardinos University of rete, Greece. onstraint-based ausal Learning Measure

More information

PDF hosted at the Radboud Repository of the Radboud University Nijmegen

PDF hosted at the Radboud Repository of the Radboud University Nijmegen PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is an author's version which may differ from the publisher's version. For additional information about this

More information

Learning Semi-Markovian Causal Models using Experiments

Learning Semi-Markovian Causal Models using Experiments Learning Semi-Markovian Causal Models using Experiments Stijn Meganck 1, Sam Maes 2, Philippe Leray 2 and Bernard Manderick 1 1 CoMo Vrije Universiteit Brussel Brussels, Belgium 2 LITIS INSA Rouen St.

More information

Outline. Spring It Introduction Representation. Markov Random Field. Conclusion. Conditional Independence Inference: Variable elimination

Outline. Spring It Introduction Representation. Markov Random Field. Conclusion. Conditional Independence Inference: Variable elimination Probabilistic Graphical Models COMP 790-90 Seminar Spring 2011 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Outline It Introduction ti Representation Bayesian network Conditional Independence Inference:

More information

Detecting marginal and conditional independencies between events and learning their causal structure.

Detecting marginal and conditional independencies between events and learning their causal structure. Detecting marginal and conditional independencies between events and learning their causal structure. Jan Lemeire 1,4, Stijn Meganck 1,3, Albrecht Zimmermann 2, and Thomas Dhollander 3 1 ETRO Department,

More information

Robustification of the PC-algorithm for Directed Acyclic Graphs

Robustification of the PC-algorithm for Directed Acyclic Graphs Robustification of the PC-algorithm for Directed Acyclic Graphs Markus Kalisch, Peter Bühlmann May 26, 2008 Abstract The PC-algorithm ([13]) was shown to be a powerful method for estimating the equivalence

More information

Learning equivalence classes of acyclic models with latent and selection variables from multiple datasets with overlapping variables

Learning equivalence classes of acyclic models with latent and selection variables from multiple datasets with overlapping variables Learning equivalence classes of acyclic models with latent and selection variables from multiple datasets with overlapping variables Robert E. Tillman Carnegie Mellon University Pittsburgh, PA rtillman@cmu.edu

More information

PDF hosted at the Radboud Repository of the Radboud University Nijmegen

PDF hosted at the Radboud Repository of the Radboud University Nijmegen PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is a preprint version which may differ from the publisher's version. For additional information about this

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

Tópicos Especiais em Modelagem e Análise - Aprendizado por Máquina CPS863

Tópicos Especiais em Modelagem e Análise - Aprendizado por Máquina CPS863 Tópicos Especiais em Modelagem e Análise - Aprendizado por Máquina CPS863 Daniel, Edmundo, Rosa Terceiro trimestre de 2012 UFRJ - COPPE Programa de Engenharia de Sistemas e Computação Bayesian Networks

More information

CS839: Probabilistic Graphical Models. Lecture 2: Directed Graphical Models. Theo Rekatsinas

CS839: Probabilistic Graphical Models. Lecture 2: Directed Graphical Models. Theo Rekatsinas CS839: Probabilistic Graphical Models Lecture 2: Directed Graphical Models Theo Rekatsinas 1 Questions Questions? Waiting list Questions on other logistics 2 Section 1 1. Intro to Bayes Nets 3 Section

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

Probabilistic Graphical Networks: Definitions and Basic Results

Probabilistic Graphical Networks: Definitions and Basic Results This document gives a cursory overview of Probabilistic Graphical Networks. The material has been gleaned from different sources. I make no claim to original authorship of this material. Bayesian Graphical

More information

arxiv: v6 [math.st] 3 Feb 2018

arxiv: v6 [math.st] 3 Feb 2018 Submitted to the Annals of Statistics HIGH-DIMENSIONAL CONSISTENCY IN SCORE-BASED AND HYBRID STRUCTURE LEARNING arxiv:1507.02608v6 [math.st] 3 Feb 2018 By Preetam Nandy,, Alain Hauser and Marloes H. Maathuis,

More information

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Part I C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Probabilistic Graphical Models Graphical representation of a probabilistic model Each variable corresponds to a

More information

Measurement Error and Causal Discovery

Measurement Error and Causal Discovery Measurement Error and Causal Discovery Richard Scheines & Joseph Ramsey Department of Philosophy Carnegie Mellon University Pittsburgh, PA 15217, USA 1 Introduction Algorithms for causal discovery emerged

More information

Lecture 6: Graphical Models

Lecture 6: Graphical Models Lecture 6: Graphical Models Kai-Wei Chang CS @ Uniersity of Virginia kw@kwchang.net Some slides are adapted from Viek Skirmar s course on Structured Prediction 1 So far We discussed sequence labeling tasks:

More information

Bayesian Networks to design optimal experiments. Davide De March

Bayesian Networks to design optimal experiments. Davide De March Bayesian Networks to design optimal experiments Davide De March davidedemarch@gmail.com 1 Outline evolutionary experimental design in high-dimensional space and costly experimentation the microwell mixture

More information

arxiv: v4 [math.st] 19 Jun 2018

arxiv: v4 [math.st] 19 Jun 2018 Complete Graphical Characterization and Construction of Adjustment Sets in Markov Equivalence Classes of Ancestral Graphs Emilija Perković, Johannes Textor, Markus Kalisch and Marloes H. Maathuis arxiv:1606.06903v4

More information

Causal Effect Identification in Alternative Acyclic Directed Mixed Graphs

Causal Effect Identification in Alternative Acyclic Directed Mixed Graphs Proceedings of Machine Learning Research vol 73:21-32, 2017 AMBN 2017 Causal Effect Identification in Alternative Acyclic Directed Mixed Graphs Jose M. Peña Linköping University Linköping (Sweden) jose.m.pena@liu.se

More information

Using background knowledge for the estimation of total causal e ects

Using background knowledge for the estimation of total causal e ects Using background knowledge for the estimation of total causal e ects Interpreting and using CPDAGs with background knowledge Emilija Perkovi, ETH Zurich Joint work with Markus Kalisch and Marloes Maathuis

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 4 Learning Bayesian Networks CS/CNS/EE 155 Andreas Krause Announcements Another TA: Hongchao Zhou Please fill out the questionnaire about recitations Homework 1 out.

More information

Identifiability of Gaussian structural equation models with equal error variances

Identifiability of Gaussian structural equation models with equal error variances Biometrika (2014), 101,1,pp. 219 228 doi: 10.1093/biomet/ast043 Printed in Great Britain Advance Access publication 8 November 2013 Identifiability of Gaussian structural equation models with equal error

More information

Lecture 6: Graphical Models: Learning

Lecture 6: Graphical Models: Learning Lecture 6: Graphical Models: Learning 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering, University of Cambridge February 3rd, 2010 Ghahramani & Rasmussen (CUED)

More information

Junction Tree, BP and Variational Methods

Junction Tree, BP and Variational Methods Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,

More information

BN Semantics 3 Now it s personal! Parameter Learning 1

BN Semantics 3 Now it s personal! Parameter Learning 1 Readings: K&F: 3.4, 14.1, 14.2 BN Semantics 3 Now it s personal! Parameter Learning 1 Graphical Models 10708 Carlos Guestrin Carnegie Mellon University September 22 nd, 2006 1 Building BNs from independence

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

A graph contains a set of nodes (vertices) connected by links (edges or arcs) BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,

More information

Bayesian Networks: Representation, Variable Elimination

Bayesian Networks: Representation, Variable Elimination Bayesian Networks: Representation, Variable Elimination CS 6375: Machine Learning Class Notes Instructor: Vibhav Gogate The University of Texas at Dallas We can view a Bayesian network as a compact representation

More information

Graphical models. Sunita Sarawagi IIT Bombay

Graphical models. Sunita Sarawagi IIT Bombay 1 Graphical models Sunita Sarawagi IIT Bombay http://www.cse.iitb.ac.in/~sunita 2 Probabilistic modeling Given: several variables: x 1,... x n, n is large. Task: build a joint distribution function Pr(x

More information

Machine Learning Summer School

Machine Learning Summer School Machine Learning Summer School Lecture 3: Learning parameters and structure Zoubin Ghahramani zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/ Department of Engineering University of Cambridge,

More information

Automatic Causal Discovery

Automatic Causal Discovery Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon 1 Outline 1. Motivation 2. Representation 3. Discovery 4. Using Regression for Causal

More information

A Brief Introduction to Graphical Models. Presenter: Yijuan Lu November 12,2004

A Brief Introduction to Graphical Models. Presenter: Yijuan Lu November 12,2004 A Brief Introduction to Graphical Models Presenter: Yijuan Lu November 12,2004 References Introduction to Graphical Models, Kevin Murphy, Technical Report, May 2001 Learning in Graphical Models, Michael

More information

Learning With Bayesian Networks. Markus Kalisch ETH Zürich

Learning With Bayesian Networks. Markus Kalisch ETH Zürich Learning With Bayesian Networks Markus Kalisch ETH Zürich Inference in BNs - Review P(Burglary JohnCalls=TRUE, MaryCalls=TRUE) Exact Inference: P(b j,m) = c Sum e Sum a P(b)P(e)P(a b,e)p(j a)p(m a) Deal

More information

Learning the Structure of Linear Latent Variable Models

Learning the Structure of Linear Latent Variable Models Journal (2005) - Submitted /; Published / Learning the Structure of Linear Latent Variable Models Ricardo Silva Center for Automated Learning and Discovery School of Computer Science rbas@cs.cmu.edu Richard

More information

Learning Marginal AMP Chain Graphs under Faithfulness

Learning Marginal AMP Chain Graphs under Faithfulness Learning Marginal AMP Chain Graphs under Faithfulness Jose M. Peña ADIT, IDA, Linköping University, SE-58183 Linköping, Sweden jose.m.pena@liu.se Abstract. Marginal AMP chain graphs are a recently introduced

More information

On Learning Causal Models from Relational Data

On Learning Causal Models from Relational Data Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) On Learning Causal Models from Relational Data Sanghack Lee and Vasant Honavar Artificial Intelligence Research Laboratory

More information

The Role of Assumptions in Causal Discovery

The Role of Assumptions in Causal Discovery The Role of Assumptions in Causal Discovery Marek J. Druzdzel Decision Systems Laboratory, School of Information Sciences and Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA 15260,

More information

Causal Reasoning with Ancestral Graphs

Causal Reasoning with Ancestral Graphs Journal of Machine Learning Research 9 (2008) 1437-1474 Submitted 6/07; Revised 2/08; Published 7/08 Causal Reasoning with Ancestral Graphs Jiji Zhang Division of the Humanities and Social Sciences California

More information

Expectation Propagation in Factor Graphs: A Tutorial

Expectation Propagation in Factor Graphs: A Tutorial DRAFT: Version 0.1, 28 October 2005. Do not distribute. Expectation Propagation in Factor Graphs: A Tutorial Charles Sutton October 28, 2005 Abstract Expectation propagation is an important variational

More information

COMP538: Introduction to Bayesian Networks

COMP538: Introduction to Bayesian Networks COMP538: Introduction to Bayesian Networks Lecture 2: Bayesian Networks Nevin L. Zhang lzhang@cse.ust.hk Department of Computer Science and Engineering Hong Kong University of Science and Technology Fall

More information

BN Semantics 3 Now it s personal!

BN Semantics 3 Now it s personal! Readings: K&F: 3.3, 3.4 BN Semantics 3 Now it s personal! Graphical Models 10708 Carlos Guestrin Carnegie Mellon University September 22 nd, 2008 10-708 Carlos Guestrin 2006-2008 1 Independencies encoded

More information

Causal Inference for High-Dimensional Data. Atlantic Causal Conference

Causal Inference for High-Dimensional Data. Atlantic Causal Conference Causal Inference for High-Dimensional Data Atlantic Causal Conference Overview Conditional Independence Directed Acyclic Graph (DAG) Models factorization and d-separation Markov equivalence Structure Learning

More information

Directed and Undirected Graphical Models

Directed and Undirected Graphical Models Directed and Undirected Graphical Models Adrian Weller MLSALT4 Lecture Feb 26, 2016 With thanks to David Sontag (NYU) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,

More information

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

Causal Structure Learning and Inference: A Selective Review

Causal Structure Learning and Inference: A Selective Review Vol. 11, No. 1, pp. 3-21, 2014 ICAQM 2014 Causal Structure Learning and Inference: A Selective Review Markus Kalisch * and Peter Bühlmann Seminar for Statistics, ETH Zürich, CH-8092 Zürich, Switzerland

More information

Statistical Approaches to Learning and Discovery

Statistical Approaches to Learning and Discovery Statistical Approaches to Learning and Discovery Graphical Models Zoubin Ghahramani & Teddy Seidenfeld zoubin@cs.cmu.edu & teddy@stat.cmu.edu CALD / CS / Statistics / Philosophy Carnegie Mellon University

More information

Data Mining 2018 Bayesian Networks (1)

Data Mining 2018 Bayesian Networks (1) Data Mining 2018 Bayesian Networks (1) Ad Feelders Universiteit Utrecht Ad Feelders ( Universiteit Utrecht ) Data Mining 1 / 49 Do you like noodles? Do you like noodles? Race Gender Yes No Black Male 10

More information

Bayesian Networks BY: MOHAMAD ALSABBAGH

Bayesian Networks BY: MOHAMAD ALSABBAGH Bayesian Networks BY: MOHAMAD ALSABBAGH Outlines Introduction Bayes Rule Bayesian Networks (BN) Representation Size of a Bayesian Network Inference via BN BN Learning Dynamic BN Introduction Conditional

More information

Causal Bayesian networks. Peter Antal

Causal Bayesian networks. Peter Antal Causal Bayesian networks Peter Antal antal@mit.bme.hu A.I. 4/8/2015 1 Can we represent exactly (in)dependencies by a BN? From a causal model? Suff.&nec.? Can we interpret edges as causal relations with

More information

Being Bayesian About Network Structure:

Being Bayesian About Network Structure: Being Bayesian About Network Structure: A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller Machine Learning, 2003 Presented by XianXing Zhang Duke University

More information

Artificial Intelligence Bayes Nets: Independence

Artificial Intelligence Bayes Nets: Independence Artificial Intelligence Bayes Nets: Independence Instructors: David Suter and Qince Li Course Delivered @ Harbin Institute of Technology [Many slides adapted from those created by Dan Klein and Pieter

More information

Causal Inference in the Presence of Latent Variables and Selection Bias

Causal Inference in the Presence of Latent Variables and Selection Bias 499 Causal Inference in the Presence of Latent Variables and Selection Bias Peter Spirtes, Christopher Meek, and Thomas Richardson Department of Philosophy Carnegie Mellon University Pittsburgh, P 15213

More information

GEOMETRY OF THE FAITHFULNESS ASSUMPTION IN CAUSAL INFERENCE 1

GEOMETRY OF THE FAITHFULNESS ASSUMPTION IN CAUSAL INFERENCE 1 The Annals of Statistics 2013, Vol. 41, No. 2, 436 463 DOI: 10.1214/12-AOS1080 Institute of Mathematical Statistics, 2013 GEOMETRY OF THE FAITHFULNESS ASSUMPTION IN CAUSAL INFERENCE 1 BY CAROLINE UHLER,

More information

Directed Graphical Models

Directed Graphical Models CS 2750: Machine Learning Directed Graphical Models Prof. Adriana Kovashka University of Pittsburgh March 28, 2017 Graphical Models If no assumption of independence is made, must estimate an exponential

More information

Total positivity in Markov structures

Total positivity in Markov structures 1 based on joint work with Shaun Fallat, Kayvan Sadeghi, Caroline Uhler, Nanny Wermuth, and Piotr Zwiernik (arxiv:1510.01290) Faculty of Science Total positivity in Markov structures Steffen Lauritzen

More information

Review: Directed Models (Bayes Nets)

Review: Directed Models (Bayes Nets) X Review: Directed Models (Bayes Nets) Lecture 3: Undirected Graphical Models Sam Roweis January 2, 24 Semantics: x y z if z d-separates x and y d-separation: z d-separates x from y if along every undirected

More information

Equivalence in Non-Recursive Structural Equation Models

Equivalence in Non-Recursive Structural Equation Models Equivalence in Non-Recursive Structural Equation Models Thomas Richardson 1 Philosophy Department, Carnegie-Mellon University Pittsburgh, P 15213, US thomas.richardson@andrew.cmu.edu Introduction In the

More information

Causal Models with Hidden Variables

Causal Models with Hidden Variables Causal Models with Hidden Variables Robin J. Evans www.stats.ox.ac.uk/ evans Department of Statistics, University of Oxford Quantum Networks, Oxford August 2017 1 / 44 Correlation does not imply causation

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:

More information