Causal Bayesian networks. Peter Antal

Similar documents
Causal Bayesian networks. Peter Antal

Bayesian networks as causal models. Peter Antal

Fusion in simple models

Probability theory: elements

Egészségügyi informatika és biostatisztika Biomarkerek

1. what conditional independencies are implied by the graph. 2. whether these independecies correspond to the probability distribution

Causality. Pedro A. Ortega. 18th February Computational & Biological Learning Lab University of Cambridge

Introduction to Causal Calculus

Recall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem

Lecture 4 October 18th

Conditional Independence and Factorization

Directed Graphical Models or Bayesian Networks

Causal Inference & Reasoning with Causal Bayesian Networks

COS402- Artificial Intelligence Fall Lecture 10: Bayesian Networks & Exact Inference

Rapid Introduction to Machine Learning/ Deep Learning

Causality in Econometrics (3)

4.1 Notation and probability review

Learning in Bayesian Networks

CS Lecture 3. More Bayesian Networks

ANALYTIC COMPARISON. Pearl and Rubin CAUSAL FRAMEWORKS

Bounding the Probability of Causation in Mediation Analysis

Bayesian Networks Representation

A Brief Introduction to Graphical Models. Presenter: Yijuan Lu November 12,2004

Introduction to Probabilistic Graphical Models

Undirected Graphical Models: Markov Random Fields

Bayes Nets: Independence

An Introduction to Bayesian Machine Learning

Lecture 5: Bayesian Network

Y. Xiang, Inference with Uncertain Knowledge 1

Probabilistic Reasoning. (Mostly using Bayesian Networks)

Graphical models and causality: Directed acyclic graphs (DAGs) and conditional (in)dependence

Introduction to Bayes Nets. CS 486/686: Introduction to Artificial Intelligence Fall 2013

Bayesian networks (1) Lirong Xia

Today. Why do we need Bayesian networks (Bayes nets)? Factoring the joint probability. Conditional independence

Summary of the Bayes Net Formalism. David Danks Institute for Human & Machine Cognition

Bayesian Networks Inference with Probabilistic Graphical Models

On Causal Explanations of Quantum Correlations

Probabilistic Machine Learning

Chris Bishop s PRML Ch. 8: Graphical Models

Bayesian Networks BY: MOHAMAD ALSABBAGH

Probabilistic Graphical Models and Bayesian Networks. Artificial Intelligence Bert Huang Virginia Tech

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008

10601 Machine Learning

Identification of Conditional Interventional Distributions

Uncertain Reasoning. Environment Description. Configurations. Models. Bayesian Networks

Tópicos Especiais em Modelagem e Análise - Aprendizado por Máquina CPS863

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

Single World Intervention Graphs (SWIGs):

Probabilistic Graphical Models (I)

Causal Models with Hidden Variables

LEARNING CAUSAL EFFECTS: BRIDGING INSTRUMENTS AND BACKDOORS

Data Analysis and Uncertainty Part 1: Random Variables

Probabilistic Causal Models

Review: Bayesian learning and inference

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Some Probability and Statistics

2 : Directed GMs: Bayesian Networks

Statistical Models for Causal Analysis

Graphical Models and Kernel Methods

Automatic Causal Discovery

Intelligent Systems: Reasoning and Recognition. Reasoning with Bayesian Networks

4 : Exact Inference: Variable Elimination

Advanced Probabilistic Modeling in R Day 1

Probabilistic Graphical Models. Rudolf Kruse, Alexander Dockhorn Bayesian Networks 153

Multivariate Gaussians. Sargur Srihari

Probability Review. September 25, 2015

Probabilistic Graphical Networks: Definitions and Basic Results

Bayesian Network Representation

Preliminaries Bayesian Networks Graphoid Axioms d-separation Wrap-up. Bayesian Networks. Brandon Malone

Probabilistic Models

Causal Discovery. Richard Scheines. Peter Spirtes, Clark Glymour, and many others. Dept. of Philosophy & CALD Carnegie Mellon

From Causality, Second edition, Contents

CAUSAL INFERENCE IN THE EMPIRICAL SCIENCES. Judea Pearl University of California Los Angeles (

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Introduction to Probabilistic Graphical Models

Chapter 16. Structured Probabilistic Models for Deep Learning

Representation. Stefano Ermon, Aditya Grover. Stanford University. Lecture 2

Instructor: Sudeepa Roy

CS839: Probabilistic Graphical Models. Lecture 2: Directed Graphical Models. Theo Rekatsinas

A Decision Theoretic Approach to Causality

Bayesian Networks and Markov Random Fields

Probabilistic Graphical Models Redes Bayesianas: Definição e Propriedades Básicas

Stochastic inference in Bayesian networks, Markov chain Monte Carlo methods

An Introduction to Bayesian Networks: Representation and Approximate Inference

Lecture Notes 22 Causal Inference

2 : Directed GMs: Bayesian Networks

Bayesian Networks. Material used

Ch.6 Uncertain Knowledge. Logic and Uncertainty. Representation. One problem with logical approaches: Department of Computer Science

Bayesian Networks: Representation, Variable Elimination

Learning Semi-Markovian Causal Models using Experiments

Bayesian networks. Chapter 14, Sections 1 4

Graphical Models - Part I

Statistical Approaches to Learning and Discovery

Exchangeability and Invariance: A Causal Theory. Jiji Zhang. (Very Preliminary Draft) 1. Motivation: Lindley-Novick s Puzzle

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

Data Mining 2018 Bayesian Networks (1)

CS6220: DATA MINING TECHNIQUES

9 Forward-backward algorithm, sum-product on factor graphs

Directed Graphical Models

Announcements. CS 188: Artificial Intelligence Spring Probability recap. Outline. Bayes Nets: Big Picture. Graphical Model Notation

Transcription:

Causal Bayesian networks Peter Antal antal@mit.bme.hu A.I. 11/25/2015 1

Can we represent exactly (in)dependencies by a BN? From a causal model? Suff.&nec.? Can we interpret edges as causal relations with no hidden variables? in the presence of hidden variables? local models as autonomous mechanisms? Can we infer the effect of interventions? Optimal study design to infer the effect of interventions?

In a Bayesian network, any query can be answered corresponding to passive observations: p(q=q E=e). What is the (conditional) probability of Q=q given that E=e. Note that Q can preceed temporally E. X Y Specification: p(x), p(y X) Joint distribution: p(x,y) Inferences: p(x), p(y), p(y X), p(x Y) A.I. 11/25/2015 3

Perfect intervention: do(x=x) as set X to x. What is the relation of p(q=q E=e) and p(q=q do(e=e))? X Y Specification: p(x), p(y X) Joint distribution: p(x,y) Inferences: p(y X=x)=p(Y do(x=x)) p(x Y=y) p(x do(y=y)) What is a formal knowledge representation of a causal model? What is the formal inference method? A.I. 11/25/2015 4

Imagery observations and interventions: We observed X=x, but imagine that x would have been observed: denoted as X =x. We set X=x, but imagine that x would have been set: denoted as do(x =x ). What is the relation of Observational p(q=q E=e, X=x ) Interventional p(q=q E=e, do(x=x )) Counterfactual p(q =q Q=q, E=e, do(x=x), do(x =x )) O: What is the probability that the patient recovers if he takes the drug x. I:What is the probability that the patient recovers if we prescribe* the drug x. C: Given that the patient did not recovered for the drug x, what would have been the probability that patient recovers if we had prescribed* the drug x, instead of x. ~C (time-shifted): Given that patient did not recovered for the drug x and he has not respond well**, what is the probability that patient will recover if we change the prescribed* drug x to x. *: Assume that the patient is fully compliant. ** expected to neither he will. A.I. 11/25/2015 5

The domain is defined by the joint distribution P(X 1,..., X n Structure,parameters) 1. Representation of parameteres small number of parameters 2. Representation of independencies what is relevant for diagnosis 3. Representation of causal relations what is the effect of a treatment 4. Representation of possible worlds quantitave qualitative? passive (observational) Active (interventional) Imagery (counterfactual)

strong association, X precedes temporally Y, plausible explanation without alternative explanations based on confounding, necessity (generally: if cause is removed, effect is decreased or actually: y would not have been occurred with that much probability if x had not been present), sufficiency (generally: if exposure to cause is increased, effect is increased or actually: y would have been occurred with larger probability if x had been present). Autonomous, transportable mechanism. The probabilistic definition of causation formalizes many, but for example not the counterfactual aspects. A.I. 11/25/2015 7

I P (X;Y Z) or (X Y Z) P denotes that X is independent of Y given Z: P(X;Y z)=p(y z) P(X z) for all z with P(z)>0. (Almost) alternatively, I P (X;Y Z) iff P(X Z,Y)= P(X Z) for all z,y with P(z,y)>0. Other notations: D P (X;Y Z) =def= I P (X;Y Z) Contextual independence: for not all z.

The independence map (model) M of a distribution P is the set of the valid independence triplets: M P ={I P,1 (X 1 ;Y 1 Z 1 ),..., I P,K (X K ;Y K Z K )} If P(X,Y,Z) is a Markov chain, then M P ={D(X;Y), D(Y;Z), I(X;Z Y)} Normally/almost always: D(X;Z) Exceptionally: I(X;Z) X Y Z

Y X Z If P(Y,X,Z) is a naive Bayesian network, then M P ={D(X;Y), D(Y;Z), I(X;Z Y)} Normally/almost always: D(X;Z) Exceptionally: I(X;Z)

Directed acyclic graph (DAG) nodes random variables/domain entities edges direct probabilistic dependencies (edges- causal relations Local models - P(X i Pa(X i )) Three interpretations: 3. Concise representation of joint distributions P( M, O, D, S, T ) P( M ) P( O M ) P( D O, M ) P( S D) P( T S, M ) 1. Causal model M P ={I P,1 (X 1 ;Y 1 Z 1 ),...} 2. Graphical representation of (in)dependencies

I G (X;Y Z) denotes that X is d-separated (directed separated) from Y by Z in directed graph G.

For certain distributions exact representation is not possible by Bayesian networks, e.g.: 1. Intransitive Markov chain: XYZ 2. Pure multivariate cause: {X,Z}Y V 3. Diamond structure: P(X,Y,Z,V) with M P ={D(X;Z), D(X;Y), D(V;X), D(V;Z), I(V;Y {X,Z}), I(X;Z {V,Y}).. }. X Z Y

Causal models: X 1 X 2 X 3 X 4 X 1 X 2 X 3 X 4 Markov chain P(X 1,...) M P ={I(X i+1 ;X i-1 X i )} first order Markov property Flow of time?

p(x),p(z X),p(Y Z) X Z Y p(x Z),p(Z Y),p(Y) X Z Y p(x Z),p(Z),p(Y Z) X Z Y transitive M intransitive M p(x),p(z X,Y),p(Y) X Y Z v-structure M P ={D(X;Z), D(Z;Y), D(X,Y), I(X;Y Z)} M P ={D(X;Z), D(Y;Z), I(X;Y), D(X;Y Z) } Often: present knowledge renders future states conditionally independent. (confounding) Ever(?): present knowledge renders past states conditionally independent. (backward/atemporal confounding)

(Passive, observational) inference P(Query Observations) Interventionist inference P(Query Observations, Interventions) Counterfactual inference P(Query Observations, Counterfactual conditionals)

If G is a causal model, then compute p(y do(x=x)) by 1. deleting the incoming edges to X 2. setting X=x 3. performing standard Bayesian network inference. Mutation? Subpopulation Location E X? Y * Disease

Can we represent exactly (in)dependencies by a BN? almost always Can we interpret edges as causal relations with no hidden variables? compelled edges as a filter in the presence of hidden variables? Sometimes, e.g. confounding can be excluded in certain cases in local models as autonomous mechanisms? a priori knowledge, e.g. Causal Markov Assumption Can we infer the effect of interventions in a causal model? Graph surgery with standard inference in BNs Optimal study design to infer the effect of interventions? With no hidden variables: yes, in a non-bayesian framework In the presence of hidden variables: open issue Suggested reading J. Pearl: Causal inference in statistics, 2009