Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures

Size: px

Start display at page:

Download "Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures"

Estella Barrett
5 years ago
Views:

Villejuif, France INSERM 1018 ONCOSTAT 5-6 October 2017 GDR

1 Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures V. Asvatourian, S. Michiels, E. Lanoy Biostatistics and Epidemiology unit, Gustave-Roussy, Villejuif, France INSERM 1018 ONCOSTAT 5-6 October 2017 GDR Vahe Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 1

2 Current work Contribution of causal models in evaluating immunotherapies from observational data Motivations Multi-dimensional Predictive variables (p) > 400 Number of subjects (n) 40 Multi-collinearity Biomarkers measures can be repeated Observational data: association vs causation? Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 2

3 IDA: Estimation of causal effects in high-dimensionnal settings Maathuis, M. H., Kalisch, M., & Bühlmann, P. (2009). Estimating high-dimensional intervention effects from observational data. Annals of Statistics Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 3

4 Causal structure learning method Score-based To determine within candidate subgraphs those which fit better the data The fitness is measured through a score The purpose of the algorithm is to identify the DAG maximizing the score Time consuming in high-dimensional context Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 4

5 Causal structure learning method Score-based To determine within candidate subgraphs those which fit better the data The fitness is measured through a score The purpose of the algorithm is to identify the DAG maximizing the score Time consuming in high-dimensional context Constraint-based To learn the relationships by testing (conditionnal) independences between pairs of variables Consistent in high dimensional settings Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 4

6 Causal structure learning method PC-algorithm (constraint-based) Tests of (conditional) independences between pairs of variables Estimation of causal effects based on Completed partially DAG 1 X3 X11 X10 X4 X2 X12 X9 X5 X1 X13 X16 X6 X8 X14 X15 1 Maathuis, M. H., Kalisch, M., & Bühlmann, P. (2009). Estimating high-dimensional intervention effects from observational data. Annals of Statistics Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 5 X7

7 Causal structure learning method PC-algorithm (constraint-based) Tests of (conditional) independences between pairs of variables Estimation of causal effects based on Completed partially DAG 1 X3 X11 X10 X4 X2 X12 X9 X5 X1 X13 X16 X6 X8 X14 X15 1 Maathuis, M. H., Kalisch, M., & Bühlmann, P. (2009). Estimating high-dimensional intervention effects from observational data. Annals of Statistics Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 5 X7

8 Causal structure learning method PC-algorithm (constraint-based) Tests of (conditional) independences between pairs of variables Estimation of causal effects based on Completed partially DAG 1 X3 X11 X10 X4 X2 X12 X9 X5 X1 X13 X16 X6 X8 X14 X15 1 Maathuis, M. H., Kalisch, M., & Bühlmann, P. (2009). Estimating high-dimensional intervention effects from observational data. Annals of Statistics Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 5 X7

9 PC-algorithm on repeated exposures 1 Chronologically ordered PC-algorithm Integration of chronological order Estimation of causal effects based on Completed partially DAG X i,t=0 X i,t=1 X i,t=2 1 Asvatourian V, Coutzac C, Chaput N, Michiels S, Lanoy E: Estimating causal effects of repeated exposures on a binary endpoint in a high-dimensional setting Under review Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 6

10 PC-algorithm on repeated exposures 1 Chronologically ordered PC-algorithm Integration of chronological order Estimation of causal effects based on Completed partially DAG X i,t=0 X i,t=1 X i,t=2 1 Asvatourian V, Coutzac C, Chaput N, Michiels S, Lanoy E: Estimating causal effects of repeated exposures on a binary endpoint in a high-dimensional setting Under review Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 6

11 PC-algorithm on repeated exposures 1 Chronologically ordered PC-algorithm Integration of chronological order Estimation of causal effects based on Completed partially DAG X i,t=0 X i,t=1 X i,t=2 1 Asvatourian V, Coutzac C, Chaput N, Michiels S, Lanoy E: Estimating causal effects of repeated exposures on a binary endpoint in a high-dimensional setting Under review Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 6

12 PC-algorithm on repeated exposures 1 Chronologically ordered PC-algorithm Integration of chronological order Estimation of causal effects based on Completed partially DAG X i,t=0 X i,t=1 X i,t=2 1 Asvatourian V, Coutzac C, Chaput N, Michiels S, Lanoy E: Estimating causal effects of repeated exposures on a binary endpoint in a high-dimensional setting Under review Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 6

13 PC-algorithm on repeated exposures 1 Chronologically ordered PC-algorithm Integration of chronological order Estimation of causal effects based on Completed partially DAG X i,t=0 X i,t=1 X i,t=2 1 Asvatourian V, Coutzac C, Chaput N, Michiels S, Lanoy E: Estimating causal effects of repeated exposures on a binary endpoint in a high-dimensional setting Under review Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 6

14 Can we remove or fix edges (a priori information)? Some biomarkers could be known a priori Extensions of the score-based method Adding the expert opinion in the score calculation Reliable and consistent expert s opinion with the true structure Small number of experts Low dimensional setting Extensions of the constraint-based method Few articles on adding expert s knowledge in the case of constraint-based methods. Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 7

15 PC-algorithm with a priori information on high dimensional time-varying exposures? X i,t=0 X i,t=1 X i,t=2 X i,t=0 X i,t=1 X i,t=2 PC-algo without a priori information PC-algo with a priori information Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 8

16 PC-algorithm with a priori information on high dimensional time-varying exposures? X i,t=0 X i,t=1 X i,t=2 X i,t=0 X i,t=1 X i,t=2 PC-algo without a priori information PC-algo with a priori information Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 8

17 PC-algorithm with a priori information on high dimensional time-varying exposures? X i,t=0 X i,t=1 X i,t=2 X i,t=0 X i,t=1 X i,t=2 PC-algo without a priori information PC-algo with a priori information Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 8

18 Dynamic versus non-dynamic a priori information Giving information about time-varying biomarkers versus time fixed biomarkers X i,t=0 X i,t=1 X i,t=2 X i,t=0 X i,t=1 X i,t=2 Dynamic a priori information Non dynamic a priori information Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 9

19 Assumptions Is the true DAG dynamic or not? X i,t=0 X i,t=1 X i,t=2 X i,t=0 X i,t=1 X i,t=2 Dynamic true DAG Non dynamic true DAG Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 10

20 When to add a priori information? Steps of the PC-algorithm Identification of the skeleton Identification of the v-structures Orienting the last edges Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 11

21 When to add a priori information? Steps of the PC-algorithm Adding information about dependencies (fix or delete edges before performing PC-algo) Identification of the skeleton Adding information about direction (orient edges) Identification of the v-structures Orienting the last edges Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 11

22 Expert s knowledge elicitation For each pair, a priori information can be summarized by P A,B = {P, P, P } Each expert gives a score from 0 to 10 to every pairs he knows, where the score represents the confidence of causal effect If several scores are given to a same pair, the median of these score is taken Each expert is represented by 4 parameters γ 1, γ 2, γ 3, γ 4 : γ 1 γ 2 γ 3 γ 4 Probability of detecting the existing edges with correct direction Probability of detecting the existing edges with reverse direction Probability of correctly detecting the absent edges Probability of not detecting the absent edges Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 12

23 Simulation set-up p {20, 80} n visits {4, 8} Expert s level {bad, medium, good} Bad γ 1 : +, γ 2 : + + +, γ 3 : +, γ 4 : Medium γ 1 : ++, γ 2 : ++, γ 3 : ++, γ 4 : ++ Good γ 1 : + + +, γ 2 : +, γ 3 : + + +, γ 4 : + Percentage of a priori information {10, 40} n {50, 1000} Number of experts= 10 Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 13

24 Evaluation of performance The capacity of the algorithm to recover true edges through the sensitivity The capacity of the algorithm to recover true absences of edges through the specificity Structure similarity trough the Structural Hamming Distance (SHD), which is a score that compares the following errors Wrong connection: an absent edge in the original graph is present in the learned graph Missed edge: a true edge is not in the learned graph Wrong orientation: an edge has a different orientation in the original and learned structure Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 14

25 Discussion More simulations are ongoing Dynamic assumptions be not adequate in some biological settings Ability to identify true edges (sensitivity) remains low Perspective: Combining constraint-based and Bayesian methods Vahé Asvatourian Integrating expert s knowledge in constraint learning algorithm with time-dependent exposures 15

Predicting causal effects in large-scale systems from observational data

nature methods Predicting causal effects in large-scale systems from observational data Marloes H Maathuis 1, Diego Colombo 1, Markus Kalisch 1 & Peter Bühlmann 1,2 Supplementary figures and text: Supplementary