Investigating mediation when counterfactuals are not metaphysical: Does sunlight exposure mediate the effect of eye-glasses on cataracts?

Similar documents
INVESTIGATING MEDIATION WHEN

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Propensity Score Methods for Causal Inference

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Propensity Score Weighting with Multilevel Data

Propensity Score Analysis with Hierarchical Data

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

Casual Mediation Analysis

Estimating the Marginal Odds Ratio in Observational Studies

Causal Modeling in Environmental Epidemiology. Joel Schwartz Harvard University

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.

Linear Regression With Special Variables

Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula.

An Introduction to Causal Analysis on Observational Data using Propensity Scores

Mediation Analysis for Health Disparities Research

Simple logistic regression

Mediation for the 21st Century

36-463/663: Multilevel & Hierarchical Models

THE DESIGN (VERSUS THE ANALYSIS) OF EVALUATIONS FROM OBSERVATIONAL STUDIES: PARALLELS WITH THE DESIGN OF RANDOMIZED EXPERIMENTS DONALD B.

Causal Inference for Mediation Effects

Potential Outcomes Model (POM)

Lecture 4 Multiple linear regression

Selection on Observables: Propensity Score Matching.

Estimating direct effects in cohort and case-control studies

Bootstrapping Sensitivity Analysis

Comparative effectiveness of dynamic treatment regimes

IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS IPW and MSM

WORKSHOP ON PRINCIPAL STRATIFICATION STANFORD UNIVERSITY, Luke W. Miratrix (Harvard University) Lindsay C. Page (University of Pittsburgh)

Modeling Mediation: Causes, Markers, and Mechanisms

Ignoring the matching variables in cohort studies - when is it valid, and why?

Causal Mechanisms Short Course Part II:

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

Bayesian Inference for Causal Mediation Effects Using Principal Stratification with Dichotomous Mediators and Outcomes

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks

University of California, Berkeley

Causality II: How does causal inference fit into public health and what it is the role of statistics?

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Ratio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects

Variable selection and machine learning methods in causal inference

Statistics in medicine

Conceptual overview: Techniques for establishing causal pathways in programs and policies

Causal mediation analysis: Definition of effects and common identification assumptions

Sensitivity analysis and distributional assumptions

Bayesian regression tree models for causal inference: regularization, confounding and heterogeneity

Spatio-Temporal Threshold Models for Relating UV Exposures and Skin Cancer in the Central United States

Standardization methods have been used in epidemiology. Marginal Structural Models as a Tool for Standardization ORIGINAL ARTICLE

One-stage dose-response meta-analysis

Statistical Analysis of Causal Mechanisms

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies.

University of California, Berkeley

Causal Mediation Analysis in R. Quantitative Methodology and Causal Mechanisms

Bounds on Causal Effects in Three-Arm Trials with Non-compliance. Jing Cheng Dylan Small

Causal inference in biomedical sciences: causal models involving genotypes. Mendelian randomization genes as Instrumental Variables

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs

OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS Outcome regressions and propensity scores

Identification and Inference in Causal Mediation Analysis

Marginal, crude and conditional odds ratios

SIMPLE EXAMPLES OF ESTIMATING CAUSAL EFFECTS USING TARGETED MAXIMUM LIKELIHOOD ESTIMATION

Section 9c. Propensity scores. Controlling for bias & confounding in observational studies

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Matching. Quiz 2. Matching. Quiz 2. Exact Matching. Estimand 2/25/14

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Lecture 1 Introduction to Multi-level Models

Identification of Causal Parameters in Randomized Studies with Mediating Variables

Statistical Methods for Causal Mediation Analysis

Gov 2002: 13. Dynamic Causal Inference

Extending causal inferences from a randomized trial to a target population

Estimating the Dynamic Effects of a Job Training Program with M. Program with Multiple Alternatives

Estimating Post-Treatment Effect Modification With Generalized Structural Mean Models

Job Training Partnership Act (JTPA)

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Part IV Statistics in Epidemiology

High-Throughput Sequencing Course

Mediation analyses. Advanced Psychometrics Methods in Cognitive Aging Research Workshop. June 6, 2016

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Ratio-of-Mediator-Probability Weighting for Causal Mediation Analysis. in the Presence of Treatment-by-Mediator Interaction

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

Binary Logistic Regression

University of California, Berkeley

Lecture 7 Time-dependent Covariates in Cox Regression

Methods for inferring short- and long-term effects of exposures on outcomes, using longitudinal data on both measures

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

13.1 Causal effects with continuous mediator and. predictors in their equations. The definitions for the direct, total indirect,

Causal Directed Acyclic Graphs

Modeling Log Data from an Intelligent Tutor Experiment

Discussion of Papers on the Extensions of Propensity Score

On the Analysis of Tuberculosis Studies with Intermittent Missing Sputum Data

Estimating and contextualizing the attenuation of odds ratios due to non-collapsibility

PSC 504: Instrumental Variables

Lecture 5: Poisson and logistic regression

Comparison of Three Approaches to Causal Mediation Analysis. Donna L. Coffman David P. MacKinnon Yeying Zhu Debashis Ghosh

Applications of GIS in Health Research. West Nile virus

Introduction to Statistical Inference

Global Sensitivity Analysis for Repeated Measures Studies with Informative Drop-out: A Semi-Parametric Approach

Statistical Analysis of List Experiments

STA6938-Logistic Regression Model

Transcription:

Investigating mediation when counterfactuals are not metaphysical: Does sunlight exposure mediate the effect of eye-glasses on cataracts? Brian Egleston Fox Chase Cancer Center Collaborators: Daniel Scharfstein, Beatriz Munoz, Sheila West Johns Hopkins University 1

Public Health Goals Cataracts are a major source of vision loss in older persons. Promoting the use of eye-glasses when people are outdoors might reduce the incidence of cataracts through their reduction in the amount of sunlight that reaches the eye. 2

Salisbury Eye Evaluation Population based study of approximately 2,500 older adults in Salisbury, Maryland. Participants asked about their lifetime glasses use, jobs, and leisure activities. The current study uses recalled eye-glasses use and sun exposure and presenting cortical cataract data. 3

Data Structure and Notation Z indicates glasses use (1 = use, 0 = no use) Y (z) indicates potential cataract outcome (1=cataract, 0 otherwise) under Z = z. M(z) is potential ocular UV exposure in Maryland Sun Years (MSYs) under Z = z. Full set of potential outcomes: {M(0), Y (0), M(1), Y (1)} 4

Data Structure and Notation X is a vector of confounding covariates: Age, Type of job in 30s, Race, Sex, Diabetic status, Education level M = M(Z), Y = Y (Z) are observed UV and cataract outcomes. Observed data: {Z, X, M, Y } 5

Mediation A mediator is the causal mechanism linking an exposure to an outcome. Causal hypothesis: Eye glasses UV exposure Cataracts Age 31 Ages 31-34 Age 65+ 6

Traditional Model Eye-glass use (Z) α Ultra-Violet (M) τ β Cataracts (Y ) Y = γ 1 + τz + ɛ 1 (1) M = γ 2 + αz + ɛ 2 (2) Y = γ 3 + τ Z + βm + ɛ 3 (3) For Y as a continuous measure, cov(ɛ 2, ɛ 3 )=0 Total effect of Z on Y is τ, direct effect is τ. 7

Traditional Model Under Baron and Kenny, a measure of the indirect (mediated) effect is αβ. Total effect = τ = τ + αβ. Stringent assumptions are necessary to give causal meaning to τ and αβ. 8

Controlled Effects Effects on outcomes after manipulating Z, M(Z) Y (z, m) Exchangeability assumptions needed to identify controlled effects under randomization. Y (z, m) M(z) Z = z, which implies, E[Y (z, m)] = E[Y (z, m) M(z) = m, Z = z] Then, E[Y (1, m)] E[Y (0, m)] = τ Are controlled effects meaningful? 9

Natural Effects Proposed by Robins and Pearl. Total Effect = E[Y (1)] E[Y (0)] = E[Y (1, M(1))] E[Y (0, M(1))] }{{} Direct Effect + E[Y (0, M(1))] E[Y (0, M(0))] }{{} Indirect (Mediated) Effect 10

Natural Effects An assumption is needed to identify natural mediational effects in addition to the assumption necessary for controlled effects. One assumption: Y (1, m) Y (0, m) = B is a random variable that does not depend on m. Natural effects have become the reference for assigning cause to mediational, surrogate marker, and indirect effect models (e.g. Taylor et al., 2005) 11

Natural Effects Are natural effects meaningful? How could one ever experimentally observe Y (1, M(0))? We would need to observe UV exposure in 30s when a person does not wear glasses, then go back in time and assign glasses but exposure under no glasses. 12

Proposed Causal Estimand RR(p, m) = P [Y (1) = 1 P = p, M(0) = m] P [Y (0) = 1 P = p, M(0) = m] P = M(1)/M(0). P is the proportion of potential UV that reaches eyes under glasses. Relative risk of cataracts with glasses use within strata based on baseline exposure and shielding effect of glasses. 13

Proposed Causal Estimand In a case of complete mediation we would expect that RR(1, m) = 1 for all m. If glasses use does not change an individual s UV exposure then glasses should not be associated with cataracts. 14

Proposed Causal Estimand In a case of mediation, we would expect that, 1 RR(p, m) > RR(p, m) if p > p The more that glasses prevent UV exposure, the more they prevent cataracts. This monotonicity might be broken if the principal stratum defined by {p, m} includes individuals who are very different from the principal stratum defined by {p, m}. 15

Local Causal Inference Strata are likely similar within neighborhoods of p for given M(0). After controlling for M(0), those with very different values of P might have different characteristics, but we did not expect this to be the case a priori. 16

Hypothesized RR(p,m) Proportion of UV penetrating glasses, P 0.0 0.2 0.4 0.6 0.8 1.0 Relative Risk = 1 Relative Risk Decreasing Relative Risk = 1 Relative Risk Decreasing 0.0 0.1 0.2 0.3 0.4 0.5 MSYs under no glasses, M(0) 17

Non-Metaphysical Counterfactual M(0) observable on everyone (Duncan et al., 1997) M = 12 s=1 G(s)R(s) 18 t=5 F (t, s)h(t, s)t hats (t, s)t eye (t, s) M = Total UV exposure s = Month t = Hour of day G(s) = Geographic correction factor R(s) = Ocular ambient exposure ratio F (t, s) = Fraction of time spent outdoors H(t, s) = Global ambient exposure T hats (t, s) = Percent of UV penetrating hats T eye (t, s) = Percent UV penetrating glasses; Set to 1 to identify M(0) 18

Identification of Estimand Assumption 1: Stable Unit Treatment Value An individual s potential outcomes are unrelated to glasses use of other study participants and there are only two well-defined treatment arms. Assumption 2: Z {Y (0), Y (1), M(1)} M(0), X This is an observational study equivalent of the randomization assumption in randomized trials. 19

Identification of Estimand Assumption 3: Y (0) M(1) Z, M(0), X If we already know someone s glasses use status, baseline UV exposure and set of confounding covariates, knowing UV exposure that would occur when a person wears glasses gives us no additional information about baseline cataract outcomes. 20

Identification of Estimand For a neighborhood dp of P = p, P [Y (1) = 1 P dp, M(0)] [ P [Y = 1 Z = 1, P dp, M(0), X]P [P dp Z = 1, M(0), X] = E E[P [P dp Z = 1, M(0), X]] ] M(0) P [Y (0) = 1 P dp, M(0)] [ P [Y = 1 Z = 0, M(0), X]P [P dp Z = 1, M(0), X] = E E[P [P dp Z = 1, M(0), X]] ] M(0) Use assumption 2 for first equality, assumptions 2 and 3 for second. 21

Models Models of primary interest: logit P [Y (0) = 1 P, M(0)] = g 0 (P, M(0); β 0) logit P [Y (1) = 1 P, M(0)] = g 1 (P, M(0); β 1) Propensity model used for assumption 2: logit P [Z = 1 M(0), X] = h(m(0), X; γ ) 22

Models Beta regression of P since we do not observe M(1) on those who did not wear glasses. logit E[P M(0), X] = k(m(0), X; η ) E[P M(0), X] = µ(m(0), X; η ) V ar[p M(0), X] = µ(m(0), X; η )(1 µ(m(0), X; η )) 1 + φ 23

Estimation Maximum likelihood estimates can be used for β 1, γ, η, and φ. Unbiased estimating equation for β 0 : U β0 (O ; ψ ) = E = [ (1 Z)g 0 (P, M(0); β 0) (Y expit {g 0 (P, M(0); β 0)}) (1 expit {h(m(0), X; γ ))} 1 0 (1 Z)g 0(p, M(0); β 0 )Y (0)f(p M(0), Z = 1, X; η, φ )dp O ] (1 expit {h(m(0), X; γ ))} 1 0 (1 Z)g 0 (p, M(0); β 0 )expit {g 0(p, M(0); β 0 )}f(p M(0), Z = 1, X; η, φ )dp where O = {Z, X, M(0), M, Y }. (1 expit {h(m(0), X; γ ))} 24

Estimation P [Y (z) = 1 P = p, M(0) = m] = exp {g z (p, m; β z )} 1 + exp {g z (p, m; β z )} RR(p, m) = P [Y (1) = 1 P = p, M(0) = m] P [Y (0) = 1 P = p, M(0) = m] 25

Large Sample Theory Stack the score equations and U β0 (O ; ψ ). U(O ; ψ) = [ U β0 (O ; ψ), U β1 (O ; ψ), Uγ(O ; ψ), Uη(O ; ψ), U φ (O ; ψ) Under mild regularity conditions (Huber, 1964), n( ψ ψ ) D Normal(0, Σ ) Σ = E [ U(O ; ψ ) ψ By the δ-method, n(rr(p, m; ψ) RR(p, m; ψ )) ] 1 E [ U(O ; ψ )U(O ; ψ ) ] E D N ( 0, RR(p, m; ψ ) ψ [ U(O ; ψ ) ψ ] ] 1 Σ RR(p, m; ψ ) ) ψ 26

Analysis Table 1: Characteristics of sample Variable No Eye-glass Use Eye-glass Use Number of participants 830 (42%) 1125 (58%) Cortical cataracts 16.1% 11.6% Sun exposure if glasses worn, M(1) - 0.06 Sun exposure if glasses never worn, M(0).17 (.11).16 (.11) Age 73.5 (5.0) 72.7 (4.8) Diabetic 17.4% 17.2% Male 54.6% 39.9% Black 30.7% 22.1% Not high school graduate 58.0% 45.6% Job characteristics Worked over water 1.7% 1.2% Worked outside on land 41.1% 28.5% Worked inside 38.9% 44.2% Worked as homemaker 18.3% 26.1%

Analysis: Baron and Kenny s Method Table 2: Logistic models of cataract development (coefficients as odds ratios). Variable Model 1 95% CI Model 2 95% CI Cataract Models Age 1.17 (1.07, 1.28) 1.17 (1.07, 1.28) Age spline term 0.89 (0.78, 1.03) 0.89 (0.78, 1.03) Diabetic 1.43 (1.02, 2.00) 1.43 (1.02, 2.00) Male 0.64 (0.45, 0.92) 0.63 (0.44, 0.91) Black 4.23 (3.13, 5.72) 4.22 (3.12, 5.71) Not high school grad 1.10 (0.81, 1.48) 1.09 (0.81, 1.48) Worked over water Reference Reference Worked outside 0.50 (0.20, 1.27) 0.52 (0.20, 1.32) Worked inside 0.64 (0.25, 1.66) 0.70 (0.26, 1.90) Worked as homemaker 0.54 (0.19, 1.51) 0.57 (0.20, 1.61) Glasses 0.74 (0.56, 0.99) 0.78 (0.57, 1.09) UV 1.80 (0.30, 10.76)

Analysis Figure 1: Estimates of P [Y (0) = 1 P = p, M(0) = m]: Probabilities of developing cataracts under no glasses within strata. Proportion of UV penetrating glasses, P 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.1 0.2 0.3 0.4 0.5 MSYs under no glasses, M(0)

Analysis: Relative Risk Proportion of UV penetrating glasses, P 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.1 0.2 0.3 0.4 0.5 MSYs under no glasses, M(0)

Analysis: Relative Risk P value Proportion of UV penetrating glasses, P 1.0 0.8 0.6 0.4 0.2 0.1 0.2 0.3 0.4 0.5 MSYs under no glasses, M(0) 1.0 0.8 0.6 0.4 0.2 0.0

Analysis: Relative Risk Figure 2: RR(p,m) and P vs. M(0) among glasses wearers; Red=Non-sunglasses users Proportion of UV penetrating glasses, P 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.1 0.2 0.3 0.4 0.5 MSYs under no glasses, M(0)

Analysis: Relative Risk Figure 3: RR(p,m) and P vs. M(0) among glasses wearers; Red=Black Proportion of UV penetrating glasses, P 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.1 0.2 0.3 0.4 0.5 MSYs under no glasses, M(0)

Discussion The RR is approximately 1 when P=1. The RR decreases as P decreases, suggesting a protective effect of glasses. The decrease in the RR is not monotone; this might be due to differences in principal strata. Sunglass users have higher values of P. These results are consistent with mediation. 34

Discussion The traditional method of analysis provided only marginal evidence of mediation. Our causal estimand provides a richer analysis. 35

Discussion This work presents how one might develop, identify, and estimate a scientifically meaningful causal estimand. The results suggest that encouraging people to wear eyeglasses in mid-life can reduce cataracts later in life. 36

Figure 4: Boxplots of Propensity of Wearing Glasses 0.0 0.2 0.4 0.6 0.8 1.0 Did not wear glasses Wore glasses P(Wore Glasses X)

Analysis: Relative Risk Figure 5: RR(p,m) and P vs. M(0) among glasses wearers; Red=Cataracts Proportion of UV penetrating glasses, P 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.1 0.2 0.3 0.4 0.5 MSYs under no glasses, M(0)

Table 3: Characteristics of sample after weighting by estimated probability of observed glasses use. Variable No Eye-glass Use Eye-glass Use Number of participants 830 (42%) 1125 (58%) Cortical cataracts 15.0% 12.2% Sun exposure if glasses never worn, M(0).17.17 Age 73.0 73.0 Diabetic 17.3% 17.4% Male 46.3% 46.1% Black 25.5% 25.5% Not high school graduate 51.1% 50.9% Job characteristics Worked over water 1.4% 1.4% Worked outside on land 33.4% 33.4% Worked inside 42.2% 42.3% Worked as homemaker 23.1% 22.9% 39

Figure 6: Histogram of M(0). Frequency 0 100 200 300 400 0.0 0.1 0.2 0.3 0.4 0.5 MSY exposure from 31 34 under no glasses, M(0)

Figure 7: Scatterplot of M(1) vs. M(0) among participants who wore glasses; jitter added. MSYs under glasses, M(1) 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 MSYs under no glasses, M(0)

Table 4: Results from logistic model of outdoor glasses use at age 31. Variable Estimate 95% CI Intercept 2.82 (-1.15, 6.80) Age -0.03 (-0.09, 0.02) Age spline term 0.00 (-0.09, 0.09) Diabetic 0.12 (-0.13, 0.36) Male -0.53 (-0.78, -0.29) Black 0.39 (0.16, 0.61) Not high school grad -0.39 (-0.58, -0.19) Worked over water Reference Worked outside -0.44 (-1.24, 0.35) Worked inside -0.25 (-1.10, 0.59) Worked as homemaker -0.26 (-1.11, 0.60) UV 3.96 (-1.72, 9.63) UV cubic spline term 1-24.70 (-52.11, 2.72) UV cubic spline term 2 59.02 (-3.02, 121.07)

Table 5: Results from Beta regression of P=M(1)/M(0) Variable Estimate 95% CI Intercept -0.56 (-2.98, 1.86) Age -0.01 (-0.04, 0.03) Age spline term 0.02 (-0.04, 0.08) Diabetic -0.02 (-0.17, 0.13) Male 0.17 (0.02, 0.33) Black -0.09 (-0.24, 0.06) Not high school grad 0.02 (-0.10, 0.14) Worked over water Reference Worked outside -0.08 (-0.62, 0.47) Worked inside 0.18 (-0.38, 0.75) Worked as homemaker 0.02 (-0.55, 0.58) UV 4.40 (0.87, 7.93) UV cubic spline term 1-10.61 (-27.69, 6.47) UV cubic spline term 2 19.72 (-18.94, 58.38) φ 3.12 (2.89, 3.35)

Analysis: Relative Risk Figure 8: RR(p,m) RR 1.0 2.0 Proportion of UV penetrating glasses, P 0.8 0.6 0.4 0.2 1.5 1.0 0.5 0.1 0.2 0.3 0.4 0.5 MSYs under no glasses, M(0)