OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS Outcome regressions and propensity scores

Size: px
Start display at page:

Download "OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS Outcome regressions and propensity scores"


1 OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS Outcome regressions and propensity scores

2 Outcome Regression and Propensity Scores ( 15) Outline 15.1 Outcome regression 15.2 Propensity scores 15.3 Propensity stratification and standardization 15.4 Propensity matching 15.5 Propensity models, structural models, predictive models BIOS Outcome regressions and propensity scores

3 15.1 Outcome regression Recall in 14 we specified the structural model E[Y a Y a=0 A = a,l] = β 1 a + β 2 al Note this model and g-estimation approach to inference do not require modeling the L Y association Thus g-est is protected from bias arising from mis-specifying the L Y association Suppose now that we are willing to model the L Y association within levels of A BIOS Outcome regressions and propensity scores

4 15.1 Outcome regression Consider marginal structural model E[Y a L] = β 0 + β 1 a + β 2 al + β 3 L The effect of quitting smoking on weight gain in each stratum of L is a function of β 1 and β 2 Parameter β 3 is often (eg in a linear models course) referred to as main effect of L The terminology effect is misleading because β 3 may not have an interpretation as the causal effect of L. Eg, we have not indexed the potential outcomes on the left side by the levels of L, there may be confounding, etc β 3 simply quantifies how the mean of the counterfactual Y a=0 varies as a function of L BIOS Outcome regressions and propensity scores

5 15.1 Outcome regression Because our goal is inference about the causal effect of A on Y, which is a function of β 1 and β 2, the parameters β 0 and β 3 are nuisance parameters An advantage of g-estimation is that we need not estimate the nuisance parameter β 3 (Fine Pt 15.1) Parameters of the structural model above can be consistently estimated by the outcome regression model E[Y A,C = 0,L] = α 0 + α 1 A + α 2 AL + α 3 L assuming L sufficient to adjust for confounding (and selection bias due to drop out C) Obtain ˆα 1 = ˆβ and ˆα 2 = ˆβ BIOS Outcome regressions and propensity scores

6 15.1 Outcome regression These estimates can be interpreted as conditional causal effects Eg, the effect estimate for those smoking 5 cigs/day is Ê[Y A = 1,C = 0,L] Ê[Y A = 0,C = 0,L] = ˆβ 1 +5 ˆβ 2 = Eg, effect estimate for those smoking 40 cigs/day is ˆβ ˆβ 2 Outcome regression does not readily yield marginal causal effect estimate unless we fit E[Y A,C = 0,L] = α 0 + α 1 A + α 3 L which is likely mis-specified; for the NHESF data, ˆα 1 = 3.5 (95% CI 2.6, 4.3) BIOS Outcome regressions and propensity scores

7 15.2 Propensity Scores Let p(l) = Pr[A = 1 L]; note p(l) close to 0 for individuals w/ low prob of receiving treatment and close to 1 for those w/ high prob of receiving treatment Ie, p(l) measures the propensity of individuals to receive treatment given information available in the covariates; propensity score In randomized trial where assignment to treatment or not equally likely, p(l) = 0.5 In observational studies treatment assignment/selection mechanism unknown; therefore p(l) unknown BIOS Outcome regressions and propensity scores

8 15.2 Propensity Scores In IP weighting and g-estimation we estimated propensity scores p(l) by logistic regression code: Program 15.2 Here we only consider propensity scores for dichotomous treatments. Propensity score methods, other logitpr[a = 1 L] = α 0 + α 1 L When using IP weighting (Chapter 12) estimated the probability of treatment for each individual. Let us refer to this value of ( ) is close to 0 for individuals w treatment and is close to 1 for those who treatment. That is, ( ) measures the treatment given the information available ( ) is referred to as the propensity score In an ideal randomized trial in which to treatment =1, the propensity scor Under this model, individual had the lowest estimated propensity score (0.053), andindividual other related doubly-robust the highest es- the (0.793) than IP weighting and g-estimation data. timators, are difficult to generalize note that ( ) =0 5 for any choice of. some individuals may be more likely to cause treatment assignment is beyond the propensity score ( ) is unknown, and t Figure 15.1 shows the to distribution non-dichotomous oftreatments. the estimated propensity score in quitters A = 1 (top) and nonquitters A = 0 (bottom) Figure 15.1 In our example, we can estimate th logistic model for the probability of qu covariates. This is the same model th estimation. Under this model, individu lowest estimated propensity score (0 053 (0 793). Figure 15.1 shows the distributi in quitters =1(top) and nonquitters who quit smoking had, on average, a gre (0 312) than those who did not quit (0 2 thesameforthetreated =1and the u no confounding due to, i.e., there wou causal diagram. Individuals with same propensity sco values of some covariates. For examp may differ with respect to smoking inten be equally likely to quit smoking given individuals have the same conditional pr group =1. If we consider all individu BIOS Outcome regressions and propensity scores superpopulation, this group will include

9 15.2 Propensity Scores As expected, those who quit smoking had, on average, a greater estimated probability of quitting (0.312) than those who did not quit (0.245) Individuals with same propensity score p(l) will not necessarily have same covariates L Eg, two individuals may have p(l) = 0.2 but different levels of smoking intensity and exercise, yet they may be equally likely to quit smoking given all variables in L If we consider all individuals in the super-population with the same value of p(l), this group may have different values of L, but distribution of L will be the same in the treated and untreated (HW) A L p(l) Thus propensity score is an example of a balancing score (Tech Pt 15.1) BIOS Outcome regressions and propensity scores

10 If is sufficient to adjust for con founding Propensity and selection bias, Scores then ( ) is sufficient too. This result ity and positivity (besides, of course, wel propensity score methods is justifed by the ity of the treated and the untreated with exchangeability within levels of the propen Key result about propensity was derived by scores Rosenbaum (Rosenbaum and Rubin and Rubin 1983): If Y a A L, then Y a in a seminal paper published in A p(l) exchangeability ` implies ` els of the propensity score ( ) which me sity score equal to either 1 or 0 holds if of the covariates, asdefined in Chapt ` ( ) and positivity within levels o be used to estimate causal effects using s Ie, if L sufficient to adjust for confounding and selection bias, then each of these methods. p(l) is sufficient too gression), standardization, and matching. Figure 15.2 depicts the propensity score for the setting represented in Figure 7.1; p(l) 15.3 is anpropensity intermediate stratification between Land standardization A with a deterministic arrow from L to p(l) L p(l) Figure 15.2 A Y Under exchangeability and positivity, pr used to consistently estimate the average c a particular value of the propensity scor E[ =0 =0 ( ) = ]. In its simplest form individuals with the value. However, the variable that can take any value between 0 two individuals will have exactly the same v 1089 hadanestimated ( ) of , whic causal effect among individuals with ( ) and the untreated with that particular va In practice, propensity score stratificat contain individuals with similar, but not i of the estimated ( ) is a popular choice classified in 10 strata of approximately e BIOS Outcome regressions and propensity scores

11 15.2 Propensity Scores Proof of key result that if Y a A L, then Y a A p(l): Pr[A = 1 Y a = y, p(l) = r] = l Pr[A = 1,L = l Y a = y, p(l) = r] = l Pr[A = 1 L = l,y a = y, p(l) = r]pr[l = l Y a = y, p(l) = r] = l Pr[A = 1 L = l]pr[l = l Y a = y, p(l) = r] Similarly = l p(l)pr[l = l Y a = y, p(l) = r] = r Pr[A = 1 p(l) = r] = l p(l)pr[l = l p(l) = r] = r BIOS Outcome regressions and propensity scores

12 15.3 Propensity stratification and standardization Assume Y a A L, which implies Y a A p(l). Therefore can identify causal effects within strata defined by p(l) E[Y a=1 Y a=0 p(l) = s] = E[Y A = 1, p(l) = s] E[Y A = 0, p(l) = s] If L contains at least one continuous covariate or more than a few categorical covariates, then p(l) may take on many values between 0 and 1 For NHESF example, only individual 1089 had an estimated p(l) of ; therefore, cannot estimate the causal effect among individuals with p(l) = by comparing the treated and the untreated with that particular value BIOS Outcome regressions and propensity scores

13 15.3 Propensity stratification and standardization In practice, PS stratification is carried out within strata that contain individuals with similar, but not identical, values of p(l); deciles of ˆp(L) a popular choice For NHESF data, approx 160 individuals per decile, with wide 95% CIs Fewer strata (eg quintiles) may increase precision, but also may be more likely there is not exchangeability within strata; i.e., Y a A p(l) does not imply Y a A c 1 < p(l) < c 2 Lunceford and Davidian (2004) show PS stratification-based estimator performs poorly in finite samples compared to IPW estimators BIOS Outcome regressions and propensity scores

14 15.3 Propensity stratification and standardization Alternatively consider the outcome regression model E[Y A,C = 0, p(l)] = α 0 + α 1 A + α 2 p(l) In practice p(l) unknown and first estimated by, say, logistic regression For NHESF data, estimated effect of quitting smoking on weight gain 3.6 (95% CI: 2.7, 4.5) kg. Validity of inference from outcome regression model above depends on correct specification of the relationship between p(l) and the mean outcome Y (e.g., in the model above we assume it is linear and there is no interaction between A and p(l)); IP weighting and g-estimation agnostic about this relationship If interaction term between A and p(l) included, then estimating the unconditional causal effect E[Y a=1 ] E[Y a=0 ] would require standardization as in 13, except here we would standardize over the distn of p(l) instead of the distn of L BIOS Outcome regressions and propensity scores

15 15.4 Propensity Matching There are many forms of propensity matching General idea is to form a matched population in which the treated and the untreated are exchangeable because they have the same distribution of p(l) For example, one can match the untreated to the treated: each treated individual is paired with one (or more) untreated individuals with the same propensity score value. Subset of the original population comprised of treated-untreated pairs (or sets) is matched population Under exchangeability and positivity given p(l), association estimators in general will be consistent for causal effects in the matched population, e.g., observed risk ratio will be consistent for causal risk ratio in matched population BIOS Outcome regressions and propensity scores

16 15.4 Propensity Matching Again, it is often the case that no two individuals in a data set have the same (estimated) propensity score Therefore individuals are matched if propensity scores are close according to some definition of closeness For example treated individual 1089 has an estimated PS of ; they might be matched to individual 1088 who has estimated PS of Individuals for whom no other individual is close in terms of PS may be excluded; thus the matched population and the target superpopulation may be different There are numerous ways of defining closeness, and detailed descriptions of these definitions are not in the text BIOS Outcome regressions and propensity scores

17 15.4 Propensity Matching Defining closeness in propensity matching entails bias-variance tradeoff If closeness criteria too loose, individuals with relatively different values of p(l) will be matched to each other, the distribution of p(l) will differ between the treated and the untreated in the matched population, and exchangeability will not hold Conversely, if closeness criteria are too tight and many individuals are excluded by the matching procedure, there will be approximate exchangeability but effect estimate will be less precise BIOS Outcome regressions and propensity scores

18 15.4 Propensity Matching In theory, propensity matching can be used to estimate the causal effect in a well characterized target population. Eg, when matching each treated individual with one or more untreated individuals and excluding the unmatched untreated, one is estimating the effect in the treated (cf Fine Pt 15.2) In practice, however, propensity matching may yield an effect estimate in a hard-to-describe subset of the study population Eg, under a given definition of closeness, some treated individuals cannot be matched with any untreated individuals and thus are excluded from analysis; effect estimate then corresponds to subset of population w/ values of estimated PS that have successful matches BIOS Outcome regressions and propensity scores

19 15.4 Propensity Matching That PS matching forces investigators to restrict analysis to treatment groups with overlapping distributions of the estimated PS is strength of method However, interpretation can be difficult Eg, suppose based on Fig 15.1, we conclude can only estimate effect of smoking cessation for individuals with an estimated PS < Who are these people? Restriction based on real world variables easier to interpret Eg, 2 individuals with estimated PS > 0.67 were only ones in the study who were over age 50 and had smoked for less than 10 years; could exclude them and explain that our effect estimate only applies to smokers under age 50 and to smokers 50 and over who had smoked for at least 10 years BIOS Outcome regressions and propensity scores

20 15.5 Propensity models, structural models, predictive models Recap: In Part II of HR we consider propensity models and structural models Propensity models are models for the probability of treatment A given the variables L used to try to achieve conditional exchangeability Used for matching and stratification in this section; for IP weighting in 12; and for g-estimation in 14 Parameters of propensity model are nuisance parameters BIOS Outcome regressions and propensity scores

21 15.5 Propensity models, structural models, predictive models Structural models describe the relation between the treatment A and some component of the distribution (e.g., the mean) of the counterfactual outcome Y a, either marginally or within levels of the variables L Parameters (coefficients) for treatment are not nuisance parameters; rather, they have a direct causal interpretation of effect of treatment on outcome MSM need not include effect modifiers unless of substantive interest SNM require inclusion of effect modifiers, if they exist, for valid inference BIOS Outcome regressions and propensity scores

22 15.5 Propensity models, structural models, predictive models Outcome regression can be as a method for (i) causal inference or (ii) prediction As an example of (ii), a doctor may use a predictive model to identify individuals at high risk of disease; parameters of these predictive models do not necessarily have a causal interpretation Dual use of outcome regression for causal inference and prediction potentially confusing Confounding Eg, consider variable selection and the M-bias example U 1 U 2 L A Figure 7.4 Y The bias induced in Figure 7.4 between the structural and traditional d In Figure 7.3 there is also confoundi outcome share the common cause blocked by conditioning on. Therefore given, and we say that is a confo definition, is also a confounder and associated with the treatment (it share associated with the outcome conditiona effect on ), and it does not lie on the ca outcome. Again, there is no discrepancy definitions of confounder for the causal d The key figure is Figure 7.4. In this causes of treatment and outcome, a BIOS Outcome regressions and propensity scores

23 15.5 Propensity models, structural models, predictive models Here L is predictive of outcome but introduces bias when drawing causal inference Note also that propensity models need not predict treatment A well; just need to include covariates L that guarantee exchangeability Including covariates predictive of treatment A but not necessary for exchangeability may increase variance Eg, consider two site study where p(l) = 0.01 at one site and p(l) = 0.99 at other site. Suppose site has no effect on outcome, so no need to condition/adjust for site. However, suppose we adjust for site by standardization. Then variance will be v large b/c within each site one of the two treatment groups will be v small. BIOS Outcome regressions and propensity scores


G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS 776 1 14 G-Estimation ( G-Estimation of Structural Nested Models 14) Outline 14.1 The causal question revisited 14.2 Exchangeability revisited

More information


G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS 776 1 14 G-Estimation G-Estimation of Structural Nested Models ( 14) Outline 14.1 The causal question revisited 14.2 Exchangeability revisited

More information

Propensity Score Methods for Causal Inference

Propensity Score Methods for Causal Inference John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good

More information


IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS IPW and MSM IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS 776 1 12 IPW and MSM IP weighting and marginal structural models ( 12) Outline 12.1 The causal question 12.2 Estimating IP weights via modeling

More information

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai Weighting Methods Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Weighting Methods Stat186/Gov2002 Fall 2018 1 / 13 Motivation Matching methods for improving

More information

Causal Inference Basics

Causal Inference Basics Causal Inference Basics Sam Lendle October 09, 2013 Observed data, question, counterfactuals Observed data: n i.i.d copies of baseline covariates W, treatment A {0, 1}, and outcome Y. O i = (W i, A i,

More information

Estimating the Marginal Odds Ratio in Observational Studies

Estimating the Marginal Odds Ratio in Observational Studies Estimating the Marginal Odds Ratio in Observational Studies Travis Loux Christiana Drake Department of Statistics University of California, Davis June 20, 2011 Outline The Counterfactual Model Odds Ratios

More information

Primal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing

Primal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing Primal-dual Covariate Balance and Minimal Double Robustness via (Joint work with Daniel Percival) Department of Statistics, Stanford University JSM, August 9, 2015 Outline 1 2 3 1/18 Setting Rubin s causal

More information

Combining multiple observational data sources to estimate causal eects

Combining multiple observational data sources to estimate causal eects Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information

Marginal, crude and conditional odds ratios

Marginal, crude and conditional odds ratios Marginal, crude and conditional odds ratios Denitions and estimation Travis Loux Gradute student, UC Davis Department of Statistics March 31, 2010 Parameter Denitions When measuring the eect of a binary

More information

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Overview In observational and experimental studies, the goal may be to estimate the effect

More information

Propensity Score Analysis with Hierarchical Data

Propensity Score Analysis with Hierarchical Data Propensity Score Analysis with Hierarchical Data Fan Li Alan Zaslavsky Mary Beth Landrum Department of Health Care Policy Harvard Medical School May 19, 2008 Introduction Population-based observational

More information

Gov 2002: 4. Observational Studies and Confounding

Gov 2002: 4. Observational Studies and Confounding Gov 2002: 4. Observational Studies and Confounding Matthew Blackwell September 10, 2015 Where are we? Where are we going? Last two weeks: randomized experiments. From here on: observational studies. What

More information

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007)

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007) Double Robustness Bang and Robins (2005) Kang and Schafer (2007) Set-Up Assume throughout that treatment assignment is ignorable given covariates (similar to assumption that data are missing at random

More information

An Introduction to Causal Analysis on Observational Data using Propensity Scores

An Introduction to Causal Analysis on Observational Data using Propensity Scores An Introduction to Causal Analysis on Observational Data using Propensity Scores Margie Rosenberg*, PhD, FSA Brian Hartman**, PhD, ASA Shannon Lane* *University of Wisconsin Madison **University of Connecticut

More information

Section 9c. Propensity scores. Controlling for bias & confounding in observational studies

Section 9c. Propensity scores. Controlling for bias & confounding in observational studies Section 9c Propensity scores Controlling for bias & confounding in observational studies 1 Logistic regression and propensity scores Consider comparing an outcome in two treatment groups: A vs B. In a

More information

Summary and discussion of The central role of the propensity score in observational studies for causal effects

Summary and discussion of The central role of the propensity score in observational studies for causal effects Summary and discussion of The central role of the propensity score in observational studies for causal effects Statistics Journal Club, 36-825 Jessica Chemali and Michael Vespe 1 Summary 1.1 Background

More information

Assess Assumptions and Sensitivity Analysis. Fan Li March 26, 2014

Assess Assumptions and Sensitivity Analysis. Fan Li March 26, 2014 Assess Assumptions and Sensitivity Analysis Fan Li March 26, 2014 Two Key Assumptions 1. Overlap: 0

More information

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 12: Effect modification, and confounding in logistic regression Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression

More information

Covariate Balancing Propensity Score for General Treatment Regimes

Covariate Balancing Propensity Score for General Treatment Regimes Covariate Balancing Propensity Score for General Treatment Regimes Kosuke Imai Princeton University October 14, 2014 Talk at the Department of Psychiatry, Columbia University Joint work with Christian

More information

Behavioral Data Mining. Lecture 19 Regression and Causal Effects

Behavioral Data Mining. Lecture 19 Regression and Causal Effects Behavioral Data Mining Lecture 19 Regression and Causal Effects Outline Counterfactuals and Potential Outcomes Regression Models Causal Effects from Matching and Regression Weighted regression Counterfactuals

More information

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification Todd MacKenzie, PhD Collaborators A. James O Malley Tor Tosteson Therese Stukel 2 Overview 1. Instrumental variable

More information

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score Causal Inference with General Treatment Regimes: Generalizing the Propensity Score David van Dyk Department of Statistics, University of California, Irvine vandyk@stat.harvard.edu Joint work with Kosuke

More information

Propensity Score Methods, Models and Adjustment

Propensity Score Methods, Models and Adjustment Propensity Score Methods, Models and Adjustment Dr David A. Stephens Department of Mathematics & Statistics McGill University Montreal, QC, Canada. d.stephens@math.mcgill.ca www.math.mcgill.ca/dstephens/siscr2016/

More information

Ignoring the matching variables in cohort studies - when is it valid, and why?

Ignoring the matching variables in cohort studies - when is it valid, and why? Ignoring the matching variables in cohort studies - when is it valid, and why? Arvid Sjölander Abstract In observational studies of the effect of an exposure on an outcome, the exposure-outcome association

More information

Propensity Score Methods for Estimating Causal Effects from Complex Survey Data

Propensity Score Methods for Estimating Causal Effects from Complex Survey Data Propensity Score Methods for Estimating Causal Effects from Complex Survey Data Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School

More information

Lab 4, modified 2/25/11; see also Rogosa R-session

Lab 4, modified 2/25/11; see also Rogosa R-session Lab 4, modified 2/25/11; see also Rogosa R-session Stat 209 Lab: Matched Sets in R Lab prepared by Karen Kapur. 1 Motivation 1. Suppose we are trying to measure the effect of a treatment variable on the

More information

e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls

e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls under the restrictions of the copyright, in particular

More information


Propensity Score Matching and Analysis TEXAS EVALUATION NETWORK INSTITUTE AUSTIN, TX NOVEMBER 9, 2018 Propensity Score Matching and Analysis TEXAS EVALUATION NETWORK INSTITUTE AUSTIN, TX NOVEMBER 9, 2018 Schedule and outline 1:00 Introduction and overview 1:15 Quasi-experimental vs. experimental designs

More information

arxiv: v1 [stat.me] 15 May 2011

arxiv: v1 [stat.me] 15 May 2011 Working Paper Propensity Score Analysis with Matching Weights Liang Li, Ph.D. arxiv:1105.2917v1 [stat.me] 15 May 2011 Associate Staff of Biostatistics Department of Quantitative Health Sciences, Cleveland

More information

Lecture Discussion. Confounding, Non-Collapsibility, Precision, and Power Statistics Statistical Methods II. Presented February 27, 2018

Lecture Discussion. Confounding, Non-Collapsibility, Precision, and Power Statistics Statistical Methods II. Presented February 27, 2018 , Non-, Precision, and Power Statistics 211 - Statistical Methods II Presented February 27, 2018 Dan Gillen Department of Statistics University of California, Irvine Discussion.1 Various definitions of

More information

Extending causal inferences from a randomized trial to a target population

Extending causal inferences from a randomized trial to a target population Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh

More information

The propensity score with continuous treatments

The propensity score with continuous treatments 7 The propensity score with continuous treatments Keisuke Hirano and Guido W. Imbens 1 7.1 Introduction Much of the work on propensity score analysis has focused on the case in which the treatment is binary.

More information

Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design

Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design 1 / 32 Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design Changbao Wu Department of Statistics and Actuarial Science University of Waterloo (Joint work with Min Chen and Mary

More information

Targeted Maximum Likelihood Estimation in Safety Analysis

Targeted Maximum Likelihood Estimation in Safety Analysis Targeted Maximum Likelihood Estimation in Safety Analysis Sam Lendle 1 Bruce Fireman 2 Mark van der Laan 1 1 UC Berkeley 2 Kaiser Permanente ISPE Advanced Topics Session, Barcelona, August 2012 1 / 35

More information

Propensity Score Matching

Propensity Score Matching Methods James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 Methods 1 Introduction 2 3 4 Introduction Why Match? 5 Definition Methods and In

More information

Causal Inference in Observational Studies with Non-Binary Treatments. David A. van Dyk

Causal Inference in Observational Studies with Non-Binary Treatments. David A. van Dyk Causal Inference in Observational Studies with Non-Binary reatments Statistics Section, Imperial College London Joint work with Shandong Zhao and Kosuke Imai Cass Business School, October 2013 Outline

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics Lecture 3 : Regression: CEF and Simple OLS Zhaopeng Qu Business School,Nanjing University Oct 9th, 2017 Zhaopeng Qu (Nanjing University) Introduction to Econometrics Oct 9th,

More information


ANALYTIC COMPARISON. Pearl and Rubin CAUSAL FRAMEWORKS ANALYTIC COMPARISON of Pearl and Rubin CAUSAL FRAMEWORKS Content Page Part I. General Considerations Chapter 1. What is the question? 16 Introduction 16 1. Randomization 17 1.1 An Example of Randomization

More information

Robustness to Parametric Assumptions in Missing Data Models

Robustness to Parametric Assumptions in Missing Data Models Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice

More information

Causal Inference. Miguel A. Hernán, James M. Robins. May 19, 2017

Causal Inference. Miguel A. Hernán, James M. Robins. May 19, 2017 Causal Inference Miguel A. Hernán, James M. Robins May 19, 2017 ii Causal Inference Part III Causal inference from complex longitudinal data Chapter 19 TIME-VARYING TREATMENTS So far this book has dealt

More information

The Impact of Measurement Error on Propensity Score Analysis: An Empirical Investigation of Fallible Covariates

The Impact of Measurement Error on Propensity Score Analysis: An Empirical Investigation of Fallible Covariates The Impact of Measurement Error on Propensity Score Analysis: An Empirical Investigation of Fallible Covariates Eun Sook Kim, Patricia Rodríguez de Gil, Jeffrey D. Kromrey, Rheta E. Lanehart, Aarti Bellara,

More information

Propensity Score Weighting with Multilevel Data

Propensity Score Weighting with Multilevel Data Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative

More information

Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula.

Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula. FACULTY OF PSYCHOLOGY AND EDUCATIONAL SCIENCES Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula. Modern Modeling Methods (M 3 ) Conference Beatrijs Moerkerke

More information

Methods for inferring short- and long-term effects of exposures on outcomes, using longitudinal data on both measures

Methods for inferring short- and long-term effects of exposures on outcomes, using longitudinal data on both measures Methods for inferring short- and long-term effects of exposures on outcomes, using longitudinal data on both measures Ruth Keogh, Stijn Vansteelandt, Rhian Daniel Department of Medical Statistics London

More information

Online supplement. Absolute Value of Lung Function (FEV 1 or FVC) Explains the Sex Difference in. Breathlessness in the General Population

Online supplement. Absolute Value of Lung Function (FEV 1 or FVC) Explains the Sex Difference in. Breathlessness in the General Population Online supplement Absolute Value of Lung Function (FEV 1 or FVC) Explains the Sex Difference in Breathlessness in the General Population Table S1. Comparison between patients who were excluded or included

More information

Mediation Analysis for Health Disparities Research

Mediation Analysis for Health Disparities Research Mediation Analysis for Health Disparities Research Ashley I Naimi, PhD Oct 27 2016 @ashley_naimi wwwashleyisaacnaimicom ashleynaimi@pittedu Orientation 24 Numbered Equations Slides at: wwwashleyisaacnaimicom/slides

More information

Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous

Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous Y : X M Y a=0 = Y a a m = Y a cum (a) : Y a = Y a=0 + cum (a) an unknown parameter. = 0, Y a = Y a=0 = Y for all subjects Rank

More information

Investigating mediation when counterfactuals are not metaphysical: Does sunlight exposure mediate the effect of eye-glasses on cataracts?

Investigating mediation when counterfactuals are not metaphysical: Does sunlight exposure mediate the effect of eye-glasses on cataracts? Investigating mediation when counterfactuals are not metaphysical: Does sunlight exposure mediate the effect of eye-glasses on cataracts? Brian Egleston Fox Chase Cancer Center Collaborators: Daniel Scharfstein,

More information

A Sampling of IMPACT Research:

A Sampling of IMPACT Research: A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment

More information

Causality II: How does causal inference fit into public health and what it is the role of statistics?

Causality II: How does causal inference fit into public health and what it is the role of statistics? Causality II: How does causal inference fit into public health and what it is the role of statistics? Statistics for Psychosocial Research II November 13, 2006 1 Outline Potential Outcomes / Counterfactual

More information

36-463/663: Multilevel & Hierarchical Models

36-463/663: Multilevel & Hierarchical Models 36-463/663: Multilevel & Hierarchical Models (P)review: in-class midterm Brian Junker 132E Baker Hall brian@stat.cmu.edu 1 In-class midterm Closed book, closed notes, closed electronics (otherwise I have

More information

Lecture 4 Multiple linear regression

Lecture 4 Multiple linear regression Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters

More information

studies, situations (like an experiment) in which a group of units is exposed to a

studies, situations (like an experiment) in which a group of units is exposed to a 1. Introduction An important problem of causal inference is how to estimate treatment effects in observational studies, situations (like an experiment) in which a group of units is exposed to a well-defined

More information

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis. Estimating the Mean Response of Treatment Duration Regimes in an Observational Study Anastasios A. Tsiatis http://www.stat.ncsu.edu/ tsiatis/ Introduction to Dynamic Treatment Regimes 1 Outline Description

More information



More information

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Joe Schafer Office of the Associate Director for Research and Methodology U.S. Census

More information

Data Integration for Big Data Analysis for finite population inference

Data Integration for Big Data Analysis for finite population inference for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation

More information

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation Biost 58 Applied Biostatistics II Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 5: Review Purpose of Statistics Statistics is about science (Science in the broadest

More information

Causal Inference with Measurement Error

Causal Inference with Measurement Error Causal Inference with Measurement Error by Di Shu A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Doctor of Philosophy in Statistics Waterloo,

More information

Causal inference in epidemiological practice

Causal inference in epidemiological practice Causal inference in epidemiological practice Willem van der Wal Biostatistics, Julius Center UMC Utrecht June 5, 2 Overview Introduction to causal inference Marginal causal effects Estimating marginal

More information

Part IV Statistics in Epidemiology

Part IV Statistics in Epidemiology Part IV Statistics in Epidemiology There are many good statistical textbooks on the market, and we refer readers to some of these textbooks when they need statistical techniques to analyze data or to interpret

More information

Lecture 2: Poisson and logistic regression

Lecture 2: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs STAT 5500/6500 Conditional Logistic Regression for Matched Pairs Motivating Example: The data we will be using comes from a subset of data taken from the Los Angeles Study of the Endometrial Cancer Data

More information

Strategy of Bayesian Propensity. Score Estimation Approach. in Observational Study

Strategy of Bayesian Propensity. Score Estimation Approach. in Observational Study Theoretical Mathematics & Applications, vol.2, no.3, 2012, 75-86 ISSN: 1792-9687 (print), 1792-9709 (online) Scienpress Ltd, 2012 Strategy of Bayesian Propensity Score Estimation Approach in Observational

More information


PROPENSITY SCORE MATCHING. Walter Leite PROPENSITY SCORE MATCHING Walter Leite 1 EXAMPLE Question: Does having a job that provides or subsidizes child care increate the length that working mothers breastfeed their children? Treatment: Working

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University Joint

More information

PSC 504: Dynamic Causal Inference

PSC 504: Dynamic Causal Inference PSC 504: Dynamic Causal Inference Matthew Blackwell 4/8/203 e problem Let s go back to a problem that we faced earlier, which is how to estimate causal effects with treatments that vary over time. We could

More information

Gov 2002: 13. Dynamic Causal Inference

Gov 2002: 13. Dynamic Causal Inference Gov 2002: 13. Dynamic Causal Inference Matthew Blackwell December 19, 2015 1 / 33 1. Time-varying treatments 2. Marginal structural models 2 / 33 1/ Time-varying treatments 3 / 33 Time-varying treatments

More information

Chapter 3: Element sampling design: Part 1

Chapter 3: Element sampling design: Part 1 Chapter 3: Element sampling design: Part 1 Jae-Kwang Kim Fall, 2014 Simple random sampling 1 Simple random sampling 2 SRS with replacement 3 Systematic sampling Kim Ch. 3: Element sampling design: Part

More information

Lab 8. Matched Case Control Studies

Lab 8. Matched Case Control Studies Lab 8 Matched Case Control Studies Control of Confounding Technique for the control of confounding: At the design stage: Matching During the analysis of the results: Post-stratification analysis Advantage

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2010 Paper 269 Diagnosing and Responding to Violations in the Positivity Assumption Maya L. Petersen

More information

Matching. Quiz 2. Matching. Quiz 2. Exact Matching. Estimand 2/25/14

Matching. Quiz 2. Matching. Quiz 2. Exact Matching. Estimand 2/25/14 STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University Frequency 0 2 4 6 8 Quiz 2 Histogram of Quiz2 10 12 14 16 18 20 Quiz2

More information



More information

OUTLINE CAUSAL INFERENCE: LOGICAL FOUNDATION AND NEW RESULTS. Judea Pearl University of California Los Angeles (www.cs.ucla.

OUTLINE CAUSAL INFERENCE: LOGICAL FOUNDATION AND NEW RESULTS. Judea Pearl University of California Los Angeles (www.cs.ucla. OUTLINE CAUSAL INFERENCE: LOGICAL FOUNDATION AND NEW RESULTS Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea/) Statistical vs. Causal vs. Counterfactual inference: syntax and semantics

More information

Sensitivity analysis and distributional assumptions

Sensitivity analysis and distributional assumptions Sensitivity analysis and distributional assumptions Tyler J. VanderWeele Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL 60637, USA vanderweele@uchicago.edu

More information

Estimating Causal Effects from Observational Data with the CAUSALTRT Procedure

Estimating Causal Effects from Observational Data with the CAUSALTRT Procedure Paper SAS374-2017 Estimating Causal Effects from Observational Data with the CAUSALTRT Procedure Michael Lamm and Yiu-Fai Yung, SAS Institute Inc. ABSTRACT Randomized control trials have long been considered

More information

Lecture 5: LDA and Logistic Regression

Lecture 5: LDA and Logistic Regression Lecture 5: and Logistic Regression Hao Helen Zhang Hao Helen Zhang Lecture 5: and Logistic Regression 1 / 39 Outline Linear Classification Methods Two Popular Linear Models for Classification Linear Discriminant

More information

Modeling Log Data from an Intelligent Tutor Experiment

Modeling Log Data from an Intelligent Tutor Experiment Modeling Log Data from an Intelligent Tutor Experiment Adam Sales 1 joint work with John Pane & Asa Wilks College of Education University of Texas, Austin RAND Corporation Pittsburgh, PA & Santa Monica,

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Gov 2002: 5. Matching

Gov 2002: 5. Matching Gov 2002: 5. Matching Matthew Blackwell October 1, 2015 Where are we? Where are we going? Discussed randomized experiments, started talking about observational data. Last week: no unmeasured confounders

More information

Lecture 5: Poisson and logistic regression

Lecture 5: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 3-5 March 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

University of Michigan School of Public Health

University of Michigan School of Public Health University of Michigan School of Public Health The University of Michigan Department of Biostatistics Working Paper Series Year 003 Paper Weighting Adustments for Unit Nonresponse with Multiple Outcome

More information

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Progress, Updates, Problems William Jen Hoe Koh May 9, 2013 Overview Marginal vs Conditional What is TMLE? Key Estimation

More information

More Statistics tutorial at Logistic Regression and the new:

More Statistics tutorial at  Logistic Regression and the new: Logistic Regression and the new: Residual Logistic Regression 1 Outline 1. Logistic Regression 2. Confounding Variables 3. Controlling for Confounding Variables 4. Residual Linear Regression 5. Residual

More information

Propensity score modelling in observational studies using dimension reduction methods

Propensity score modelling in observational studies using dimension reduction methods University of Colorado, Denver From the SelectedWorks of Debashis Ghosh 2011 Propensity score modelling in observational studies using dimension reduction methods Debashis Ghosh, Penn State University

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Princeton University Asian Political Methodology Conference University of Sydney Joint

More information

Advanced Statistical Methods for Observational Studies L E C T U R E 0 1

Advanced Statistical Methods for Observational Studies L E C T U R E 0 1 Advanced Statistical Methods for Observational Studies L E C T U R E 0 1 introduction this class Website Expectations Questions observational studies The world of observational studies is kind of hard

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Suppose that we are concerned about the effects of smoking. How could we deal with this?

Suppose that we are concerned about the effects of smoking. How could we deal with this? Suppose that we want to study the relationship between coffee drinking and heart attacks in adult males under 55. In particular, we want to know if there is an association between coffee drinking and heart

More information

Counterfactual Model for Learning Systems

Counterfactual Model for Learning Systems Counterfactual Model for Learning Systems CS 7792 - Fall 28 Thorsten Joachims Department of Computer Science & Department of Information Science Cornell University Imbens, Rubin, Causal Inference for Statistical

More information

Noncompliance in Randomized Experiments

Noncompliance in Randomized Experiments Noncompliance in Randomized Experiments Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Noncompliance in Experiments Stat186/Gov2002 Fall 2018 1 / 15 Encouragement

More information

6.3 How the Associational Criterion Fails

6.3 How the Associational Criterion Fails 6.3. HOW THE ASSOCIATIONAL CRITERION FAILS 271 is randomized. We recall that this probability can be calculated from a causal model M either directly, by simulating the intervention do( = x), or (if P

More information

Since the seminal paper by Rosenbaum and Rubin (1983b) on propensity. Propensity Score Analysis. Concepts and Issues. Chapter 1. Wei Pan Haiyan Bai

Since the seminal paper by Rosenbaum and Rubin (1983b) on propensity. Propensity Score Analysis. Concepts and Issues. Chapter 1. Wei Pan Haiyan Bai Chapter 1 Propensity Score Analysis Concepts and Issues Wei Pan Haiyan Bai Since the seminal paper by Rosenbaum and Rubin (1983b) on propensity score analysis, research using propensity score analysis

More information

Empirical likelihood methods in missing response problems and causal interference

Empirical likelihood methods in missing response problems and causal interference The University of Toledo The University of Toledo Digital Repository Theses and Dissertations 2017 Empirical likelihood methods in missing response problems and causal interference Kaili Ren University

More information

Missing Covariate Data in Matched Case-Control Studies

Missing Covariate Data in Matched Case-Control Studies Missing Covariate Data in Matched Case-Control Studies Department of Statistics North Carolina State University Paul Rathouz Dept. of Health Studies U. of Chicago prathouz@health.bsd.uchicago.edu with

More information