Estimating the Marginal Odds Ratio in Observational Studies Travis Loux Christiana Drake Department of Statistics University of California, Davis June 20, 2011
Outline The Counterfactual Model Odds Ratios Dened Estimation of Odds Ratios The Propensity Score Matching on the Propensity Score Weighting by the Propensity Score Simulations
The Counterfactual Model Denitions We will use the following notation: Y 1 - the potential binary reponse if the unit is exposed Y 0 - the potential binary reponse if the unit is not exposed Z - the binary exposure (or treatment) of interest X - covariates associated with exposure and/or response Though each of Y 1 and Y 0 are real values, we are only able to observe one. We call the observed response Y : Y = Z Y 1 + (1 Z) Y 0
The Counterfactual Model A Population A hypothetical population would then look like the following: Unit X 1 X 2 X 3 Z Y 1 Y 0 1 0.09 1.80 1.86 0 1 1 2 0.51 0.25 1.62 1 0 1 3 0.48 0.69 0.35 1 0 0 4 1.48 1.76 0.47 0 0 1 5 0.55 0.50 0.82 1 1 0....... where the observed Y for each unit is colored blue
Odds Ratios Marginal Odds Ratio The marginal odds ratio can be obtained by comparing the odds of response in the population if everyone is exposed Odds exp = P (Y 1 = 1) P (Y 1 = 0) to the odds of response if everyone in not exposed Looking at the ratio Odds unexp = P (Y 0 = 1) P (Y 0 = 0) Oddsexp Odds unexp, the marginal odds ratio is ψ marg = P (Y 1 = 1) P (Y 0 = 0) P (Y 1 = 0) P (Y 0 = 1)
Odds Ratios Crude Odds Ratio The marginal odds ratio is often approximated by estimating the crude odds ratio: ψ crude = P (Y 1 = 1 Z = 1) P (Y 0 = 0 Z = 0) P (Y 1 = 0 Z = 1) P (Y 0 = 1 Z = 0) When confounding is present P (Y 1 = 1 Z = 1) P (Y 1 = 1) P (Y 0 = 1 Z = 0) P (Y 0 = 1) so the crude and marginal odds ratio may be dierent values
Odds Ratios Conditional Odds Ratio The conditional odds ratio is dened as ψ cond (x) = P (Y = 1 Z = 1, X = x) P (Y = 0 Z = 0, X = x) P (Y = 0 Z = 1, X = x) P (Y = 1 Z = 0, X = x) With the assumption of strongly ignorable treatment assignment, i.e. (Y 1, Y 0 ) Z X and 0 < P(Z = 1 X ) < 1, we can simplify: ψ cond (x) = P (Y 1 = 1 X = x) P (Y 0 = 0 X = x) P (Y 1 = 0 X = x) P (Y 0 = 1 X = x)
Odds Ratios Non-collapsibility In the linear model E(Y X, Z) = β T X + Z, the conditional eect of exposure is equal to the marginal eect: E(Y 1 X ) E(Y 0 X ) = (β T X + ) (β T X ) = E(Y 1 ) E(Y 0 ) = E(E X (β T X + )) E(E X (β T X )) = (β T µ X + ) (β T µ X ) = The linear model is collapsible In the logistic model, this property does not hold The dierence between the marginal and conditional eects means estimators are needed for each
Odds Ratios Standard Estimation If the data follow a logistic model with { exp β 0 + } p β j=1 j x j + αz P (Y = 1 X = x, Z = z) = { 1 + exp β 0 + } p β j=1 j x j + αz the conditional odds ratio is constant across X with ψ cond (x) = ψ cond = e α. Logistic regression will lead to an unbiased and asymptotically ecient estimator of ψ cond (x) In order for the estimate to be unbiased, X must contain all predictors of Y, not just the confounders (Gail, Wieand, and Piantadosi, 1984)
Odds Ratios Standard Estimation Subclassing observations based on covariates into multiple 2 2 tables, we can estimate the odds ratio by the Mantel-Haenszel (MH) estimator If the k th table takes the form Z = 1 Z = 0 Y = 1 a k b k Y = 0 c k d k n k the MH estimator is dened as ˆψ MH = k k a k d k n k b k c k n k But what odds ratio are we estimating?
Odds Ratios Standard Estimation If the covariates are constant within each subclass, ˆψ MH estimates the conditional odds ratio, assuming ψ cond is constant Examples: Subclassication dened by categorical covariates; perfect matching on covariates If covariates vary within the subclasses, ˆψ MH will be biased due to non-collapsibility (eg. Greenland, Robins, and Pearl, 1999) Extreme case: When data is summarized in one table, ˆψMH estimates the crude odds ratio Cochran (1968) shows that using 5 subclasses removes approximately 90% of bias due to confounding in linear models
The Propensity Score Denition and Consequences The propensity score (Rosenbaum and Rubin, 1983) is the probability of exposure conditional on covariates: e(x) = P(Z = 1 X = x) Populations of exposed and unexposed with the same propensity score have the same distribution of observed covariates: X Z e(x ) Under strongly ignorable treatment assignment on X, (Y 1, Y 0 ) Z e(x ) as well Common uses include Subclassication Weighting Matching As a covariate in a regression model
Matching on the Propensity Score In 1-to-1 matching, the MH estimator simplies: The k th table has the form the MH estimator becomes Z = 1 Z = 0 Y = 1 a k b k Y = 0 1 a k 1 b k 1 1 2 ˆψ MH = k a k(1 b k ) k (1 a k)b k Similar simplications hold for case-control designs
Matching on the Propensity Score As the number of matched pairs increases, it can be shown that n ( ˆψ MH ψ) N ( 0, σ 2) where and ψ = lim k p 1k(1 p 0k ) n k (1 p 1k)p 0k p zk = P(Y = 1 Z = z, e(x ) = e k ) for z = 0, 1
Matching on the Propensity Score Assuming a logistic outcome model, so ψ = lim n or p zk = k ψ = e α lim n ˆ exp { β T x + αz } 1 + exp {β T x + αz} df X e(x )(x e k ) exp{β T x+α} k k k 1+exp{β T x+α} df X e(x ) 1 1+exp{β T x} df X e(x ) 1 1+exp{β T x+α} df X e(x ) exp{β T x} 1+exp{β T x} df X e(x ) exp{β T x} 1+exp{β T x+α} df X e(x ) 1 1+exp{β T x} df X e(x ) 1 1+exp{β T x+α} df X e(x ) exp{β T x} 1+exp{β T x} df X e(x )
Matching on the Propensity Score When the exposure follows the model logit P(Z = 1 X ) = γ T X, and the outcome follows the model logit P(Y = 1 X, Z) = β T X + αz, the value of ψ depends on the relationship between γ T X and β T X : If β T X = f (γ T X ) for some f, then ψ = ψ cond β T X is constant in domains dened by e(x ) If γ T X and β T X are independent, then ψ = ψ marg Let H = h(x ) = β T X. Then F H e(x ) = F H If 0 < ρ ( γ T X, β T X ) < 1, ψ falls between ψ cond and ψ marg
Weighting by the Propensity Score A simple weighted estimate is obtained by ˆψ IPW 1 = ˆµ 1(1 ˆµ 0 ) (1 ˆµ 1 )ˆµ 0 where ˆµ 1 = 1 n i Z i Y i e(x i ) and ˆµ 0 = 1 n i (1 Z i )Y i 1 e(x i ) Unbiased for ψ marg if e(x ) correctly specied (Lunceford and Davidian, 2004) Extremely sensitive to extreme propensity scores (near 0 or 1) Either of ˆµ 1 and ˆµ 0 can be greater than 1, so that ˆψ IPW 1 < 0
Weighting by the Propensity Score Improvements can be made by using where ˆψ IPW 2 = µ 1(1 µ 0 ) (1 µ 1 ) µ 0 µ 1 = ( i Z i e(x i ) ) 1 i Z i Y i e(x i ) and µ 0 = ( i 1 Z i 1 e(x i ) ) 1 i (1 Z i )Y i 1 e(x i ) Remains unbiased Decreases sensitivity to propensity scores near 0 and 1 Less variance in small samples Always positive
Simulations Mechanics Created population of 2 million units: Standard normal covariates X = (X 1, X 2, X 3 ) Exposure Z : logit P(Z = 1 X) = γ T X Potential outcome Y 1 : logit P(Y 1 = 1 X, Z) = β T X + αz Potential outcome Y 0 : logit P(Y 0 = 1 X, Z) = β T X Took 10,000 independent samples of 2,000 observations each Crude and logistic estimation Sublcassied and matched on PS Weighted by inverse propensity score, 0.005 < e(x) < 0.995
Simulations Strong Correlation Crude Logistic Reg 5 PS classes PS matched IPW1 IPW2 2 4 6 8 10 Cor(γ T X, β T X) = 1 OR Estimate Crude: 6.924 Conditional: 3.004 Marginal: 1.85
Simulations Weak Correlation Crude Logistic Reg 5 PS classes PS matched IPW1 IPW2 1 2 3 4 5 6 Cor(γ T X, β T X) = 0.0008 OR Estimate Conditional: 3.015 Marginal: 1.604 Crude: 1.602
Simulations Moderate Correlation Crude Logistic Reg 5 PS classes PS matched IPW1 IPW2 1 2 3 4 5 6 Cor(γ T X, β T X) = 0.6413 OR Estimate Crude: 4.068 Conditional: 2.996 Marginal: 1.802
Simulations Correlation and Matching Bias Using scaled bias = Ê( ˆψ MH) ψ marg ψ cond ψ marg for 28 simulations with various γ and β: Conditional OR = 2 Conditional OR = 4 Scaled bias: Marginal OR = 0; Conditional OR = 1 0.0 0.2 0.4 0.6 0.8 1.0 1.0 0.5 0.0 0.5 1.0 cor(γ T X, β T X)
Conclusion Matching on the propensity score leads to an estimate which is consistent for neither the conditional nor marginal odds ratio Inverse propensity weighting yields an estimate which is unbiased for the marginal odds ratio Can adjustments be made to improve variability?
References Cochran, W.G. (1968). The Eectiveness of Adjustment by Subclassication in Removing Bias in Observational Studies. Biometrics 24, 295-313. Gail, M.H., Wieand, S., and Piantadosi S. (1984). Biased Estimates of Treatment Eect in Randomized Experiments with Nonlinear Regressions and Omitted Covariates. Biometrika 71, 431-444. Greenland, S., Robins, J.M., and Pearl, J. (1999). Confounding and Collapsibility in Causal Inference. Science 14, 29-46. Lunceford, J.K. and Davidian, M. (2004). Stratication and Weighting via the Propensity Score in Estimation of Causal Treatment Eects: A Comparative Study. Statistics in Medicine 23, 2937-2960. Rosenbaum, P.R. and Rubin, D.B. (1983). The Central Role of the Propensity Score in Observational Studies for Causal Eects. Biometrika 70, 41-55.