Methods for Handling Missing Data

Size: px

Start display at page:

Download "Methods for Handling Missing Data"

Frederica Stevens
5 years ago
Views:

1 Methods for Handling Missing Data Joseph Hogan Brown University MDEpiNet Conference Workshop October 22, 2018 Hogan (MDEpiNet) Missing Data October 22, / 160

2 Course Overview I 1 Introduction and Background Introduce case studies Missing data mechanisms Review and critique of commonly-used methods 2 Case Study 1: Growth Hormone Study Analysis using mixture models Setting up sensitivity analysis Inference about treatment effects Hogan (MDEpiNet) Missing Data October 22, / 160

3 Course Overview II 3 Case Study 2: Smoking cessation study Exploratory analysis for long sequence of binary data Analysis via IPW methods under MAR Comparative analysis via GEE, ML, LOCF Hogan (MDEpiNet) Missing Data October 22, / 160

4 INTRODUCTION AND BACKGROUND Hogan (MDEpiNet) Missing Data October 22, / 160

5 Study 1: Growth Hormone Study NIH-funded trial to study effect of rhgh for increasing muscle strength in elderly About 240 patients randomized to one of 4 arms: Placebo rhgh Exercise + Placebo (EP) Exercise + rhgh (EG) Primary outcome Quadriceps strength, in ft-lbs of torque Measured at baseline, 6 months, 12 months Our analysis Mean quad strength at 12 months Compare EP and EG arms only, for illustration Hogan (MDEpiNet) Missing Data October 22, / 160

6 Summary Statistics: Growth Hormone Study Month Treatment k n k EP (32) (52) 86 (51) (24) 81 (25) 73 (21) All (26) 82 (26) 73 (21) EG (26) (15) 68 (26) (24) 90 (32) 88 (32) All (25) 87 (32) 88 (32) Hogan (MDEpiNet) Missing Data October 22, / 160

7 Questions to be addressed What is the mean quad strength at 12 months, among all individuals who initiated therapy? What is the treatment effect at 12 months, among all individuals who initiated therapy? These are questions about the full data, or the data we intended to observed but did not. The main difficulty is that a significant proportion of the data are missing. Hogan (MDEpiNet) Missing Data October 22, / 160

8 Study 2: Smoking Cessation Study NIH-funded study to reduce smoking among sedentary women Roughly 300 individuals randomized to two arms: Supervised exercise vs. Wellness education program Primary outcome Weekly smoking status over 12 weeks Treatment comparison Smoking rate at week 12 following baseline Analysis issues Binary outcomes Mean has some structure as a function of time Large number of repeated measures Hogan (MDEpiNet) Missing Data October 22, / 160

9 Smoking Cessation Study: Summaries Hogan (MDEpiNet) Missing Data October 22, / 160

10 Smoking Cessation Study: Summaries Hogan (MDEpiNet) Missing Data October 22, / 160

11 Basics of inference with incomplete data Formulate the precise question you want to answer Define the quantity you want to estimate Ascertain what information is available in the data And... what information is unavailable Apply a statistical method to estimate quantity of interest Apply statistical principles to quantify uncertainty Sampling variability Uncertainty due to missing data or untestable assumptions Hogan (MDEpiNet) Missing Data October 22, / 160

12 Course objectives Develop an understanding of Mechanisms that lead to missing data Biases missing data may cause Methods of addressing missing data Use examples to illustrate the methods, and provide understanding of how they work Hogan (MDEpiNet) Missing Data October 22, / 160

13 A word about the data examples They are relatively simple in nature (stylized) They are designed to promote understanding of the methods The idea is that when you apply the methods on more complex problems, you will have a feel for how and why they work (and how and why they don t) In real life, data analysis problems can be much harder than the ones we are using You will have to do further research to implement these methods on complex datasets Hogan (MDEpiNet) Missing Data October 22, / 160

14 Some reasons for missing data Refusal to respond Drop out of a study (patient decision) Removal from a study (researcher / doctor decision) Death Administrative reasons (funding, etc) Hogan (MDEpiNet) Missing Data October 22, / 160

15 Defining the estimation target First, some notation needed Y = outcome variable (e.g., CD4 count) R = response indicator { 1 if Y observed = 0 if Y missing X = covariates of direct interest V = auxiliary covariates (available, not of direct interest) Hogan (MDEpiNet) Missing Data October 22, / 160

16 Defining the estimation target with incomplete data Possible targets of estimation Full-data parameter: Mean outcome among all individuals intended to be in the sample, whether or not they are observed µ = E(Y ) Observed-data parameter: Mean response among all individuals whose outcome was observed µ 1 = E(Y R = 1) Hogan (MDEpiNet) Missing Data October 22, / 160

17 Defining the estimation target with incomplete data Full data parameter: Regression parameters among all individuals intended to be in the sample E(Y X ) = X T β Observed-data parameter: Regression parameters among individuals with observed outcome E(Y X, R = 1) = X T β 1 Important questions to ask: When does µ = µ 1? When does β = β 1? Hogan (MDEpiNet) Missing Data October 22, / 160

18 Illustration using univariate mean Consider univariate sample Data: n units targeted m units respond (m < n) Y 1,..., Y m observed; Y m+1,..., Y n missing R 1 = R 2 = = R m = 1 and R m+1 = = R n = 0 X 1,..., X n is a baseline covariate, observed on everyone Target of inference: µ = E(Y ) (full-data parameter) Hogan (MDEpiNet) Missing Data October 22, / 160

19 Data excerpt from Growth Hormone trial V Y R [1,] 35.0 NA 0 [2,] [3,] [4,] [5,] 68.6 NA 0 [6,] [7,] 39.0 NA 0 [8,] 52.0 NA 0 [9,] [10,] [11,] V = baseline quad strength Y = quad strength at one year Hogan (MDEpiNet) Missing Data October 22, / 160

20 What can be estimated? What we can estimate: µ 1 = E(Y R = 1) What we cannot estimate (without making assumptions) µ = E(Y ) Main reason we cannot estimate these is because µ = E(Y ) = E(Y R = 1)P(R = 1) + E(Y R = 0)P(R = 0) Hogan (MDEpiNet) Missing Data October 22, / 160

21 The need for assumptions to estimate full-data parameters Cannot estimate parameters for parts of the data that are missing Hence need assumptions about the missing data These are called missing data mechanisms Under most circumstances, these assumptions cannot be tested This motivates the need to: State the assumptions unambiguously so others can critique them Carry out sensitivity analysis wherever possible Hogan (MDEpiNet) Missing Data October 22, / 160

22 Missing data mechanisms Classification of association between R and Y MCAR Missing completely at random MAR Missing at random MNAR Missing not at random These are sometimes defined conditionally on covariates X or, in the case of repeated measures, on the data history up to a specific time point. More later. Hogan (MDEpiNet) Missing Data October 22, / 160

23 Missing data mechanisms Joint distribution of two random variables MDM for univariate samples MDM for multivariate sample, where interest is in regression Have model covariates only Have model covariates and auxiliary information Hogan (MDEpiNet) Missing Data October 22, / 160

24 Statistical independence The notation X Y means that the random variable X is independent of the random variable Y. Implications of independence Joint distribution can be factored f (x, y) = f (x)f (y) Conditional distributions and expectations f (x y) = f (x) E(X Y ) = E(X ) i.e., knowing Y does not influence the distribution or expectation of X Hogan (MDEpiNet) Missing Data October 22, / 160

25 Joint distribution of two random variables To characterize inference from incomplete data, always sampling at least two variables, Y and R Usually we are interested in some aspect of f (y), such as the mean or median. Denote this as θ. For example θ = E(Y ) = y f (y) dy But to carry out inference, we need to make assumptions about the joint distribution of Y and R, denoted by f (y, r) Hogan (MDEpiNet) Missing Data October 22, / 160

26 Joint distribution of two random variables The joint distribution can be decomposed into conditional distributions as follows Mixture factorization f (y, r) = f (r) f (y r) Selection factorization f (y, r) = f (y) f (r y) We will focus on the selection factorization for now. Hogan (MDEpiNet) Missing Data October 22, / 160

27 Selection factorization The selection factorization describes the joint distribution in terms of The distribution or model for the variable of interest, f (y) The distribution of the response indicators and their dependence on y, written as f (r y). Missing data mechanism Selection mechanism This allows us to characterize different types of missing data mechanisms formally. Hogan (MDEpiNet) Missing Data October 22, / 160

28 MDM for univariate sampling Missing values of Y are missing completely at random (MCAR) if R Y, or equivalently if f (r y) = f (r) For univariate samples, this is also classified as missing at random (MAR). More on this distinction later. Hogan (MDEpiNet) Missing Data October 22, / 160

29 MDM for univariate sampling Missing values of Y are missing not at random (MNAR) if there exists at least one value of y such that f (r y) f (r) Or in words, if the probability of response is systematically higher/lower for particular values of y. Hogan (MDEpiNet) Missing Data October 22, / 160

30 MDM for univariate sampling Under MAR, methods applied to the observed data only will generally yield valid inferences about the population. Estimates will be consistent Standard errors may be larger than if you had the full data Under MNAR, methods applied to observed data only generally will not yield valid inferences Hogan (MDEpiNet) Missing Data October 22, / 160

31 MDM for multivariate sampling regression Consider the setting where we are interested in the regression of Y on X. Let µ(x ) = E(Y X ). Assume there are no other covariates available The model we have in mind is g{µ(x )} = β 0 + β 1 X Here, the function g tells you the type of regression model you are fitting (linear, logistic, etc.) The full data are (Y 1, X 1, R 1 ), (Y 2, X 2, R 2 ),..., (Y n, X n, R n ) Hogan (MDEpiNet) Missing Data October 22, / 160

32 MDM for multivariate sampling regression In cases like this, we can define missing data mechanisms relative to the objective of inference. The Y s are missing at random if Y R X In words, the MDM is a random deletion mechanism within distinct levels of X Another way to write this: f (r y, x) = f (r x) The deletion mechanism depends on X, but within levels of X it does not depend on Y Hogan (MDEpiNet) Missing Data October 22, / 160

33 Examples of MAR in regression Let Y denote blood pressure, X denote gender (1 = F, 0 = M). The regression model of interest is E(Y X ) = β 0 + β 1 X, so that β 0 = mean BP among men, β 1 = mean difference. Let s assume men have higher BP on average. Randomly delete BP for 20% of men and 40% of women. R does depend on Y, but only through X Men have higher BP Men less likely to be deleted those with higher BP less likely to be deleted Within levels of X, deletion mechanism is completely random. Hogan (MDEpiNet) Missing Data October 22, / 160

34 MAR in regression some practical issues Revisit the MAR condition. If R Y X, this also means f (y x, r) = f (y x), or f (y x, r = 1) = f (y x, r = 0) The relationship between Y and X is the same whether R = 1 or R = 0. Consequence is that a regression fit to those with R = 1 gives valid estimates of regression parameters. (Standard errors will be higher relative to having all the data.) Hogan (MDEpiNet) Missing Data October 22, / 160

35 MAR in regression some practical issues Under MAR, the inferences are still valid even if The X distribution is different between those with missing and observed Y s Question you have to ask to (subjectively) assess MAR: Is the missing data mechanism a random deletion of Y s among people who have the same X values? Equivalent formulation of this question: Is the relationship between X and Y the same among those with missing and observed Y values? Hogan (MDEpiNet) Missing Data October 22, / 160

36 MAR for regression when auxiliary variables are available In some cases we have information on more than just the X variables In a clinical trial we may be interested in E(Y X ) when X is a treatment group, but we have collected lots of baseline covariates V. In a longitudinal study, we may be interested in the mean outcome at the last measurement time, but we have accumulated information on the outcome at previous measurement times. When auxiliary information is available, we can use it in some cases to make MAR more plausible. Here MAR has a slightly different formulation. Hogan (MDEpiNet) Missing Data October 22, / 160

37 MAR with auxiliary covariates The relationship of interest is g{µ(x )} = β 0 + β 1 X The full data are (Y 1, X 1, V 1, R 1 ), (Y 2, X 2, V 2, R 2 ),..., (Y n, X n, V n, R n ) where Y is observed when R = 1 and is missing when R = 0. Hogan (MDEpiNet) Missing Data October 22, / 160

38 MAR with auxiliary covariates Values of Y are missing at random (MAR) if Y R (X, V ) Two equivalent ways to write this are: f (r x, v, y) = f (r x, v) f (y x, v, r = 1) = f (y x, v, r = 0) The first says that within distinct levels defined by (X, V ), missingness in Y is a random deletion mechanism The second says that the relationship between Y and (X, V ) is the same whether Y is missing or not Hogan (MDEpiNet) Missing Data October 22, / 160

39 MAR with auxiliaries example Return to our BP example, but now assume V denotes income level. Recall, we are interested in the coefficient β 1 from E(Y X ) = β 0 + β 1 X and not the coefficient α 1 from E(Y X, V ) = α 0 + α 1 X + α 2 V Hogan (MDEpiNet) Missing Data October 22, / 160

40 Missing data mechanisms for longitudinal data Need to define some notation for longitudinal data Y j = value of Y at time j R j = 1 if Y j observed, 0 otherwise Y j = (Y 1, Y 2,..., Y j ) = outcome history up to time j X j = covariate history up to time j H j = (X j, Y j 1 ) Allows us to define MCAR, MAR, MNAR for longitudinal data Hogan (MDEpiNet) Missing Data October 22, / 160

41 Missing data mechanisms for longitudinal data MAR Missing at random (MAR) If interest is in marginal means such as E(Y j ), MAR means R j Y j (R j 1 = 1, H j ) Interpretation: Among those in follow up at time j, missingness is independent of the outcome Yj conditional on the previously observed Y s. Missingness does not depend on present or future Y s, given the past. Hogan (MDEpiNet) Missing Data October 22, / 160

42 Missing data mechanisms for longitudinal data MAR Implications 1 Selection mechanism [R j R j 1 = 1, H J ] = [R j R j 1 = 1, H j ] Can model selection probability as a function of observed past 2 Imputation mechanism [Y j R j = 0, R j 1 = 1, H j ] = [Y j R j = 1, R j 1 = 1, H j ] Can impute missing Yj using a model of the observed Y j Critical: Must correctly specify observed-data model Hogan (MDEpiNet) Missing Data October 22, / 160

43 LOCF We can characterize LOCF in this framework It is an imputation mechanism Missing Yj is equal to the most recently observed value of Y Missing value filled in with probability one (no variance) Formally where j = max k<j {R k = 1} [Y j R j = 0, H j ] = Y j with probability one, Not an MAR mechanism in general Conditional distribution of missing Yj not equal to that for observed Y j Hogan (MDEpiNet) Missing Data October 22, / 160

44 Random effects and parametric models Assume a joint distribution for the repeated measures Model applies to the full data, hence cannot be checked Example: Multivariate normal where µ J 1 = E(Y ) and Σ J J = var(y ) (Y 1,..., Y J ) T N(µ, Σ), Special case: Random effects model Particular way of structuring mean and variance Hogan (MDEpiNet) Missing Data October 22, / 160

45 Random effects and parametric models When do these models yield valid inference? Most parametric models valid if MAR holds All parts of the model are correctly specified These models have an implied distribution for the conditionals [Y j Y 1,..., Y j 1 ] Under MAR, the implied distribution applies to those with complete and incomplete data, but... Parametric assumptions cannot be checked empirically Hogan (MDEpiNet) Missing Data October 22, / 160

46 GEE I Assume a mean and variance structure for the repeated measures Not necessarily a full parametric model Assumed variance structure is the working covariance With complete data Inferences are most efficient when covariance correctly specified Correct inference about time-specific means even if covariance mis-specified Reason: all information about time-specific means is already observed Hogan (MDEpiNet) Missing Data October 22, / 160

47 GEE II With incomplete data Information about time-specific means relies on imputation of missing observations These imputations come from the conditional distribution [Y j Y 1,..., Y j 1 ] The form of the conditional distribution depends on the working covariance Implication: Correct inference about time-specific means only when both mean and covariance are correctly specified Can get different treatment effects with different working covariances Hogan (MDEpiNet) Missing Data October 22, / 160

48 Dependence of estimates on working covariance From Hogan et al., 2004 Statistics in Medicine Hogan (MDEpiNet) Missing Data October 22, / 160

49 Structure of case studies 1 Introduce modeling approach 2 Relate modeling approach to missing data hierarchy 3 Illustrate on simple cases 4 Include a treatment comparison 5 Discussion of key points from case study Hogan (MDEpiNet) Missing Data October 22, / 160

50 CASE STUDY I: MIXTURE MODEL ANALYSIS OF GROWTH HORMONE TRIAL Hogan (MDEpiNet) Missing Data October 22, / 160

51 Outline of analysis Objective: Compare EG to EP at month 12 Variable: Y3 Estimation of E(Y 3 ) for EG arm only Ignoring baseline covariates Using information from baseline covariate Y1 MAR and MNAR (sensitivity analysis) Treatment comparisons Expand to longitudinal case MAR using regression imputation MNAR sensitivity analysis Hogan (MDEpiNet) Missing Data October 22, / 160

52 Estimate E(Y ) from univariate sample Y3 R [1,] NA 0 [2,] [3,] [4,] [5,] NA 0 [6,] [7,] NA 0 [8,] NA 0 [9,] [10,] [11,] [12,] [13,] NA 0 [14,] Hogan (MDEpiNet) Missing Data October 22, / 160

53 Estimating E(Y ) from univariate sample Model: E(Y R = 1) = µ 1 E(Y R = 0) = µ 0 (not identifiable) Target of estimation E(Y ) = µ 1 P(R = 1) + µ 0 P(R = 0) Question: what to assume about µ 0? In a sense, we are going to impute a value for µ 0, or impute values of the missing Y s that will lead to an estimate of µ 0. Hogan (MDEpiNet) Missing Data October 22, / 160

54 Parameterizing departures from MAR Target of estimation: E µ0 (Y ) = µ 1 P(R = 1) + µ 0 P(R = 0) = µ 1 + (µ 0 µ 1 ) P(R = 0) Suggests sensitivity parameter (µ 0 ) = µ 0 µ 1. Leads to E (Y ) = µ 1 + P(R = 0) Features of this format: Centered at MAR ( = 0) cannot be estimated from observed data Can vary for sensitivity analysis Allows Bayesian approach by placing prior on Hogan (MDEpiNet) Missing Data October 22, / 160

55 Estimation under MNAR Recall model E (Y ) = µ 1 + P(R = 0) Estimate known quantities n 1 = i R i n 0 = i (1 R i) P(R = 0) = n 0 /(n 1 + n 0 ) µ 1 = (1/n 1 ) i Y i Have one unknown quantity = µ 0 µ 1 Hogan (MDEpiNet) Missing Data October 22, / 160

56 Estimation under MAR Plug into model Interpretation Under MAR ( = 0), Ê (Y ) = µ 1 + (µ 0 µ 1 ) }{{} P(R = 0) Ê (Y ) = µ 1 Estimator is the observed-data mean = µ 1 + P(R = 0) Under MNAR ( 0), Shift observed-dta mean by P(R = 0) Shift is proportional to fraction of missing observations Hogan (MDEpiNet) Missing Data October 22, / 160

57 y3.delta delta Y [R=1] = 88.3 P(R = 0) = 0.42 Hogan (MDEpiNet) Missing Data October 22, / 160

58 Using information from baseline covariates Y1 Y3 R [1,] 35.0 NA 0 [2,] [3,] [4,] [5,] 68.6 NA 0 [6,] [7,] 39.0 NA 0 [8,] 52.0 NA 0 [9,] [10,] [11,] [12,] [13,] 63.4 NA 0 [14,] Hogan (MDEpiNet) Missing Data October 22, / 160

59 General model for Y Objective: Inference for E(Y 3 ) Model General form [Y 3 Y 1, R = 1] F 1 (y 3 y 1 ) [Y 3 Y 1, R = 0] F 0 (y 3 y 1 ) The general form encompasses all possible models that can be assumed for the observed and missing values of Y 3 The model F 0 cannot be estimated from data Hogan (MDEpiNet) Missing Data October 22, / 160

60 The model under MAR and MNAR Under MAR, F 0 = F 1. Suggests following strategy Fit model for F1 using observed data; call it F 1 Use this model to impute missing values of Y3 Under MNAR, F 0 F 1. Suggests following strategy Parameterize a model so that F0 is related to F 1 through a sensitivity parameter Generically write this as F0 = F 1 Use fitted version of F 1 to impute missing Y 3 Hogan (MDEpiNet) Missing Data October 22, / 160

61 Regression parameterization of F 1 and F 0 Take the case of MAR first MAR implies [Y 3 Y 1, R = 1] = [Y 3 Y 1, R = 0] Assume regression model for [Y X, R = 1] E(Y 3 Y 1, R = 1) = α 1 + β 1 Y 1 Assume a model for [Y X, R = 0] of similar form E(Y 3 Y 1, R = 0) = α 0 + β 0 Y 1 Cannot estimate parameters from observed data Hogan (MDEpiNet) Missing Data October 22, / 160

62 Regression parameterization of F 1 and F 0 Recall model Link the models. One way to do this: E(Y 3 Y 1, R = 1) = α 1 + β 1 Y 1 E(Y 3 Y 1, R = 0) = α 0 + β 0 Y 1 β 0 = β 1 + β α 0 = α 1 + α Under MAR: α = β = 0 Hogan (MDEpiNet) Missing Data October 22, / 160

63 Caveats to using this (or any!) approach Recall model E(Y 3 Y 1, R = 1) = α 1 + β 1 Y 1 E(Y 3 Y 1, R = 0) = α 0 + β 0 Y 1 More general version of missing-data model E(Y 3 Y 1, R = 0) = g(y 1 ; θ) Do we know the form of g? Do we know the value of θ? Do we know that Y 1 is sufficient to predict Y 3? We are assuming we know all of these things. Hogan (MDEpiNet) Missing Data October 22, / 160

64 Estimation of E(Y 3 ) under MAR 1 Fit the model E(Y 3 Y 1, R = 1) = α 1 + β 1 Y 1 obtain α 1, β 1. 2 For those with R = 0, impute predicted value via Ŷ 3i = Ê(Y 3 Y 1i, R i = 0) = α 1 + β 1 Y 1i 3 Estimate overall mean as the mixture Ê(Y 3 ) = (1/n) i R i Y 3i + (1 R i )Ŷ3i Hogan (MDEpiNet) Missing Data October 22, / 160

65 Regression imputation under MAR y3[r == 1] y1[r == 1] Hogan (MDEpiNet) Missing Data October 22, / 160

66 Sample means and imputed means under MAR n r Y 1 Y 3 R = R = MAR Hogan (MDEpiNet) Missing Data October 22, / 160

67 Some intuition behind this (simple) estimator Could base estimate purely on regression model E(Y 3 ) = E Y1,R {E(Y 3 Y 1, R)} = E R [ EY1 R {E(Y 3 Y 1, R)} ] = E R [α 1 + β 1 E(Y 1 R)] = α 1 + β 1 E(Y 1 ) Plug in the estimators for each term May be more efficient when regression model is correct Hogan (MDEpiNet) Missing Data October 22, / 160

68 Some more details... If we don t want to use the regression model for Y 3, can write where the term in red is Hence E(Y 3 ) = E(Y 3 R = 1)P(R = 1) + E(Y 3 R = 0)P(R = 0), E(Y 3 R = 0) = E Y1 R=0 {E(Y 3 Y 1, R = 0)} Ê(Y 3 ) = Y [R=1] 3 = E Y1 R=0(α 1 + β 1 Y 1 R = 0) = α 1 + β 1 E(Y 1 R = 0) ( ˆP(R = 1) + α 1 + β ) 1 Y [R=0] 1 ˆP(R = 0) Hogan (MDEpiNet) Missing Data October 22, / 160

69 Inference and treatment comparisons SE and CI: bootstrap Draw bootstrap sample Carry out imputation procedure Repeat for lots of bootstrap samples (say B) Base SE and CI on the B bootstrapped estimators Why not multiple imputation? Estimators are linear Bootstrap takes care of missing data uncertainty here Treatment comparisons coming later Hogan (MDEpiNet) Missing Data October 22, / 160

70 Estimation of E(Y 3 ) under MNAR Recall model E(Y 3 Y 1, R = 1) = α 1 + β 1 Y 1 E(Y 3 Y 1, R = 0) = α 0 + β 0 Y 1 More general version of model E(Y 3 Y 1, R = 1) = g 1 (Y 1 ; θ 1 ) E(Y 3 Y 1, R = 0) = g 0 (Y 1 ; θ 1, ) = h{g 1 (Y 1 ; θ 1 ), } The h function relates missing-data and observed-data models. User needs to specify form of h The parameter should not be estimable from observed data Vary in a sensitivity analysis Hogan (MDEpiNet) Missing Data October 22, / 160

71 Regression-based specification under MNAR Specify observed-data model E(Y 3 Y 1, R = 1) = g 1 (Y 1, θ 1 ) = α 1 + β 1 Y 1 Specify missing-data model E(Y 3 Y 1, R = 0) = h{g 1 (Y 1, θ 1 ), } = + g 1 (Y 1, θ 1 ) = + (α 1 + β 1 Y 1 ) Many other choices are possible Here, add a constant to the MAR imputation Have MAR when = 0, and MNAR otherwise Hogan (MDEpiNet) Missing Data October 22, / 160

72 Estimation of E(Y 3 ) under MNAR 1 Fit the model E(Y 3 Y 1, R = 1) = α 1 + β 1 Y 1 obtain α 1, β 1. 2 For those with R = 0, impute predicted value via Ŷ 3i = Ê(Y 3 Y 1i, R i = 0) = + α 1 + β 1 Y 1i 3 Estimate overall mean as the mixture Ê(Y 3 ) = (1/n) R i Y 3i + (1 R i )Ŷ3i i ( = Y [R=1] 3 ˆP(R = 1) + + α 1 + β ) 1 Y [R=0] 1 ˆP(R = 0) Hogan (MDEpiNet) Missing Data October 22, / 160

73 Sensitivity analysis based on varying What should be the anchor point? Usually appropriate to anchor analysis at MAR Examine effect of MNAR by varying away from 0 How to select a range for? Will always be specific to application. Ensure that range is appropriate to context (see upcoming example) Can use data-driven range for, e.g. based on SD Reporting final inferences Stress test approach Inverted sensitivity analysis find values of that would change substantive conclusions Average over plausible values Hogan (MDEpiNet) Missing Data October 22, / 160

74 Calibrating How should the range and scale of be chosen? Direction > 0 dropouts have higher mean < 0 dropouts have lower mean Range and scale Residual variation in outcome quantified by SD of regression error [Y 3 Y 1, R = 1] = α 1 + β 1 Y 1 + e var(e) = σ 2 Suggests scaling in units of σ Will illustrate in longitudinal case Hogan (MDEpiNet) Missing Data October 22, / 160

75 Moving to longitudinal setting Set-up for single treatment arm Illustrate ideas with analysis of GH data Compare treatments Illustrate sensitivity analysis under MNAR Discuss how to report results Hogan (MDEpiNet) Missing Data October 22, / 160

76 Longitudinal case: notation Assume missing data pattern is monotone K = dropout time = j R j E k (Y j ) = E(Y j K = k) When j > k, cannot estimate E k (Y j ) from data Hogan (MDEpiNet) Missing Data October 22, / 160

77 Longitudinal model with J = 3: Set up E k (Y 1 ) = E(Y 1 K = k) identified for k = 1, 2, 3 For other means, we have j = 2 j = 3 K = 1 E 1 (Y 2 Y 1 ) E 1 (Y 3 Y 1, Y 2 ) K = 2 E 2 (Y 2 Y 1 ) E 2 (Y 3 Y 1, Y 2 ) K = 3 E 3 (Y 2 Y 1 ) E 3 (Y 3 Y 1, Y 2 ) Components in red cannot be estimated. Need assumptions Hogan (MDEpiNet) Missing Data October 22, / 160

78 Longitudinal model with J = 3: MAR j = 2 j = 3 K = 1 ωe 2 (Y 2 Y 1 ) + (1 ω)e 3 (Y 2 Y 1 ) E 3 (Y 3 Y 1, Ŷ2) K = 2 E 2 (Y 2 Y 1 ) E 3 (Y 3 Y 1, Y 2 ) K = 3 E 3 (Y 2 Y 1 ) E 3 (Y 3 Y 1, Y 2 ) Here, ω is a weight such that 0 ω 1 Hogan (MDEpiNet) Missing Data October 22, / 160

79 Longitudinal model with J = 3: MNAR j = 2 j = 3 K = 1 ωe 2 (Y 2 Y 1 ) + (1 ω)e 3 (Y 2 Y 1 ) + 1 E 3 (Y 3 Y 1, Ŷ2) + 2 K = 2 E 2 (Y 2 Y 1 ) E 3 (Y 3 Y 1, Y 2 ) + 3 K = 3 E 3 (Y 2 Y 1 ) E 3 (Y 3 Y 1, Y 2 ) Hogan (MDEpiNet) Missing Data October 22, / 160

80 Procedure with several longitudinal measures Start by imputing those with missing data at j = 2 1 Fit model for E(Y 2 Y 1, R 2 = 1) obtain α (2), β (2) 1 E(Y 2 Y 1, R 2 = 1) = α (2) + β (2) 1 Y 1 This is a model that combines those with K = 2 and K = 3 2 Impute missing Y 2 as before Ŷ 2i = + α (2) + β (2) 1 Y 1i Hogan (MDEpiNet) Missing Data October 22, / 160

81 Procedure with several longitudinal measures Now impute those with missing data at j = 3 1 Fit model for E(Y 3 Y 1, Y 2, R 3 = 1) obtain α (3), E(Y 3 Y 1, Y 2, R 3 = 1) = α (3) + β (3) 1 Y 1 + β (3) 2 Y 2 (3) (3) β 1, β 2 2 Impute missing Y 3 as follows: For those with Y1, Y 2 observed, Ŷ 3i = + α (3) + (3) β 1 Y (3) 1i + β 2 Y 2i For those with only Y1 observed, Ŷ 3i = + α (3) + β (3) 1 Y 1i + (3) β 2 Ŷ 2i Hogan (MDEpiNet) Missing Data October 22, / 160

82 Side note Recall imputation for those with only Y 1 observed: Ŷ 3i = + α (3) + β (3) 1 Y 1i + (3) β 2 Ŷ 2i This is really just using information from the observed Y 1 because Ŷ 2i = + α (2) + β (2) 1 Y 1i Hence the imputation is from the (linear) model of E(Y 3 Y 1 ) that is implied by the other imputation models. Hogan (MDEpiNet) Missing Data October 22, / 160

83 Calibration of At each time point j, is actually a multiplier of the residual SD for the observed data regression [Y j Y 1,..., Y j 1, R = 1] For example, the imputation model at j = 3 is actually Ŷ 3i = σ 3 + α (3) + where σ 2 3 = var(y 3 Y 1, Y 2, R 3 = 1) (3) β 1 Y (3) 1i + β 2 Y 2i Generally will suppress this for clarity Hogan (MDEpiNet) Missing Data October 22, / 160

84 Procedure with several longitudinal measures Final step: compute estimate of E(Y 3 ) Ê (Y 3 ) = (1/n) i R 3i Y 3i + (1 R 3i )Ŷ3i( ) Based on the imputations, this turns out to be a weighted average of Y [K=3] 3, Y [K=2] 2, and Y [K=1] 1. Weights depend on dropout rates at each time coefficients in imputation models sensitivity parameter(s) Hogan (MDEpiNet) Missing Data October 22, / 160

85 Analysis components Fitted regression models at each time point Can check validity of imputation Contour plots: vary separately by treatment Treatment effect estimates p-values Summary table Treatment effect, SE, p-value These are computed using bootstrap Hogan (MDEpiNet) Missing Data October 22, / 160

86 Hogan (MDEpiNet) Missing Data October 22, / 160

87 Hogan (MDEpiNet) Missing Data October 22, / 160

88 Hogan (MDEpiNet) Missing Data October 22, / 160

89 Hogan (MDEpiNet) Missing Data October 22, / 160

90 Case Study 2: Chronic Schizophrenia Major breakthroughs have been made in the treatment of patients with psychotic symptoms. However, side effects associated with some medications have limited their usefulness. RIS-INT-3 (Marder and Meibach, 1994, Chouinard et al., 1993) was a multi-center study designed to assess the effectiveness and adverse experiences of four fixed doses of risperidone compared to haliperidol and placebo in the treatment of chronic schizophrenia. Hogan (MDEpiNet) Missing Data October 22, / 160

91 RIS-INT-3 Patients were required to have a PANSS (Positive and Negative Syndrome Scale) score between 60 and 120. Prior to randomization, one-week washout phase (all anti-psychotic medications discontinued). If acute psychotic symptoms occurred, patients randomized to a double-blind treatment phase, schedule to last 8 weeks. Patients randomized to one of 6 treatment groups: risperidone 2, 6, 10 or 16 mg, haliperidol 20 mg, or placebo. Dose titration occurred during the first week of the double-blind phase. Hogan (MDEpiNet) Missing Data October 22, / 160

92 RSIP-INT-3 Patients scheduled for 5 post-baseline assessments at weeks 1,2,4,6, and 8 of the double-blind phase. Primary efficacy variable: PANSS score Patients who did not respond to treatment and discontinued therapy or those who completed the study were eligible to receive risperidone in an open-label extension study. 521 patients randomized to receive placebo (n = 88), haliperidol 20 mg (n = 87), risperidone 2mg (n = 87), risperidone 6mg (n = 86), risperidone 10 mg (n = 86), or risperidone 16 mg (n = 87). Hogan (MDEpiNet) Missing Data October 22, / 160

93 Dropout and withdrawal Only 49% of patients completed the 8 week treatment period. The most common reason for discontinuation was insufficient response. Other main reasons included: adverse events, uncooperativeness, and withdrawal of consent. Hogan (MDEpiNet) Missing Data October 22, / 160

94 Dropout and Withdrawal Placebo Haliperidol Risp 2mg Risp 6mg Risp 10mg Risp 16 mg (n = 88) (n = 87) (n = 87) (n = 86) (n = 86) (n = 87) Completed 27 31% 36 41% 36 41% 53 62% 48 56% 54 62% Withdrawn 61 69% 51 59% 51 59% 33 38% 38 44% 33 38% Lack of Efficacy 51 58% 36 41% 41 47% 12 14% 25 29% 18 21% Other 10 11% 15 17% 10 11% 21 24% 13 15% 15 17% Hogan (MDEpiNet) Missing Data October 22, / 160

95 Central Question What is the difference in the mean PANSS scores at week 8 between risperidone at a specified dose level vs. placebo in the counterfactual world in which all patients were followed to that week? Hogan (MDEpiNet) Missing Data October 22, / 160

96 Sample means and imputed means under MAR N INS HOST EPS Y 0 µ R Placebo R = ?? R = Risperidone R = ?? R = Hogan (MDEpiNet) Missing Data October 22, / 160

97 Sample means and imputed means under MAR N INS HOST EPS Y 0 µ R Placebo R = R = Risperidone R = R = Hogan (MDEpiNet) Missing Data October 22, / 160

98 Regression imputation under MAR Placebo Hogan (MDEpiNet) Missing Data October 22, / 160

99 Regression imputation under MAR Risperidone Hogan (MDEpiNet) Missing Data October 22, / 160

100 Sample means by dropout time: Aggregated data PANSS Visit Hogan (MDEpiNet) Missing Data October 22, / 160

101 Extrapolation of means when ν = 0 (MAR) PANSS Visit Hogan (MDEpiNet) Missing Data October 22, / 160

102 Extrapolation of means when ν = 1 PANSS Visit Hogan (MDEpiNet) Missing Data October 22, / 160

103 Extrapolation of means when ν = +1 PANSS Visit Hogan (MDEpiNet) Missing Data October 22, / 160

104 Summary of means Placebo Risperidone E(Y 5 R 5 = 1) E(Y 5 R 5 = 0) ν = ν = ν = E(Y 5 ) ν = ν = ν = Hogan (MDEpiNet) Missing Data October 22, / 160

105 Summary Full-data mean parameterized as a mixture of observed- and missing-data means Implemented using imputation We used regression imputation, but this is only one possibility Regression models need not be linear More complex models may require more complex imputation procedures Key features of this approach Missing data distribution indexed by sensitivity parameter that cannot be estimated from data Separates testable from untestable assumptions Easy to assess effect of departures from MAR Hogan (MDEpiNet) Missing Data October 22, / 160

106 Summary Model parameterization Need to limit number of s to make inferences manageable Need sensible scale and range for s Scope of sensitivity analysis should be specified as part of trial protocol to avoid reliance on post-hoc analyses Inference about treatment effects Sensitivity analysis provides range of conclusions Can use as a stress-test : under what MNAR scenario would our conclusions change? Can also use Bayesian formulations that average results over a prior for sensitivity parameter (Daniels & Hogan, 2008) Hogan (MDEpiNet) Missing Data October 22, / 160

107 CASE STUDY II: INVERSE PROBABILITY WEIGHTING METHODS Hogan (MDEpiNet) Missing Data October 22, / 160

108 Inverse Probability Weighting General idea Consider estimating E(Y ) from a sample of data If we had all the data, we would use the sample mean Ê(Y ) = (1/n)(Y 1 + Y Y n ) Problem: Only some of the Y s are observed The observed Y s may not be a random draw from the full sample Hogan (MDEpiNet) Missing Data October 22, / 160

109 Inverse Probability Weighting General idea The solution: use a weighted mean Probability of being observed: π i = P(R i = 1) Issues: Weighted mean Ê IPW (Y ) = i R iy i /π i i R i/π i Probability of being observed may depend on individual characteristics X May also depend on the actual (but unobserved) outcome Y Hogan (MDEpiNet) Missing Data October 22, / 160

110 Inverse Probability Weighting under MAR Recall MAR: Y R X MAR implies [R X, Y ] [R X ] IPW theory Define π(x i ) = P(R = 1 X i ) Assume MAR Assume π(x i ) > 0 for all i The weighted estimator is a consistent estimate of E(Y ) Ê IPW (Y ) = i R iy i /π(x i ) i R i/π(x i ) Hogan (MDEpiNet) Missing Data October 22, / 160

111 Inverse Probability Weighting under MAR Remains true when π(x i ) is replaced by a consistent estimator π(x i ). To estimate π(x i ), must specify a model such as logit π(x i ) = X T i β In this case, π(x i ) = exp(xi T β) 1 + exp(xi T β) Can use other models as well but the model must yield consistent estimates of π(x i ) for IPW to give valid estimator. Hogan (MDEpiNet) Missing Data October 22, / 160

112 Example: Estimate E(Y 3 ) from GH Data Treat Y 1 as the covariate MAR assumption is [R Y 3, Y 1 ] [R Y 1 ] Assume logit model for selection logit π(y 1 ) = γ 0 + γ 1 Y 1 Compute weighted estimator as above Hogan (MDEpiNet) Missing Data October 22, / 160

113 Fitted model π(y 1 ) prob y1 Hogan (MDEpiNet) Missing Data October 22, / 160

114 Relative weights: Plot of Y 3 vs 1/ π(y 1 ) wt[r == 1] y3[r == 1] Hogan (MDEpiNet) Missing Data October 22, / 160

115 Compare estimators Complete cases: 88 Imputation under MAR: 79 IPW under MAR: 80 Hogan (MDEpiNet) Missing Data October 22, / 160

116 IPW Longitudinal case To illustrate, we assume monotone missingness (as in the GH trial) Target of inference: E(Y 3 ). Weighted estimator is Ê IPW (Y 3 ) = How to construct the response probabilities π 3i? i R 3iY 3i /π 3i i R 3i/π 3i Hogan (MDEpiNet) Missing Data October 22, / 160

117 Longitudinal case no covariates R 3 = 1 is equivalent to the joint event R 1 = 1, R 2 = 1, R 3 = 1 Hence P(R 3 = 1) = P(R 1 = 1, R 2 = 1, R 3 = 1) = P(R 3 = 1 R 2 = 1, R 1 = 1) P(R 2 = 1 R 1 = 1) P(R 1 = 1) = φ 3 φ 2 φ 1 Notation φ j = P(R j = 1 R 1 = R 2 = = R j 1 = 1) Hogan (MDEpiNet) Missing Data October 22, / 160

118 Longitudinal case with covariates Recall longitudinal version of MAR [R j R j 1 = 1, H J ] = [R j R j 1 = 1, H j ] In words, missingness at j depends only on the observable past history of X and Y. Implication: can use observable history in models of φ j Example: logit φ j (H j ) = X T i1β + X T ij γ + θy i,j 1 Hogan (MDEpiNet) Missing Data October 22, / 160

119 Procedure for inference about E(Y J ) 1 Formulate and fit models for φ j (H ij ) = P(R j = 1 R i,j 1 = 1, H ij ) 2 Compute estimated value of π J (H ij ) = P(R J = 1 H ij ) as π J (H ij ) = φ 1 (H i1 ) φ 2 (H i2 ) φ J (H ij ) 3 Compute weighted mean Ê IPW (Y J ) = i R ijy ij / π J (H ij ) i R ij/ π J (H ij ) Hogan (MDEpiNet) Missing Data October 22, / 160

120 Inverse Probability Weighting under MAR Practical issues to consider Stability of weights Weights near zero lead to bias and inefficiency Need to check histogram Can use stabilized weights Fit of weight models No guarantees here Can use lack of fit diagnostics to weed out poor-fitting models Selection of weight models Poses a more serious problem with respect to final inferences Not good to pick weight model that gives lowest p-value! Pre-specify weight covariates that are related to missingness and outcome Hogan (MDEpiNet) Missing Data October 22, / 160

121 Analysis of Smoking Cessation Data via IPW Specify outcome model Select baseline covariates Specify and fit weight model Fit weighted longitudinal regression for treatment comparison Hogan (MDEpiNet) Missing Data October 22, / 160

122 Outcome model Outcome and treatment Y j = quit status (1 if yes, 0 if no) Z = 1 if exercise, 0 if wellness θ j = P(Y j = 1) Model constant quit rate up to week 4 separate treatment quit rates after week 4, but constant over time logit θ j = γ 0 1(j 4) + (γ 1 + βz) 1(j > 4) β = treatment log odds ratio Hogan (MDEpiNet) Missing Data October 22, / 160

123 Hogan (MDEpiNet) Missing Data October 22, / 160

124 Exploratory analysis Ascertain whether previous Y s belong in weight model This shows strong correlation between R j and Y j 1 Hogan (MDEpiNet) Missing Data October 22, / 160

125 Exploratory analysis Hogan (MDEpiNet) Missing Data October 22, / 160

126 Covariates for selection model Hogan (MDEpiNet) Missing Data October 22, / 160

127 Check weight distribution at each time Hogan (MDEpiNet) Missing Data October 22, / 160

128 Summary of results Treatment effect relatively robust in odds ratio scale Arm-specific cessation rates very different Tx effect not robust in other scales Hogan (MDEpiNet) Missing Data October 22, / 160

129 Hogan (MDEpiNet) Missing Data October 22, / 160

130 Multiple Imputation Hogan (MDEpiNet) Missing Data October 22, / 160

131 Overview Imputing missing data from a parametric model Build around an example using GH data Goal 1: Estimate E(Y3 ) in GH data Goal 2: Estimate treatment effect in CTQ data Process of imputation Model specification Drawing imputed values from the model Combining observed and imputed information Standard error estimation Sensitivity analysis (CTQ data) Hogan (MDEpiNet) Missing Data October 22, / 160

132 Recall excerpt from GH data id tx Y1 Y2 Y3 R Hogan (MDEpiNet) Missing Data October 22, / 160

133 The strategy behind multiple imputation Setting: Full data for an individual is (Y, R, X, V ) Objective: Interested in some feature of f (y) or f (y x) Assumptions: 1 MAR, in that f (y x, v, r = 0) = f (y x, v, r = 1) 2 Model for f (y x, v, r = 1) has known form Hogan (MDEpiNet) Missing Data October 22, / 160

134 The strategy behind multiple imputation 1 Fit a model for f (y x, v, r = 1); if it is a parametric model, this means estimating the parameters α in the model f (y x, v, r = 1, α) 2 For each person having R = 0, take a draw of Y X, V from the fitted model. That means, for person i having R i = 0, plug in their values of X i and V i, and draw a value of Y i from the fitted model. Example coming soon. 3 Do this several times for each individual, so that each person has multiple draws of Y i. Can call these Now have K filled-in datasets. Ŷ (1) i, Ŷ (2) i,..., Ŷ (K) i Hogan (MDEpiNet) Missing Data October 22, / 160

135 The strategy behind multiple imputation 4 Perform the analysis you would have carried out had the data been complete. If you are interested in the parameter θ, this gives you K parameter estimates, θ (1), θ (2),..., θ (K) Hogan (MDEpiNet) Missing Data October 22, / 160

136 The strategy behind multiple imputatation 5 Now need an estimate and standard error The estimate is the sample mean θ = (1/K) K θ (j) j=1 The (estimate of) variance of θ combines the between- and within-imputation variance, var( θ) = 1 K 1 K ( θ (j) θ) 2 + j=1 1 K K var( θ (j) ), j=1 where var( θ (j) ) = { } 2 s.e.( θ (j) ) Hogan (MDEpiNet) Missing Data October 22, / 160

137 Analysis 1: Use MI to estimate E(Y 3 ) As with single imputation, we treat Y 1 as an auxiliary variable to use in the imputation model Missing data assumption: MAR f (y 3 y 1, r = 1) = f (y 3 y 1, r = 0) Specify parametric imputation model f (y 3 y 1, r = 1) Y 3 = β 0 + β 1 Y 1 + e e N(0, σ 2 ) This implies that the model f (y3 y 1, r = 1) is a normal distribution with mean and variance E(Y 3 Y 1 ) = β 0 + β 1 Y 1 var(y 3 Y 1 ) = σ 2 Hogan (MDEpiNet) Missing Data October 22, / 160

138 Applying MI to the GH data 1 Fit a model for f (y 3 y 1, r 3 = 1) fitted.model = lm(y3 ~ Y1, subset = (R3==1)) beta.hat = fitted.model$coefficients sigma.sq = anova.lm(fitted.model)$ Mean Sq [2] sigma = sqrt(sigma.sq) For the GH data, this means fitting the regression model specified above. The estimated parameters are β 0 = 15.6 β 1 = 0.19 σ = 20.9 Hogan (MDEpiNet) Missing Data October 22, / 160

139 Y1 R3 Y3 Y3.mean [1,] NA [2,] [3,] [4,] [5,] NA [6,] [7,] NA Hogan (MDEpiNet) Missing Data October 22, / 160 Applying MI to the GH data 2 For each person having R = 0, take a draw of Y 3 Y 1 from the fitted model. That means, for person i having R i = 0, plug in their value of Y 1i, and draw a value of Y 3 from the fitted model. So we draw imputed values of the missing Y 3 from X.matrix = cbind( rep(1,length(y1)), Y1) Y3.mean = X.matrix %*% beta.hat Y 3 N( Y 1i, )

140 Applying MI to GH data 3 Do this several times for each individual, so that each person has multiple draws of Y 3. Can call these Now have K filled-in datasets. Ŷ (1) # Draw 10 imputations of Y3 K = 10 n = length(y3) Y3.imp = matrix(0, nrow=n, ncol=k) 3i, Ŷ (2) 3i,..., Ŷ (K) 3i for (j in 1:K) { # this line imputes a value for each person Y3.imp[,j] = rnorm(n=n, mean=y3.mean, sd = sigma ) # this line replaces the values with observed Y3 where Y3 is observed Y3.imp[R3==1,] = Y3[R3==1] } Hogan (MDEpiNet) Missing Data October 22, / 160

141 Excerpt from imputed data Y1 R3 Y3 Y3.mean === FIRST 4 IMPUTATIONS === [1,] NA [2,] [3,] [4,] [5,] NA [6,] [7,] NA Hogan (MDEpiNet) Missing Data October 22, / 160

142 Applying MI to GH data 4 Perform the analysis you would have carried out had the data been complete. If you are interested in the parameter θ, this gives you K parameter estimates, θ (1), θ (2),..., θ (K) Hogan (MDEpiNet) Missing Data October 22, / 160

143 Applying MI to GH data # Step 4: Calculate E(Y3) for each replicated dataset # We will do this on the matrix Y3.imp, calculating column means and s.e. Y3.bar.imp = apply(y3.imp, 2, mean) Y3.sd.imp = apply(y3.imp, 2, sd) Y3.se.imp = Y3.sd / sqrt(n) Hogan (MDEpiNet) Missing Data October 22, / 160

144 Applying MI to GH data > # Here are the means and s.e. from each imputed dataset > cbind(y3.bar.imp, Y3.se.imp) Y3.bar.imp Y3.se.imp [1,] [2,] [3,] [4,] [5,] [6,] [7,] [8,] [9,] [10,] Hogan (MDEpiNet) Missing Data October 22, / 160

145 Apply MI to GH data 5 The (estimate of) variance of θ combines the between- and within-imputation variance, var( θ) = 1 K 1 K ( θ (j) θ) K j=1 K var( θ (j) ), j=1 # Step 5: Calculate overall mean and SE theta.hat = mean(y3.bar.imp) var.w = mean( Y3.se.imp^2 ) var.b = (1 + 1/K) * sd( Y3.bar.imp )^2 se.theta.hat = sqrt(var.w + var.b) # missing information miss.info = var.b / (var.w + var.b) c(theta.hat, se.theta.hat, miss.info) [1] Hogan (MDEpiNet) Missing Data October 22, / 160

146 Comparing results from three methods Method Ê(Y 3 ) s.e. Observed data only IPW Regression imputation with bootstrap s.e. 6.8 Multiple imputation Hogan (MDEpiNet) Missing Data October 22, / 160

147 Example 2: CTQ Data Goal of analysis: treatment comparison between wellness and exercise Available data: Y = indicator of quit status at week 12 (1 if yes, 0 if no) X = treatment indicator V = auxiliary covariates measured at baseline R = indicator of whether Y is observed Model of interest logit{pr(y = 1 X )} = β 0 + β 1 X Hogan (MDEpiNet) Missing Data October 22, / 160

148 Example 2: CTQ Data Auxiliary covariates V = (F, W ): F = Fagerstrom index of nicotine dependence (1-10) W = weight at baseline Recall: model of interest does not involve F, W logit{pr(y = 1 X )} = β 0 + β 1 X Hogan (MDEpiNet) Missing Data October 22, / 160

149 CTQ Data Imputation model under MAR logit{p(y = 1 X, F, W )} = α 0 + α 1 X + α 2 F + α 3 W S i = α 0 + α 1 X i + α 2 F i + α 3 W i φ i = exp(s i ) 1 + exp(s i ) Y i Ber(φ i ) Imputation involves drawing multiple values of Y from the appropriate Bernoulli distribution. Hogan (MDEpiNet) Missing Data October 22, / 160

150 Apply MI to CTQ Data Data excerpt > ctq id week y z weight j r fs basewt NA 1 NA NA 1 NA Hogan (MDEpiNet) Missing Data October 22, / 160

151 Apply MI to CTQ Data Fit treatment model to observed data only logit{p(y = 1 X )} = β 0 + β 1 X > # fit regression model of smoking to z with observed data > model.0 = glm(y ~ z, family=binomial(link="logit") ) > summary(model.0) Call: glm(formula = y ~ z, family = binomial(link = "logit")) Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) *** z Null deviance: on 186 degrees of freedom (90 observations deleted due to missingness) Hogan (MDEpiNet) Missing Data October 22, / 160

152 Apply MI to CTQ Data 2 Fit imputation model logit{p(y = 1 X, F, W )} = α 0 + α 1 X + α 2 F + α 3 W = S > imp.model = glm(y ~ z + basewt + fs, family = binomial) > summary(imp.model) glm(formula = y ~ z + basewt + fs, family = binomial) Estimate Std. Error z value Pr(> z ) (Intercept) z basewt * fs *** Hogan (MDEpiNet) Missing Data October 22, / 160

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model