Analyzing Pilot Studies with Missing Observations

Analyzing Pilot Studies with Missing Observations Monnie McGee mmcgee@smu.edu. Department of Statistical Science Southern Methodist University, Dallas, Texas Co-authored with N. Bergasa (SUNY Downstate Medical Center) I. Ginsburg and D. Engler (Columbia Presbyterian Medical Center) University of Texas at Dallas, April 19, 2005 p.1/32

Outline 1. Motivation: Gabapentin Study 2. Analysis with a Mixed-Effects Model 3. Other Important Facts about the Data 4. Dealing with the Real Data 5. Conclusions and Future Explorations University of Texas at Dallas, April 19, 2005 p.2/32

Gabapentin Study Protocol called for 16 subjects in pre-post format Half randomized to receive Gabapentin Main outcomes: Hourly Scratching Activity & Visual Analogue Score Two quantitations: Baseline and After 6 weeks Quantitations required a 48-hour stay in the hospital University of Texas at Dallas, April 19, 2005 p.3/32

Mixed Effects Model Analysis y ijk = α i + β j + γ ij + ɛ ijk y ijk is the response for the i th group and the j th quantitation on the k th subject. University of Texas at Dallas, April 19, 2005 p.4/32

Mixed Effects Model Analysis y ijk = α i + β j + γ ij + ɛ ijk y ijk is the response for the i th group and the j th quantitation on the k th subject. α i, i = 1, 2, represents effect of treatment group β j, j = 1, 2 is the effect of the j th quantitation γ ij is the interaction effect between group and quantitation University of Texas at Dallas, April 19, 2005 p.4/32

Mixed Effects Model Analysis y ijk = α i + β j + γ ij + ɛ ijk y ijk is the response for the i th group and the j th quantitation on the k th subject. α i, i = 1, 2, represents effect of treatment group β j, j = 1, 2 is the effect of the j th quantitation γ ij is the interaction effect between group and quantitation ɛ ijk N(0,σ 2 I) University of Texas at Dallas, April 19, 2005 p.4/32

Mixed Effects Model Analysis y ijk = α i + β j + γ ij + ɛ ijk y ijk is the response for the i th group and the j th quantitation on the k th subject. α i, i = 1, 2, represents effect of treatment group β j, j = 1, 2 is the effect of the j th quantitation γ ij is the interaction effect between group and quantitation ɛ ijk N(0,σ 2 I) The random effect is due to different initial levels of response for each subject on each quantitation University of Texas at Dallas, April 19, 2005 p.4/32

LME Results for HSA Effect Num DF Den DF F Value Pr > F Constant 1 861 24.81 < 0.0001 Group 1 13 4.47 0.0543 Quant 1 861 8.76 0.0032 Group Quant 1 861 1.39 0.2390 Log Likelihood: 5517.502 University of Texas at Dallas, April 19, 2005 p.5/32

Show Me the Data! Excel Spreadsheet of the Data Graphical Display of HSA and VAS University of Texas at Dallas, April 19, 2005 p.6/32

Issues with the Data Lots of NAs in spreadsheet! Entire pre and/or post assessments missing for 4 subjects A priori difference in gabapentin and placebo groups Very small sample size Disparate beginning times HSA and VAS normalization Detection limit for HSA; Finite scale for VAS University of Texas at Dallas, April 19, 2005 p.7/32

Types of Missingness Missing Completely at Random (MCAR): probability of an observation being missing does not depend on observed or unobserved measurements. Pr(R y o,y m ) = Pr(R) Missing at Random (MAR): probability of an observation being missing, given the observed data, does not depend on the unobserved data. Pr(R y o,y m ) = Pr(R y o ) University of Texas at Dallas, April 19, 2005 p.8/32

Types of Missingness (cont d) Missing Not at Random (MNAR): probability of an observation being missing depends on the value of the missing observation itself. University of Texas at Dallas, April 19, 2005 p.9/32

Types of Missingness (cont d) Missing Not at Random (MNAR): probability of an observation being missing depends on the value of the missing observation itself. In most situations, the true mechanism is probably MNAR. - Carpenter & Kenward ( 2005) University of Texas at Dallas, April 19, 2005 p.9/32

Missingness in Gabapentin Data Due to severity of missingness in hours 24-48 of quantitation, only first 24-hours of data were used. University of Texas at Dallas, April 19, 2005 p.10/32

Missingness in Gabapentin Data Due to severity of missingness in hours 24-48 of quantitation, only first 24-hours of data were used. Two subjects pre-treatment data are missing due to equipment malfunction. Itermittant data missing due to eating, sleeping, showering, etc. during the hospital stay. Some data may be missing due to severity of scratching or severity of illness (two subjects with missing post-treatment measurements) University of Texas at Dallas, April 19, 2005 p.10/32

Missingness in Gabapentin Data Due to severity of missingness in hours 24-48 of quantitation, only first 24-hours of data were used. Two subjects pre-treatment data are missing due to equipment malfunction. Itermittant data missing due to eating, sleeping, showering, etc. during the hospital stay. Some data may be missing due to severity of scratching or severity of illness (two subjects with missing post-treatment measurements) Our mechanism is mostly MAR University of Texas at Dallas, April 19, 2005 p.10/32

Now What? Fill-in the missing values and rerun the mixed model. University of Texas at Dallas, April 19, 2005 p.11/32

Now What? Fill-in the missing values and rerun the mixed model. Mean-filled values Regression-mean imputation University of Texas at Dallas, April 19, 2005 p.11/32

Now What? Fill-in the missing values and rerun the mixed model. Mean-filled values Regression-mean imputation Last Observation Carried Forward (LOCF) University of Texas at Dallas, April 19, 2005 p.11/32

Now What? Fill-in the missing values and rerun the mixed model. Mean-filled values Regression-mean imputation Last Observation Carried Forward (LOCF) Hot Deck (or Cold Deck) Imputation Likelihood based Imputation Time Series Approach (Pfeffermann and Nathan, 2002) NB: Most results pertaining to inference are asymptotic results. University of Texas at Dallas, April 19, 2005 p.11/32

Results: Mean-Filled Values Effect Num DF Den DF F Value Pr > F Constant 1 607 101.88 < 0.0001 Group 1 13 1.47 0.2468 Quant 1 607 39.08 < 0.0001 Group Quant 1 607 21.00 < 0.0001 Log Likelihood: 1181.314 University of Texas at Dallas, April 19, 2005 p.12/32

Results: LOCF-Filled Values Effect Num DF Den DF F Value Pr > F Constant 1 581 89.29 < 0.0001 Group 1 13 0.907 0.3584 Quant 1 581 38.39 < 0.0001 Group Quant 1 581 21.32 < 0.0001 Log Likelihood: 1124.822 University of Texas at Dallas, April 19, 2005 p.13/32

Summary Thus Far Carpenter and Kenward (2005) call mean replacement and LOCF unprincipled methods Both lead to biased estimates of parameters. Simple mean imputation tends to dilute associations. LOCF distorts mean and covariance structure, even for a single time point, even under MCAR. Regression mean imputation can generate unbiased estimates, but the variance is still typically underestimated. Can t replace entire quantitations with mean or LOCF. University of Texas at Dallas, April 19, 2005 p.14/32

Nearest Neighbor Hot Deck Imputation Let y i = (y i1,...,y ik ) be a K 1 complete data vector of outcomes. University of Texas at Dallas, April 19, 2005 p.15/32

Nearest Neighbor Hot Deck Imputation Let y i = (y i1,...,y ik ) be a K 1 complete data vector of outcomes. Let y i = (y obs,i,y obs,m ) where y obs,i is the observed part and y obs,m is the missing part of y i. Then ŷ it = y lt + (y obs,i y obs,l ) where y obs,i is the mean of the observed values for subject i. University of Texas at Dallas, April 19, 2005 p.15/32

Nearest Neighbor Hot Deck Imputation Let y i = (y i1,...,y ik ) be a K 1 complete data vector of outcomes. Let y i = (y obs,i,y obs,m ) where y obs,i is the observed part and y obs,m is the missing part of y i. Then ŷ it = y lt + (y obs,i y obs,l ) where y obs,i is the mean of the observed values for subject i. Subject l is called the donor. University of Texas at Dallas, April 19, 2005 p.15/32

Choosing a Donor We want a donor that is close to the subject whose observations are missing. University of Texas at Dallas, April 19, 2005 p.16/32

Choosing a Donor We want a donor that is close to the subject whose observations are missing. Close is defined by a metric, e. g. d(i,j) = max k x ik x jk where x i = (x i1,...,x ik ) T are the values of K appropriatly scaled covariates for a unit i at which y i is missing. University of Texas at Dallas, April 19, 2005 p.16/32

Donors for TS Data Suppose subject i is missing a value at time t. The closest donor is defined as d j (t) = min j for all j = 1,...,n 1. T t=1 x it x jt, University of Texas at Dallas, April 19, 2005 p.17/32

Hot Deck Results Effect Num DF Den DF F Value Pr > F Constant 1 703 75.48 < 0.0001 Group 1 13 0.760 0.2468 Quant 1 703 37.61 < 0.0001 Group Quant 1 703 16.18 < 0.0001 Log Likelihood: 1405.141 University of Texas at Dallas, April 19, 2005 p.18/32

A Modification Hot Deck Imputation provides us with only one data set, which we take as the real data. University of Texas at Dallas, April 19, 2005 p.19/32

A Modification Hot Deck Imputation provides us with only one data set, which we take as the real data. Multiple Imputation provides us with multiple data sets, which we can use to estimate uncertainty about the correct nonresponse model. University of Texas at Dallas, April 19, 2005 p.19/32

A Modification Hot Deck Imputation provides us with only one data set, which we take as the real data. Multiple Imputation provides us with multiple data sets, which we can use to estimate uncertainty about the correct nonresponse model. BUT - MI can be complicated. University of Texas at Dallas, April 19, 2005 p.19/32

A Modification Hot Deck Imputation provides us with only one data set, which we take as the real data. Multiple Imputation provides us with multiple data sets, which we can use to estimate uncertainty about the correct nonresponse model. BUT - MI can be complicated. Estimate multiple data sets using NNHDI with additive noise. University of Texas at Dallas, April 19, 2005 p.19/32

Modified NNHDI Results Results for 3 Imputations of NNHDI with additive N(0, 29) noise. University of Texas at Dallas, April 19, 2005 p.20/32

Modified NNHDI Results Results for 3 Imputations of NNHDI with additive N(0, 29) noise. Effect Imputation Num DF Den DF F Value Pr > F Group A 1 13 5.15 0.0409 B 1 13 7.27 0.0183 C 1 13 7.07 0.0196 Quant A 1 703 24.86 < 0.0001 B 1 703 22.50 < 0.0001 C 1 703 20.58 < 0.0001 Group Quant A 1 703 23.51 < 0.0001 Log Likelihoods: A: 805.89, B: 817.74, C: 814.94 B 1 703 47.92 < 0.0001 C 1 703 52.74 < 0.0001 University of Texas at Dallas, April 19, 2005 p.20/32

Nonresponse Uncertainty Let ˆθ d and W d, d = 1,...,D, be D complete-data estimates and their associated variances for θ. Then University of Texas at Dallas, April 19, 2005 p.21/32

Nonresponse Uncertainty Let ˆθ d and W d, d = 1,...,D, be D complete-data estimates and their associated variances for θ. Then θ D = 1 D D d=1 ˆθ d. University of Texas at Dallas, April 19, 2005 p.21/32

Nonresponse Uncertainty Let ˆθ d and W d, d = 1,...,D, be D complete-data estimates and their associated variances for θ. Then θ D = 1 D D ˆθ d. d=1 and the average within imputation variance is W D = 1 D D d=1 W d. University of Texas at Dallas, April 19, 2005 p.21/32

More Uncertainty The between-imputation variance is Total variability is B D = 1 D 1 D (ˆθ d θ D ) 2. d=1 T D = W D + D + 1 D B D, University of Texas at Dallas, April 19, 2005 p.22/32

More Uncertainty The between-imputation variance is Total variability is B D = 1 D 1 D (ˆθ d θ D ) 2. d=1 T D = W D + D + 1 D B D, and ˆγ D = (1 + 1/D)B D /T D is an estimate of the fraction of information about θ due to nonresponse (Little and Rubin, pp.86-87). University of Texas at Dallas, April 19, 2005 p.22/32

Uncertainty Calculations For the Gabapentin Data: Effect θd Wd B D T D ˆγ D Group -0.862 31.4 0.081 31.6 0.003 Quant -0.065 3.90 0.078 4.01 0.026 Interaction 0.694 8.4 0.149 8.56 0.023 University of Texas at Dallas, April 19, 2005 p.23/32

Power and Size Case 1: Pretest/Posttest Study with one normally distributed random variable (σ 2 = 1) and data MCAR University of Texas at Dallas, April 19, 2005 p.24/32

Power and Size Case 1: Pretest/Posttest Study with one normally distributed random variable (σ 2 = 1) and data MCAR Case 2: Case 1 with chunks of missing data (wave nonresponse). University of Texas at Dallas, April 19, 2005 p.24/32

Power and Size Case 1: Pretest/Posttest Study with one normally distributed random variable (σ 2 = 1) and data MCAR Case 2: Case 1 with chunks of missing data (wave nonresponse). Case 3: Wave nonresponse for longitudinal data with no correlation, analyzed with mixed-model University of Texas at Dallas, April 19, 2005 p.24/32

Power and Size Case 1: Pretest/Posttest Study with one normally distributed random variable (σ 2 = 1) and data MCAR Case 2: Case 1 with chunks of missing data (wave nonresponse). Case 3: Wave nonresponse for longitudinal data with no correlation, analyzed with mixed-model Case 4: Same as 3 with AR(1) structure in data University of Texas at Dallas, April 19, 2005 p.24/32

Power and Size Case 1: Pretest/Posttest Study with one normally distributed random variable (σ 2 = 1) and data MCAR Case 2: Case 1 with chunks of missing data (wave nonresponse). Case 3: Wave nonresponse for longitudinal data with no correlation, analyzed with mixed-model Case 4: Same as 3 with AR(1) structure in data Compared size and power for 10%, 30%, and 50% missing values. University of Texas at Dallas, April 19, 2005 p.24/32

Case 1: A Simple Paired t-test N = 10 N = 30 % Missing None 10% 30% None 10% 30% µ d = 0 0.050 0.056 0.091 0.050 0.051 0.062 µ d = 2 0.977 0.938 0.712 1 1 0.999 µ d = 5 1 1 0.987 1 1 1 University of Texas at Dallas, April 19, 2005 p.25/32

Case 2: Paired t-test with Wave Nonresponse N = 10 N = 30 % Missing 30% 50% 10% 30% 50% µ d = 0 0.052 0.053 0.050 0.050 0.051 µ d = 2 0.662 0.341 0.999 0.995 0.904 University of Texas at Dallas, April 19, 2005 p.26/32

Case 3: Longitudinal WN Data N = 10 N = 30 Scenario Effect 30% 50% 30% 50% µ d = 0 Group 0.014 0.015 0.032 0.049 µ d = 0 Quant 0.053 0.049 0.051 0.035 µ d = 2 Group 0.009 0.007 0.014 0.012 µ d = 2 Quant 1 1 1 1 University of Texas at Dallas, April 19, 2005 p.27/32

Case 4: Longitudinal AR(1) Data N = 10 N = 30 Scenario Effect 30% 50% 30% 50% φ 1 = φ 2 Group 0.046 0.046 0.049 0.049 φ 1 = φ 2 Quant 0.243 0.228 0.216 0.188 φ 1 φ 2 Group 0.050 0.050 0.049 0.050 φ 1 φ 2 Quant 0.328 0.304 0.265 0.219 University of Texas at Dallas, April 19, 2005 p.28/32

The Real Issue How good are the parameter estimates under the above scenarios? University of Texas at Dallas, April 19, 2005 p.29/32

The Real Issue How good are the parameter estimates under the above scenarios? Results about estimation in the literature are asymptotic. University of Texas at Dallas, April 19, 2005 p.29/32

The Real Issue How good are the parameter estimates under the above scenarios? Results about estimation in the literature are asymptotic. Literature suggests a transformation that makes normality more accurate for small samples. Searle (1970) gives information matrices for mixed effects models with unbalanced data. University of Texas at Dallas, April 19, 2005 p.29/32

The Real Issue How good are the parameter estimates under the above scenarios? Results about estimation in the literature are asymptotic. Literature suggests a transformation that makes normality more accurate for small samples. Searle (1970) gives information matrices for mixed effects models with unbalanced data. Large literature on efficiency for various experimental designs in presence of missing observations. University of Texas at Dallas, April 19, 2005 p.29/32

Remaining Issues Automating choice of like individuals for replacement values University of Texas at Dallas, April 19, 2005 p.30/32

Remaining Issues Automating choice of like individuals for replacement values Variance of random perturbation University of Texas at Dallas, April 19, 2005 p.30/32

Remaining Issues Automating choice of like individuals for replacement values Variance of random perturbation Generating data substitutions from models University of Texas at Dallas, April 19, 2005 p.30/32

Remaining Issues Automating choice of like individuals for replacement values Variance of random perturbation Generating data substitutions from models Calculate efficiencies, bias, and variance University of Texas at Dallas, April 19, 2005 p.30/32

Remaining Issues Automating choice of like individuals for replacement values Variance of random perturbation Generating data substitutions from models Calculate efficiencies, bias, and variance Detection limits, a priori differences in groups, normalization, etc. University of Texas at Dallas, April 19, 2005 p.30/32

References 1. Carpenter, James and Kendward, Mike (2005) Economic and Social Research Council Missing Data Website. http:www.missingdata.org.uk. 2. Little, Roderick J.A. and Rubin, Donald B.(2002). Statistical Analysis with Missing Data (2nd edition). New York: Wiley Interscience. 3. Pfeffermann, Danny and Nathan, Gad (2002). Imputation for Wave Nonresponse: Existing Methods and a Time Series Approach, in Survey Nonresponse (Robert M. Groves, Don A. Dilman, John L. Eltinge, and Rodrick J.A. Little, eds.). New York: Wiley, Chapter 28. 4. Prescott, P. and Mansson, R.A. (2002). Efficiency of Pair Wise Treatment Comparisons in Incomplete Block Experiments Subject to the Loss of a Block of Observations. Communications in Statistics: Theory and Methods, 31, 449-462. 5. Searle, S. R. (1970). Large Sample Variances of Maximum Likelihood Estimators of Variance Components Using Unbalanced Data. Biometrics, 26, 505-524. University of Texas at Dallas, April 19, 2005 p.31/32

A Priori Difference in Groups Reassign subjects to groups at random, regardless of true assignment University of Texas at Dallas, April 19, 2005 p.32/32

A Priori Difference in Groups Reassign subjects to groups at random, regardless of true assignment Calculate two-sample t-tests for each assignment University of Texas at Dallas, April 19, 2005 p.32/32

A Priori Difference in Groups Reassign subjects to groups at random, regardless of true assignment Calculate two-sample t-tests for each assignment 1000 replications of 10000 assignments Results: Percentage of P-values < 0.05 University of Texas at Dallas, April 19, 2005 p.32/32

A Priori Difference in Groups Reassign subjects to groups at random, regardless of true assignment Calculate two-sample t-tests for each assignment 1000 replications of 10000 assignments Results: Percentage of P-values < 0.05 Data Min Median Max Original 2.97 3.35 4.2 Mean Repl 0.07 0.21 0.38 LOCF Repl 11.6 12.5 13.4 University of Texas at Dallas, April 19, 2005 p.32/32