Shared Random Parameter Models for Informative Missing Data

Size: px

Start display at page:

Download "Shared Random Parameter Models for Informative Missing Data"

Joel Pitts
5 years ago
Views:

1 Shared Random Parameter Models for Informative Missing Data Dean Follmann NIAID NIAID/NIH p.

2 A General Set-up Longitudinal data (Y ij,x ij,r ij ) Y ij = outcome for person i on visit j R ij = 1 if observed 0 if missing D i = R i+ dropout time X ij = covariate e.g. time on study t ij P(missing) determined by a coin flip MCAR. Easy P(missing) depends on observed data MAR. Doable P(missing) depends on unobserved value of the missing data MNAR. Ambitious NIAID/NIH p.

3 Shared Parameter Model Assume that each person draws a random effect b 0i,b 1i from a N(0, Σ). Y ij = β 0 + β 1 t ij + b i0 + b i1 t ij + e ij Interested in the overall slope. What if faster decliners tend to drop out? Simple model P(R ij = 0 R ij 1 = 1) = Φ(α j + θb 1i ) At visit j, each person decides to dropout based on their own coin that depends on b 1i. NIAID/NIH p.

4 b i governs subject i s slope & P(dropout) Yearly drop in Y s h h s Probability of dropout NIAID/NIH p.

5 Data patterns generated from h and s 650 h Outcome s h h h s s s h s h s h h h h Years Since Randomization NIAID/NIH p.

6 Culling of patients in a 2 visit trial Distribution of slopes obs obs Density obs Slopes NIAID/NIH p.

7 Likelihood We can always write f(y i, r i, b i ) = g(y i b i, r i )m(r i b i )h(b i ) Key: g(y i b i, r i ) = g(y i b i ) Allows us to essentially eliminate the density for the missing ys. f(y o i, r i ) = f(y i, r i, b)dy m i db b y m i = b f(yo i b)m(r i b)db Maximum likelihood requires specialized software, can be difficult if dimension of b i is large. NIAID/NIH p.

8 How MNAR? MNAR: Probability of missingness depends on y m i. Let s work on R for a simple model with b scalar. P(R = 0 y m y o ) = P(R = 0 b) g(ym y o b)h(b)db g(ym y o b)h(b)db = P(R = 0 b)h(b y o y m )db h(b y o, y m ) can be viewed as an Empirical Bayes Posterior Distribution. Posteriors depend on all the data. Intuitively, y m should improve guess about b NIAID/NIH p.

9 Choice of covariates Without missing data covariate selection requires familiar judgment Clinical trials who cares Observational studies control confounding, what s of interest In shared parameter models, covariates impact the selection probabilities: Suppose E[Y man] = 10, E[Y woman] = 10. Fred s true mean is 10, typical for a man but good overall. Assume θ < 0. Adjust, Fred s b i = 0 typical P(dropout) Don t adjust Fred b i = 10 lower P(dropout) NIAID/NIH p.

10 Choice of covariates Model Consider two possible models for mean response. (1) Y ij = β 0 + b 0i + e ij (2) Y ij = β 0 + β 1 I(man) + b 0i + e ij Want sickest overall dropout? Under (1) use P(R ij = 0 R ij 1 = 1, b 0i ) = Φ(α j + θb 0i ) Under (2) use P(R ij = 0 R ij 1 = 1, b 0i ) = Φ(α j +θ(β 1 I(man)+b 0i )) NIAID/NIH p. 1

11 A Simpler Approach Unweighted analysis. Suppose each subject has their own mean: Y ij = β 0 + b 0i + e ij Then no matter how P(dropout) depends on b 0i, an unbiased estimate of β 0 is n i=1 Y i/n One Subject, One Vote" Simple fix for clinical trial compare two unweighted estimates. Related to Within Cluster Resampling approach to correct informative cluster size. NIAID/NIH p. 1

12 An Inconvenient Truth Endless two group study with m = intended observations. Y ij = b 0i + e ij with b 0i N(0, 1) e ij N(0, 100). In group 0, m i = 1 if b 0i < 0 but m i = if b 0i > 0. In group 1 no missing data. Sample mean of Y i s has the same expectation in the two groups. Y i s like a 50:50 mixture of a TN + (0,1) & N(0, 100). NIAID/NIH p. 1

13 An inconvenient truth Histogram of ybar0 Frequency 0e+00 1e+05 2e+05 3e+05 4e+05 5e+05 6e ybar0 Histogram of ybar ybar1 NIAID/NIH p. 1 Frequency

14 Approximate Conditional Approach Numerical integration of the complete likelihood can be hard. Let s try to make things simple by factoring differently f(y o i, r i, b i ) = f(y o i b i, r i )h(b i r i )m(r i ) = f(y o i b i )h(b i r i )m(r i ) f(y o i, r i ) = f(y o i b i )h(b i r i )db i m(r i ) m() can be estimated by empirical distribution, Random effects conditional on r i can be approximated. NIAID/NIH p. 1

15 Approximate Conditional Approach Twist is to modify the distribution h(b i r i ) e.g. b 0i = b 0i + ωd i, Y ij = β 0 + β 1 X i + b 0i + ωd i + e ij = β 0 + β 1 X i + β 2 D i + b 0i + e ij Uncondition at the end. E[Y X = 1] E[Y X = 0] = E[E[Y X = 1, D]] E[E[Y X = 0, D]] = β 1 + β 2 (D 1 D 0 ) NIAID/NIH p. 1

16 Heckman s model Suppose Y is the wage of a plumber, and Y the perceived utility of being a plumber. Y i = x iβ + b i + e i1 Y i = w iα + b i + e i2 with b i, e i1, e i2 iid normal, τ 2, σ1, 2 σ2. 2 People with Yi > 0 choose to be plumbers, Y i missing for nonplumbers. Can show that P(R i = 1) = P(Yi > 0) = Φ(w iα/ τ 2 + σ2 2) NIAID/NIH p. 1

17 Heckman s model Heckman showed E[b i R i = 1] λ( w iα/ τ 2 + σ2 2), λ() is Mills ratio. var[b i R i = 1] = c. Fix up the mean for the observed plumbers R i = 1: Y i = x iβ + ωλ( w i α ) + ǫ ω = σb 2/ σ2 2 + σ2 b > 0, people with a knack for plumbing choose it. NIAID/NIH p. 1

18 IPPB Trial Intermittent positive pressure breathing versus standard compressor nebulizer therapy. IPPB = Short term mechanical ventilation to force meds into lung. Primary endpoint: rate of change in FEV 1. n=984, 3 years FU, measured every 3 months 39% dropout. NIAID/NIH p. 1

19 IPPB group: estimates by dropout time A B intercept -0.1 slope dropout time dropout visit NIAID/NIH p. 1

20 IPPB Model Naive model: for each group we fit Y ij = β 0 + β 1 + b 0i + b 1i t j + e ij e ij iid N(0, σ 2 ), b i iid N(0,Σ) Shared Parameter Model: As above but with a connection P(R ij = 0 R ij 1 = 1) = Φ(α + θ 0 b 0i + θ 1 b 1i ) Approximate Conditional Model: given D i = R i+ Y ij = β 0 + β 1 t j + β 2 D i + β 3 t j D i + b 0i + b 1i t j + e ij NIAID/NIH p. 2

21 Model Estimates 1 A. Random Effects Model. Standard IPPB Parameter Estimate se Estimate se β β NIAID/NIH p. 2

22 Model Estimates 2 B. Shared Parameter Model Standard IPPB Parameter Estimate se Estimate se β β α θ θ NIAID/NIH p. 2

23 Model Estimates 3 C. Conditional Model Standard IPPB Parameter Estimate se Estimate se β β β β NIAID/NIH p. 2

24 Summary of IPPB Trial Model ˆ SE Naive Shared Approx Cond Unweighted IPPB not helpful beyond standard nebulizer therpay. Dropout appears related to intercept not slope. All analyses similar NIAID/NIH p. 2

25 Epilepsy Study Wanted to see if Felbamate reduced seizure frequency n = 40 patients titrated off meds, given drug/placebo & followed for 17 days. 11/19 Placebo 8/21 Felbamate dropped off. Number of seizures recorded each day. NIAID/NIH p. 2

26 Seizure Rates by Dropout O-placebo Avg. Daily Seizure Freq x 5 0 x x x x x o x x x Days in Study NIAID/NIH p. 2

27 Model Let Y ij denote the seizure count for patient i on day j Assume Poisson(λ ij ) Seizure rate seemed constant over time. Within each group log(λ ij ) = β + b i with b i N(0, σ 2 ) Shared Parameter Model, also assume P(R ij = 0 R ij 1 = 1, b i ) = 1/(1 + exp( γ θb i )) Approximate Conditional Model log(λ ij ) = β + ω log(d i ) + b i NIAID/NIH p. 2

28 Estimates Random Effects Shared Parameter Conditional Parm Est se Est se Est se β β σ F σ P ω F ω P γ γ θ F θ P NIAID/NIH p. 2

29 Treatment Effect For the conditional model, need to uncondition E F [log(ˆλ)] E P [log(ˆλ)] ˆβ 1 + ˆω F D F ˆω P D P Estimate=-1.58 (p>.05) versus (p<.05) for shared parameter model? ˆω F significant & substantial. ˆθ F NS & small. NIAID/NIH p. 2

30 Summary Shared parameter & naive models similar with significant effect of Felbamate. Conditional model: seizures & dropout related in Felbamate group. Inconsistency of models required investigation. Performed simulations Conditional model more robust with better small sample properties. NIAID/NIH p. 3

31 Extensions Appealing to embed shared parameter model in a larger class of models Allow perturbations of the Common b ic Y ij = β 0 + β 1 t ij + b ic + b iy + e ij P(R ij = 0 R ij 1 = 1) = Φ(α j + θ(b ic + b ir )) Allow smooth mean functions, more general error distributions, richer random effects dbns. Tradeoff between flexibility and burden of estimation. NIAID/NIH p. 3

32 Conclusions Shared Parameter Models a form of NMAR. Require unexaminable assumptions. Model fitting can be involved, approximate conditional linear model simpler to fit, need to uncondition. Extensions/flexibile modeling makes sense. When confronting missing data, appealing to try different methods. NIAID/NIH p. 3

33 A Book NIAID/NIH p. 3

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model