Pattern Mixture Models for the Analysis of Repeated Attempt Designs

Size: px

Start display at page:

Download "Pattern Mixture Models for the Analysis of Repeated Attempt Designs"

Jerome Joseph
6 years ago
Views:

1 Biometrics 71, December 2015 DOI: /biom Mixture Models for the Analysis of Repeated Attempt Designs Michael J. Daniels, 1, * Dan Jackson, 2, ** Wei Feng, 3, *** and Ian R. White 2, **** 1 Department of Integrative Biology, Department of Statistics & Data Sciences, University of Texas, Austin, TX Medical Research Council Biostatistics Unit, Cambridge Institute of Public Health, Cambridge, U.K. 3 Department of Statistics, University of Florida, Gainesville, Florida mjdaniels@austin.utexas.edu daniel.jackson@mrc-bsu.cam.ac.uk fengwei@ufl.edu ian.white@mrc-bsu.cam.ac.uk Summary. It is not uncommon in follow-up studies to make multiple attempts to collect a measurement after baseline. Recording whether these attempts are successful or not provides useful information for the purposes of assessing the missing at random (MAR) assumption and facilitating missing not at random (MNAR) modeling. This is because measurements from subjects who provide this data after multiple failed attempts may differ from those who provide the measurement after fewer attempts. This type of continuum of resistance to providing a measurement has hitherto been modeled in a selection model framework, where the outcome data is modeled jointly with the success or failure of the attempts given these outcomes. Here, we present a pattern mixture approach to model this type of data. We re-analye the repeated attempt data from a trial that was previously analyed using a selection model approach. Our pattern mixture model is more flexible and is more transparent in terms of parameter identifiability than the models that have previously been used to model repeated attempt data and allows for sensitivity analysis. We conclude that our approach to modeling this type of data provides a fully viable alternative to the more established selection model. Key words: Nonignorable missingness; Repeated attempt model; Selection model; Sensitivity analysis. 1. Introduction It is not uncommon in follow-up studies to make multiple attempts to collect a measurement after baseline (e.g., Wood et al., 2006; Jackson et al., 2012). Here, we refer to this type of design as a repeated attempt designs (RAD) and the corresponding statistical models as repeated attempt models (RAM). Information about the multiple attempts made to obtain outcome data have the potential to provide some information about the unobserved responses. This has been exploited in several papers, in the context of selection models (Alho, 1990; Wood et al., 2006; Jackson et al., 2010, 2012). In these selection models the information about the repeated attempts is thought to describe a continuum of resistance to providing data (Lin and Schaeffer, 1995). Evidence for this type of resistance can be informally assessed by tabulating numerical summaries of outcome data, such as the mean outcome, by the number of attempts made to obtain these data. Assuming that a large value for the response variable is a favorable outcome, a negative association between the mean outcome and the number of attempts made to obtain outcome data provides evidence that those with less favorable outcomes are resistant. This would suggest that those who do not provide data after many failed attempts, and therefore are highly resistant, may have very unfavorable outcomes and this would almost certainly invalidate a statistical analysis which assumes data are missing at random (MAR). The advantage of the existing selection models is that they exploit the RAD to identify all the parameters in the full data model. Selection models that describe the marginal probability that outcome data are observed, rather than the marginal probability that each attempt to obtain outcome data is successful, are very sensitive to outliers and the distributional assumptions made, because these models are very weakly identified (Kenward, 1998). However the selection model becomes much more strongly identifiable when it is used to describe each attempt (Jackson et al., 2012) in situations where multiple attempts to obtain data are made. This identifiability can also be considered a disadvantage of the RAM because this approach does not allow the type of sensitivity parameter defined by Daniels and Hogan (2008); such parameters are recommended for the analysis of missing data (National Research Council, 2010) and importantly, when varied, do not impact the fit of the model to the observed data. As such, these parameters allow examination of the sensitivity of inferences to unverifiable assumptions about the missingness. To our knowledge, there has been no previous work using a pattern mixture model for repeated attempt data. mixture models (Little, 1993, 1995) have been advocated as an approach to handle missing data that easily allows for sensitivity parameters due to their direct connection to the extrapolation factoriation (Daniels and Hogan, 2000, 2008). Here we describe how this type of model can be The Authors Biometrics published by Wiley Periodicals, Inc. on behalf of International Biometric Society This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

2 Mixture Models for the Analysis of Repeated Attempt Designs 1161 adapted to incorporate the repeated attempt information; as such, we propose a pattern mixture model RAM as a competitor to the selection model RAM. We motivate this work based on the QUATRO trial (Gray et al., 2006). The QUATRO trial (Gray et al., 2006) was a single-blind, multi-center randomied controlled trial of the effectiveness of adherence therapy for participants with schiophrenia. The trial included 409 participants in four centers: Amsterdam (the Netherlands), Leipig (Germany), London (United Kingdom) and Verona (Italy). Participants were recruited from June 2002 to October 2003 from people under the care of mental health services and were individually randomied to receive either adherence therapy (intervention) or health education (control). The inclusion and exclusion criteria are described in detail by Gray et al. (2006). Assessments were undertaken at baseline and at a follow-up of 52 weeks. The objective of this multicenter trial was to assess the impact of adherence therapy on self-reported quality-of-life of people with severe mental illness. The investigators made multiple attempts to collect the quality-of-life outcome (as many as nine attempts) but there were still individuals whose response could not be collected. In the treatment and control arms, 29 out of 204 (14%) and 13 out of 205 (6%) subjects failed to provide outcome data at the end of the trial, respectively. This imbalance in the amount of missing data by treatment arm, in conjunction with the concern that those with less favorable outcomes may be less likely to provide data, motivated both the previous MNAR modeling of Jackson et al. (2010) and the methods described here. The intuition that underlies our modeling is that we could produce treatment arm-specific plots which show the mean quality of life scores against the number of attempts. To impute the outcomes for those who do not provide outcome data, we could fit regression lines to each of these plots and then extrapolate, where the extent of the extrapolation reflects how resistant the non-responders are thought to be. We develop a statistically principled and more general version of this idea, where we fully take into account all the various sources of uncertainty in a Bayesian pattern mixture model. We first briefly review the use of selection models for the RAD in Section 2. We then introduce a pattern mixture model for this design that allows intuitive sensitivity parameters in Section 3. We show connections between the selection model and pattern mixture model formulations in Section 4. We reanalye the QUATRO data using these models in Section 5. Section 6 contains a discussion and open issues. 2. Notation, Target of Inference, and Review of Repeated Attempt Models 2.1. Notation The outcome of interest (here, self reported QoL at 52 weeks) will be denoted as Y, and the set of baseline covariates (here, baseline QoL and center) as X. R will denote the number of attempts until the outcome is successfully collected; we assume up to K attempts where R = K + 1 corresponds to the outcome not being collected after the maximum number of attempts. We assume a randomied trial where Z denotes randomiation to the intervention of interest (here, adherence therapy); an extension to an observational study is mentioned in Section Inferential Goal The quantity of interest here, θ, is the treatment effect on the means, unconditional on X and R, θ = E(Y Z = 1) E(Y Z = 0). (1) In the existing RAM-SM (described in Section 2.3), this parameter is specified directly in the model; see equation (3) below. For the RAM-PMM (introduced in Section 3), this parameter is not directly specified in the model; computation of this parameter requires evaluation of the double integral in (5) below. In the QUATRO data, θ is the effect of adherence therapy on self reported QoL at 52 weeks Repeated Attempt Selection Model (RAM-SM) The RAM-SM was originally proposed by Alho (1990). A logistic regression is used to model the probability that each attempt to obtain outcome data is successful, where an MNAR model is obtained by using the (possibly missing) outcome data as a covariate in this logistic regression. The joint likelihood of the response and the repeated attempts data is obtained by modeling the marginal distribution of the outcome data Y and the missing data mechanism of the RAM (Wood et al., 2006); the latter describes the probability that the attempts are successful given Y. The key identifying assumption is that the covariate effect associated with the outcome Y is common across all attempts. To date, the RAM has only been developed for an incomplete univariate outcome. An example of the missing data mechanism in a RAM is given by the model logitp(r = k R k, Z =, X = x,y = y) = λ 0k + γ k + λ k x + δy, (2) where R = k if the kth attempt is successful. We assume that all attempts are independent, so λ 0k is the log odds, for participants with X = 0, Y = 0, and Z = 0, that the kth attempt is successful given no success on the previous k 1 attempts. Model (2) can be thought of as a discrete time survival model, or a stratified logistic regression, where we also handle the unobserved outcomes (Jackson et al., 2012). The term λ k x allows the probability that attempts are successful to depend on covariates, where the covariate effects may or may not depend on k. The term δy permits MNAR models. The MAR assumption is equivalent to assuming that δ = 0. The key identifying assumption is that the covariate effect of Y in (2) is constant across all attempts, that is δ does not depend on k. By jointly modeling Y, and then the attempts given Y using (2), a selection model approach has been adopted. If only one attempt is made for all subjects then the RAM reduces to a standard selection model. The model specified for the response Y is Y Z =, X = x N{μ(, x),σ 2 (, x)}, (3)

3 1162 Biometrics, December 2015 where in the example μ(, x) = β 0 + θ + β x and σ 2 (, x) = σ 2. Alho (1990) originally proposed using a modified likelihood to fit the RAM but two new estimation methods have subsequently been developed by Wood et al. (2006). The first of these methods uses the EM algorithm to fit the model using the full likelihood and the second uses a Bayesian approach and the software WinBUGS (Lunn et al., 2002). The full likelihood was also used by Jackson et al. (2012) to fit the RAM, but without using the EM algorithm. It can be difficult to assess the fit of this selection model RAM to the observed data because the observed data likelihood is not available in closed form (though it can be evaluated numerically using one-dimensional Gaussian quadrature to integrate out the missing data). Previous work involving fitting a RAM to the QUATRO data was performed using WinBUGS (see Jackson et al. (2010) for full details). Briefly, no strong evidence that data are MNAR was found. As such, the MAR assumption made by Gray et al. (2006) does not appear unreasonable though the estimated treatment effect was slightly reduced (but it remained statistically insignificant). 3. Repeated Attempt Mixture Model (RAM-PMM) 3.1. Model In our pattern mixture formulation of the RAM, denoted as RAM-PMM, the patterns are defined by the values of R. For pattern R = k (k = 1,...,K+ 1) and arm = 0, 1, we consider the following model for the conditional distribution of Y Y Z =, X = x,r= k N{μ(, x,k),σ 2 (, x,k)}, so that we allow the mean and variance of the outcome data to depend on covariates and patterns. In all our modeling, we assume the particular function μ(, x,k) for k K, μ(, x,k) = α (k) + β 1 x, (4) and a constant variance, σ 2 (, x,k) = σ 2. More complex forms of the mean and variance are possible with richer data; we return to this issue in the discussion. So far, we have not discussed anything about the identification of the distribution of the missing data, specifically, μ(, x,k+ 1). Different assumptions about μ(, x,k+ 1) in Section 3.2 will allow us to explore a range of possibilities in a sensitivity analysis. We then specify a model for the conditional distribution of the pattern indicator (similar to the RAM-SM specification in equation (2) but not conditioning on Y). An example of this type of model is logit{π k (x)} =λ 0k + λ + λ x x, where π k (x) = P(R = k R k, Z =, X = x). Finally, we specify a model for [X Z] = [X] (by randomiation). This factoriation respects the fact, which is sometimes overlooked in pattern mixture models, that the distribution of the baseline outcome (which is included in X) does not depend on Z. Note that this model can be specified parametrically or using Bayesian nonparametrics (more in the discussion). Here, we assume a parametric model for this distribution; note, that in our example, X is composed of the baseline outcome, Y 0 and indicators of center. We assume for each center, the distribution of the baseline outcome is normal with mean and variance depending on center. It is easy to assess the fit of this model to the observed data as its distribution is modeled directly in the pattern mixture framework. The quantity of interest here, θ, given in (1) can be written as follows, θ = E[Y Z = 1] E[Y Z = 0] = μ(1, x,k)df(k x,z = 1)dF(x) μ(0, x, k)df(k x,z = 0)dF(x), (5) where F(k x,) and F(x) were specified above. All integrals are double integrals over x and k (the first integral is a sum over the discrete k). The parameter, θ can be computed using Monte Carlo (MC) integration in WinBUGS. Details can be found in Section Priors In the proposed mixture model, not all parameters are identified by the observed data. One of the main contributions of our approach is the form of the priors for the unidentified parameters, which is described below. First we outline priors for the identified parameters. Identified parameters. The parameters indexing (and identified by) the observed data comprise ({α (k) : k = 1,...,K}, β 1,σ,λ). For the regression parameters, we use diffuse normal priors. For the variance component, σ 2, we use a vague inverse gamma prior. Unidentified parameters. The parameters α (K+1) are not identified by the observed data (or modeling assumptions) and would be classified as sensitivity parameters (Daniels and Hogan, 2008). To identify these parameters, we exploit the repeated attempt design and assume a functional relationship between the intercept parameters for the observed outcomes, {α (k) : k = 1,...,K} and the number of attempts (k). In particular, we specify a prior for α (K+1) conditional on ᾱ (K) = (α (1),...,α (K) ) T, i.e., p(α (K+1) ᾱ (K) ). We center this prior at its prediction based on implicitly fitting the regression α (k) = h (k; ζ) + ɛ k, k = 1,...,K. In what follows, we set h (k; ζ) = ζ 0 + ζ 1 k, k = 1,...,K (a linear regression with α (k) as the dependent variable and pattern (k) as the independent variable) and compute the least squares estimate of (ζ 0,ζ 1 ) to obtain α (K+1) ᾱ (K) N{ˆζ 0 + ˆζ 1 (K + C),τ 2 }.

4 Mixture Models for the Analysis of Repeated Attempt Designs 1163 where ˆζ j are functions of {α (1),...,α (K) }. The linear relation- for k = 1,,K in k is the key assumption. We ship of α (k) assume that the intercepts in (4) follow a linear trend over patterns that provide outcome data and that we can extrapolate from this following the intuition that we described in the introduction. Here the sensitivity parameters are C and τ, where C represents how far we should extrapolate the linear trend to describe the missing outcome data (i.e., how resistant are those that have not provided outcome data by the Kth attempt) and τ represents our uncertainty in the precision of the extrapolation. In what follows, the sensitivity parameter τ is fixed at ero and we focus on C. This approach does not put any modeling restrictions on the observed data, but still attempts to use information in an intuitive manner from the repeated attempt design. Also note that it is not necessary to assume a linear form of h (k; ζ); any functional relationship (subject to having enough patterns/attempts) is possible. We return to this in the discussion Connections to Priors and Sensitivity Parameters in a Two- Mixture Model It is not uncommon for information on the number of attempts to be collected but not used in the modeling. The pattern mixture model approach in that situation would only have two patterns, corresponding to Y being observed or missing. Implicitly, the first pattern would be formed by combining the first K successful patterns (where outcome was observed) from our RAM-PMM into a single pattern. We can define the corresponding intercept for this combined pattern as a function of the parameters in our repeated attempt model to be α, where α = K p k=1 kα (k) and p k = E x {P(R = k Z =, X = x)}; the quantity P(R = k, x) can be computed recursively from the π k s. The typical approach in a two pattern model would be to specify the conditional mean of α (K+1) as E(α (K+1) α ) = α + η, where η is a sensitivity parameter. The value of η implied by the model and priors for our RAM-PMM is η = ˆζ 0 + ˆζ 1 (K + C) α. (6) If the investigator wanted to do an analysis with a two-pattern model, (6) could help calibrate η based on the repeated attempt model. We do this in Section Computations in WinBUGS Models can be specified, and the posterior sampled using MCMC, in WinBUGS (see the Supplementary Materials for code). For continuous components in X, the integral in (1) can be computed in WinBUGS by using the following trick: (1) Create L units with a missing outcome and missing covariates. (2) Compute the mean of these L outcomes for each Z at each iteration. This trick implicitly does a Monte Carlo (MC) integration over the distribution of X at each iteration. L is chosen such that the error in the MC integration for the quantities of interest is negligible. Corresponding code can be found in the Supplementary materials. 4. Some Connections Between the Parameters of RAM-PMM and RAM-SM 4.1. SM Corresponding to the RAM-PMM and Direct Theoretical Connections In the following, we describe the selection model derived from the RAM-PMM. For simplicity, we only consider one covariate and suppress the dependence of the intercept on. In addition, we assume the coefficient for this covariate is constant across patterns as is the residual variance, Y x, R = k N(α (k) + β 1 x, σ 2 ), for k = 1,...,K+ 1. The implied selection model has the link function, log P(D = k + 1 x, y). P(D = k x, y) The main observation here is that the implied selection model corresponds to a different link function than that used in the RAM-SM and obviously a different distribution for Y (i.e., a mixture of normals). There are also some direct theoretical connections between the RAM-SM and RAM-PMM. Setting the parameter ζ 1 equal to ero (due to α (k) not depending on k) implies MAR (since then we have the same (normal) distribution in each pattern and thus the marginal distribution of y is normal [not a mixture of normals]) for the RAM-PMM. This model is equivalent to RAM-SM with δ = Empirical Connections We also assess connections empirically between the RAM- SM and RAM-PMM specified in this paper. We simulated data under the RAM-SM (using parameter values based on the QUATRO data) to assess how the value of δ impacts the derived pattern specific conditional means, α (k) as a function of k (see Figure 1); as such, we do not actually need to fit the RAM-PMM and the sensitivity parameter, C does not need to be specified. We see that there is a monotone non-decreasing pattern in the pattern specific means for odds ratios larger than one and monotone non-increasing pattern for odds ratios less than one. We also simulated data under many different true values for λ 0j (not shown); for all scenarios examined, the odds ratios were monotone non-increasing (non-decreasing) based on the sign of δ. To summarie, the RAM-SM and RAM-PMM try to exploit the repeated attempts in similar ways. However, the RAM- PMM allows for sensitivity analysis and more transparency in the model specification for the observed data and parameter identifiability; sensitivity analysis is an essential component of inference for missing data in randomied trials. 5. Analysis of QUATRO The main analysis of QUATRO (Gray et al., 2006) was a complete-case analysis using a linear regression model, where individuals with missing data at baseline or follow-up were excluded. Specifically, the final quality-of-life score was regressed on randomied group, adjusted for the baseline score and center. This gave an estimated intervention effect of 0.4

5 1164 Biometrics, December 2015 OR=1.36 OR=0.5 OR=0.67 OR=1 OR=1.5 OR=2 Figure 1. specific means under RAM-SM with different values of OR = exp(δ) and other parameters based on the QUATRO data. The ORs in the plots are for a one standard deviation change in Y. OR=1.36 is the estimate from the QUATRO data. (intervention minus control) with a 95% confidence interval of ( 2.6, 1.8); negative values correspond to a harmful effect of intervention. These results do not allow for the missing data (although the sensitivity analyses in Gray et al. (2006) did do so). There were more missing final quality-of-life scores in the treatment group (Table 1). Up to 9 attempts were made to collect the 52 week outcome for participants. However, given the sparsity of subjects with 3 to 9 attempts on each arm for our analysis, we merged those subjects into one pattern (see Table 2). We see an overall decreasing outcome mean with the number of attempts which both the RAM-SM and the RAM-PMM try to exploit. For our analysis, the number of attempts, R takes values in {1, 2, 3, 4} (i.e., K = 3). R = 4 corresponds to the pattern that Y is not observed even after all attempts. Individuals with Y missing, but fewer than three attempts, have R censored; there are 8 and 20 subjects censored respectively on the two arms (Table 2). Covariates X are indicators of the four centers and the baseline response. The results from the analysis performed by Jackson et al. (2010) using a RAM-SM are not directly comparable to those obtained in Section 5 because Jackson et al. modeled all nine attempts. Undertaking the modeling of all nine attempts in our pattern mixture framework is not feasible because only a small proportion of participants (28/409) receive more than three attempts Results Ignorable analysis. We start with an ignorable analysis which assumes the missingness is MAR and does not explicitly model the number of attempts. For the MCMC al-

6 Mixture Models for the Analysis of Repeated Attempt Designs 1165 Table 1 QUATRO data: counts (outcome means) by number of attempts (k) and randomied group (Z) Y observed after k attempts # of attempts (k) Y not observed Control (n = 205) 77 (42.4) 94 (41.3) 7 (38.7) 7 (34.7) 3 (34.2) 2 (32.9) 1 (40.7) 1 (62.98) 0 (NA) 13 Treatment (n = 204) 73 (40.7) 90 (40.2) 7 (38.6) 1 (45.7) 3 (35.0) 0 (NA) 0 (NA) 1 (30.3) 0 (NA) 29 gorithm, we ran iterations with a burn-in of 1000 iterations. We define m(, x) = E(Y Z =, X = x). We compute the treatment effect (marginalied over X) as θ = E(Y Z = 1) E(Y Z = 0) = m(1, x)df(x) m(0, x)df(x). The marginal treatment effect θ has a posterior mean of 0.4 with 95% credible interval of ( 2.5, 1.8); thus, the effect of the adherence therapy on 52 week QoL is minimal with a confidence interval that overlaps ero, providing little (if any evidence) of a beneficial effect of the intervention. The treatment effect is smaller in magnitude than suggested by Table 2 because of the covariate adjustment (results not shown). These results are in excellent agreement with previous ignorable analyses that adjust for the same covariates used here by Jackson et al. (2010) and Gray et al. (2006) RAM-PMM nonignorable analyses. For the MCMC algorithm, we again ran iterations with a burn-in of 1000 iterations. For the MC integration, we set L = 2000, which made the MC error negligible; increasing L to 3000 made no substantive difference in the posterior mean of θ. We vary C between 0 (meaning that missing subjects are comparable to the last responders) and 3 (meaning that missing subjects differ from the last responders as much as the last responders differ from the first responders). However, other choices can be made including negative C s; we discuss this further in Section 6. For all values of C considered (see Table 3), representing different degrees of resistance, we observed a larger negative effect of adherence therapy on self-reported 52 week QoL (with θ ranging from 0.6 to 0.8) than in the ignorable analysis and wider confidence intervals (that still cover ero). The estimated effect of adherence therapy increases with C which corresponds to those without QoL observed after the maxi- Table 2 QUATRO data: counts (outcome means) by number of attempts (k) and randomied group (Z) after merging 3 9 attempts into one pattern Y observed Y missing R <4 4 Control (n = 205) 77 (42.4) 94 (41.3) 21 (37.4) 8 5 Treatment (n = 204) 73 (40.7) 90 (40.2) 12 (37.6) 20 9 mum number of attempts having poorer QoL than those observed after 3 or fewer attempts. This can also be seen by the slope of the priors, ζ 10 and ζ 11, with posterior means (95% credible intervals) of 1.9 ( 4.8, 0.97) and 1.7 ( 5.2, 1.7), respectively; note that both are negative with the slope for those on adherence therapy more extreme (doing worse as the number of attempts increases). We point out these slopes do not depend on the values of the sensitivity parameter (C) considered. Posterior means of all the parameters are given in the Supplementary Materials A (standard) two-pattern model. We fit a standard two-pattern (outcome observed or not) mixture model under (nonignorable) MAR which gave essentially the same results as the ignorable analysis (not shown). Finally, we fit a MNAR two pattern model with sensitivity parameters, η, specified as in Section 3.3. The values of the sensitivity parameters η for C = 0, 1, 2, 3 were (η 0 = 2.96,η 1 = 2.3), (η 0 = 4.8,η 1 = 3.9), (η 0 = 6.8,η 1 = 5.6), and (η 0 = 8.6,η 1 = 7.3), respectively. As expected, the results closely match the RAM- PMM analysis (in terms of posterior means), but with less uncertainty. For example, for C = 1, the posterior mean and credible interval for θ was 0.6 ( 2.8, 1.6); for C = 3, 0.9 ( 3.1, 1.4). The decrease in uncertainty is expected since there are fewer patterns (and thus parameters). Overall, there was not strong evidence of a beneficial effect of the adherence therapy intervention under any of the PMM formulations. Table 3 Posterior summaries for the RAM-PMM for θ and the treatment specific means. C is the sensitivity parameter C parameter mean 95% CI 0 θ 0.6 ( 2.9, 1.7) E(Y Z = 0) 40.9 (39.2, 42.5) E(Y Z = 1) 40.2 (38.4, 42.1) 1 θ 0.7 ( 3.1, 1.8) E(Y Z = 0) 40.7 (39.1, 42.4) E(Y Z = 1) 40.0 (38.0, 42.1) 2 θ 0.7 ( 3.5, 2.0) E(Y Z = 0) 40.5 (38.8, 42.3) E(Y Z = 1) 39.8 (37.4, 42.1) 3 θ 0.8 ( 3.8, 2.2) E(Y Z = 0) 40.4 (38.5, 42.2) E(Y Z = 1) 39.6 (36.9, 42.2)

7 1166 Biometrics, December Comparing the Results to Those Obtained Using RAM-SMs The Stata module alho, available at the website of the last author, was used to fit a RAM-SM that is conceptually similar to the RAM-PMM used here. This module requires complete covariates and so we used mean imputation to impute missing baseline quality of life scores (imputing missing baseline values in this way in randomied trials is not a source of bias (White and Thompson, 2005)). All participants who provided outcome data after more than 3 attempts were placed in the R = 3 pattern. Those who did not provide outcome data were placed in the R = 4 pattern, regardless of the number of attempts made to obtain their outcome data. A standard linear regression model was assumed for the final quality of life scores, where the covariates were the treatment group, the baseline quality of life scores and center indicators. The following model was assumed for the RAM-SM missing data mechanism, logit{p(r = k R k, Z =, X = x,y = y)} = λ 0k + γ + λ k x + δ 1 y + δ 2 y, (7) where the covariates X are the baseline quality of life score and center effects. The parameter, δ 2 allows the relationship between the outcome and the number of attempts to differ by treatment arm (related to ζ 1 in the PMM); δ 2 = 0 would roughly correspond to ζ 1 being the same for both treatments. This parameter is important because the treatment contrast is very sensitive to the missing data mechanism differing by randomied arm (White et al., 2007). The alho module gave maximum likelihood estimates of ˆδ 1 = 0.041(.008,.074) and ˆδ 2 = 0.026(.011,.063) and the estimated treatment effect, θ, was 1.5( 3.9, 0.8). From this RAM-SM we obtain some evidence that the final quality-of-life scores had an effect in model (7), where the estimates of δ 1 and δ 2 suggest that participants with better final quality-of-life scores are more likely to report them, particularly in the treatment group. This is consistent with the slopes, ζ 1j of the RAM-PMM. The estimated treatment effect is notably larger than that from the RAM-PMM analysis but still does not achieve statistical significance. In fact, the treatment effect for the RAM-SM would correspond to an implausibly extreme value of C>10 in the RAM-PMM, which would likely lead us to question the validity of the RAM-SM results here. However, there are alternative explanations here including whether a better fitting RAM-PMM (or RAM-SM) would necessitate such an extreme C for roughly compatible results. Estimates of all the parameters can be found in the Supplementary Materials A Comparison of the RAM-SM and RAM-PMM Model Fits We compare the relative fit of the RAM-SM and RAM- PMM using the BIC based on the observed data likelihood, BIC = 2loglik + p log n where p is the number of parameters in each model (18 for the RAM-PMM and 16 for the RAM-SM); note that to compute the BIC for the RAM-PMM, we fit a simple frequentist (equivalent) model to the observed data using linear and logistic regressions with missing baseline data filled in using mean imputation (as was done in the RAM-SM analysis) and censoring treated as in the RAM-SM. The BIC for RAM-SM is and for the RAM-PMM is , indicating better fit for the RAM-SM here. However, we accept the better fit of the RAM-SM with caution here given the implicit extreme value of C implied for the models considered and the ad hoc adjustments (described above) needed to make the BIC comparison here. We discuss this further in Section Conclusions/Discussion We have proposed a pattern mixture model for a repeated attempt design that includes sensitivity parameters; we have also made comparisons with the existing selection models for repeated attempt designs. In the QUATRO study, we found minimal evidence of a significant effect of the intervention (adherence therapy) on 52 week self-reported QoL with the models considered. For our analysis of the QUATRO data, the RAM-SM provided a better fit to the data than the RAM-PMM as measured by the BIC. In fact, the log likelihood for the SM was larger than for the PMM; this is likely due to subtle differences between the models including the pattern-specific distributions implied by the selection model, the different forms of the missing data mechanism for the two approaches (cf. Section 4.1) and the ad hoc adjustments needed for the RAM- SM comparison (cf. Section 5.3) here (using the stata alho module). This is despite the fact, as seen in the simulations in Section 4, that the pattern-specific behavior of the RAM-SM is similar to the linear model used to extrapolate the missing pattern in the RAM-PMM. However, we recommend consideration of the RAM-PMM in general as it allows for sensitivity analysis, unlike the RAM- SM, handles the missing data similar to the RAM-SM in terms of a continuum of resistance, and does not have the issue of a potential large impact on inferences of modeling choices in the missing data mechanism (though this impact does not seem to be as large as in ordinary selection models (Ng, 2013)). We saw this undesirable sensitivity as the RAM-SM results here corresponded to an unreasonably extreme value of C and were quite different than related RAM-SM s fit to the QUATRO data in Jackson et al. (2012). In fact, the extreme value of C suggests that very extreme values of the QoL were needed to make the full data response look normal. There are a variety of extensions to the current models. The choice of linearity for the functional form of the conditional means for extrapolation was made due to there only being three patterns in the QUATRO data example. More complex forms for h (k; ζ) can easily be accommodated (though this choice is restricted by the number of attempts/patterns). In the QUATRO data, we only consider positive values for C based on the concept of a continuum of resistance and we recommend the maximum number of attempts as a default upper bound for C. However, negative values can be accommodated as appropriate for other datasets (where, for example, it is thought that those unobserved after the maximum number of attempts are more similar to those observed after few attempts). We could also consider more complex forms for the mean and variance functions, μ(, x,k) and σ 2 (, x,k). We as-

8 Mixture Models for the Analysis of Repeated Attempt Designs 1167 sumed a parametric form for the distribution of the covariates, X; more flexible specifications could easily be developed using Bayesian nonparametric models for the distribution of X (as well as for the response, Y model). For repeated attempts with sparse patterns, we can adapt the ideas from the work of Roy (2003) and Roy and Daniels (2008) to combine patterns in a data-dependent way. We are also working on proving the monotonicity of the pattern specific means observed in Section 4.2. Finally, the approach here was developed for a randomied trial. Extension to observational studies would require some minor adjustments including a definition of X as all required confounders as opposed to covariates that potentially impact missingness. 7. Supplementary Materials The supplementary materials contains WinBUGS code for the models fit in Section 5.1 and parameter estimates for models fit in Sections and 5.2, and are available with this paper at the Biometrics website on Wiley Online Library. Acknowledgements MJD was partially supported by US NIH grants CA85295 and CA DJ and IRW are employed by the UK Medical Research Council [Unit Programme number U ]. References Alho, J. M. (1990). Adjusting for nonresponse bias using logistic regression. Biometrika 77, Daniels, M. J. and Hogan, J. W. (2000). Reparameteriing the pattern mixture model for sensitivity analyses under informative dropout. Biometrics 56, Daniels, M. J. and Hogan, J. W. (2008). Missing data in longitudinal studies: Strategies for Bayesian modeling and sensitivity analysis, volume 109 of Monographs on Statistics and Applied Probability. Boca Raton, FL: Chapman & Hall/CRC. Gray, R., Leese, M., Bindman, J., Becker, T., Burti, L., David, A., et al. M. (2006). Adherence therapy for people with schiophrenia european multicentre randomised controlled trial. The British Journal of Psychiatry 189, Jackson, D., Mason, D., White, I. R., and Sutton, S. (2012). An exploration of the missing data mechanism in an internet based smoking cessation trial. BMC Medical Research Methodology 12, 157. Jackson, D., White, I. R., and Leese, M. (2010). How much can we learn about missing data?: An exploration of a clinical trial in psychiatry. Journal of the Royal Statistical Society: Series A (Statistics in Society) 173, Kenward, M. G. (1998). Selection models for repeated measurements with non-random dropout: an illustration of sensitivity. Statistics in Medicine 17, Lin, I.-F. and Schaeffer, N. C. (1995). Using survey participants to estimate the impact of nonparticipation. Public Opinion Quarterly 59, Little, R. J. (1993). -mixture models for multivariate incomplete data. Journal of the American Statistical Association 88, Little, R. J. (1995). Modeling the drop-out mechanism in repeatedmeasures studies. Journal of the American Statistical Association 90, Lunn, D., Best, N., Thomas, A., Wakefield, J., and Spiegelhalter, D. (2002). Bayesian analysis of population PK/PD models: General concepts and software. Journal of Pharmacokinetics and Pharmacodynamics 29, National Research Council (2010). The Prevention and Treatment of Missing Data in Clinical Trials. Washington, D.C.: The National Academies Press. Ng, Y. L. (2013). Using repeated contact attempts to move beyond the missing at random assumption. PhD thesis, University of Cambridge. Roy, J. (2003). Modeling longitudinal data with nonignorable dropouts using a latent dropout class model. Biometrics 59, Roy, J. and Daniels, M. J. (2008). A general class of pattern mixture models for nonignorable dropout with many possible dropout times. Biometrics 64, White, I. R., Carpenter, J., Evans, S., and Schroter, S. (2007). Eliciting and using expert opinions about dropout bias in randomised controlled trials. Clinical Trials 4, White, I. R. and Thompson, S. G. (2005). Adjusting for partially missing baseline measurements in randomied trials. Statistics in Medicine 24, Wood, A. M., White, I. R., and Hotopf, M. (2006). Using number of failed contact attempts to adjust for non-ignorable nonresponse. Journal of the Royal Statistical Society: Series A (Statistics in Society) 169, Received December Revised May Accepted May 2015.

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model