DIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS

Size: px

Start display at page:

Download "DIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS"

Emily Gardner
5 years ago
Views:

1 DIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS Ivy Liu and Dong Q. Wang School of Mathematics, Statistics and Computer Science Victoria University of Wellington New Zealand Corresponding author: Ivy Liu Key Words: clinical trials; cumulative odds ratio; diagnostics; influence measure; Mantel- Haenszel estimator; ordinal response; proportional odds model; random effects; and strata deletion. ABSTRACT For stratified clinical trials with an ordinal response, this article considers two diagnostics strategies for evaluating the heterogeneity of the ordinal odds ratios across multi-centers in proportional odds models. The first strategy assumes multi-centers as a sample from a population and then uses a proportional odds model allowing random effects to model the heterogeneity among the ordinal log odds ratios. It gives an overall view of the heterogeneity across multi-centers. The second strategy applies the influence measure to the Mantel- Haenszel type estimators of the ordinal odds ratios. It shows the detail of heterogeneity in each of the multi-centers. At the end, this article uses a multi-center clinical trials example to illustrate both strategies. 1. INTRODUCTION This article considers diagnostics methods for stratified multi-center clinical trials when the response variable has a natural ordering (e.g., better, unchanged, and worse). For a clinical trial study, the odds ratio is commonly used to describe the relationship between treatments and response. When the response is ordinal, the ordinal odds ratios that sum- 1

2 marize the relationship across different centers are considered based on a proportional odds model. Diagnostic methods play an important role in categorical data analysis. For instance, Grizzle and Williams (1972) explored a general diagnostic approach to the analysis of categorical data for fitting a loglinear model. Pregibon (1981) developed diagnostic measures for a maximum likelihood fit of a logistic regression model. Andersen (1992) discussed diagnostics for categorical data in the Goodman association model. Wang, Critchley and Liu (2004) considered the diagnostic method for a contingency table assuming a clustered sampling model. A well known diagnostic measure, Cook distance, is widely used for linear regression models (Cook and Weisberg, 1982) and multivariate analysis (Wang and Critchley, 2000 and 2003). This article uses two diagnostics strategies for evaluating the heterogeneity of the ordinal odds ratios across multi-centers in proportional odds models. The strategies are the following: (1) Using a proportional odds model (McCullagh, 1980) allowing random effects that assumes the multi-centers as a sample from a population (Hartzel, Liu and Agresti, 2001). The model permits heterogeneity in the conditional associations between treatments and response. The estimate of the mean for the association effects describes the average relationship between treatments and response across centers. The variance estimate for the association effects describes the heterogeneity of the relationship across centers. (2) Using Mantel-Haenszel (Mantel and Haenszel, 1959) type estimators to evaluate the ordinal odds ratios in a proportional odds model (Liu and Agresti, 1999; Liu, 2003). This technique summarizes the association effects across the centers using ordinal odds ratios. Then, we propose an influence measure to indicate the detail of the heterogeneity for the ordinal odds ratio in each of the centers. When each center has very few observations, the second strategy is better than the first one, because the model fitting algorithm for the proportional odds model with random effects often has computational problems for sparse data. However, the first strategy gives an overall view of the heterogeneity while the second gives the detail of the heterogeneity in each of the centers. 2

3 Section 2 introduces the proportional odds model used for fitting stratified clinical trials data with an ordinal response. It also shows the Mantel-Haenszel type estimators of the ordinal odds ratios that describe the conditional associations between treatments and response. Section 3 discusses the first strategy that determines the level of heterogeneity using a proportional odds model allowing random effects. Section 4 proposes the second strategy where a influence measure is used to explore the detail of heterogeneity for the ordinal log odds ratios across centers. Section 5 gives an example using these strategies. 2. ORDINAL ODDS RATIOS IN A PROPORTIONAL ODDS MODEL When the response is ordinal (e.g., better, unchanged, and worse), the most popular model uses logits of cumulative probabilities. The model is often called the proportional odds model (McCullagh, 1980). For instance, in a clinical trial study, one can compare r treatments on a c-level ordinal response for data from K clinical centers using the model. Let π ijk denote the probability that the response is at level j when a patient at center k received treatment i, where π i1k + + π ick = 1 for all i = 1,..., r and k = 1,..., K. The cumulative probabilities are denoted by πijk = π i1k π ijk for all i = 1,..., r, j = 1,..., c 1, and k = 1,..., K. The proportional odds model considered has the form ( ) π ijk log = α 1 πijk j + γ k + β i i = 1,..., r, j = 1,..., c 1, k = 1,..., K, (1) with a constraint γ K = β r = 0. The model is simply a logit model when the response is binary. This article uses an example provided by Merck Research Laboratories for a double-blind, parallel-group preliminary clinical study conducted at 21 centers, where patients suffering from asthma were randomly assigned to 3 different treatments (2mg active drug, 10mg active drug, and placebo). At the end of the study, the doctors described the patients change in condition using an ordinal scale from better to worse (1 to 4). Table 1 shows the results of the doctors evaluations associated with the treatments and is presented as 21 separate 3 4 tables. Such a study might use many clinics because of the time it takes each clinic center to recruit many patients. Therefore, the three-way table might then have many strata but 3

4 few observations per stratum. It leads to a sparse data set. In the proportional odds model (1), parameters {β i } describe the treatment effects given clinics. For each clinic center, the odds that the evaluation for treatment i falls below any fixed level are exp(β i ) times the odds for treatment r. We refer to the ordinal odds ratio exp(β i ) as the cumulative odds ratio, since the model is based on the cumulative probabilities. When fitting the model, the maximum likelihood (ML) estimators of {β i } perform badly for sparse data. Liu and Agresti (1996) and Liu (2003) have shown that the ML method tends to overestimate the effects. This is not surprising, because it also happens for a binary response case (Andersen 1980, p. 244 and Ghosh, 1995). The standard asymptotic properties of ML estimators fail when the number of nuisance parameters (such as {γ k }) grows at the same rate as the sample size (Neyman and Scott, 1948). Alternatively, the Mantel-Haenszel (MH) type method provides better estimators for {β i } (Liu and Agresti, 1996; Liu, 2003). Liu (2003) extended the ordinary MH estimator (Mantel and Haenszel, 1959) for a common odds ratio for several 2 2 tables to the case of several r c tables. Like the ordinary MH estimator the extended MH-type estimator is consistent under both the ordinary asymptotics in which the number of strata is fixed and also the sparse asymptotics in which the number of strata grows with the sample size. The estimator is referred to a dually consistent estimator. We assume that each r c table is formed by r independent multinomial samples (X 1jk, X 2jk,..., X rjk, j = 1,..., c) with sample sizes denoted by (n 1k, n 2k,..., n rk ). The total sample size in each stratum is denoted by N k = n 1k n rk, for k = 1,..., K. Let cumulative counts be X ijk = X i1k X ijk and let R ih jk = X ijk (n hk X hjk )/N k and S ih jk = X hjk(n ik X ijk)/n k. Liu (2003) proposed the MH type estimator of β i as where K c 1 L ih = log L i = 1 r Rjk ih k=1 j=1 r h=1,h i L ih K c 1 log Sjk ih k=1 j=1 r 1 L rh h=1, i = 1,..., r 1, (2), i = 1,..., r, h = 1,..., r, and i h. 4

5 Furthermore, the dually consistent variance and covariance estimators for { L i } follow immediately from Greenland (1989) and Liu (2003) as ˆ Cov( L i, L h ) = 1 r 2 (U + ih U + ir U + rh + U + rr ) ˆ Var( L i ) = 1 r 2 (U i++ 2U + ir + U r++), (3) where U ihg = 0 if i = h or g, U ih + = U i++ if i = h, and U ih + = U hi + = U +ih U ih+ U hi+ + U ihh if i h. The form of U ihg is given in Appendix A. Liu (2003) gave the proof of dual consistency for the variance and covariance estimators. For sparse data such as that in Table 1, the MH-type estimators (2) of the cumulative log odds ratios {β i } perform better than the ML estimators. 3. PROPORTIONAL ODDS MODEL WITH RANDOM EFFECTS When comparing treatments on an ordinal response with stratified data, it is common to use Model (1) that assumes a lack of interaction between the treatment effects and strata. That is, the cumulative odds ratios used to describe the conditional association between treatment and response are the same for each clinic center. To check the homogeneity of cumulative odds ratios, we might use ordinary Pearson or likelihood-ratio goodness-offit tests for the corresponding model. Such tests are appropriate only under the ordinary asymptotics, which is not the case for Table 1. In a clinical trial conducted to compare treatments among several clinical centers, the true treatment effects might vary due to some other factors, such as age among subjects at different centers. Therefore, it is plausible to estimate the degree of the heterogeneity for the cumulative odds ratios across the centers. Hartzel, Liu and Agresti (2001) used random effects terms to describe the variability for the cumulative log odds ratios for two treatments. When the strata are a sample, such as a sample of clinical centers, it is natural to treat the strata effects or the cumulative log odds ratios as random effects across strata. Consider a proportional odds model allowing random effects terms as ( ) π ijk log = α 1 πijk j + c k + b ik i = 1,..., r, j = 1,..., c 1, k = 1,..., K, (4) 5

6 where b rk = 0 for all k and {b 1k,..., b r 1,k } are a vector of correlated random effects. In the model, {c 1,..., c K } are independent observations from a N(γ, σ c ) and {b i1,..., b ik } are independent observations from a N(β i, σ βi ) for all i = 1,..., r 1. The model is a special case of a multivariate generalized linear mixed model for ordinal responses (Tutz and Hennevogl, 1996). To obtain the likelihood function we construct the usual product of multinomials and then integrate out the random effects with respect to the random effects distribution. Let u k = (c k, b 1k,..., b r 1,k ) be the random effects having a multivariate normal distribution with mean β = (γ, β 1,..., β r 1 ) and the covariance matrix Σ. That is, u k N[β, Σ]. Let g(u; β, Σ) denote the multivariate normal density function with mean β and covariance matrix Σ. The likelihood function has the form K r c L(α, β, Σ) = (π ijk ) X ijk g(u k ; β, Σ)du k, k=1 i=1 j=1 where π i1k = exp[ (α 1 + c k + b ik )] π ijk = exp[ (α j + c k + b ik )] exp[ (α j 1 + c k + b ik )], j = 2,..., c 1 π ick = exp[ (α c 1 + c k + b ik )] To fit Model (4), one can consider Hedeker and Gibbons (1994) procedures. They maximized the likelihood after approximating the integrals by Gauss-Hermite quadrature. Alternatively, the procedure PROC NLMIXED in SAS for fitting generalized linear mixed models using adaptive Gauss-Hermite quadrature is also available to fit the proportional odds models with random effects. See Hartzel, Liu, and Agresti (2001) for the details of SAS codes. However, fitting the proportional odds model with random effects may not be suitable for highly sparse data. The fitting algorithm often fails to converge when the data are sparse and when there are many random effects. In our experience, the fitting algorithm for SAS PROC NLMIXED is often successful when the number of random effects is less than 3. 6

7 For Model (4), we describe the conditional association between treatments and response using the estimate of means {β 1, β 2,..., β r 1 }. The degree of the heterogeneity for the cumulative log odds ratios across the centers is given by the estimate of standard deviations {σ β1, σ β2,..., σ βr 1 }. 4. A DELETION INFLUENCE MEASURE There are several ways to check for the homogeneity of cumulative odds ratios or to describe the heterogeneity among them. For instance, we might use an overall goodness-offit test to check the homogeneity. In the proportional odds model with random effects (4), we can use the variance estimators to indicate the level of heterogeneity among the cumulative log odds ratios across centers. When the homogeneity doesn t hold, the next question that arises is to identify which pieces of data contribute to the heterogeneity most. Sometimes, even though the heterogeneity is minor in magnitude and perhaps not even significant in the sample according to a statistical test, it is always interesting to get the detail of the heterogeneity for the cumulative log odds ratios among multi-centers. Suppose we replace {β i } by {β ik } in the proportional odds model (1). This model allows the cumulative odds ratio to vary among centers. The variability in {β i1, β i2,..., β ik } indicates the heterogeneity of the conditional associations between treatments and response across centers for all i = 1,..., r 1. However, when there are few observations in a particular center (say center k), the ML estimate of {β 1k, β 2k,..., β r 1,k } is not reliable. The estimate can approach to either negative infinity or positive infinity because of the many zero counts within the center. A similar problem occurs if we use the Mantel-Haenszel type estimator (2) to estimate the cumulative odds ratios for each center. For the example in Table 1, each center has very few patients. It is not appropriate to estimate the center-specific cumulative odds ratios in the sense that the treatment effects obtained separately from each center provide little information. Consequently, the detail of heterogeneity is not available based on the estimates of the center-specific cumulative odds ratios. Instead, this article proposes an influence measure that determines the strength of the heterogeneity of the cumulative log odds ratios across different centers based on the MH type 7

8 estimators (2). The proposed influence measure has a similar form as the Cook s distance (Cook and Weisberg, 1982) for the ordinary linear regression models. For the linear regression models, the Cook s distance is a common diagnostics method used to detect a goodness-of-fit of the model and also to identify possibly aberrant data points in multivariate analysis. In general, the Cook s distance D k has the form D k = (ˆθ ˆθ (k) ) T ˆ Cov(ˆθ) 1 (ˆθ ˆθ (k) ) where ˆθ is the vector of estimates based on all data points and ˆθ (k) is the vector of estimates when the kth data point is deleted. For the stratified clinical trials, we are interested which clinical center has a strong influence to the treatment effects. A clinical center is said to be influential if its deletion has a relatively large influence on the inference measure. For the multi-center clinical trials, if one center (e.g., kth) is deleted, the effect of deleting the kth center on the estimates of { L i } is to yield the estimates of { L i(k) }. The proposed influence measure is defined as C k = ( L L (k) ) T Cov( ˆ L) 1 ( L L (k) ), k = 1,..., K, (5) where the vector L = ( L 1, L 2,..., L r 1 ) is the MH estimates of the cumulative log odds ratios and Cov( ˆ L) is the estimate of the covariance matrix of ( L 1, L 2,..., L r 1 ). The form of { L i } is given on (2) and their variance and covariance estimates are given on (3). The effect C k of deletion is a quadratic form measuring the overall difference between the vector L and the L (k). A relatively large value of C k implies that the kth clinical center has a high influence on the cumulative log odds ratios. 5. EXAMPLE ratios. For the example in Table 1, we use an MH type method to estimate the cumulative odds We also describe the heterogeneity for the cumulative odds ratios among centers based on the two strategies discussed in Sections 3 and Cumulative Odds Ratios Estimates in a Proportional Odds Model 8

9 Using the MH type estimates (2) to describe the conditional relationship between the treatments and the patients change in condition given each center, we have L 13 = and L 23 = with standard error estimates of and respectively. That is, the odds of improving using the active drug with dose 2mg (or 10mg) are estimated to be exp(0.640) = 1.90 (or exp(1.063) = 2.90) times the odds for the placebo, given each center. In comparison, the ordinary ML estimates of ˆβ 1 and ˆβ 2 for model (1) equal with an estimated standard error of 0.343, and with an estimated standard error of 0.355, respectively. We tend to obtain a larger parameter estimate from the ML fitting when the number of observations in each center is small First Diagnostics Strategy Using a Random Effects Model Using the proportional odds model with random effects (4), the fitting algorithm fails to converge because of the sparseness of the data and more than 2 random effects in the model. To illustrate the strategy discussed in Section 3 in this example, we combine the response scales 3 and 4 to make the data less sparse. Since r = 3 in this example, there are 3 random effects in model (4), including (c k, b 1k, b 2k ). To reduce the number of random effects, we let {b 2k } degenerate to its mean β 2. We treat {c 1, c 2,..., c K } as independent observations from a N(γ, σ c ) and {b 11, b 12,..., b 1K } as independent observations from a N(β 1, σ β1 ). The random effects (c k, b 1k ) are correlated with covariance σ cβ. Using SAS PROC NLMIXED, the estimate of β 1 is with a standard error of and the estimate for the variability of {b 11, b 12,..., b 1K } is ˆσ β1 = with a standard error of Using the proportional odds model with random effects, the odds of improving using the active drug with dose 2mg are estimated to be exp(0.517) = 1.68 times the odds for the placebo across centers. Although the standard deviation σ β1 is not significantly different form zero, the strength of heterogeneity is large compared to the level of ˆβ 1. The estimate of β 2 is 1.185, with a standard error of 0.325, that is, the odds of improving using the active drug with dose 10mg are estimated to be exp(1.185) = 3.27 times the odds for the placebo. Comparing to the MH type estimates of the cumulative odds ratio, the model (4) produces similar estimates of the conditional associations between treatments and response. Furthermore, the random effects 9

10 model gives the overall estimate for the level of heterogeneity in the treatment effects, that is ˆσ β Second Diagnostics Strategy Using the Influence Measure On the other hand, we can consider the influence measure (5) for the cumulative log odds ratios in each center. Table 2 lists the influence measure C k and the estimates of the cumulative log odds ratios when each of the centers is deleted. The notations { L i(k) } denote the cumulative log odds ratios when the kth center is deleted for i = 1, 2 and k = 1,..., 21. It is clear from Figure 1 that clinical centers 13, 15, 3, and 17 have a larger influence measure compared to the others. This might suggest that the cumulative log odds ratios for those centers are different from the others. It could be due to error, or some other factors (e.g., age), which is worth investigation. ACKNOWLEDGMENT We would like to thank Megan Clark for helpful comments. 10

11 BIBLIOGRAPHY Andersen, E. B. (1980). Discrete statistical methods with social science applications. New York: North-Holland. Andersen, E. B. (1992). Diagnostics in categorical data analysis. 54(3): J. R. Statist. Soc. B Atkinson, A. C. (1985). Plot, transformations and regression. Oxford: Clarendon. Cook, R. D. and Weisberg, S. (1982). Residuals and influence in regression. New York: Chapman and Hall. Ghosh, M. (1995). Inconsistent MLE for the Rasch model. Statist. Probab. Lett. 23: Greenland, S. (1989). Generalized Mantel-Haenszel estimators for K 2 J tables. Biometrics 45: Hartzel, J., Liu, I-M. and Agresti, A. (2001). Describing heterogeneous effects in stratified ordinal contingency tables, with application to multi-center clinical trials. Comput. Statist. Data Anal. 35: Hedeker, D. and Gibbons, R. D. (1994). multilevel analysis. Biometrics 50: A random-effects ordinal regression model for Liu, I-M. and Agresti, A. (1996). Mantel-Haenszel-type inference for cumulative odds ratios with a stratified ordinal responses. Biometrics 52: Liu, I. (2003). Describing ordinal odds ratios for stratified r c tables. Biometrical J. 45: Mantel, N. and Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. J. Natl. Cancer Inst. 22: McCullagh, P. (1980). Regression models for ordinal data. J. R. Statist. Soc. B 42:

12 Neyman, J. and Scott, E. L. (1948). observations. Econometrika 16:1-22. Consistent estimates based on partially consistent Pregibon, D. (1981). Logistic regression diagnostics. Ann. Statist. 9: Tutz, G. and Hennevogl. W. (1996). Random effects in ordinal l regression models. Comput. Statist. Data Anal. 22: Wang, D. Q. and Critchley, F. (2000). Multiple deletion measures and conditional influence in regression models. Commun. Statist. Theory Meth. 29: Wang, D. Q., Critchley, F. and Liu, I. (2004). Diagnostics analysis and perturbations in a clustered sampling model. Commun. Statist. Theory Meth. 33: Wang, D. Q., Critchley, F. and Smith, P. J. (2003). The multiple sets of deletion measurer and masking in regression. Commun. Statist. Theory Meth. 32:

13 APPENDIX A The form of U ihg is as follows: where ˆθ ia = ( k U ihg = j 1 ˆθ 2 ih K 1 ˆθ ih ˆθig R ia jk )/( k c 1 k=1 j=1 K k=1 j=1 j ˆφ ih jk (ˆθ ih )+2 k [ K c 1 c 1 k=1 j=1 S ih jk c 1 ˆφ ih jsk (ˆθ ih ) j<s ] 2, if i h = g K ˆφ ihg c 1 jk (ˆθ ih )+ ˆφ ihig jsk (ˆθ ih,ˆθ ig ) k=1 j s ( K c 1 k=1 j=1 S ih jk )( K c 1 k=1 j=1 S ig jk Sjk ia ) for all a i and ), if i h g, ˆφ ih jk(θ) = 1 [(n Nk 2 ik Xijk)X hjkθ (n ik Xijk)(n hk Xhjk) (Xijk + X hjk )θ + (n hk Xhjk )X ijk ] ˆφ ih jsk (θ) = 1 [(n Nk 2 ik Xisk )X 2 hjk θ2 + (n ik Xisk )(n hk Xhsk ) (Xijk + Xhjk)θ + (n hk Xhsk)X isk] ˆφ ihg jk (θ) = 1 [(n Nk 2 gk XijkX hjk n ik XhjkX gjk)θ +(n hk n gk Xijk n gkxijk X hjk )] ˆφ ihig jsk (θ ih, θ ig ) = n ik θ ih N 2 k n ik θ ig N 2 k [ X hjk (n gk X gsk )] if j < s [ X gsk (n hk X hjk )] If j > s. 13

14 Figure 1: Influence Measure for Table 1 influence measure clinical center 14

15 Table 1: Doctors Evaluations of patients suffering from asthma Response Response Center Drug Center Drug mg mg mg mg Placebo Placebo mg mg mg mg Placebo Placebo mg mg mg mg Placebo Placebo mg mg mg mg Placebo Placebo mg mg mg mg Placebo Placebo mg mg mg mg Placebo Placebo mg mg mg mg Placebo Placebo mg mg mg mg Placebo Placebo mg mg mg mg Placebo Placebo mg mg mg mg Placebo Placebo mg mg Placebo Note: Response is scaled from better (1) to worse (4). 15

16 Table 2: Influence Measure center k C k L1(k) L2(k)

Describing Stratified Multiple Responses for Sparse Data

Describing Stratified Multiple Responses for Sparse Data Ivy Liu School of Mathematical and Computing Sciences Victoria University Wellington, New Zealand June 28, 2004 SUMMARY Surveys often contain qualitative