Repeated ordinal measurements: a generalised estimating equation approach

Similar documents
ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS

,..., θ(2),..., θ(n)

PQL Estimation Biases in Generalized Linear Mixed Models

Generalized Linear Models (GLZ)

Longitudinal Modeling with Logistic Regression

Using Estimating Equations for Spatially Correlated A

Charles E. McCulloch Biometrics Unit and Statistics Center Cornell University

Modeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association for binary responses

Figure 36: Respiratory infection versus time for the first 49 children.

Stat 579: Generalized Linear Models and Extensions

Models for Longitudinal Analysis of Binary Response Data for Identifying the Effects of Different Treatments on Insomnia

8 Nominal and Ordinal Logistic Regression

Simulating Longer Vectors of Correlated Binary Random Variables via Multinomial Sampling

DIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

LOGISTIC REGRESSION Joseph M. Hilbe

STAT 526 Advanced Statistical Methodology

LOGISTICS REGRESSION FOR SAMPLE SURVEYS

Describing Stratified Multiple Responses for Sparse Data

Assessing GEE Models with Longitudinal Ordinal Data by Global Odds Ratio

GEE for Longitudinal Data - Chapter 8

Chapter 1. Modeling Basics

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.

Chapter 2: Describing Contingency Tables - II

Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links

Regression models for multivariate ordered responses via the Plackett distribution

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

SUPPLEMENTARY SIMULATIONS & FIGURES

Bayesian Multivariate Logistic Regression

Multivariate Extensions of McNemar s Test

Efficiency of generalized estimating equations for binary responses

Investigating Models with Two or Three Categories

Multinomial Logistic Regression Models

Sections 2.3, 2.4. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis 1 / 21

Robust covariance estimator for small-sample adjustment in the generalized estimating equations: A simulation study

GLM models and OLS regression

Generalized Estimating Equations (gee) for glm type data

Single-level Models for Binary Responses

Sample size calculations for logistic and Poisson regression models

Anders Skrondal. Norwegian Institute of Public Health London School of Hygiene and Tropical Medicine. Based on joint work with Sophia Rabe-Hesketh

Mantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC

Ronald Heck Week 14 1 EDEP 768E: Seminar in Categorical Data Modeling (F2012) Nov. 17, 2012

Testing Non-Linear Ordinal Responses in L2 K Tables

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES

Generalized Linear Models

Poisson regression: Further topics

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

Testing Independence

Categorical Predictor Variables

Multinomial Regression Models

Introduction to General and Generalized Linear Models

Econometrics II. Seppo Pynnönen. Spring Department of Mathematics and Statistics, University of Vaasa, Finland

Semiparametric Generalized Linear Models

Sample size determination for logistic regression: A simulation study

Bias-corrected AIC for selecting variables in Poisson regression models

A COEFFICIENT OF DETERMINATION FOR LOGISTIC REGRESSION MODELS

Generalized Quasi-likelihood (GQL) Inference* by Brajendra C. Sutradhar Memorial University address:

Lecture 15 (Part 2): Logistic Regression & Common Odds Ratio, (With Simulations)

ONE MORE TIME ABOUT R 2 MEASURES OF FIT IN LOGISTIC REGRESSION

Modeling and Measuring Association for Ordinal Data

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

Generalized Linear Models for Non-Normal Data

On Properties of QIC in Generalized. Estimating Equations. Shinpei Imori

The equivalence of the Maximum Likelihood and a modified Least Squares for a case of Generalized Linear Model

Logistic Regression: Regression with a Binary Dependent Variable

Generalized linear models

Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence

Generalized Linear Models: An Introduction

General Regression Model

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

Model Assumptions; Predicting Heterogeneity of Variance

Longitudinal analysis of ordinal data

Discrete Response Multilevel Models for Repeated Measures: An Application to Voting Intentions Data

An R # Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

Logistic regression: Miscellaneous topics

A weighted simulation-based estimator for incomplete longitudinal data models

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1

Outline of GLMs. Definitions

The GENMOD Procedure. Overview. Getting Started. Syntax. Details. Examples. References. SAS/STAT User's Guide. Book Contents Previous Next

University of California, Berkeley

TECHNICAL REPORT # 59 MAY Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study

1 Mixed effect models and longitudinal data analysis

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL

Generalized Linear Models Introduction

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

A measure of partial association for generalized estimating equations

Comparison of methods for repeated measures binary data with missing values. Farhood Mohammadi. A thesis submitted in partial fulfillment of the

COMPOSITIONAL IDEAS IN THE BAYESIAN ANALYSIS OF CATEGORICAL DATA WITH APPLICATION TO DOSE FINDING CLINICAL TRIALS

A test for improved forecasting performance at higher lead times

Improving the Precision of Estimation by fitting a Generalized Linear Model, and Quasi-likelihood.

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Journal of Statistical Software

Covariance modelling for longitudinal randomised controlled trials

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Transcription:

Repeated ordinal measurements: a generalised estimating equation approach David Clayton MRC Biostatistics Unit 5, Shaftesbury Road Cambridge CB2 2BW April 7, 1992 Abstract Cumulative logit and related regression models for ordered categorical data may be expressed as generalised linear models for correlated binary responses. These may be fitted using the generalised estimated equation approach of Liang and Zeger (1986) and yields nearly identical results to maximum likelihood while offering further flexibility. The approach also generalises to deal with repeated ordinal measurements in the same subject, such as those commonly observed in medical cross-over experiments. Keywords: Generalised estimating equations, ordinal data, repeated measures, cross-over trials. 1 Background: generalised estimating equations In a generalised linear model (GLM), a response vector y, length N, has expectation vector µ whose elements are related to those of a linear predictor η by the link function g(.) (so that g(µ i ) = η i ). The linear predictor is given by the linear model η = Xβ where X is a matrix whose rows, x i, are vectors of covariates for each observational unit. In the original formulation of GLMs, the responses are assumed to be independent with variances φv(µ i ), where V (.) is the variance function and φ the scale factor. 1

Estimation of the regression coefficients, β, is by solution of the estimating equations X t We = 0 (1) where e is the vector of scaled residuals, e i = g (µ i )(y i µ i ), and W is a diagonal matrix of weights such that [W ii ] 1 = φv(µ i ) [ g (µ i ) ] 2. It is well known that these estimating equations lead to maximum likelihood estimates of β when the distribution of responses is drawn from the exponential family. In other cases, they are referred to as maximum quasi-likelihood estimates. In either case, under the assumptions set out above, the asymptotic variance of the estimates is estimated by (X t WX) 1 (evaluated with β at its estimated value, ˆβ). More recently, it has been recognised that these estimating equations lead to consistent (if not fully efficient) estimates of β even when the variance function V(.) is mis-specified. However, in these circumstances the above variance estimate is incorrect and it is necessary to use an alternative robust estimate of the form (X t WX) 1 S(X t W X) 1. Here S is the sum of squares and products (SSP) matrix for the individual contributions to the estimating equation. More precisely, if x i is the i th row of X, the contribution of subject i to the estimating equations is given by u i = W ii e i x i so that equation 1 may be rewritten as i u i = 0, then S = u i u t i. i In the papers by Liang, Zeger and collaborators (for example, Liang and Zeger, 1986) this idea has been further developed. If the vector y, of length NM, represents M repeated measurements on N different individuals, then the model may be extended to allow for correlation between repeated measurements of the same individual with, say, Corr(y i j,y ik ) = (ρ jk ) i. 2

(Usually the correlation structure is assumed to be constant across subjects, so that the i subscript may be dropped.) Efficient estimation of β may be achieved by solution of estimating equations of the same general form as discussed above, with the modification that the W matrix should now be block-diagonal as a result of the correlation between repeated measures. Since it may be difficult in practice to specify the correct variance and correlation structures, Liang and Zeger recommend the use of a convenient working approximation to ρ, and of a robust estimate for the variance of ˆβ. This is obtained in the same way as before, the matrix S now representing the empirical SSP matrix for the N contributions of individual subjects to the estimating equation, u i (= j u i j ). 2 Ordered categorical responses Perhaps the most popular method for analysis of ordered categorical data analysis is that based upon the cumulative logit regression model. This was first proposed by Snell (1964) and further generalised by McCullagh (1980) to allow link functions other than the logit. McCullagh s description of the model was in terms of an underlying latent continuous response, stratified at unknown cutpoints. For a response with C categories we need C 2 parameters to represent these cutpoints (since the boundary between the first two categories can be taken as zero without loss of generality). An alternative view of the class of models is that they hold that, over the C 1 different ways of collapsing the response into a binary one, the quantal regression equations are unchanged, save in their intercepts. Any of the usual binary regression links (logit, probit, complementary log-log etc.) are available. With this view of the model, the extra C 2 cutpoint parameters represent differences between the intercepts of the C 1 binary regressions. This latter view of the logit version of the model prompted Clayton (1974) to propose, for the two sample problem, a modified version of the Mantel-Haenszel estimate of the common odds ratio in a stack of 2 2 contingency tables. The possible collapses of the ordinal response yield C 1 such tables and these provide C 1 correlated estimates of the common odds ratio. Clayton showed that, although the optimal weights for pooling these estimates are rather complicated, use of weights which are optimal under the null hypothesis provides a convenient practical method. This method was based on two main ideas, 3

1. the treatment of the ordinal response as C 1 correlated binary responses, and 2. the use of weights which are locally optimal around the null. The first (and, to a lesser extent, the second) of these ideas is carried through in the present proposal. Thus, if the ordered categorical response of the i th subject is coded 1,...,C then we may create an expanded vector of binary responses, y, of length N(C 1) and indexed by i and j so that, for j = 1,...,C 1, y i j = I(y i j). The Snell-McCullagh model relates the expectation of this vector, µ, via a link function to the linear predictor vector, η, with elements also indexed by i and j η i j = θ j + x t iβ. This model may be fitted using generalized estimating equations. An expanded design matrix, X, is created by repeating each row of the original design matrix C 1 times, corresponding to the C 1 possible collapses, and appending C 2 columns of dummy variables to allow for differences in intercepts of the C 1 possible binary regressions. The binomial variance function correctly specifies the variances of the elements of y. The correlations of responses are simple functions of their expectations, µ, Corr(y i j,y ik ) = µ i,min( j,k) µ i jµ ik µi j (1 µ i j )µ ik (1 µ ik ). A working correlation matrix, constant for all i, is provided by the estimate under the null hypothesis of homogeneity of response. This is obtained by substituting the marginal cumulative proportions for µ i j, j = 1,...,C 1 in the above expression. Notice that, in contrast with the method of Clayton (1974), the weighting scheme uses estimates under the null hypothesis only for correlations between the elements of y their variances are dealt with correctly. However, if software allowed, there would be no need even for this inaccuracy, which arises solely from a requirement to specify a common correlation structure across subjects. 4

Time (minutes) Treatment < 20 20 30 30 60 > 60 Active 40 49 19 11 Placebo 31 29 35 25 Table 1: Time to falling asleep for 239 subjects An example The main purpose of this paper is to exploit the natural generalisation of this approach to deal with repeated ordinal measurements within the same subject. Before proceeding to this, however, a comparison with the method of maximum likelihood in the simpler case serves to demonstrate the efficiency of the method, and some practical advantages. Table 1 reproduces a dataset which has been analysed elsewhere in the literature (Framcom, Chuang and Landis, 1989; Agresti, 1989). The data concern time to falling asleep, coded into 4 ordered categories, for N = 239 subjects, half receiving active treatment and half placebo. Measurements were made pretreatment and on a follow-up occasion after treatment, but for this first analysis only the follow-up data are shown. For the GEE analysis, each subject contributes three binary response variables coding whether time to falling asleep was (a) 20 minutes, (b) 30 minutes, or (c) 60 minutes. The corresponding marginal proportions are 0.2971, 0.6234 and 0.8494 so that the working correlation matrix is 1.000 0.505 0.274 1.000 0.542 1.000 If treatment is coded into a vector z, with z i = 0 indicating placebo and z i = 1 indicating active treatment, the Snell-McCullagh model is η i j = µ + θ j + βz i where the cutpoint parameters, θ j, are subject to a linear constraint such as the corner constraint θ 1 = 0. Alternatively, in the syntax introduced by Wilkinson and Rogers (1973) and further developed in computer programs such as GLIM, this model can be written. 1 + Cutpoint + Treatment. 5

Using GEE with a logistic link and binomial variance function, the treatment effect is estimated as ˆβ = +0.762 with asymptotic standard error 0.239 (note that the positive coefficient indicates a shift to the left in the response distribution). Full maximum likelihood yielded ˆβ = 0.761 with an ASE of 0.238. An unexpected benefit of the GEE approach is that the cutpoint parameters enter simply as terms in the linear model. The assumption of constancy of treatment effect across cutpoints may be tested by inclusion of a Cutpoint Treatment interaction term in the model. A single degree of freedom test for trend of treatment effect across cutpoints can be carried out by fitting the model η i j = µ + θ j + βz i + γ jz i. In the present example, this yields ˆγ = 0.369 with ASE 0.222. There is, therefore, some suggestion of failure of the Snell-McCullagh model, with a tendency for the treatment effect to increase with shift of cutpoint to the right. This impression is also suggested by inspection of the odds ratios for the three cutpoints; cutting at 20 minutes gives an odds ratio of (40 89)/(31 79) = 1.45, cutting at 30 minutes gives an odds ratio of (89 60)/(60 30) = 2.97, and cutting at 60 minutes gives an odds ratio of (108 25)/(95 11) = 2.58 The ability to include such interaction terms represents a genuine extension of the Snell-McCullagh approach. Although the representation of treatment effect with a single parameter requires us to assume no interaction between treatment and cutpoint, there is no such requirement for other explanatory variables of less direct interest. Thus, the proportional odds assumption may be maintained for the effect of interest (treatment), but relaxed for the effects of other powerful disturbing influences. 3 Repeated ordinal response data The extension of the method to deal with repeated ordinal measurements in the same subject is natural. Such repeated ordinal measurements occur frequently in cross-over trials (see, for example, Jones and Kenward, 1989), and in experiments which incorporate a pre-treatment baseline measurement. The analysis of such data by maximum likelihood is difficult. Incorporation of a random subject effect in the linear model leads to an intractable likelihood, as do other approaches to modelling the association structure. 6

By contrast, the GEE approach is straightforward. Each ordinal measurement contributes a block of derived binary response variables so that, if there are R repeated measurements, the binary response vector is of length NR(C 1). Explanatory variables may be constant within a subject, in which case each value must be repeated R(C 1) times in the design matrix, or may vary from occasion to occasion, requiring each value to be repeated C 1 times. The model will include effects for cutpoint, occasion, other explanatory variables, and (possibly) their interaction. Two methods have been considered for calculating a working correlation matrix 1. to calculate working correlations between binary responses representing different cutpoints of the same measurement as in 2, and to ignore all others, and 2. to estimate the correlation structure as a free R(C 1) R(C 1) matrix. The second suggestion requires estimation of the correlation structure and this is an active research area. In this paper the approach suggested by Liang and Zeger (1986) is used. In later work (Liang, Zeger and Qaqish, 1992) this was termed GEE1 to distinguish it from the (rather more efficient) approach of Prentice and Zhao (1991), which they termed GEE2. An example Table 2 shows the sleep data in more detail, including both pre-treatment and follow-up measurements. The extended analysis simultaneously models pre-treatment and follow-up responses by expanding each subject s responses into 6 binary indicators. If, as before, we index subjects by i and cutpoints by j, and further index pre-treatment and follow-up responses by t = 0 and 1 respectively, then a model for treatment effect is or, in the Wilkinson and Rogers syntax, η i jt = µ + θ j + βz i + γt + δz i t. 1 + Cutpoint + Treatment + Occasion + Treatment.Occasion The parameter of interest in this model is the interaction parameter, δ. 7

Initial Follow-up occasion Treatment occasion < 20 20 30 30 60 > 60 Active < 20 7 4 1 0 20 30 11 5 2 2 30 60 13 23 3 1 > 60 9 17 13 8 Placebo < 20 7 4 2 1 20 30 14 5 1 0 30 60 6 9 18 2 > 60 4 11 14 22 Table 2: Time to falling asleep for 239 subjects, pre-treatment and at follow-up Our first working correlation structure is 1.000 0.566 0.291 1.000 0.515 ρ 1 = 1.000 0 1.000 0.505 0.274 1.000 0.542 1.000 The bottom right section of this matrix is the same as that used in 2 and the top left section is calculated in the same way from the marginal cumulative proportions for the pre-treatment measurement (0.1088, 0.2762 and 0.5900). Correlation between pre-treatment and follow-up responses are ignored. In the second approach, the correlation matrix was estimated from the data, as ρ 2 = 1.000 0.593 0.306 1.000 0.509 1.000 0.202 0.174 0.118 0.410 0.301 0.169 0.300 0.346 0.340 1.000 0.486 0.259 1.000 0.504 1.000 Note that these two matrices agree quite closely except for those elements set to zero in the former. 8..

ASE Method Estimate Naive Robust GEE(ρ 1 ) 0.701 0.336 0.242 GEE(ρ 2 ) 0.677 0.244 0.241 EWLS 0.65 0.25 Table 3: Estimates of the Treatment Occasion interaction parameter The estimates of the interaction parameter δ obtained from these two analyses are given in Table 3. Also shown is the estimate obtained by Agresti (1989), who fitted the same model to these data using empirically weighted least squares (EWLS). For the GEE analyses, two ASE s are given. The first ( naive ) estimate is the appropriate diagonal element of (X t WX) 1 and requires that the working correlation matrix and the variance function are both correct. The second is the robust estimate which allows for mis-specification of either or both of these. The variance function cannot fail to be correctly specified since, for any response, y, taking on values 0 or 1, Var(y) = E(y)[1 E(y)]. It is therefore not surprising that in the second GEE analysis, which estimates the correlation structure from the data, the naive and robust ASE s agree closely. In the first analysis, the naive ASE is incorrect owing to the mis-specification of the working correlation matrix. However, the robust ASE is very close to that obtained in the second analysis. It would seem, therefore, that the loss of efficiency due to using an incorrect working correlation structure is negigible. Agresti s estimates using EWLS were only published to two decimal places, but seem to agree quite closely with the GEE analyses. In no analysis was the treatment effect estimated more precisely than our earlier analysis which discarded the pre-treatment baseline measurement. This, of course, is not surprising if we consider the analogous analysis for measurements on a continuous interval scale. In that case, we only gain from using the baseline data if the between-subject component of variance excedes the within-subject error variance. In that case, the correlation between pre-treatment and follow-up measurements would excede 0.5. 9

4 Discussion The generalised estimating equation method proposed by Liang and Zeger provides an invaluable new tool for the applied statistician. The approach to ordinal response data described here serves to demonstrate the flexibility of the approach and its ability to provide a unified approach to seemingly unrelated problems. Now that software is becoming available, it is increasingly attractive to use this general technique in preference to more specialised (and limited) programs. This paper has shown that 1. the Snell-McCullagh model for ordinal response data may be treated as a special instance of marginal models for repeated binary responses, 2. the GEE method of estimation is nearly as efficient as full maximum likelihood, 3. the approach allows extension of the model to include interactions between cutpoints and explanatory variables, and... 4.... to deal with repeated ordinal measurements. Some problems remain. In particular the performance of the method for repeated measurements in small samples requires further investigation, particularly in view of its potential application in cross-over trials. In this context, the adequacy of the robust ASE requires further study. The alternative is to estimate the correlation structure and use the naive ASE, but estimation of a large number of correlations from a small sample is potentially hazardous. A further possibility is to model the correlation structure more parsimoniously in terms of the expected values and, perhaps, one further parameter expressing the strength of association between pretreatment and follow-up measurements. It must be expected, however, that whatever approach turns out to be preferable, generalised estimating equation methods will prove better in small samples than the empirical weighted least squares approach which is currently its main competitor. Software The computations described in this paper were carried out in S using the gee() function written by Vincent Carey and available on STATLIB. The maximum like- 10

lihood analysis of the follow-up data was carried out using SAS PROC LOGIS- TIC. Agresti s (1989) analysis used SAS PROC CATMOD. Acknowledgements I am grateful to the associate editor and to the referees for their constructive criticism of an earlier version. References Agresti, A. (1989) A survey of models for repeated ordered categorical response data. Statistics in Medicine, 8, 1209 1224. Clayton, D.G. (1974) Some odds ratio statistics for the analysis of ordered categorical data. Biometrika, 61, 525 531. Francom, S.F., Chuang, C. and Landis, J.R. (1989) A log-linear model for ordinal data to characterize differential change among treatments. Statistics in Medicine, 8, 571 582. Jones, B. and Kenward, M.G. (1989) The Design and Analysis of Cross-over Trials. Chapman and Hall, London. Liang, K.-Y. and Zeger, S.L. (1986) Longitudinal data analysis using generalized linear models. Biometrika, 73, 13 22. Liang, K.-Y., Zeger, S.L. and Qaqish, B. (1992) Multivariate regression analyses for categorical data (with discussion). J.R.Statist.Soc. B, 54, 3 40. McCullagh, P. (1980) Regression models for ordinal data (with discussion). Journal of the Royal Statistical Society, Series B, 42, 109 142. Prentice, R.L. and Zhao, L.P. (1991) Estimating equations in means and covariances of multivariate discrete and continuous responses. Biometrics, 47, 825 840. Snell, E.J. (1964) A scaling procedure for ordered categorical data. Biometrics, 20, 592 607. Wilkinson, G.N. and Rogers, C.E. (1973) Symbolic description of factorial models for analysis of variance. Applied Statistics, 22, 392 399. 11