Class Notes. Examining Repeated Measures Data on Individuals

Size: px

Start display at page:

Download "Class Notes. Examining Repeated Measures Data on Individuals"

Arabella Bruce
5 years ago
Views:

1 Ronald Heck Week 12: Class Notes 1 Class Notes Examining Repeated Measures Data on Individuals Generalized linear mixed models (GLMM) also provide a means of incorporang longitudinal designs with categorical outcomes into situaons where there are clustered data structures. One of the attracve properes of the GLMM is that it allows for linear as well as non-linear models under a single framework which will address issues of clustering. It is possible to fit models with outcomes resulng from various probability distribuons including normal (or Gaussian), inverse Gaussian, gamma, Poisson, mulnomial, binomial, and negave binomial through an appropriate link funcon g(). At level 1, repeated observaons (e.g., students proficiency status in math, students enrollment over successive semesters in college, changes in clinical or health status) are nested within individuals, perhaps with addional me-varying covariates. At level 2, we can define variables describing differences between individuals (e.g., treatment groups, parcipaon status, subject background variables and attudes). Generalized Esmang Equaons Alternavely, by using the Generalized Esmated Equaons (GEE) approach, we can examine a number of categorical measurements nested within individuals (i.e., individuals represent the clusters), but where individuals themselves are considered to be independent and randomly sampled from a populaon of interest. More specifically, in this latter type of model, the pairs of dependent and independent variables ( Y i ; X i ) for individuals are assumed to be independent and idencally distributed (Ziegler, Kastner, & Blettner, 1998) rather than clustered within organizaons. GEE is used to characterize the marginal expectaon of a set of repeated measures (i.e., average response for observaons sharing the same covariates) as a funcon of a set of study variables. As a result, the important point is that the growth parameters are not assumed to vary randomly across individuals (or higher groups) as in a typical random-coefficients (or mixed) model. This is an important disncon between the two types of models to keep in mind that is, while random-coefficient models explicitly address variaon across individuals as well as clustering among subjects in higher-order groups, GEE models assume simple random sampling of subjects represenng a populaon as opposed to at set of higher-order groups. Hence, GEE models provide what are called populaon average results; that is, they model the marginal expectaon as a funcon of the explanatory variables. In contrast, typical mullevel model provide unit specific results. Regression coefficients based on populaon averages (GEE) will be generally similar to unitspecific (random-effect models) coefficients but smaller in size (Raudenbush & Bryk, 2002). This disncon does not arise in models with connuous outcomes and identy link funcons. For example, for a GEE model, the odds rao is the average esmate in the populaon that is, the expected increase for a unit change X in the populaon. In contrast, in random-effect (unitspecific) models, the odds rao will be the subject-specific effect for a parcular level of clustering (i.e., the person or unit of clustering) given a unit change in X.

2 Ronald Heck Week 12: Class Notes 2 We first begin with a within- and between-subjects model esmated using the GEE (or fixedeffect) approach. GEE was developed to extend GLM further by accommodang repeated categorical measures, logisc regression, and various other models for me series or other correlated data where relaonships between successive measurements on the same individual are assumed to influence the esmaon of model parameters (Horton & Lipsitz, 1999; Liang & Zeger, 1986; Zeger, Liang, & Albert, 1988). The GEE analyc approach handles a number of different types of categorical outcomes, their associated sampling distribuons, and corresponding link funcons. It is suitable to use where the repeated observaons are nested within individuals over me, but the individuals are considered to be a random sample of a populaon. One scenario is where individuals are randomly assigned to treatment condions that unfold over me. If the outcome is a count, we can make use of an addional exposure parameter (i.e., referred to as an offset term) which as you will recall is a "structural" predictor that can be added to the model. Its coefficient is not esmated by the model but is assumed to have the value 1.0; thus, the values of the offset are simply added to the linear predictor of the dependent variable. This extra parameter can be especially useful in Poisson regression models, where each case may have different levels of exposure to the event of interest. At present in IBM SPSS, the GEE approach only accommodates a two-level data hierarchy (measurements nested in individuals). If we intend to add a group-level variable, we would need to use GENLIN MIXED to specify the group structure. Students Proficiency in Reading Over Time Consider a study to examine students likelihood to be proficient in reading over me and to assess whether their background might affect their varying patterns of meeng proficiency or not. We may first be interested in answering whether a change takes place over me in students likelihood to be proficient. This concern addresses whether the probability of a student being proficient is the same or different over the occasions of measurement. The assumpon is that if we can reject the hypothesis that the likelihood of being proficient is the same over me, it implies that a change in individuals has taken place. In this situaon, occasions of measurement are assumed to be nested within subjects but independent between subjects. We may have a number of research quesons we are interested in examining such as the following: What is the probability of students being proficient in reading over me? Do probabilies of being proficient change over me? What do students trends look like over me? Are there between-individual variables that explain students likelihood to be proficient over me? Vercal Alignment of Data Within Individuals The data in this study consist of 2,228 individuals who were measured on four occasions regarding their proficiency in reading. To examine growth within and between individuals using GEE (or GENLIN MIXED), the data must first be organized differently (see Chapter 2 in the text). The me-related observaons must be organized vercally, which will require four lines for each subject, since there are four repeated observaons regarding proficiency. You will recall that an intercept is defined as the level of Y when X (Time) is 0. For categorical outcomes, the me variable funcons to separate contrasts between me, for example, between a baseline

3 Ronald Heck Week 12: Class Notes 3 measurement and end of a treatment intervenon or to examine change over a parcular me period. This coding pattern for Time (0, 1, 2, 3) idenfies the intercept in the model as students inial (me1) proficiency status (i.e., since it is coded 0, and the intercept represents the individual s status when the other predictors are 0). This is the most common type of coding for models involving individual change. There are several important steps that must be specified in conducng the analysis. Users idenfy the type of outcome and appropriate link funcon, define the regression model, select the correlaon structure between repeated measures, and select either model-based or robust standard errors. There are a number of different ways to notate the models. We will let Y be the dichotomous response at me t (t = 1,2,, T ) for individual i (i = 1,2,, N), where we assume the observaons of different individuals are independent, but we allow for an associaon between the repeated measures for the same subject. This will allow us later in the chapter to add the subscript j to define random effects of individuals nested within groups such as classrooms or schools. We assume the following marginal regression model for the expected value of Y : g ( E [ Y ] ) x where x is a (p +1) x 1 vector (prime designates a vector) of covariates for the i th subject on the t th measurement occasion (t = 1,2,, T), represents the corresponding regression parameters, and g() refers to one of several corresponding link funcons, depending on the measurement ofy. This suggests that the data can be summarized to the vectory i and the matrix. The slope can be interpreted as the rate of change in the populaon-averaged Y i with X i (Zeger et al., 1988). Typically, the parameters are constant for all t (Ziegler et al., 1998). Where the data are dichotomous, the marginal mean a probability is most commonly modeled via the logit link (i.e., whether a child is proficient or not at me t). The coefficients are then interpreted as log odds. For the Bernoulli case (i.e., where the number of trials is 1), Y has a binomial distribuon with probability of success and variance of π(1-π). For binary data with the logit link funcon, we have the familiar = log( /(1 ) x, where is the underlying transformed predictor of Y, in this case, the log of the odds of /(1 ). It should again be noted that the model represents a rao of the probability of the event coded 1 occurring versus the probability of the event coded 0 occurring at a parcular me point. There is no residual variance parameter ( i ), as the variance is related to the expected value of and therefore cannot be uniquely defined.

4 Ronald Heck Week 12: Class Notes 4 In the first model, we specify the repeated measures outcomes in two parameters which describe the intercept and me-related slope as follows: log( /1 ) ( ), 0 1 me where me is coded to indicate the interval between successive measurements, 0 is an intercept and 1 describes the rate of change on a logit scale in the fracon of posive responses in the populaon of subjects per unit me, rather than the typical change for an individual subject. As the above equaon suggests, 0 is the log odds of response when me is 0 (i.e, inial status). In this case, 1 is the log odds associated with a one-year interval. The model assumes there are no between-subject random effects; therefore, there are two parameters to esmate. Since this is a single-level model, for convenience we ll drop the subscripts referring to the predictors. Correlaon Structures Between Repeated Measures It is possible to specify several different types of correlaon structures to describe the withinsubject dependencies over me. However, because one does not often know what the correct structure is ahead of me, different choices can make some difference in the model s parameter esmates; therefore, the structure is chosen to improve efficiency. It often does take a bit of preliminary work to determine the opmal working correlaon matrix for a parcular data structure. Examples of GEE correlaon/covariance structure specificaons include independence, exchangeable, autoregressive, staonary m-dependent, and unstructured. The independent matrix assumes that the repeated measurements are uncorrelated; however, this will not be the case in most instances. Generally, in longitudinal models the successive measurements are correlated at least to some extent. An exchangeable (or compound symmetry) covariance (or correlaon) matrix assumes homogenous correlaons between elements (which is somemes difficult to assume in longitudinal studies); that is, the correlaons are assumed to be the same over me. This can somemes be difficult to support in a longitudinal study, however. The autoregressive, or AR(1) matrix, assumes the repeated measures have a first-order autoregressive structure. This implies that the correlaon between any two adjacent elements is equal to (rho), to 2 1< <1. for elements separated by a third, and so on, with constrained such that - An m-dependent matrix assumes consecuve measurement have a common correlaon coefficient, pairs of measurements separated by a third have a common correlaon coefficient, and so on, through pairs of measurements separated by m-1other measurements. Where measurements are note evenly spaced, it may be reasonable to consider a model where the correlaon is a funcon of the me between observaons (i.e., M-dependent or autoregressive). Measurements with greater separaon are assumed to be uncorrelated. When choosing this structure, specify a value of m less than the order of the working correlaon matrix.

5 Ronald Heck Week 12: Class Notes 5 Finally, an unstructured correlaon (or covariance) matrix provides a separate coefficient for each covariance. As with cross-seconal models, we have found that model esmates can vary slightly according to the matrix structure specified. Standard Errors and Esmaon Model-based standard errors are based on the correlaonal structure chosen. Hence, they may be inconsistent if the correlaon structure is incorrectly specified. They are usually a little smaller than the robust standard errors (SEs). For smaller numbers of clusters, model-based SEs are generally preferred over robust SEs. In contrast, robust standard errors vary only slightly depending on the choice of hypothesized correlaonal structure among the repeated measures; that is, the esmates are consistent even if the correlaonal structure is specified incorrectly. The robust SE approach uses a sandwich esmator based on an approximaon to maximum likelihood. Because of this, there can be occasions that occur when one approach will converge and the other may not. Robust standard errors are often preferred when the number of clustered observaons is large. We will esmate our models in this example using robust standard errors since we have a considerable amount of data. Once again, we note that users should keep in mind that GEE uses a type of quasi-likelihood esmaon (as opposed to full informaon ML), which can make direct model comparison based on fit stascs that depend on the real likelihood (e.g., deviance, AIC, BIC) not very accurate (Hox, 2010). Table 1. Model Informaon Dependent Variable readprof a Probability Distribuon Binomial Link Funcon Logit Subject Effect 1 Id Within-Subject Effect 1 Time Working Correlaon Matrix Structure Exchangeable a. The procedure models 1 as the response, treang 0 as the reference category. Table 1 provides informaon about how the model is defined (e.g., probability distribuon and link funcon, number of effects in the model, type of correlaon matrix used to describe withinsubject structure). As the output shows, the distribuon is binomial and a logit link funcon is used to transformy. The working correlaon structure is exchangeable, which is the same as compound symmetry. This implies that the correlaons are the same over each me interval. We can subsequently invesgate whether this is a viable assumpon for these data. Next, we can observe how many of the total cases for the dependent variable (reading proficiency) are coded 1 (proficient) versus 0 (not proficient). As the table suggests, across the four me periods, an average 68% of the individuals were proficient and 32% were not.

6 Ronald Heck Week 12: Class Notes 6 Table 2. Reading Proficiency Informaon N Percent Dependent Variable readprof % % Total % If we did not include the me variable, the log odds intercept would be (not tabled) which would be the grand mean log odds coefficient across the four me periods. We can translate the odds rao back to the predicted populaon probability of = 1 [odds/(1+odds)], which would be 2.128/3.128, or 0.680, which fits with the Table 2 esmate. Next in Table 3 are the fixed effect results for the intercept and the me-related predictor. The esmated intercept log odds coefficient is 0.838, which because of the coding of the me variable (i.e., 0, 1, 2, 3), can be interpreted as the percentage of individuals who are proficient at the start of the study. The intercept represents the predicted log odds when any variables in the model are 0. If we exponenate the log odds, we obtain the corresponding odds rao of This suggests individuals are almost 2.3 mes more likely to be proficient than non-proficient at the beginning of the study (.70/.30 ~2.3). Table 3. Parameter Esmates Parameter B Std. Error 95% Wald Confidence Interval Lower Upper Hypothesis Test 95% Wald Confidence Interval for Exp(B) Wald Chi- Square Df Sig. Exp(B) Lower Upper (Intercept) Time (Scale) 1 Dependent Variable: readprof Model: (Intercept), me Regarding the me variable, the coefficient suggests that over each interval students likelihood of being proficient decreases significantly (log odds = , p <.001). We can translate this into a predicted probability by adding it to the intercept. Inially (i.e., at me = 0), the log odds of being proficient is For the second interval (me = 1) the esmated log odds will then be the [ (-0.055) = 0.783]. We could then esmate the new probability as 0.69, which is esmated as follows: 1/[1+( ) -(.783) which reduces to 1/ Note this esmate is slightly different from the actual observed probability in the table below, since there was no actual change that took place between me 0 and me 1. The odds rao suggests the odds of being proficient are mulplied by.947 (or reduced by 5.3%) over the first interval. We can see in

7 Ronald Heck Week 12: Class Notes 7 this situaon an assumed negave linear me trend in reduced probability of being proficient does not quite fit the data opmally. Table 4. Proporon of proficient students Readprof Time Mean N Std. Deviaon Total In this case, we might decide to code the data somewhat differently to obtain results that model the trend a bit better. We might wish to treat the me-related variable as ordinal (1,2,C) rather than scale. If we make this change, we will have C-1 esmates, since one category will serve as the reference group. In this case, we will specify descending for the factor category order so that the first category (Time = 0) will serve as the reference group. This is the same as creang a series of C-1 dummy variables for a categorical factor and specifying them in the model. Table 5. Model 1.2 Parameter Esmates Parameter B Std. Error Hypothesis Test 95% Wald Confidence Interval for Exp(B) Wald Chi- Square df Sig. Exp(B) Lower Upper (Intercept) [me=3] [me=2] [me=1] [me=0] 0 a (Scale) 1 Dependent Variable: readprof Model: (Intercept), me (ordinal) a. Set to zero because this parameter is redundant. The intercept log odds is now This is only slightly different from the last table. If we calculate the predicted probability of being proficient inially (Time = 0), we see it will be (.840) 1/(1 e ) or 1/1.432 = Note we can also use the odds rao to esmate the probability (2.315/3.315). This probability is consistent with the observed probability of in the previous table. We can see further that at Time = 1, there was little change in log odds units regarding students probability of being proficient (log odds = 0.002, p =.904). At Time 2 (log odds = , p <.001) and Time 3 (log odds = , p <.001), however, students were significantly lower in probability of being proficient relave to their proficiency status at Time 0. Regarding the odds raos (OR), we can interpret the nonsignificant relaonship at Time = 1 as indicang there was no significant change in odds of being proficient at Time 1 (OR = 1.002, p =

8 Ronald Heck Week 12: Class Notes 8.904). In contrast, the odds of being proficient at Time = 2 versus me 0 are mulplied by (or reduced by 20.5%) compared to the inial level. At Time = 3, it suggests that the odds of being proficient at Time 3 versus Time 0 (i.e., inial status intercept) are mulplied by (or reduced by 11%). We can esmate the probability of being proficient at Time 3 versus Time 0 in several ways. We can add the two log odds coefficients ( = 0.734). This will provide the log odds of being proficient at Time 3. The exponenated slope can be interpreted as the change in the odds that Y = 1 relave to the reference category (i.e., Time 0). If we exponenate the log odds ( e.734 ), we obtain the odds rao of We can then calculate the probability of being proficient at Time 3 as 2.08/3.08 = 0.675, which is consistent with the in the previous table. Alternavely, we can also represent the new odds rao as the product of the two odds raos (2.315*0.899) = 2.08; that is, we mulply the odds rao for Time = 0 by the difference in odds between Time 0 and Time 3 (0.899), which provides the new odds rao (2.08), and will lead to the same probability. Applying this approach for Time = 2, we have 2.315*0.795 =1.840, which is then (1.84/2.84 =0.648). This esmate of the probability Y = 1 is consistent with the observed proporon of in the previous table. We can see that defining the me trend as categorical in this instance provides some benefits in represenng the change probability of being proficient that takes place between each measurement more accurately. Adding a Predictor We can next add one or more between-subjects predictors, but the outcome parameters are treated as fixed; that is, the slopes cannot vary across individuals in the sample. We provide an example where we add gender (female coded 1; male coded 0) to the model. We can define this model as follows:. log[ /1 )] 0 1me 2 female We will do this one in class and compare me defined as interval and ordinal. References Horton, N. J., & Lipsitz, S. R. (1999). Review of software to fit Generalized Esmang Equaon (GEE) regression models. The American Stascian, 53, Hox, Joop J. (2010). Mullevel analysis: Techniques and applicaons (2nd ed.). New York: Routledge. Liang, Kung-Lee, & Zeger, Scott L. (1986). Longitudinal analysis using generalized linear models. Biometrika, 73(1), Raudenbush, Stephen W., & Bryk, Anthony S. (2002). Hierarchical linear models: Applicaons and data analysis methods (2nd ed.). Thousand Oaks, CA: Sage Publicaons. Zeger, Scott L., & Liang, Kung-Lee. (1986). Longitudinal data analysis for discrete and connuous outcomes. Biometrics, 42(1), Ziegler, A., Kastner, C., & Blettner, M. (1998). The Generalised Esmang Equaons: An annotated bibliography. Biometrical Journal(2),

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While