ST 732, Midterm Solutions Spring 2019

Size: px

Start display at page:

Download "ST 732, Midterm Solutions Spring 2019"

Blaise Little
5 years ago
Views:

1 ST 732, Midterm Solutions Spring 2019 Please sign the following pledge certifying that the work on this test is your own: I have neither given nor received aid on this test. Signature: Printed Name: There are FOUR questions, most with multiple parts. For each part of each question, please write your answers in the space provided. If you need more space, continue on the back of the page and indicate clearly where on the back you have continued your answer. Scratch paper is available from the instructor; just ask. You are allowed ONE (1) SHEET of NOTES (front and back). Calculators are NOT allowed (you will not need one). NOTHING should be on your desk but this test paper, your one page of notes, and any scratch paper given to you by the instructor. Points for each part of each problem are given in the left margin. TOTAL POINTS = 100. If you are asked to provide an expression, you need not carry out the algebra to simplify the expression (unless you want to do so). In all problems, all symbols and notation are defined exactly as they are in the class notes. NOTE: My answers are MUCH MORE DETAILED than I expected yours to be. 1

2 1. An education researcher has conducted a study in children in the fifth grade to evaluate four self-guided instructional programs. 400 fifth grade children were enrolled in the study, and each child was randomly assigned to one of the four programs, 100 children per program. At baseline (week 0), the time (in minutes) it took the child to complete a reasoning task was recorded. Each child then began his/her instructional program. At weeks 2, 4, 6, 8, and 10 following initiation of his/her program, the child repeated the reasoning task, and the completion time was recorded. The data are shown below, with the sample means at baseline and at all subsequent weeks thereafter superimposed (boldface lines). 350 Program 1 Program Completion Time (min) Program 3 Program Week The researcher hoped to show that there are differences in effectiveness of the programs, with more effective programs showing a greater decrease in completion time over the study period, so that the pattern of change in completion time is possibly different across programs. She also hypothesized that the rate of change of completion time over the study period may not be constant for at least some of the programs in that the rate of decrease in completion time increases over time. The main goals of the study were thus (i) To determine if the pattern of change of mean completion time is not the same for all programs (ii) To determine if the rate of change of mean completion time is not constant for at least one of the programs. The education researcher hopes to address this and other questions based on the following model, which is popular in the literature in her field, and the standard assumptions made for it: Y hlj = µ lj + b hl + e hlj = µ + τ l + γ j + (τγ) lj + b hl + e hlj, (1) where Y hlj is the weight for the hth child asr signed to the lth program at jth week, j = 1,..., 6; l = 1,..., 4 indexes Programs 1 4, respectively; and the terms on the right hand side of (1) are as defined in the course notes. 2

3 V 1 = Here are the sample covariance matrices V and associated correlation matrices Γ based on the data for each program (1, 2, 3, 4): V 2 = V 3 = V 4 = , Γ1 =, Γ2 =, Γ3 =, Γ4 = And here is selected output of an analysis based on (1): Source DF Type III SS Mean Square F Value Pr > F program <.0001 Error week <.0001 week*program <.0001 Error(week) Mauchly s Criterion DF Chi-Square Pr > ChiSq <.0001 week_n represents the nth degree polynomial contrast for week Contrast Variable: week_1 Source DF Type III SS Mean Square F Value Pr > F Mean <.0001 program <.0001 Error

4 Contrast Variable: week_2 Source DF Type III SS Mean Square F Value Pr > F Mean <.0001 program <.0001 Error Contrast Variable: week_3 Source DF Type III SS Mean Square F Value Pr > F Mean program Error Contrast Variable: week_4 Source DF Type III SS Mean Square F Value Pr > F Mean program Error Contrast Variable: week_5 Source DF Type III SS Mean Square F Value Pr > F Mean program Error Define M = µ 11 µ 12 µ 16 µ 21 µ 22 µ 26 µ 31 µ 32 µ 36 µ 41 µ 42 µ 46. [5 points] (a) Based on the information you have, do you feel it is possible to obtain reliable inference on the researcher s question (i), to determine if the pattern of change of mean completion time is not the same for all programs, using model (1)? If so, describe how in terms of the matrix M, and present a formal statement of the result. If not, explain why not. The best answer is NO, as it seems that the assumptions required to ensure valid inferences are likely to be violated. If all of the assumptions required to ensure validity of tests based on (1) were satisfied, we would address (i) by the test of parallelism, which in the ANOVA table is given in the week*program row. These assumptions include: (i) the covariance matrix of a data vector is the same for all individuals, regardless of group; and (ii) the common covariance matrix is compound symmetric with the same variance at all time points (although this can be relaxed to a common Type H structure). It is not clear if the assumption of a common covariance structure for all programs is violated or not, but there are features that might cause some worry. The variances along the diagonals of each sample covariance matrix seem reasonably similar over time, but there is some suggestion that the magnitude of variance might be different, as the variances seem larger for programs 2 4 than they do for program 1. Even if a common covariance structure were reasonable, the 4

5 assumption of compound symmetry (or at least Type H) seems questionable; for all programs, the sample correlations seem to damp out over time. Moreover, under the assumption of a similar covariance structure for all programs, the test of sphericity under that assumption (Mauchly s Criterion) strongly rejects the null hypothesis that the structure is of Type H. In short, the evidence available does not seem to support the key assumptions needed to validate the usual test of parallelism. (b) Give an expression in terms of the elements of M that formalizes the researcher s question (ii), to determine if the rate of change of mean completion time is not constant for at least one of the programs, or explain why this is not possible. This is a simpler version of Problem 3(b) on Homework 1. The relevant null hypothesis is that the rate of change of mean completion time is constant for all programs, so that the means for each program lie on a straight line. Here, the time points are equally spaced, so if the rate of change is constant for all programs, it must be that for each program l = 1,..., 4 µ l2 µ l1 = µ l3 µ l2 = µ l4 µ l3 = µ l5 µ l4 = µ l6 µ l5. [5 points] Thus, we can formalize the researcher s question (ii) by the null hypothesis that this holds for all programs. Most of you expressed this in the form MU = 0. (c) Based on the information you have, do you feel you can obtain any reliable insights relevant to question (ii)? If so, explain why and describe how you would do this and state the result(s). If not, explain why not. This is a more open-ended question, and I gave full credit for any sensible answer. We want to gain some kind of insight on whether or not there is any evidence that the rate of change of mean completion time is not constant for at least one program. Some of you said that this is not possible on grounds that the assumptions of compound symmetry (or at least Type H) are violated. This is not unreasonable, thinking in terms of the representation in (b). On the other hand, we know that specialized within-unit tests do not require the covariance matrix to be compound symmetric/type H; these tests are valid as long as the covariance matrix is the same for all programs. We have been provided information for tests corresponding to orthogonal polynomial contrasts, so some of you suggested referring to this. The evidence on whether or not the covariance structure is the same for all groups in inconclusive. If you were willing to accept that there is not sufficient evidence to say it is different, you may have been willing to view the orthogonal polynomial contrast tests as reliable. In this case, there is strong evidence of quadratic effects. The week 2 test of whether or not the quadratic component of the relationship over time averaged across groups (Mean) strongly rejects the null hypothesis of no qudratic effect averaged across programs. This could be interpreted as suggesting that the rate of change for at least one program is not constant. The week 2 test of program, which addresses whether or not the quadratic component of the relationship over time is different for the 4 programs, also strongly rejects the null hypothesis that it is the same. This could be interpreted as suggesting that there must be a quadratic effect, and thus nonconstant rate of change, for at least one program, making the quadratic component different. Of course, if you felt the evidence you have casts the assumption of a common covariance structure in sufficient doubt, you may not have been willing to consider these observations reliable. 5

6 2. Consider the education study in the previous problem. Here are the data again: 350 Program 1 Program Completion Time (min) Program 3 Program Week Again, the goals are (i) To determine if the pattern of change of mean completion time is not the same for all programs (ii) To determine if the rate of change of mean completion time is not constant for at least one of the programs. [12 points] (a) Based on all information you have, propose a statistical model different from that in (1) in which both (i) and (ii) can be addressed. Briefly state any assumptions you incorporate in the model. (There is more space on the next page.) These are clearly population-averaged questions, so you could have either posited a populationaveraged model directly or posited a linear mixed effects model to induce a population-averaged model; either is a reasonable approach. Clearly, a quadratic model seems appropriate given the visual evidence. Here is a population-averaged model in which (i) and (ii) can be addressed. Letting Y ij be completion time at the jth week t j = 0, 2, 4, 6, 8, 10 on the ith subject, j = 1,..., 6, i = 1,..., 400, δ il = 1 if i is in program l and = 0 otherwise, l = 1, 2, 3, 4, a plausible model is Y ij = β 0 + (β 11 δ i1 + β 12 δ i2 + β 13 δ i3 + β 14 δ i4 )t j + (β 21 δ i1 + β 22 δ i2 + β 23 δ i3 + β 24 δ i4 )t 2 j + ɛ ij, where, because of randomization, we have assumed a common mean response at baseline (you may have allowed this to vary by program as well); and a i = (δ i1, δ i2, δ i3, δ i4 ) T. Based on the sample information above, a general assumption is E(ɛ ij a i ) = 0, and, with ɛ i = (ɛ i1,..., ɛ i6 ) T, var(ɛ i a i ) = V l = T 1/2 l Γ l T 1/2 l, where l = 1, 2, 3 or 4 depending on a i. Here, ɛ ij represents the aggregate deviation from the mean completion time at week j for Y ij due to among- and within-individual sources. 6

7 Based on the observations in the last problem, we might be willing to assume that the overall correlation matrix Γ l is the same for all programs and that the variances var(ɛ ij a i ) = σl 2 are constant over time but perhaps different by program (at least for program 1). Under these conditions, we might take T l = σl 2I 6, l = 1,..., 4, and take Γ l to be an AR(1) correlation matrix with correlation parameter α the same for each program. We might also be willing to make the assumption that Y i a i under these conditions is normal with moments implied by these specifications, although that is not absolutely necessary. (b) For your model in (a), write down a vector β that collects all parameters that characterize mean completion time under the 4 programs. Then provide a matrix L such that you can address question (i), to determine if the pattern of change of mean completion time is not the same for all programs over the study period, through an expression of the form Lβ. Here, β = (β 0, β 11, β 12, β 13, β 14, β 21, β 22, β 23, β 24 ) T. The pattern of change of mean completion time would be the same for all programs over the study period if both linear and quadratic coefficients of week are the same for all programs; i.e., if β 11 = β 12 = β 13 = β 14 and Thus, an appropriate L matrix is L = β 21 = β 22 = β 23 = β The question then can be addressed by testing the null hypothesis that Lβ = 0. 7

8 [5 points] (c) In terms of β you defined in (b), provide a matrix L that allows you to address question (ii), to determine if the rate of change of mean completion time is not constant for at least one of the programs, through an expression of the form Lβ. The rate of change of mean completion time will be constant for all programs if β 21 = β 22 = β 23 = β 24 = 0. Thus, an appropriate L matrix to address this question is L = The question then can be addressed by testing the null hypothesis that Lβ = 0. (d) In terms of β you defined in (b), provide a matrix L that allows you to represent the rate of change in mean completion time for Program 4 at the midpoint of the study (week 5) in terms of an expression of the form Lβ. The rate of change of mean completion time for any program is the derivative of the mean model with respect to time. For program 4, the rate of change at any time t is thus β β 24 t. The rate of change at week t = 5 is thus β 14 + β 24 10, which can be represented as Lβ with L = ( ). 8

9 3. The data shown below are from a study in 180 male subjects who had coronary artery bypass graft (CABG) surgery in the past year. Recent research suggests that such subjects can benefit from lowering their low-density lipoprotein (LDL) cholesterol levels to no more than 70 mg/dl and ideally to 40 mg/dl or below; LDL levels of more than 100 mg/dl are considered unacceptably high in such patients. All subjects in this study had baseline LDL levels of at least 110 mg/dl. Subjects were randomly assigned with equal probability to receive one of three treatment regimens: a standard dose of a popular statin drug, torsuvastatin; a very high dose of torsuvastatin; or a very powerful injectable agent, Trupatha. LDL cholesterol levels were to be measured at baseline (month 0), prior to initiation of study treatment, and then at months 0.5, 1, 2, 3, and 6 thereafter. Recorded for each subject was an indicator of whether or not the subject s body mass index (BMI) was 25 or more at baseline (0 = BMI < 25, 1 = BMI 25). Individuals with BMI 25 are considered to be overweight. Also recorded for each subject was an indicator of whether or not the subject suffered from hypertension (0 = no, 1 = yes). Here are the data, with a loess smooth superimposed on each plot. Standard Dose High Dose LDL Cholesterol (mg/dl) 50 Injectable Month As is often the case, many participants dropped out of the study before completion: although all subjects have the baseline and 0.5 month LDL measurements, only 164, 136, 124, and and only 101 returned at 1, 2, 3, and 6 months, respectively. 9

10 The investigators had the following questions: (i) Is mean baseline LDL cholesterol level associated with being overweight (BMI 25) and/or suffering from hypertension? (ii) Is the typical or mean pattern of change of LDL cholesterol not the same for at least one of the three treatment regimens among overweight subjects (BMI 25)? Among normal weight subjects (BMI < 25)? (iii) Is mean LDL cholesterol at 6 months not the same for at least one of the three treatment regimens for overweight subjects (BMI 25) in this population? For normal weight subjects (BMI < 25)? [12 points] (a) Can you propose a statistical model in which all of questions (i)-(iii) can all be addressed? If so, write down the model and briefly state any assumptions you incorporate in the model. If not, state why not, and write down a model in which at least one of the three questions can be addressed (state which one(s)). Describe (briefly) any assumptions you incorporate in the model. Some of these questions are clearly subject-specific and some are population-averaged, so a linear mixed effects model, which allows both types of questions to be addressed, is the way to go. Letting Y ij be the LDL cholesterol measurement on subject i at the jth time t ij j = 1,..., n i (different for each i due to possible missing observations/dropout), it is natural from the plot to model individual-specific trajectories as straight lines, i.e., take the individual-level model to be Y ij = β 0i + β 1i t ij + e ij, β i = (β 0i, β 1i ) T, which we could also write as Y i = C i β i + e i as in the notes. Define δ il = 1 if i was randomized to regimen l and = 0 otherwise, l = 1, 2, 3, where standard dose=1, high dose=2, injectable=3. Let o i = 0 if i is a not overweight (BMI < 25) and o i = 1 if i is overweight (BMI 25), and let h i = 0 if i does not suffer from hypertension and h i = 1 if he does. A population model that allows the above questions to be addressed is then β 0i = β 00 + β 01 o i + β 02 h i + b 0i β 1i = (β 11 + β 21 o i )δ i1 + (β 12 + β 22 o i )δ i2 + (β 13 + β 23 o i )δ i3 + b 1i, (2) where b i = (b 0i, b 1i ) T is the vector of individual-specific random effects. We could equally well have parameterized β 1i as β 1i = {β 11 (1 o i ) + β 21 o i }δ i1 + {β 12 (1 o i ) + β 22 o i }δ i2 + {β 13 (1 o i ) + β 23 o i }δ i3 + b 1i. These specifications of β 1i allow the association of typical or mean pattern of change, which is taken to have a constant rate of change here (from a SS perspective) to be associated with being overweight or not in a way that depends on treatment, which seems necessary from the statement of question (ii). You may have used a fancier or simpler model and parameterized it as above or differently. Some of you also allowed the individual-specific slopes to have mean depending on hypertensive status, which is fine. Letting a i = (δ i1, δ i2, δ i3, o i, h i ) T, we need to make assumptions on e i and b i, for which we definitely assume E(e i a i ) = 0 and var(b i a i ) = 0. To complete the specification, in particular of the forms of var(e i a i ) = R i (γ) and var(b i a i ), it would be great to be able to see fits under different choices for these (and the associated AIC and BIC values) and to see residual plots, which would also give us an informal assessment of whether or not e i a i and b i a i are approximately normal. The default specification is of course 10

11 [5 points] that var(e i a i ) = σ 2 I ni and var(b i a i ) = D (2 2), but these could be relaxed to allow σ 2 and D to differ by group and/or to allow var(e ij a i ) to change over time and for within-individual correlation. You may have written down a different model, which for most of you was fine; the key is that your model allows all the questions of interest above to be addressed. (b) The investigators tell you that the primary reason that subjects dropped out of the study was either (i) because their physicians felt that their LDL levels up to the point of dropout were not lowering fast enough or (ii) because their LDL levels up to the point of dropout were lowering so dramatically that the subject felt it was unnecessary to continue in the study. Would you feel comfortable proceeding with standard analysis using your model under these conditions? If so, explain briefly how you would conduct the analysis. If not, explain why not. It sounds like the assumption that the dropout mechanism is missing at random (MAR) may be reasonable here; apparently subjects and physicians were making the decision on whether or not a subject should drop out on the basis of his evolving, observed LDL measurements. Given this, I would feel comfortbale proceeding with a standard analysis using maximum likelihood methods as long as I was willing to believe that my model is correctly specified and that the distributions of both e i a i and b i a i are mean-zero normal with correct specifications of the corresponding covariance matrices. Under these conditions, I d feel okay using maximum likelihood under this assumption to feel comfortable that valid inferences would be achieved (according to the ignorability argument in Section 5.6). I d also ideally like to be able to use the observed information matrix from this analysis to obtain (model-based) standard errors and tests, to ensure that the uncertainty is taken into appropriate account. (c) In terms of your model in (a), show how you would address question (i) (is mean baseline LDL level associated with being overweight and/or suffering from hypertension?). If you cannot, state why not. Mean LDL level is not associated with either being overweight or suffering from hypertension if β 01 = β 02 = 0. Thus, I would test this null hypothesis against the alternative that at least one of β 01 or β 02 is different from zero. (d) In terms of your model in (a), show how you would address the second part of question (ii) (Is the typical/mean pattern change of LDL level not the same for at least one of the three regimens among normal weight patients?) If you cannot, state why not. According to the model in (a), the typical/mean pattern of change of LDL level for normal weight patients (o i = 0) is β 11 δ i1 + β 12 δ i2 + β 13 δ i3, depending on which regimen an individual received. If this pattern is the same for all regimens, it must be that β 11 = β 12 = β 13. I would thus test this null hypothesis against the alternative that at least one of β 11, β 12, or β 13 is different from the others. (e) Show how you would use your model to estimate the variation in subject-specific baseline LDL levels for subjects who are overweight and suffer from hypertension, or explain why you cannot do this. For the purpose of this problem, I will assume that var(b i a i ) = D, so that among-individual variation and correlation is the same regardless of weight and hypertensive status. This is a (2 2) matrix. The upper left diagonal element D 11 reflects the variance in subject-specific baseline LDL levels for subjects of any type in this case. Thus, to estimate variation in subjectspecific baseline LDL levels for subjects who are overweight and suffer from hypertension, I would report the estimate of D 11. If you allowed the D matrix to be different depending on being overweight and/or hypertensive, your answer was graded accordingly. 11

12 4. (a) Suppose that the outcome Y is continuous, and consider the model Y ij =β 0i {1 exp( β 1i t ij )} + e ij, i = 1,..., m, j = 1,..., n, β 0i = β 0 + b 01i, β 1i = β 1 + b 1i, β 0 > 0, β 1 > 0, (3) where b i = (b 0i, b 1i ) T are independent for all i, e ij are independent for all i, j, and b i and e ij are independent of one another for all i, j, with b i N (0, D), D (2 2), e ij N (0, σ 2 ). Dick refers to β 0 in (3) as the saturation value characterizing the mean outcome in the population, and Jane refers to β 0 as the mean saturation value among individuals in the population. Who is correct, Dick or Jane? Explain (briefly) your answer. It is immediate that Jane is correct from the population model specified; individual-specific β 0i in this model vary about the mean saturation value β 0 in the population. Dick is suggesting that β 0 can be interpreted as the saturation value characterizing (there are no covariates) the population mean outcome. This population mean outcome is E(Y ij ) = E[β 0i {1 exp( β 1i t ij )}] (4) = β 0 E[1 exp{ (β 1 + b i1 )t ij }] + E(b 0i [1 exp{ (β 1 + b i1 )t ij }]). (5) If you answered that Dick is not correct, you are almost certainly right. For β 0 to be the saturation value of this population mean, it must be the limit of the above expression as t ij is β 0. This does not seem likely, and simply saying this would suffice. It turns out (I did not expect you to show this!) that this is indeed the case, From Problem 2 of Homework 1, we know that E[1 exp{ (β 1 + b i1 )t ij }] = 1 exp( β 1 t ij + D 22 t 2 ij /2). Thus, the first term in the above expression is β 0 {1 exp( β 1 t ij + Dt 2 ij /2)}. At first glance, one might think that the second term in this expression is equal to zero because E(b 0i ) = 0. However, as long as the matrix D is not diagonal, so that b 0i and b 1i are not independent, the second term is not necessarily equal to zero. It turns out that one can evaluate the second term (it s a bit involved) and as a result show that in fact which clearly does not approach β 0 as t ij. E(Y ij ) = β 0 + exp( β 1 t ij + D 22 t 2 ij )(D 12t ij β 0 ), 12

13 (b) Consider the model Y ij = β + b i + e ij, i = 1,..., m, j = 1,..., n, (6) where m is even, and δ i = 0 for i = 1,..., m/2, δ i = 1 for i = m/2 + 1,..., m. In (6), b i are mutually independent for all i with E(b i δ 1 ) = 0, var(b i δ i ) = D 0 (1 δ i ) + D 1 δ i, D 0, D 1 > 0; (7) and e ij are mutually independent for all i, j with E(e ij δ i ) = 0, var(e ij δ i ) = σ 2 0 (1 δ i) + σ 2 1 δ i; (8) and b i and e ij are independent of one another for all i, j. Under the assumption that b i and e ij are all normally distributed, provide the simplest expression you can for the maximum likelihood estimator β for β, defining any additional notation you may need. This is immediate from Homework 3, Problem 2. This is the simplest version of a linear mixed effects model but witih different among-individual covariance matrix (which is just a scalar variance here). The data are balanced; there are no time points, even, and all individuals have n observations. Thus, we know immediately that the maximum likelihood estimator for β is identical to the OLS estimator. Letting 1 be a (n 1) vector of all ones, we can thus immediately write down that ( m ) 1 m β = 1 T 1 1 T Y i. This answer would suffice. You may have felt compelled to note that this can be simplified to i=1 i=1 β = (mn) 1 m i=1 j=1 n Y ij = N 1 m i=1 j=1 n Y ij = m 1 m i=1 Y i = Y, the overall mean of all outcomes. 13

3 Repeated Measures Analysis of Variance

3 Repeated Measures Analysis of Variance 3.1 Introduction As we have discussed, many approaches have been taken in the literature to specifying statistical models for longitudinal data. Within the framework