ST 732, Midterm Solutions Spring 2019

Size: px
Start display at page:

Download "ST 732, Midterm Solutions Spring 2019"

Transcription

1 ST 732, Midterm Solutions Spring 2019 Please sign the following pledge certifying that the work on this test is your own: I have neither given nor received aid on this test. Signature: Printed Name: There are FOUR questions, most with multiple parts. For each part of each question, please write your answers in the space provided. If you need more space, continue on the back of the page and indicate clearly where on the back you have continued your answer. Scratch paper is available from the instructor; just ask. You are allowed ONE (1) SHEET of NOTES (front and back). Calculators are NOT allowed (you will not need one). NOTHING should be on your desk but this test paper, your one page of notes, and any scratch paper given to you by the instructor. Points for each part of each problem are given in the left margin. TOTAL POINTS = 100. If you are asked to provide an expression, you need not carry out the algebra to simplify the expression (unless you want to do so). In all problems, all symbols and notation are defined exactly as they are in the class notes. NOTE: My answers are MUCH MORE DETAILED than I expected yours to be. 1

2 1. An education researcher has conducted a study in children in the fifth grade to evaluate four self-guided instructional programs. 400 fifth grade children were enrolled in the study, and each child was randomly assigned to one of the four programs, 100 children per program. At baseline (week 0), the time (in minutes) it took the child to complete a reasoning task was recorded. Each child then began his/her instructional program. At weeks 2, 4, 6, 8, and 10 following initiation of his/her program, the child repeated the reasoning task, and the completion time was recorded. The data are shown below, with the sample means at baseline and at all subsequent weeks thereafter superimposed (boldface lines). 350 Program 1 Program Completion Time (min) Program 3 Program Week The researcher hoped to show that there are differences in effectiveness of the programs, with more effective programs showing a greater decrease in completion time over the study period, so that the pattern of change in completion time is possibly different across programs. She also hypothesized that the rate of change of completion time over the study period may not be constant for at least some of the programs in that the rate of decrease in completion time increases over time. The main goals of the study were thus (i) To determine if the pattern of change of mean completion time is not the same for all programs (ii) To determine if the rate of change of mean completion time is not constant for at least one of the programs. The education researcher hopes to address this and other questions based on the following model, which is popular in the literature in her field, and the standard assumptions made for it: Y hlj = µ lj + b hl + e hlj = µ + τ l + γ j + (τγ) lj + b hl + e hlj, (1) where Y hlj is the weight for the hth child asr signed to the lth program at jth week, j = 1,..., 6; l = 1,..., 4 indexes Programs 1 4, respectively; and the terms on the right hand side of (1) are as defined in the course notes. 2

3 V 1 = Here are the sample covariance matrices V and associated correlation matrices Γ based on the data for each program (1, 2, 3, 4): V 2 = V 3 = V 4 = , Γ1 =, Γ2 =, Γ3 =, Γ4 = And here is selected output of an analysis based on (1): Source DF Type III SS Mean Square F Value Pr > F program <.0001 Error week <.0001 week*program <.0001 Error(week) Mauchly s Criterion DF Chi-Square Pr > ChiSq <.0001 week_n represents the nth degree polynomial contrast for week Contrast Variable: week_1 Source DF Type III SS Mean Square F Value Pr > F Mean <.0001 program <.0001 Error

4 Contrast Variable: week_2 Source DF Type III SS Mean Square F Value Pr > F Mean <.0001 program <.0001 Error Contrast Variable: week_3 Source DF Type III SS Mean Square F Value Pr > F Mean program Error Contrast Variable: week_4 Source DF Type III SS Mean Square F Value Pr > F Mean program Error Contrast Variable: week_5 Source DF Type III SS Mean Square F Value Pr > F Mean program Error Define M = µ 11 µ 12 µ 16 µ 21 µ 22 µ 26 µ 31 µ 32 µ 36 µ 41 µ 42 µ 46. [5 points] (a) Based on the information you have, do you feel it is possible to obtain reliable inference on the researcher s question (i), to determine if the pattern of change of mean completion time is not the same for all programs, using model (1)? If so, describe how in terms of the matrix M, and present a formal statement of the result. If not, explain why not. The best answer is NO, as it seems that the assumptions required to ensure valid inferences are likely to be violated. If all of the assumptions required to ensure validity of tests based on (1) were satisfied, we would address (i) by the test of parallelism, which in the ANOVA table is given in the week*program row. These assumptions include: (i) the covariance matrix of a data vector is the same for all individuals, regardless of group; and (ii) the common covariance matrix is compound symmetric with the same variance at all time points (although this can be relaxed to a common Type H structure). It is not clear if the assumption of a common covariance structure for all programs is violated or not, but there are features that might cause some worry. The variances along the diagonals of each sample covariance matrix seem reasonably similar over time, but there is some suggestion that the magnitude of variance might be different, as the variances seem larger for programs 2 4 than they do for program 1. Even if a common covariance structure were reasonable, the 4

5 assumption of compound symmetry (or at least Type H) seems questionable; for all programs, the sample correlations seem to damp out over time. Moreover, under the assumption of a similar covariance structure for all programs, the test of sphericity under that assumption (Mauchly s Criterion) strongly rejects the null hypothesis that the structure is of Type H. In short, the evidence available does not seem to support the key assumptions needed to validate the usual test of parallelism. (b) Give an expression in terms of the elements of M that formalizes the researcher s question (ii), to determine if the rate of change of mean completion time is not constant for at least one of the programs, or explain why this is not possible. This is a simpler version of Problem 3(b) on Homework 1. The relevant null hypothesis is that the rate of change of mean completion time is constant for all programs, so that the means for each program lie on a straight line. Here, the time points are equally spaced, so if the rate of change is constant for all programs, it must be that for each program l = 1,..., 4 µ l2 µ l1 = µ l3 µ l2 = µ l4 µ l3 = µ l5 µ l4 = µ l6 µ l5. [5 points] Thus, we can formalize the researcher s question (ii) by the null hypothesis that this holds for all programs. Most of you expressed this in the form MU = 0. (c) Based on the information you have, do you feel you can obtain any reliable insights relevant to question (ii)? If so, explain why and describe how you would do this and state the result(s). If not, explain why not. This is a more open-ended question, and I gave full credit for any sensible answer. We want to gain some kind of insight on whether or not there is any evidence that the rate of change of mean completion time is not constant for at least one program. Some of you said that this is not possible on grounds that the assumptions of compound symmetry (or at least Type H) are violated. This is not unreasonable, thinking in terms of the representation in (b). On the other hand, we know that specialized within-unit tests do not require the covariance matrix to be compound symmetric/type H; these tests are valid as long as the covariance matrix is the same for all programs. We have been provided information for tests corresponding to orthogonal polynomial contrasts, so some of you suggested referring to this. The evidence on whether or not the covariance structure is the same for all groups in inconclusive. If you were willing to accept that there is not sufficient evidence to say it is different, you may have been willing to view the orthogonal polynomial contrast tests as reliable. In this case, there is strong evidence of quadratic effects. The week 2 test of whether or not the quadratic component of the relationship over time averaged across groups (Mean) strongly rejects the null hypothesis of no qudratic effect averaged across programs. This could be interpreted as suggesting that the rate of change for at least one program is not constant. The week 2 test of program, which addresses whether or not the quadratic component of the relationship over time is different for the 4 programs, also strongly rejects the null hypothesis that it is the same. This could be interpreted as suggesting that there must be a quadratic effect, and thus nonconstant rate of change, for at least one program, making the quadratic component different. Of course, if you felt the evidence you have casts the assumption of a common covariance structure in sufficient doubt, you may not have been willing to consider these observations reliable. 5

6 2. Consider the education study in the previous problem. Here are the data again: 350 Program 1 Program Completion Time (min) Program 3 Program Week Again, the goals are (i) To determine if the pattern of change of mean completion time is not the same for all programs (ii) To determine if the rate of change of mean completion time is not constant for at least one of the programs. [12 points] (a) Based on all information you have, propose a statistical model different from that in (1) in which both (i) and (ii) can be addressed. Briefly state any assumptions you incorporate in the model. (There is more space on the next page.) These are clearly population-averaged questions, so you could have either posited a populationaveraged model directly or posited a linear mixed effects model to induce a population-averaged model; either is a reasonable approach. Clearly, a quadratic model seems appropriate given the visual evidence. Here is a population-averaged model in which (i) and (ii) can be addressed. Letting Y ij be completion time at the jth week t j = 0, 2, 4, 6, 8, 10 on the ith subject, j = 1,..., 6, i = 1,..., 400, δ il = 1 if i is in program l and = 0 otherwise, l = 1, 2, 3, 4, a plausible model is Y ij = β 0 + (β 11 δ i1 + β 12 δ i2 + β 13 δ i3 + β 14 δ i4 )t j + (β 21 δ i1 + β 22 δ i2 + β 23 δ i3 + β 24 δ i4 )t 2 j + ɛ ij, where, because of randomization, we have assumed a common mean response at baseline (you may have allowed this to vary by program as well); and a i = (δ i1, δ i2, δ i3, δ i4 ) T. Based on the sample information above, a general assumption is E(ɛ ij a i ) = 0, and, with ɛ i = (ɛ i1,..., ɛ i6 ) T, var(ɛ i a i ) = V l = T 1/2 l Γ l T 1/2 l, where l = 1, 2, 3 or 4 depending on a i. Here, ɛ ij represents the aggregate deviation from the mean completion time at week j for Y ij due to among- and within-individual sources. 6

7 Based on the observations in the last problem, we might be willing to assume that the overall correlation matrix Γ l is the same for all programs and that the variances var(ɛ ij a i ) = σl 2 are constant over time but perhaps different by program (at least for program 1). Under these conditions, we might take T l = σl 2I 6, l = 1,..., 4, and take Γ l to be an AR(1) correlation matrix with correlation parameter α the same for each program. We might also be willing to make the assumption that Y i a i under these conditions is normal with moments implied by these specifications, although that is not absolutely necessary. (b) For your model in (a), write down a vector β that collects all parameters that characterize mean completion time under the 4 programs. Then provide a matrix L such that you can address question (i), to determine if the pattern of change of mean completion time is not the same for all programs over the study period, through an expression of the form Lβ. Here, β = (β 0, β 11, β 12, β 13, β 14, β 21, β 22, β 23, β 24 ) T. The pattern of change of mean completion time would be the same for all programs over the study period if both linear and quadratic coefficients of week are the same for all programs; i.e., if β 11 = β 12 = β 13 = β 14 and Thus, an appropriate L matrix is L = β 21 = β 22 = β 23 = β The question then can be addressed by testing the null hypothesis that Lβ = 0. 7

8 [5 points] (c) In terms of β you defined in (b), provide a matrix L that allows you to address question (ii), to determine if the rate of change of mean completion time is not constant for at least one of the programs, through an expression of the form Lβ. The rate of change of mean completion time will be constant for all programs if β 21 = β 22 = β 23 = β 24 = 0. Thus, an appropriate L matrix to address this question is L = The question then can be addressed by testing the null hypothesis that Lβ = 0. (d) In terms of β you defined in (b), provide a matrix L that allows you to represent the rate of change in mean completion time for Program 4 at the midpoint of the study (week 5) in terms of an expression of the form Lβ. The rate of change of mean completion time for any program is the derivative of the mean model with respect to time. For program 4, the rate of change at any time t is thus β β 24 t. The rate of change at week t = 5 is thus β 14 + β 24 10, which can be represented as Lβ with L = ( ). 8

9 3. The data shown below are from a study in 180 male subjects who had coronary artery bypass graft (CABG) surgery in the past year. Recent research suggests that such subjects can benefit from lowering their low-density lipoprotein (LDL) cholesterol levels to no more than 70 mg/dl and ideally to 40 mg/dl or below; LDL levels of more than 100 mg/dl are considered unacceptably high in such patients. All subjects in this study had baseline LDL levels of at least 110 mg/dl. Subjects were randomly assigned with equal probability to receive one of three treatment regimens: a standard dose of a popular statin drug, torsuvastatin; a very high dose of torsuvastatin; or a very powerful injectable agent, Trupatha. LDL cholesterol levels were to be measured at baseline (month 0), prior to initiation of study treatment, and then at months 0.5, 1, 2, 3, and 6 thereafter. Recorded for each subject was an indicator of whether or not the subject s body mass index (BMI) was 25 or more at baseline (0 = BMI < 25, 1 = BMI 25). Individuals with BMI 25 are considered to be overweight. Also recorded for each subject was an indicator of whether or not the subject suffered from hypertension (0 = no, 1 = yes). Here are the data, with a loess smooth superimposed on each plot. Standard Dose High Dose LDL Cholesterol (mg/dl) 50 Injectable Month As is often the case, many participants dropped out of the study before completion: although all subjects have the baseline and 0.5 month LDL measurements, only 164, 136, 124, and and only 101 returned at 1, 2, 3, and 6 months, respectively. 9

10 The investigators had the following questions: (i) Is mean baseline LDL cholesterol level associated with being overweight (BMI 25) and/or suffering from hypertension? (ii) Is the typical or mean pattern of change of LDL cholesterol not the same for at least one of the three treatment regimens among overweight subjects (BMI 25)? Among normal weight subjects (BMI < 25)? (iii) Is mean LDL cholesterol at 6 months not the same for at least one of the three treatment regimens for overweight subjects (BMI 25) in this population? For normal weight subjects (BMI < 25)? [12 points] (a) Can you propose a statistical model in which all of questions (i)-(iii) can all be addressed? If so, write down the model and briefly state any assumptions you incorporate in the model. If not, state why not, and write down a model in which at least one of the three questions can be addressed (state which one(s)). Describe (briefly) any assumptions you incorporate in the model. Some of these questions are clearly subject-specific and some are population-averaged, so a linear mixed effects model, which allows both types of questions to be addressed, is the way to go. Letting Y ij be the LDL cholesterol measurement on subject i at the jth time t ij j = 1,..., n i (different for each i due to possible missing observations/dropout), it is natural from the plot to model individual-specific trajectories as straight lines, i.e., take the individual-level model to be Y ij = β 0i + β 1i t ij + e ij, β i = (β 0i, β 1i ) T, which we could also write as Y i = C i β i + e i as in the notes. Define δ il = 1 if i was randomized to regimen l and = 0 otherwise, l = 1, 2, 3, where standard dose=1, high dose=2, injectable=3. Let o i = 0 if i is a not overweight (BMI < 25) and o i = 1 if i is overweight (BMI 25), and let h i = 0 if i does not suffer from hypertension and h i = 1 if he does. A population model that allows the above questions to be addressed is then β 0i = β 00 + β 01 o i + β 02 h i + b 0i β 1i = (β 11 + β 21 o i )δ i1 + (β 12 + β 22 o i )δ i2 + (β 13 + β 23 o i )δ i3 + b 1i, (2) where b i = (b 0i, b 1i ) T is the vector of individual-specific random effects. We could equally well have parameterized β 1i as β 1i = {β 11 (1 o i ) + β 21 o i }δ i1 + {β 12 (1 o i ) + β 22 o i }δ i2 + {β 13 (1 o i ) + β 23 o i }δ i3 + b 1i. These specifications of β 1i allow the association of typical or mean pattern of change, which is taken to have a constant rate of change here (from a SS perspective) to be associated with being overweight or not in a way that depends on treatment, which seems necessary from the statement of question (ii). You may have used a fancier or simpler model and parameterized it as above or differently. Some of you also allowed the individual-specific slopes to have mean depending on hypertensive status, which is fine. Letting a i = (δ i1, δ i2, δ i3, o i, h i ) T, we need to make assumptions on e i and b i, for which we definitely assume E(e i a i ) = 0 and var(b i a i ) = 0. To complete the specification, in particular of the forms of var(e i a i ) = R i (γ) and var(b i a i ), it would be great to be able to see fits under different choices for these (and the associated AIC and BIC values) and to see residual plots, which would also give us an informal assessment of whether or not e i a i and b i a i are approximately normal. The default specification is of course 10

11 [5 points] that var(e i a i ) = σ 2 I ni and var(b i a i ) = D (2 2), but these could be relaxed to allow σ 2 and D to differ by group and/or to allow var(e ij a i ) to change over time and for within-individual correlation. You may have written down a different model, which for most of you was fine; the key is that your model allows all the questions of interest above to be addressed. (b) The investigators tell you that the primary reason that subjects dropped out of the study was either (i) because their physicians felt that their LDL levels up to the point of dropout were not lowering fast enough or (ii) because their LDL levels up to the point of dropout were lowering so dramatically that the subject felt it was unnecessary to continue in the study. Would you feel comfortable proceeding with standard analysis using your model under these conditions? If so, explain briefly how you would conduct the analysis. If not, explain why not. It sounds like the assumption that the dropout mechanism is missing at random (MAR) may be reasonable here; apparently subjects and physicians were making the decision on whether or not a subject should drop out on the basis of his evolving, observed LDL measurements. Given this, I would feel comfortbale proceeding with a standard analysis using maximum likelihood methods as long as I was willing to believe that my model is correctly specified and that the distributions of both e i a i and b i a i are mean-zero normal with correct specifications of the corresponding covariance matrices. Under these conditions, I d feel okay using maximum likelihood under this assumption to feel comfortable that valid inferences would be achieved (according to the ignorability argument in Section 5.6). I d also ideally like to be able to use the observed information matrix from this analysis to obtain (model-based) standard errors and tests, to ensure that the uncertainty is taken into appropriate account. (c) In terms of your model in (a), show how you would address question (i) (is mean baseline LDL level associated with being overweight and/or suffering from hypertension?). If you cannot, state why not. Mean LDL level is not associated with either being overweight or suffering from hypertension if β 01 = β 02 = 0. Thus, I would test this null hypothesis against the alternative that at least one of β 01 or β 02 is different from zero. (d) In terms of your model in (a), show how you would address the second part of question (ii) (Is the typical/mean pattern change of LDL level not the same for at least one of the three regimens among normal weight patients?) If you cannot, state why not. According to the model in (a), the typical/mean pattern of change of LDL level for normal weight patients (o i = 0) is β 11 δ i1 + β 12 δ i2 + β 13 δ i3, depending on which regimen an individual received. If this pattern is the same for all regimens, it must be that β 11 = β 12 = β 13. I would thus test this null hypothesis against the alternative that at least one of β 11, β 12, or β 13 is different from the others. (e) Show how you would use your model to estimate the variation in subject-specific baseline LDL levels for subjects who are overweight and suffer from hypertension, or explain why you cannot do this. For the purpose of this problem, I will assume that var(b i a i ) = D, so that among-individual variation and correlation is the same regardless of weight and hypertensive status. This is a (2 2) matrix. The upper left diagonal element D 11 reflects the variance in subject-specific baseline LDL levels for subjects of any type in this case. Thus, to estimate variation in subjectspecific baseline LDL levels for subjects who are overweight and suffer from hypertension, I would report the estimate of D 11. If you allowed the D matrix to be different depending on being overweight and/or hypertensive, your answer was graded accordingly. 11

12 4. (a) Suppose that the outcome Y is continuous, and consider the model Y ij =β 0i {1 exp( β 1i t ij )} + e ij, i = 1,..., m, j = 1,..., n, β 0i = β 0 + b 01i, β 1i = β 1 + b 1i, β 0 > 0, β 1 > 0, (3) where b i = (b 0i, b 1i ) T are independent for all i, e ij are independent for all i, j, and b i and e ij are independent of one another for all i, j, with b i N (0, D), D (2 2), e ij N (0, σ 2 ). Dick refers to β 0 in (3) as the saturation value characterizing the mean outcome in the population, and Jane refers to β 0 as the mean saturation value among individuals in the population. Who is correct, Dick or Jane? Explain (briefly) your answer. It is immediate that Jane is correct from the population model specified; individual-specific β 0i in this model vary about the mean saturation value β 0 in the population. Dick is suggesting that β 0 can be interpreted as the saturation value characterizing (there are no covariates) the population mean outcome. This population mean outcome is E(Y ij ) = E[β 0i {1 exp( β 1i t ij )}] (4) = β 0 E[1 exp{ (β 1 + b i1 )t ij }] + E(b 0i [1 exp{ (β 1 + b i1 )t ij }]). (5) If you answered that Dick is not correct, you are almost certainly right. For β 0 to be the saturation value of this population mean, it must be the limit of the above expression as t ij is β 0. This does not seem likely, and simply saying this would suffice. It turns out (I did not expect you to show this!) that this is indeed the case, From Problem 2 of Homework 1, we know that E[1 exp{ (β 1 + b i1 )t ij }] = 1 exp( β 1 t ij + D 22 t 2 ij /2). Thus, the first term in the above expression is β 0 {1 exp( β 1 t ij + Dt 2 ij /2)}. At first glance, one might think that the second term in this expression is equal to zero because E(b 0i ) = 0. However, as long as the matrix D is not diagonal, so that b 0i and b 1i are not independent, the second term is not necessarily equal to zero. It turns out that one can evaluate the second term (it s a bit involved) and as a result show that in fact which clearly does not approach β 0 as t ij. E(Y ij ) = β 0 + exp( β 1 t ij + D 22 t 2 ij )(D 12t ij β 0 ), 12

13 (b) Consider the model Y ij = β + b i + e ij, i = 1,..., m, j = 1,..., n, (6) where m is even, and δ i = 0 for i = 1,..., m/2, δ i = 1 for i = m/2 + 1,..., m. In (6), b i are mutually independent for all i with E(b i δ 1 ) = 0, var(b i δ i ) = D 0 (1 δ i ) + D 1 δ i, D 0, D 1 > 0; (7) and e ij are mutually independent for all i, j with E(e ij δ i ) = 0, var(e ij δ i ) = σ 2 0 (1 δ i) + σ 2 1 δ i; (8) and b i and e ij are independent of one another for all i, j. Under the assumption that b i and e ij are all normally distributed, provide the simplest expression you can for the maximum likelihood estimator β for β, defining any additional notation you may need. This is immediate from Homework 3, Problem 2. This is the simplest version of a linear mixed effects model but witih different among-individual covariance matrix (which is just a scalar variance here). The data are balanced; there are no time points, even, and all individuals have n observations. Thus, we know immediately that the maximum likelihood estimator for β is identical to the OLS estimator. Letting 1 be a (n 1) vector of all ones, we can thus immediately write down that ( m ) 1 m β = 1 T 1 1 T Y i. This answer would suffice. You may have felt compelled to note that this can be simplified to i=1 i=1 β = (mn) 1 m i=1 j=1 n Y ij = N 1 m i=1 j=1 n Y ij = m 1 m i=1 Y i = Y, the overall mean of all outcomes. 13

3 Repeated Measures Analysis of Variance

3 Repeated Measures Analysis of Variance 3 Repeated Measures Analysis of Variance 3.1 Introduction As we have discussed, many approaches have been taken in the literature to specifying statistical models for longitudinal data. Within the framework

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science

UNIVERSITY OF TORONTO Faculty of Arts and Science UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator

More information

This exam contains 5 questions. Each question is worth 10 points. Therefore, this exam is worth 50 points.

This exam contains 5 questions. Each question is worth 10 points. Therefore, this exam is worth 50 points. GROUND RULES: This exam contains 5 questions. Each question is worth 10 points. Therefore, this exam is worth 50 points. Print your name at the top of this page in the upper right hand corner. This is

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 14 1 / 64 Data structure and Model t1 t2 tn i 1st subject y 11 y 12 y 1n1 2nd subject

More information

INTRODUCTION TO ANALYSIS OF VARIANCE

INTRODUCTION TO ANALYSIS OF VARIANCE CHAPTER 22 INTRODUCTION TO ANALYSIS OF VARIANCE Chapter 18 on inferences about population means illustrated two hypothesis testing situations: for one population mean and for the difference between two

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 12 1 / 34 Correlated data multivariate observations clustered data repeated measurement

More information

4 Introduction to modeling longitudinal data

4 Introduction to modeling longitudinal data 4 Introduction to modeling longitudinal data We are now in a position to introduce a basic statistical model for longitudinal data. The models and methods we discuss in subsequent chapters may be viewed

More information

STA121: Applied Regression Analysis

STA121: Applied Regression Analysis STA121: Applied Regression Analysis Linear Regression Analysis - Chapters 3 and 4 in Dielman Artin Department of Statistical Science September 15, 2009 Outline 1 Simple Linear Regression Analysis 2 Using

More information

Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA

Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED Maribeth Johnson Medical College of Georgia Augusta, GA Overview Introduction to longitudinal data Describe the data for examples

More information

Bios 6649: Clinical Trials - Statistical Design and Monitoring

Bios 6649: Clinical Trials - Statistical Design and Monitoring Bios 6649: Clinical Trials - Statistical Design and Monitoring Spring Semester 2015 John M. Kittelson Department of Biostatistics & Informatics Colorado School of Public Health University of Colorado Denver

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

STAT 525 Fall Final exam. Tuesday December 14, 2010

STAT 525 Fall Final exam. Tuesday December 14, 2010 STAT 525 Fall 2010 Final exam Tuesday December 14, 2010 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will

More information

Answer to exercise: Blood pressure lowering drugs

Answer to exercise: Blood pressure lowering drugs Answer to exercise: Blood pressure lowering drugs The data set bloodpressure.txt contains data from a cross-over trial, involving three different formulations of a drug for lowering of blood pressure:

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

STAT 501 EXAM I NAME Spring 1999

STAT 501 EXAM I NAME Spring 1999 STAT 501 EXAM I NAME Spring 1999 Instructions: You may use only your calculator and the attached tables and formula sheet. You can detach the tables and formula sheet from the rest of this exam. Show your

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

BIOS 2083 Linear Models c Abdus S. Wahed

BIOS 2083 Linear Models c Abdus S. Wahed Chapter 5 206 Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter

More information

Mean Vector Inferences

Mean Vector Inferences Mean Vector Inferences Lecture 5 September 21, 2005 Multivariate Analysis Lecture #5-9/21/2005 Slide 1 of 34 Today s Lecture Inferences about a Mean Vector (Chapter 5). Univariate versions of mean vector

More information

Review of One-way Tables and SAS

Review of One-way Tables and SAS Stat 504, Lecture 7 1 Review of One-way Tables and SAS In-class exercises: Ex1, Ex2, and Ex3 from http://v8doc.sas.com/sashtml/proc/z0146708.htm To calculate p-value for a X 2 or G 2 in SAS: http://v8doc.sas.com/sashtml/lgref/z0245929.htmz0845409

More information

Sociology 593 Exam 2 Answer Key March 28, 2002

Sociology 593 Exam 2 Answer Key March 28, 2002 Sociology 59 Exam Answer Key March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably

More information

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010 MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010 Part 1 of this document can be found at http://www.uvm.edu/~dhowell/methods/supplements/mixed Models for Repeated Measures1.pdf

More information

Sampling Distributions: Central Limit Theorem

Sampling Distributions: Central Limit Theorem Review for Exam 2 Sampling Distributions: Central Limit Theorem Conceptually, we can break up the theorem into three parts: 1. The mean (µ M ) of a population of sample means (M) is equal to the mean (µ)

More information

Midterm 2 - Solutions

Midterm 2 - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis February 24, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida First Year Examination Department of Statistics, University of Florida August 20, 2009, 8:00 am - 2:00 noon Instructions:. You have four hours to answer questions in this examination. 2. You must show

More information

BINF 702 SPRING Chapter 8 Hypothesis Testing: Two-Sample Inference. BINF702 SPRING 2014 Chapter 8 Hypothesis Testing: Two- Sample Inference 1

BINF 702 SPRING Chapter 8 Hypothesis Testing: Two-Sample Inference. BINF702 SPRING 2014 Chapter 8 Hypothesis Testing: Two- Sample Inference 1 BINF 702 SPRING 2014 Chapter 8 Hypothesis Testing: Two-Sample Inference Two- Sample Inference 1 A Poster Child for two-sample hypothesis testing Ex 8.1 Obstetrics In the birthweight data in Example 7.2,

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information. STA441: Spring 2018 Multiple Regression This slide show is a free open source document. See the last slide for copyright information. 1 Least Squares Plane 2 Statistical MODEL There are p-1 explanatory

More information

3. (a) (8 points) There is more than one way to correctly express the null hypothesis in matrix form. One way to state the null hypothesis is

3. (a) (8 points) There is more than one way to correctly express the null hypothesis in matrix form. One way to state the null hypothesis is Stat 501 Solutions and Comments on Exam 1 Spring 005-4 0-4 1. (a) (5 points) Y ~ N, -1-4 34 (b) (5 points) X (X,X ) = (5,8) ~ N ( 11.5, 0.9375 ) 3 1 (c) (10 points, for each part) (i), (ii), and (v) are

More information

Hypothesis testing: Steps

Hypothesis testing: Steps Review for Exam 2 Hypothesis testing: Steps Exam 2 Review 1. Determine appropriate test and hypotheses 2. Use distribution table to find critical statistic value(s) representing rejection region 3. Compute

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X.04) =.8508. For z < 0 subtract the value from,

More information

Do not copy, post, or distribute

Do not copy, post, or distribute 14 CORRELATION ANALYSIS AND LINEAR REGRESSION Assessing the Covariability of Two Quantitative Properties 14.0 LEARNING OBJECTIVES In this chapter, we discuss two related techniques for assessing a possible

More information

Modeling the Mean: Response Profiles v. Parametric Curves

Modeling the Mean: Response Profiles v. Parametric Curves Modeling the Mean: Response Profiles v. Parametric Curves Jamie Monogan University of Georgia Escuela de Invierno en Métodos y Análisis de Datos Universidad Católica del Uruguay Jamie Monogan (UGA) Modeling

More information

BIOS 2083: Linear Models

BIOS 2083: Linear Models BIOS 2083: Linear Models Abdus S Wahed September 2, 2009 Chapter 0 2 Chapter 1 Introduction to linear models 1.1 Linear Models: Definition and Examples Example 1.1.1. Estimating the mean of a N(μ, σ 2

More information

Applied Statistics Preliminary Examination Theory of Linear Models August 2017

Applied Statistics Preliminary Examination Theory of Linear Models August 2017 Applied Statistics Preliminary Examination Theory of Linear Models August 2017 Instructions: Do all 3 Problems. Neither calculators nor electronic devices of any kind are allowed. Show all your work, clearly

More information

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE Biostatistics Workshop 2008 Longitudinal Data Analysis Session 4 GARRETT FITZMAURICE Harvard University 1 LINEAR MIXED EFFECTS MODELS Motivating Example: Influence of Menarche on Changes in Body Fat Prospective

More information

Business 320, Fall 1999, Final

Business 320, Fall 1999, Final Business 320, Fall 1999, Final name You may use a calculator and two cheat sheets. You have 3 hours. I pledge my honor that I have not violated the Honor Code during this examination. Obvioiusly, you may

More information

Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS

Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS 1a) The model is cw i = β 0 + β 1 el i + ɛ i, where cw i is the weight of the ith chick, el i the length of the egg from which it hatched, and ɛ i

More information

Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models

Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models EPSY 905: Multivariate Analysis Spring 2016 Lecture #12 April 20, 2016 EPSY 905: RM ANOVA, MANOVA, and Mixed Models

More information

Stat 500 Midterm 2 8 November 2007 page 0 of 4

Stat 500 Midterm 2 8 November 2007 page 0 of 4 Stat 500 Midterm 2 8 November 2007 page 0 of 4 Please put your name on the back of your answer book. Do NOT put it on the front. Thanks. DO NOT START until I tell you to. You are welcome to read this front

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs) 36-309/749 Experimental Design for Behavioral and Social Sciences Dec 1, 2015 Lecture 11: Mixed Models (HLMs) Independent Errors Assumption An error is the deviation of an individual observed outcome (DV)

More information

Ron Heck, Fall Week 3: Notes Building a Two-Level Model

Ron Heck, Fall Week 3: Notes Building a Two-Level Model Ron Heck, Fall 2011 1 EDEP 768E: Seminar on Multilevel Modeling rev. 9/6/2011@11:27pm Week 3: Notes Building a Two-Level Model We will build a model to explain student math achievement using student-level

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is

More information

Fitting a Straight Line to Data

Fitting a Straight Line to Data Fitting a Straight Line to Data Thanks for your patience. Finally we ll take a shot at real data! The data set in question is baryonic Tully-Fisher data from http://astroweb.cwru.edu/sparc/btfr Lelli2016a.mrt,

More information

Comprehensive Examination Quantitative Methods Spring, 2018

Comprehensive Examination Quantitative Methods Spring, 2018 Comprehensive Examination Quantitative Methods Spring, 2018 Instruction: This exam consists of three parts. You are required to answer all the questions in all the parts. 1 Grading policy: 1. Each part

More information

MATH 644: Regression Analysis Methods

MATH 644: Regression Analysis Methods MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100

More information

2. TRUE or FALSE: Converting the units of one measured variable alters the correlation of between it and a second variable.

2. TRUE or FALSE: Converting the units of one measured variable alters the correlation of between it and a second variable. 1. The diagnostic plots shown below are from a linear regression that models a patient s score from the SUG-HIGH diabetes risk model as function of their normalized LDL level. a. Based on these plots,

More information

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH The First Step: SAMPLE SIZE DETERMINATION THE ULTIMATE GOAL The most important, ultimate step of any of clinical research is to do draw inferences;

More information

Final Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10)

Final Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10) Name Economics 170 Spring 2004 Honor pledge: I have neither given nor received aid on this exam including the preparation of my one page formula list and the preparation of the Stata assignment for the

More information

Topic 12. The Split-plot Design and its Relatives (continued) Repeated Measures

Topic 12. The Split-plot Design and its Relatives (continued) Repeated Measures 12.1 Topic 12. The Split-plot Design and its Relatives (continued) Repeated Measures 12.9 Repeated measures analysis Sometimes researchers make multiple measurements on the same experimental unit. We have

More information

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington Analsis of Longitudinal Data Patrick J. Heagert PhD Department of Biostatistics Universit of Washington 1 Auckland 2008 Session Three Outline Role of correlation Impact proper standard errors Used to weight

More information

4 Bias-Variance for Ridge Regression (24 points)

4 Bias-Variance for Ridge Regression (24 points) Implement Ridge Regression with λ = 0.00001. Plot the Squared Euclidean test error for the following values of k (the dimensions you reduce to): k = {0, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500,

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

Mathematics: applications and interpretation SL

Mathematics: applications and interpretation SL Mathematics: applications and interpretation SL Chapter 1: Approximations and error A Rounding numbers B Approximations C Errors in measurement D Absolute and percentage error The first two sections of

More information

AAEC/ECON 5126 FINAL EXAM: SOLUTIONS

AAEC/ECON 5126 FINAL EXAM: SOLUTIONS AAEC/ECON 5126 FINAL EXAM: SOLUTIONS SPRING 2013 / INSTRUCTOR: KLAUS MOELTNER This exam is open-book, open-notes, but please work strictly on your own. Please make sure your name is on every sheet you

More information

Math 1040 Final Exam Form A Introduction to Statistics Fall Semester 2010

Math 1040 Final Exam Form A Introduction to Statistics Fall Semester 2010 Math 1040 Final Exam Form A Introduction to Statistics Fall Semester 2010 Instructor Name Time Limit: 120 minutes Any calculator is okay. Necessary tables and formulas are attached to the back of the exam.

More information

TA: Sheng Zhgang (Th 1:20) / 342 (W 1:20) / 343 (W 2:25) / 344 (W 12:05) Haoyang Fan (W 1:20) / 346 (Th 12:05) FINAL EXAM

TA: Sheng Zhgang (Th 1:20) / 342 (W 1:20) / 343 (W 2:25) / 344 (W 12:05) Haoyang Fan (W 1:20) / 346 (Th 12:05) FINAL EXAM STAT 301, Fall 2011 Name Lec 4: Ismor Fischer Discussion Section: Please circle one! TA: Sheng Zhgang... 341 (Th 1:20) / 342 (W 1:20) / 343 (W 2:25) / 344 (W 12:05) Haoyang Fan... 345 (W 1:20) / 346 (Th

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 MA 575 Linear Models: Cedric E Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 1 Within-group Correlation Let us recall the simple two-level hierarchical

More information

ST 790, Homework 1 Spring 2017

ST 790, Homework 1 Spring 2017 ST 790, Homework 1 Spring 2017 1. In EXAMPLE 1 of Chapter 1 of the notes, it is shown at the bottom of page 22 that the complete case estimator for the mean µ of an outcome Y given in (1.18) under MNAR

More information

Lecture 7 Time-dependent Covariates in Cox Regression

Lecture 7 Time-dependent Covariates in Cox Regression Lecture 7 Time-dependent Covariates in Cox Regression So far, we ve been considering the following Cox PH model: λ(t Z) = λ 0 (t) exp(β Z) = λ 0 (t) exp( β j Z j ) where β j is the parameter for the the

More information

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression Rebecca Barter April 20, 2015 Fisher s Exact Test Fisher s Exact Test

More information

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS STAT 512 MidTerm I (2/21/2013) Spring 2013 Name: Key INSTRUCTIONS 1. This exam is open book/open notes. All papers (but no electronic devices except for calculators) are allowed. 2. There are 5 pages in

More information

Swarthmore Honors Exam 2012: Statistics

Swarthmore Honors Exam 2012: Statistics Swarthmore Honors Exam 2012: Statistics 1 Swarthmore Honors Exam 2012: Statistics John W. Emerson, Yale University NAME: Instructions: This is a closed-book three-hour exam having six questions. You may

More information

ECON 497 Midterm Spring

ECON 497 Midterm Spring ECON 497 Midterm Spring 2009 1 ECON 497: Economic Research and Forecasting Name: Spring 2009 Bellas Midterm You have three hours and twenty minutes to complete this exam. Answer all questions and explain

More information

Correlation and Linear Regression

Correlation and Linear Regression Correlation and Linear Regression Correlation: Relationships between Variables So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means

More information

Ordinary Least Squares Regression

Ordinary Least Squares Regression Ordinary Least Squares Regression Goals for this unit More on notation and terminology OLS scalar versus matrix derivation Some Preliminaries In this class we will be learning to analyze Cross Section

More information

Exam Applied Statistical Regression. Good Luck!

Exam Applied Statistical Regression. Good Luck! Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.

More information

1. What does the alternate hypothesis ask for a one-way between-subjects analysis of variance?

1. What does the alternate hypothesis ask for a one-way between-subjects analysis of variance? 1. What does the alternate hypothesis ask for a one-way between-subjects analysis of variance? 2. What is the difference between between-group variability and within-group variability? 3. What does between-group

More information

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Regression Models. Chapter 4. Introduction. Introduction. Introduction Chapter 4 Regression Models Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna 008 Prentice-Hall, Inc. Introduction Regression analysis is a very valuable tool for a manager

More information

22s:152 Applied Linear Regression. Take random samples from each of m populations.

22s:152 Applied Linear Regression. Take random samples from each of m populations. 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds Chapter 6 Logistic Regression In logistic regression, there is a categorical response variables, often coded 1=Yes and 0=No. Many important phenomena fit this framework. The patient survives the operation,

More information

Extending causal inferences from a randomized trial to a target population

Extending causal inferences from a randomized trial to a target population Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh

More information

Chapter 7: Variances. October 14, In this chapter we consider a variety of extensions to the linear model that allow for more gen-

Chapter 7: Variances. October 14, In this chapter we consider a variety of extensions to the linear model that allow for more gen- Chapter 7: Variances October 14, 2018 In this chapter we consider a variety of extensions to the linear model that allow for more gen- eral variance structures than the independent, identically distributed

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3 STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae

More information

Department of Mathematics & Statistics STAT 2593 Final Examination 17 April, 2000

Department of Mathematics & Statistics STAT 2593 Final Examination 17 April, 2000 Department of Mathematics & Statistics STAT 2593 Final Examination 17 April, 2000 TIME: 3 hours. Total marks: 80. (Marks are indicated in margin.) Remember that estimate means to give an interval estimate.

More information

Analysis of Variance. Read Chapter 14 and Sections to review one-way ANOVA.

Analysis of Variance. Read Chapter 14 and Sections to review one-way ANOVA. Analysis of Variance Read Chapter 14 and Sections 15.1-15.2 to review one-way ANOVA. Design of an experiment the process of planning an experiment to insure that an appropriate analysis is possible. Some

More information

One-way ANOVA (Single-Factor CRD)

One-way ANOVA (Single-Factor CRD) One-way ANOVA (Single-Factor CRD) STAT:5201 Week 3: Lecture 3 1 / 23 One-way ANOVA We have already described a completed randomized design (CRD) where treatments are randomly assigned to EUs. There is

More information

Psy 420 Final Exam Fall 06 Ainsworth. Key Name

Psy 420 Final Exam Fall 06 Ainsworth. Key Name Psy 40 Final Exam Fall 06 Ainsworth Key Name Psy 40 Final A researcher is studying the effect of Yoga, Meditation, Anti-Anxiety Drugs and taking Psy 40 and the anxiety levels of the participants. Twenty

More information

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis Rebecca Barter April 6, 2015 Multiple Testing Multiple Testing Recall that when we were doing two sample t-tests, we were testing the equality

More information

STAT 540: Data Analysis and Regression

STAT 540: Data Analysis and Regression STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State

More information

Masters Comprehensive Examination Department of Statistics, University of Florida

Masters Comprehensive Examination Department of Statistics, University of Florida Masters Comprehensive Examination Department of Statistics, University of Florida May 6, 003, 8:00 am - :00 noon Instructions: You have four hours to answer questions in this examination You must show

More information

STAT 526 Spring Midterm 1. Wednesday February 2, 2011

STAT 526 Spring Midterm 1. Wednesday February 2, 2011 STAT 526 Spring 2011 Midterm 1 Wednesday February 2, 2011 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points

More information

Review. One-way ANOVA, I. What s coming up. Multiple comparisons

Review. One-way ANOVA, I. What s coming up. Multiple comparisons Review One-way ANOVA, I 9.07 /15/00 Earlier in this class, we talked about twosample z- and t-tests for the difference between two conditions of an independent variable Does a trial drug work better than

More information

BIOSTATISTICAL METHODS

BIOSTATISTICAL METHODS BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH Cross-over Designs #: DESIGNING CLINICAL RESEARCH The subtraction of measurements from the same subject will mostly cancel or minimize effects

More information

An Introduction to Mplus and Path Analysis

An Introduction to Mplus and Path Analysis An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression

More information

Difference in two or more average scores in different groups

Difference in two or more average scores in different groups ANOVAs Analysis of Variance (ANOVA) Difference in two or more average scores in different groups Each participant tested once Same outcome tested in each group Simplest is one-way ANOVA (one variable as

More information

3. Diagnostics and Remedial Measures

3. Diagnostics and Remedial Measures 3. Diagnostics and Remedial Measures So far, we took data (X i, Y i ) and we assumed where ɛ i iid N(0, σ 2 ), Y i = β 0 + β 1 X i + ɛ i i = 1, 2,..., n, β 0, β 1 and σ 2 are unknown parameters, X i s

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

STA 431s17 Assignment Eight 1

STA 431s17 Assignment Eight 1 STA 43s7 Assignment Eight The first three questions of this assignment are about how instrumental variables can help with measurement error and omitted variables at the same time; see Lecture slide set

More information

E509A: Principle of Biostatistics. GY Zou

E509A: Principle of Biostatistics. GY Zou E509A: Principle of Biostatistics (Week 4: Inference for a single mean ) GY Zou gzou@srobarts.ca Example 5.4. (p. 183). A random sample of n =16, Mean I.Q is 106 with standard deviation S =12.4. What

More information

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as ST 51, Summer, Dr. Jason A. Osborne Homework assignment # - Solutions 1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available

More information

Math 51 Midterm 1 July 6, 2016

Math 51 Midterm 1 July 6, 2016 Math 51 Midterm 1 July 6, 2016 Name: SUID#: Circle your section: Section 01 Section 02 (1:30-2:50PM) (3:00-4:20PM) Complete the following problems. In order to receive full credit, please show all of your

More information

Describing Change over Time: Adding Linear Trends

Describing Change over Time: Adding Linear Trends Describing Change over Time: Adding Linear Trends Longitudinal Data Analysis Workshop Section 7 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section

More information

Regression #8: Loose Ends

Regression #8: Loose Ends Regression #8: Loose Ends Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #8 1 / 30 In this lecture we investigate a variety of topics that you are probably familiar with, but need to touch

More information

Test 3 Practice Test A. NOTE: Ignore Q10 (not covered)

Test 3 Practice Test A. NOTE: Ignore Q10 (not covered) Test 3 Practice Test A NOTE: Ignore Q10 (not covered) MA 180/418 Midterm Test 3, Version A Fall 2010 Student Name (PRINT):............................................. Student Signature:...................................................

More information

Power and sample size calculations

Power and sample size calculations Patrick Breheny October 20 Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 1 / 26 Planning a study Introduction What is power? Why is it important? Setup One of the most important

More information

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Applied Statistics Friday, January 15, 2016

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Applied Statistics Friday, January 15, 2016 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Applied Statistics Friday, January 15, 2016 Work all problems. 60 points are needed to pass at the Masters Level and 75 to pass at the

More information