Repeated measures, part 1, simple methods

Size: px
Start display at page:

Download "Repeated measures, part 1, simple methods"

Transcription

1 enote 11 1 enote 11 Repeated measures, part 1, simple methods

2 enote 11 INDHOLD 2 Indhold 11 Repeated measures, part 1, simple methods Intro Main example: Activity of rats Separate analyses for each time point Example: Activity of rats analyzed separately for each month Analysis of summary statistics Example: Activity of rats analyzed via summary measure Random effects approach Example: Activity of rats analyzed via random effects model Pros and cons of simple approaches The R-package nlme and the function lme Exercises Intro This module describe various simple approaches for analyzing repeated measurements, and show how these analyses can be carried out in R. Data referred to as Repeated measurements (or sometimes as longitudinal data ) can be characterized by having several measurements on the same individuals or experimental units. These measurements

3 enote INTRO 3 are typically taken at different times, or at different positions within the individuals. Consider for instance the following experimental design, to compare two drugs (A and B) to reduce blood pressure: 1. Twenty individuals were selected randomly from the relevant population. 2. Half of these were given drug A and half were given drug B (randomly selected). 3. For a period of two months these individuals had their blood pressure measured every week, which resulted in eight measurements on each individual. The problem is that data collected this way, might be in violation of the standard assumption of independent measurements. It seems fair to expect two measurements from the same individual to be positively correlated, which would result in more similar measurements than two measurements from different individuals. Furthermore, two measurements taken on the same individual might be highly correlated if they are measured at two time points close to each other, but less correlated (or maybe independent) if they are measured far apart. This module describes some fairly simple (and maybe crude) methods for analyzing these data types. These methods include: Separate analyses for each time point Analysis of summary statistic Random effects approach The analysis of repeated measurements will continue in the next module, where some more explicit covariance models will be shown. The simplest approach to analyse repeated measurements would be to include time as a factor, and ignore the dependence between two observations on the same individual. Such an approach may lead to completely wrong conclusions. The essence of the problem is that this is the same as pretending to have more observations than are actually available. Two correlated observations contain less information than two independent observations, because one is partly explained by the other. This approach is unacceptable.

4 enote INTRO Main example: Activity of rats To investigate the effect of a certain type of exposure on the activity of rats, the following experiment was carried out. The experimental unit was a cage with two rats. During the entire experimental period the rats were daily exposed to the matter under investigation, in the concentration of 1, 2 or 3 units (treatment 1, 2 and 3, respectively). Once per month during 10 months the activity of the rats was measured by placing the rats from one cage in a chamber in which each intersection of a light beam was counted. The total count through a period of 57 hours was used as the result for that cage. Notice that in this setting the individual variable is cage. Summary of experiment: 3 treatments: 1, 2, 3 (concentration) 10 cages per treatment 10 contiguous months The response is activity (count of intersections of light beam during 57 hours). Here y = log(counts) is used for the analysis, because a residual plot showed that this was the most reasonable. The observations are listed in table , and the observations are plotted in Figure From the figure it seems that the activity is decreasing from month 1 to month 10 (maybe as a linear function?), and maybe that there is a small difference between the different doses. Plotting the individual curves is a very useful tool in the analysis of repeated measures. This should always be the first step. Quite often the main conclusions from the analysis can already be seen from a good plot of the data. Here we use interaction.plot to plot the individual profiles the resulting figure is seen in Figure 11.1: rats <- read.table("rats.txt", header=true, sep=",", dec=".") rats$month <- factor(rats$month) rats$treatm <- factor(rats$treatm) rats$cage <- factor(rats$cage) with(rats, interaction.plot(month, cage, lnc, legend = FALSE, las=1, lty = rep(1:3, each = 10), col = rep(2:4, each = 10))) Using the interaction.plot function here only produces the right plot because the time points are equidistant in these data. More generally one would need a quantitative

5 enote INTRO mean of lnc month Figur 11.1: The log(counts) for each cage plotted against month. The solid red lines are cages receiving dose=1, dashed green lines are dose=2, and dotted blue lines are dose=3.

6 enote INTRO 6 Month Dose Cage Tabel 11.1: The rats data set, here the raw activity counts are listed. copy of the time variable and use this as the x-axis in the plotting, e.g. using the ggplot2 package: require(ggplot2) rats$monthq <- as.numeric(rats$month) ggplot(rats, aes(x=monthq, y=lnc, group=cage, colour=treatm)) + geom_line()

7 enote INTRO lnc 9.5 treatm monthq Or the treatment group average time profiles: require(plyr) mns <- ddply(rats, ~ treatm + month + monthq, summarize, lnc = mean(lnc)) ggplot(mns, aes(x=monthq, y=lnc, group=treatm, colour=treatm)) + geom_point() + geom_line()

8 enote SEPARATE ANALYSES FOR EACH TIME POINT lnc treatm monthq Here we used the ddply function from the plyr package to compute the mean lnc for each level or unique value of treatm, month and monthq. We will not describe the details of this function or package, but only note that it is very efficient for data manipulation Separate analyses for each time point One way to avoid the problem of correlated measurements is to do a separate analysis for each point in time. This way only one observation from each individual is used, and hence they are independent. This way of analyzing repeated measurements is not wrong, but it is very inefficient, as all the remaining observations are wasted. This approach avoids the problem, instead of dealing with it. Separate analyses can be carried out for all the observed time points, but it will likely be very difficult to reach a coherent conclusion from all these sub tests. These sub tests will be correlated, and because the correlation structure is not part of the model, it is not possible to tell how strong this correlation is.

9 enote SEPARATE ANALYSES FOR EACH TIME POINT 9 Separate analyses can be carried out for selected time points far apart. This will (hopefully) cause the separate sub tests to be uncorrelated, or at least less correlated. Even with uncorrelated tests it will be difficult to reach a coherent conclusion, because of a problem known as mass significance (or multiplicity). For instance, if 20 tests are carried out at a 5% significance level, one of them is likely to be a false positive, i.e. a falsely significant p value. This problem is partly solved by using the Bonferroni correction for performing n tests (one for each time point). The Bonferroni correction simply states that the significance level 0.05/n should be used instead of the usual 0.05 (which sometimes might be shown by mutiplying the calculated p-value with n.) When selecting time points far apart, it is important that the selection must be done independently of the actual observations. Naturally the time points may not be selected systematically where there is large (or small) difference between treatments. Ideally the time points should be selected before data are collected Example: Activity of rats analyzed separately for each month Consider the rats data available in the rats.txt file. head(rats) treatm cage month lnc monthq To analyze the rats data set separately for each month, a simple one way analysis of variance model with treatment treatm as the only factor is used. The information about cage cannot be included, as we only have one observation from each cage in each monthly analysis. The model for each month is: lnc i = µ + α(treatm i ) + ε i, ε i i.i.d. N(0, σ 2 ), i = To do this in R, we split the rats data frame into a list of data frames, one for each month. We then apply (using sapply) the function fn, which fits the linear model and extracts the F and p values, to each of the data frames in the list:

10 enote ANALYSIS OF SUMMARY STATISTICS 10 ratsl <- split(rats, f = rats$month) # a list of data.frames # Function to fit model, get F and p from anova table: fn <- function(df) unlist(anova(lm(lnc ~ treatm, data=df))[1, c("f value", "Pr(>F)")]) # Alternative using plyr-functions: # round(t(daply(rats, ~ month,.fun=fn)), 2) round(sapply(ratsl, fn), 3) F value Pr(>F) These F values should be compared with F 95%,2,27 = 3.35 or with F 99.5%,2,27 = 6.49 if the Bonferroni correction is used. A few significant values are found, and even one if the Bonferroni correction is used, so the conclusion should be that weak evidence of group difference have been seen. It is possible to make a correct analysis time by time, but it is weak and often confusing, because it does not combine all information into one test Analysis of summary statistics Another way to avoid the problem of correlated measurements is to choose a single measure to summarize the individual curves, and then base the analysis on this measure. This again reduces the data set to independent observations one for each individual. To analyze the summary data set, standard methods for independent observations for instance analysis of variance can be used. The key is to choose a good summary measure. One possibility is to choose the value at a given time point, which reduces this summary method to the separate time point analysis described in the previous section. This choice is poor in most cases, because all other measurements are wasted. It is difficult to give general advice about the choice of summery measurement. Ideally, the summary measure should capture the most important feature of the curve. In some situations the most important feature is the net growth (last minus first), the average growth (slope), or time to reach the maximum point. It depends on the problem at hand.

11 enote ANALYSIS OF SUMMARY STATISTICS 11 Some common choices of summary measures are: Average over time Slope in regression with time (or higher order polynomial coefficients) Total increase (last point minus first point) Area under curve (AUC) Maximum or minimum point With the right choice of summary measure this type of analysis can be very useful, at least as a first step. These models have relatively few assumptions, and they can be checked via standard residual methods. Of course the downside of this method is that information may be lost by reducing each curve to one single measure Example: Activity of rats analyzed via summary measure The choice of summary measure for the rats data set is partly inspired by Figure It seems that the average slope is similar for the three treatments, but that the curves from dose=3 tends to be a slightly higher than the rest of the curves. To see if this is a significant difference the logarithm of the total count during all ten months lntot = log(total count) is used as summary measure. To calculate this summary measure from the previously described data set, the variable containing the log counts from each month lnc must be transformed back to the original counts, then the sum must be calculated, and finally the logarithm must be applied to the sum. This summary data set consists of independent measurements, as each cage is only used to generate one summary observation. Because it is now independent observations, it can be analyzed with a simple one way ANOVA model: lntot i = µ + α(treatm i ) + ε i, ε i i.i.d. N(0, σ 2 ), i = These operations can be done in R by writing: (The variable containing the logarithm of the total counts is called lnc)

12 enote RANDOM EFFECTS APPROACH 12 rats_sum <- ddply(rats,.(cage, treatm), summarize, logsum_count = log(sum(exp(lnc)))) anova(lm(logsum_count ~ treatm, data=rats_sum)) Analysis of Variance Table Response: logsum_count Df Sum Sq Mean Sq F value Pr(>F) treatm Residuals Signif. codes: 0 *** ** 0.01 * The p value for no treatment effect in this summary model is 5.22%. This is above the standard 5% significant level, but only slightly. In this analysis the entire curve has been summarized into a single measure, so a lot of information has been lost. A p value this low for the crude summary analysis could indicate that a significant treatment effect might be found with a more sophisticated analysis Random effects approach The two approaches described above both illustrated ways to reduce the data set to independent measures. This section explains the first step in modeling the actual covariance. As seen in previous modules, for instance the module about hierarchial random effects, the effect of adding a random effect is that two observations from the same level will possibly be positively correlated. Adding the individual factor to the model as a random effect will allow two observations from the same individual to be positively correlated Example: Activity of rats analyzed via random effects model It is reasonable to assume that two observations from the same cage could be correlated, so the model with cage as random effect is used. The factor month and the interaction between month and treatment are included. This was not possible in the previous models, because each curve was reduced into one number. In this analysis all observations

13 enote RANDOM EFFECTS APPROACH 13 are included into one coherent analysis. The model is: lnc i = µ + α(treatm i ) + β(month i ) + γ(treatm i, month i ) + d(cage i ) + ε i, where i = , d(cage i ) N(0, σ 2 d ), ε i N(0, σ 2 ), and all independent. Recall from previous modules that the covariance structure for this model is: 0, if cage i1 = cage i2 and i 1 = i 2 cov(y i1, y i2 ) = σ 2 d σd 2 + σ2, if cage i1 = cage i2 and i 1 = i 2, if i 1 = i 2 In other words this is the variance structure, where two observations from different cages are uncorrelated, and two observations from the same cage are positively correlated with correlation coefficient σ 2 d /(σ2 d + σ2 ). The following lines implement this model in R: require(lmertest) model1 <- lmer(lnc ~ month + treatm + month:treatm + (1 cage), data = rats) anova(model1) Analysis of Variance Table of type III with Satterthwaite approximation for degrees of freedom Sum Sq Mean Sq NumDF DenDF F.value Pr(>F) month < 2.2e-16 *** treatm month:treatm ** --- Signif. codes: 0 *** ** 0.01 * VarCorr(model1) Groups Name Std.Dev. cage (Intercept) Residual c(-2 * loglik(model1, REML=TRUE)) # REML=TRUE is default [1]

14 enote PROS AND CONS OF SIMPLE APPROACHES 14 This output give estimates of the variance parameters (σ 2 d = and σ 2 = ), twice the negative restricted/residual log likelihood (2lre = 8.61), and an ANOVA table for the fixed effects of the model. From this ANOVA table it is seen that the interaction between treatment and month is significant with a p value= The conclusion from this model is that treatment does have an effect on the activity, but the effect is not the same in all ten months. The main problem with this random effects approach is that all measurements on the same individual are assumed equally correlated, but some measurements are taken far apart and some measurements are taken close to each other, so this assumption is not always valid. The next module will suggest a few ways to deal with this problem. However, this random effects approach may give reasonable results for short series (with 2, 3, or 4 measurements on each individual) since the assumption of equal correlation may be ok in those cases. This random effects approach is also known as the split plot approach, or the split plot model. It is possible to view repeated measurements data as resulting from a kind of split plot experiment, with individuals as the main plots to which the treatments are applied. The sub plots are then the single measurements on each individuals. This interpretation is a bit weak, as the single measurements on each individual (typically at different times) cannot be randomized within the individual Pros and cons of simple approaches In this module a few simple approaches to the analysis of repeated measurements have been described. In many practical cases these simple approaches, especially the summary method, will give a sufficient and useful analysis of the data. Even in those cases where more sophisticated models are needed it is often helpful to run a few simple models first. Here follows a few pros and cons of the different methods: Separate analysis for each time point + Not wrong Can be confusing Difficult to reach coherent conclusion In general not very informative Analysis of summary statistic

15 enote THE R-PACKAGE NLME AND THE FUNCTION LME 15 + Good method with few and easily checked assumptions Important to choose good summary measure(s) Random effects approach + Good method for short series + Uses all observations Usually not good for long series 11.6 The R-package nlme and the function lme For what comes in the next module, the correlated residuals models, we will have to turn to the lme function of the nlme package. These model structures are not yet available by the lme4-package, while they may be implemented in the future. The lme function has a somewhat different syntax and also a somewhat different structure in the results. To run the simple split-plot version of the repeated measures model also given above: library(nlme) model2 <- lme(lnc ~ month + treatm + month:treatm, random = ~1 cage, data = rats) anova(model2) numdf dendf F-value p-value (Intercept) <.0001 month <.0001 treatm month:treatm VarCorr(model2) cage = pdlogchol(1) Variance StdDev (Intercept) Residual c(-2 * loglik(model2))

16 enote EXERCISES 16 [1] intervals(model2, which = "var-cov") Approximate 95% confidence intervals Random Effects: Level: cage lower est. upper sd((intercept)) Within-group standard error: lower est. upper The confidence intervals for the variance structure parameters produced here are not the same as those produced by the confint function (which cannot produce profile intervals for lme results). Instead they are Wald intervals constructed for log-standard deviations. This matches with the fact that the intervals are symmetric on the log-scale: ins <- intervals(model2, which = "var-cov") lins <- log(ins$sigma) unname(c(lins[2]-lins[1], lins[3]-lins[2])) [1] Exercises Exercise 1 Histamine concentration on dogs In an experiment with 16 dogs the blood histamine concentration was measured 0, 1, 3, and 5 minutes after injection of morphine or trimethaphane. Before injection the dogs were classified into two groups according to their level of histamine (intact or depleted). The data are available in the file histamin.txt and partly listed below.

17 enote EXERCISES 17 treatm level dog min hist morphine intact morphine intact morphine intact morphine intact morphine deplet morphine deplet morphine deplet morphine deplet trimetha intact trimetha intact trimetha intact trimetha intact trimetha deplet trimetha deplet trimetha deplet trimetha deplet morphine intact (64 lines total)..... trimetha deplet The main focus of this experiment is to compare the effect of trimethaphane to the effect of morphine. a) Make a plot of the data, for instance one line for each dog (maybe colored differently in each treatment group). b) Analyze these data using one or more of the simple methods. c) Formulate a conclusion about the treatment.

18 enote EXERCISES 18 Exercise 2 Growth of guinea pigs In an investigation of the effect of vitamin E on the growth of guinea pigs 15 animals were observed for 7 weeks. In week one they were given a growth inhibiting substance. In the beginning of week five they received different amounts of vitamin E (dosage 0, 1, or 2). there were five animals in each treatment group, and each animal were weighted at the end of week 1, 3, 4, 5, 6, and 7. The data is available in the file guinea.txt and is partly listed below. animal week weight dose (90 lines total) The focus of this experiment is the effect of vitamin E on the growth of guinea pigs. a) Plot of the data. b) What is the conclusion about vitamin E?

Repeated measures, part 2, advanced methods

Repeated measures, part 2, advanced methods enote 12 1 enote 12 Repeated measures, part 2, advanced methods enote 12 INDHOLD 2 Indhold 12 Repeated measures, part 2, advanced methods 1 12.1 Intro......................................... 3 12.2 A

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Mixed effects models - II Henrik Madsen, Jan Kloppenborg Møller, Anders Nielsen April 16, 2012 H. Madsen, JK. Møller, A. Nielsen () Chapman & Hall

More information

Hierarchical Random Effects

Hierarchical Random Effects enote 5 1 enote 5 Hierarchical Random Effects enote 5 INDHOLD 2 Indhold 5 Hierarchical Random Effects 1 5.1 Introduction.................................... 2 5.2 Main example: Lactase measurements in

More information

Workshop 9.3a: Randomized block designs

Workshop 9.3a: Randomized block designs -1- Workshop 93a: Randomized block designs Murray Logan November 23, 16 Table of contents 1 Randomized Block (RCB) designs 1 2 Worked Examples 12 1 Randomized Block (RCB) designs 11 RCB design Simple Randomized

More information

Solution pigs exercise

Solution pigs exercise Solution pigs exercise Course repeated measurements - R exercise class 2 November 24, 2017 Contents 1 Question 1: Import data 3 1.1 Data management..................................... 3 1.2 Inspection

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University

More information

STK4900/ Lecture 10. Program

STK4900/ Lecture 10. Program STK4900/9900 - Lecture 10 Program 1. Repeated measures and longitudinal data 2. Simple analysis approaches 3. Random effects models 4. Generalized estimating equations (GEE) 5. GEE for binary data (and

More information

Mixed Model Theory, Part I

Mixed Model Theory, Part I enote 4 1 enote 4 Mixed Model Theory, Part I enote 4 INDHOLD 2 Indhold 4 Mixed Model Theory, Part I 1 4.1 Design matrix for a systematic linear model.................. 2 4.2 The mixed model.................................

More information

Workshop 9.1: Mixed effects models

Workshop 9.1: Mixed effects models -1- Workshop 91: Mixed effects models Murray Logan October 10, 2016 Table of contents 1 Non-independence - part 2 1 1 Non-independence - part 2 11 Linear models Homogeneity of variance σ 2 0 0 y i = β

More information

The Analysis of Split-Plot Experiments

The Analysis of Split-Plot Experiments enote 7 1 enote 7 The Analysis of Split-Plot Experiments enote 7 INDHOLD 2 Indhold 7 The Analysis of Split-Plot Experiments 1 7.1 Introduction.................................... 2 7.2 The Split-Plot Model...............................

More information

SMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning

SMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning 1 Agenda 1. Comparison of Treatments (One Variable) Analysis of Variance

More information

Mixed Model: Split plot with two whole-plot factors, one split-plot factor, and CRD at the whole-plot level (e.g. fancier split-plot p.

Mixed Model: Split plot with two whole-plot factors, one split-plot factor, and CRD at the whole-plot level (e.g. fancier split-plot p. STAT:5201 Applied Statistic II Mixed Model: Split plot with two whole-plot factors, one split-plot factor, and CRD at the whole-plot level (e.g. fancier split-plot p.422 OLRT) Hamster example with three

More information

Correlated Data: Linear Mixed Models with Random Intercepts

Correlated Data: Linear Mixed Models with Random Intercepts 1 Correlated Data: Linear Mixed Models with Random Intercepts Mixed Effects Models This lecture introduces linear mixed effects models. Linear mixed models are a type of regression model, which generalise

More information

Mixed models with correlated measurement errors

Mixed models with correlated measurement errors Mixed models with correlated measurement errors Rasmus Waagepetersen October 9, 2018 Example from Department of Health Technology 25 subjects where exposed to electric pulses of 11 different durations

More information

Longitudinal data: simple univariate methods of analysis

Longitudinal data: simple univariate methods of analysis Longitudinal data: simple univariate methods of analysis Danish version by Henrik Stryhn, June 1996 Department of Mathematcis and Physics, KVL Translation (and rewriting) by Ib Skovgaard, March 1998 (the

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent

More information

BIOSTATISTICAL METHODS

BIOSTATISTICAL METHODS BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH Cross-over Designs #: DESIGNING CLINICAL RESEARCH The subtraction of measurements from the same subject will mostly cancel or minimize effects

More information

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 22s:152 Applied Linear Regression Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) We now consider an analysis with only categorical predictors (i.e. all predictors are

More information

Do not copy, post, or distribute

Do not copy, post, or distribute 14 CORRELATION ANALYSIS AND LINEAR REGRESSION Assessing the Covariability of Two Quantitative Properties 14.0 LEARNING OBJECTIVES In this chapter, we discuss two related techniques for assessing a possible

More information

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form Outline Statistical inference for linear mixed models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark general form of linear mixed models examples of analyses using linear mixed

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 14 1 / 64 Data structure and Model t1 t2 tn i 1st subject y 11 y 12 y 1n1 2nd subject

More information

STK4900/ Lecture 3. Program

STK4900/ Lecture 3. Program STK4900/9900 - Lecture 3 Program 1. Multiple regression: Data structure and basic questions 2. The multiple linear regression model 3. Categorical predictors 4. Planned experiments and observational studies

More information

df=degrees of freedom = n - 1

df=degrees of freedom = n - 1 One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:

More information

The First Thing You Ever Do When Receive a Set of Data Is

The First Thing You Ever Do When Receive a Set of Data Is The First Thing You Ever Do When Receive a Set of Data Is Understand the goal of the study What are the objectives of the study? What would the person like to see from the data? Understand the methodology

More information

Extensions of One-Way ANOVA.

Extensions of One-Way ANOVA. Extensions of One-Way ANOVA http://www.pelagicos.net/classes_biometry_fa18.htm What do I want You to Know What are two main limitations of ANOVA? What two approaches can follow a significant ANOVA? How

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Randomized Block Designs with Replicates

Randomized Block Designs with Replicates LMM 021 Randomized Block ANOVA with Replicates 1 ORIGIN := 0 Randomized Block Designs with Replicates prepared by Wm Stein Randomized Block Designs with Replicates extends the use of one or more random

More information

Analysis of Variance

Analysis of Variance Statistical Techniques II EXST7015 Analysis of Variance 15a_ANOVA_Introduction 1 Design The simplest model for Analysis of Variance (ANOVA) is the CRD, the Completely Randomized Design This model is also

More information

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph. Regression, Part I I. Difference from correlation. II. Basic idea: A) Correlation describes the relationship between two variables, where neither is independent or a predictor. - In correlation, it would

More information

22s:152 Applied Linear Regression. 1-way ANOVA visual:

22s:152 Applied Linear Regression. 1-way ANOVA visual: 22s:152 Applied Linear Regression 1-way ANOVA visual: Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 Y We now consider an analysis

More information

Hypothesis Testing for Var-Cov Components

Hypothesis Testing for Var-Cov Components Hypothesis Testing for Var-Cov Components When the specification of coefficients as fixed, random or non-randomly varying is considered, a null hypothesis of the form is considered, where Additional output

More information

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept Interactions Lectures 1 & Regression Sometimes two variables appear related: > smoking and lung cancers > height and weight > years of education and income > engine size and gas mileage > GMAT scores and

More information

Chapter 1 Linear Regression with One Predictor

Chapter 1 Linear Regression with One Predictor STAT 525 FALL 2018 Chapter 1 Linear Regression with One Predictor Professor Min Zhang Goals of Regression Analysis Serve three purposes Describes an association between X and Y In some applications, the

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

Rule of Thumb Think beyond simple ANOVA when a factor is time or dose think ANCOVA.

Rule of Thumb Think beyond simple ANOVA when a factor is time or dose think ANCOVA. May 003: Think beyond simple ANOVA when a factor is time or dose think ANCOVA. Case B: Factorial ANOVA (New Rule, 6.3). A few corrections have been inserted in blue. [At times I encounter information that

More information

Answer to exercise: Blood pressure lowering drugs

Answer to exercise: Blood pressure lowering drugs Answer to exercise: Blood pressure lowering drugs The data set bloodpressure.txt contains data from a cross-over trial, involving three different formulations of a drug for lowering of blood pressure:

More information

Analyzing More Complex Experimental Designs

Analyzing More Complex Experimental Designs Analyzing More Complex Experimental Designs Experimental Constraints In the real world, you may find it impossible to obtain completely independent samples We already talked about some ways to handle simple

More information

Analysis of Variance: Repeated measures

Analysis of Variance: Repeated measures Repeated-Measures ANOVA: Analysis of Variance: Repeated measures Each subject participates in all conditions in the experiment (which is why it is called repeated measures). A repeated-measures ANOVA is

More information

SPH 247 Statistical Analysis of Laboratory Data

SPH 247 Statistical Analysis of Laboratory Data SPH 247 Statistical Analysis of Laboratory Data March 31, 2015 SPH 247 Statistical Analysis of Laboratory Data 1 ANOVA Fixed and Random Effects We will review the analysis of variance (ANOVA) and then

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs) 36-309/749 Experimental Design for Behavioral and Social Sciences Dec 1, 2015 Lecture 11: Mixed Models (HLMs) Independent Errors Assumption An error is the deviation of an individual observed outcome (DV)

More information

with the usual assumptions about the error term. The two values of X 1 X 2 0 1

with the usual assumptions about the error term. The two values of X 1 X 2 0 1 Sample questions 1. A researcher is investigating the effects of two factors, X 1 and X 2, each at 2 levels, on a response variable Y. A balanced two-factor factorial design is used with 1 replicate. The

More information

Analysis of Variance (ANOVA)

Analysis of Variance (ANOVA) Analysis of Variance (ANOVA) Two types of ANOVA tests: Independent measures and Repeated measures Comparing 2 means: X 1 = 20 t - test X 2 = 30 How can we Compare 3 means?: X 1 = 20 X 2 = 30 X 3 = 35 ANOVA

More information

Lecture 22 Mixed Effects Models III Nested designs

Lecture 22 Mixed Effects Models III Nested designs Lecture 22 Mixed Effects Models III Nested designs 94 Introduction: Crossed Designs The two-factor designs considered so far involve every level of the first factor occurring with every level of the second

More information

Unbalanced Data in Factorials Types I, II, III SS Part 1

Unbalanced Data in Factorials Types I, II, III SS Part 1 Unbalanced Data in Factorials Types I, II, III SS Part 1 Chapter 10 in Oehlert STAT:5201 Week 9 - Lecture 2 1 / 14 When we perform an ANOVA, we try to quantify the amount of variability in the data accounted

More information

Extensions of One-Way ANOVA.

Extensions of One-Way ANOVA. Extensions of One-Way ANOVA http://www.pelagicos.net/classes_biometry_fa17.htm What do I want You to Know What are two main limitations of ANOVA? What two approaches can follow a significant ANOVA? How

More information

Intruction to General and Generalized Linear Models

Intruction to General and Generalized Linear Models Intruction to General and Generalized Linear Models Mixed Effects Models IV Henrik Madsen Anna Helga Jónsdóttir hm@imm.dtu.dk April 30, 2012 Henrik Madsen Anna Helga Jónsdóttir (hm@imm.dtu.dk) Intruction

More information

Lecture 10: F -Tests, ANOVA and R 2

Lecture 10: F -Tests, ANOVA and R 2 Lecture 10: F -Tests, ANOVA and R 2 1 ANOVA We saw that we could test the null hypothesis that β 1 0 using the statistic ( β 1 0)/ŝe. (Although I also mentioned that confidence intervals are generally

More information

Introduction and Background to Multilevel Analysis

Introduction and Background to Multilevel Analysis Introduction and Background to Multilevel Analysis Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background and

More information

The Statistical Sleuth in R: Chapter 5

The Statistical Sleuth in R: Chapter 5 The Statistical Sleuth in R: Chapter 5 Linda Loi Kate Aloisio Ruobing Zhang Nicholas J. Horton January 21, 2013 Contents 1 Introduction 1 2 Diet and lifespan 2 2.1 Summary statistics and graphical display........................

More information

STAT 510 Final Exam Spring 2015

STAT 510 Final Exam Spring 2015 STAT 510 Final Exam Spring 2015 Instructions: The is a closed-notes, closed-book exam No calculator or electronic device of any kind may be used Use nothing but a pen or pencil Please write your name and

More information

Product Held at Accelerated Stability Conditions. José G. Ramírez, PhD Amgen Global Quality Engineering 6/6/2013

Product Held at Accelerated Stability Conditions. José G. Ramírez, PhD Amgen Global Quality Engineering 6/6/2013 Modeling Sub-Visible Particle Data Product Held at Accelerated Stability Conditions José G. Ramírez, PhD Amgen Global Quality Engineering 6/6/2013 Outline Sub-Visible Particle (SbVP) Poisson Negative Binomial

More information

Sample Size / Power Calculations

Sample Size / Power Calculations Sample Size / Power Calculations A Simple Example Goal: To study the effect of cold on blood pressure (mmhg) in rats Use a Completely Randomized Design (CRD): 12 rats are randomly assigned to one of two

More information

Correlated data. Repeated measurements over time. Typical set-up for repeated measurements. Traditional presentation of data

Correlated data. Repeated measurements over time. Typical set-up for repeated measurements. Traditional presentation of data Faculty of Health Sciences Repeated measurements over time Correlated data NFA, May 22, 2014 Longitudinal measurements Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics University of

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the

More information

Univariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data?

Univariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data? Univariate analysis Example - linear regression equation: y = ax + c Least squares criteria ( yobs ycalc ) = yobs ( ax + c) = minimum Simple and + = xa xc xy xa + nc = y Solve for a and c Univariate analysis

More information

Lecture 3 Linear random intercept models

Lecture 3 Linear random intercept models Lecture 3 Linear random intercept models Example: Weight of Guinea Pigs Body weights of 48 pigs in 9 successive weeks of follow-up (Table 3.1 DLZ) The response is measures at n different times, or under

More information

Stat 5303 (Oehlert): Balanced Incomplete Block Designs 1

Stat 5303 (Oehlert): Balanced Incomplete Block Designs 1 Stat 5303 (Oehlert): Balanced Incomplete Block Designs 1 > library(stat5303libs);library(cfcdae);library(lme4) > weardata

More information

Stat 5303 (Oehlert): Randomized Complete Blocks 1

Stat 5303 (Oehlert): Randomized Complete Blocks 1 Stat 5303 (Oehlert): Randomized Complete Blocks 1 > library(stat5303libs);library(cfcdae);library(lme4) > immer Loc Var Y1 Y2 1 UF M 81.0 80.7 2 UF S 105.4 82.3 3 UF V 119.7 80.4 4 UF T 109.7 87.2 5 UF

More information

Exam Applied Statistical Regression. Good Luck!

Exam Applied Statistical Regression. Good Luck! Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.

More information

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is

More information

De-mystifying random effects models

De-mystifying random effects models De-mystifying random effects models Peter J Diggle Lecture 4, Leahurst, October 2012 Linear regression input variable x factor, covariate, explanatory variable,... output variable y response, end-point,

More information

R Demonstration ANCOVA

R Demonstration ANCOVA R Demonstration ANCOVA Objective: The purpose of this week s session is to demonstrate how to perform an analysis of covariance (ANCOVA) in R, and how to plot the regression lines for each level of the

More information

Inference with Heteroskedasticity

Inference with Heteroskedasticity Inference with Heteroskedasticity Note on required packages: The following code requires the packages sandwich and lmtest to estimate regression error variance that may change with the explanatory variables.

More information

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December

More information

TABLE OF CONTENTS INTRODUCTION TO MIXED-EFFECTS MODELS...3

TABLE OF CONTENTS INTRODUCTION TO MIXED-EFFECTS MODELS...3 Table of contents TABLE OF CONTENTS...1 1 INTRODUCTION TO MIXED-EFFECTS MODELS...3 Fixed-effects regression ignoring data clustering...5 Fixed-effects regression including data clustering...1 Fixed-effects

More information

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES BIOL 458 - Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES PART 1: INTRODUCTION TO ANOVA Purpose of ANOVA Analysis of Variance (ANOVA) is an extremely useful statistical method

More information

1-Way Fixed Effects ANOVA

1-Way Fixed Effects ANOVA 1-Way Fixed Effects ANOVA James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 19 1-Way Fixed Effects ANOVA 1 Introduction

More information

STAT3401: Advanced data analysis Week 10: Models for Clustered Longitudinal Data

STAT3401: Advanced data analysis Week 10: Models for Clustered Longitudinal Data STAT3401: Advanced data analysis Week 10: Models for Clustered Longitudinal Data Berwin Turlach School of Mathematics and Statistics Berwin.Turlach@gmail.com The University of Western Australia Models

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Regression. Bret Hanlon and Bret Larget. December 8 15, Department of Statistics University of Wisconsin Madison.

Regression. Bret Hanlon and Bret Larget. December 8 15, Department of Statistics University of Wisconsin Madison. Regression Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison December 8 15, 2011 Regression 1 / 55 Example Case Study The proportion of blackness in a male lion s nose

More information

Part II { Oneway Anova, Simple Linear Regression and ANCOVA with R

Part II { Oneway Anova, Simple Linear Regression and ANCOVA with R Part II { Oneway Anova, Simple Linear Regression and ANCOVA with R Gilles Lamothe February 21, 2017 Contents 1 Anova with one factor 2 1.1 The data.......................................... 2 1.2 A visual

More information

Introduction to Mixed Models in R

Introduction to Mixed Models in R Introduction to Mixed Models in R Galin Jones School of Statistics University of Minnesota http://www.stat.umn.edu/ galin March 2011 Second in a Series Sponsored by Quantitative Methods Collaborative.

More information

36-720: Linear Mixed Models

36-720: Linear Mixed Models 36-720: Linear Mixed Models Brian Junker October 8, 2007 Review: Linear Mixed Models (LMM s) Bayesian Analogues Facilities in R Computational Notes Predictors and Residuals Examples [Related to Christensen

More information

CHAPTER EIGHT Linear Regression

CHAPTER EIGHT Linear Regression 7 CHAPTER EIGHT Linear Regression 8. Scatter Diagram Example 8. A chemical engineer is investigating the effect of process operating temperature ( x ) on product yield ( y ). The study results in the following

More information

Simple, Marginal, and Interaction Effects in General Linear Models

Simple, Marginal, and Interaction Effects in General Linear Models Simple, Marginal, and Interaction Effects in General Linear Models PRE 905: Multivariate Analysis Lecture 3 Today s Class Centering and Coding Predictors Interpreting Parameters in the Model for the Means

More information

Simulation and Analysis of Data from a Classic Split Plot Experimental Design

Simulation and Analysis of Data from a Classic Split Plot Experimental Design Simulation and Analysis of Data from a Classic Split Plot Experimental Design 1 Split-Plot Experimental Designs Field Plot Block 1 Block 2 Block 3 Block 4 Genotype C Genotype B Genotype A Genotype B Genotype

More information

R Output for Linear Models using functions lm(), gls() & glm()

R Output for Linear Models using functions lm(), gls() & glm() LM 04 lm(), gls() &glm() 1 R Output for Linear Models using functions lm(), gls() & glm() Different kinds of output related to linear models can be obtained in R using function lm() {stats} in the base

More information

Chapter 3 ANALYSIS OF RESPONSE PROFILES

Chapter 3 ANALYSIS OF RESPONSE PROFILES Chapter 3 ANALYSIS OF RESPONSE PROFILES 78 31 Introduction In this chapter we present a method for analysing longitudinal data that imposes minimal structure or restrictions on the mean responses over

More information

Explanatory Variables Must be Linear Independent...

Explanatory Variables Must be Linear Independent... Explanatory Variables Must be Linear Independent... Recall the multiple linear regression model Y j = β 0 + β 1 X 1j + β 2 X 2j + + β p X pj + ε j, i = 1,, n. is a shorthand for n linear relationships

More information

1 Use of indicator random variables. (Chapter 8)

1 Use of indicator random variables. (Chapter 8) 1 Use of indicator random variables. (Chapter 8) let I(A) = 1 if the event A occurs, and I(A) = 0 otherwise. I(A) is referred to as the indicator of the event A. The notation I A is often used. 1 2 Fitting

More information

A brief introduction to mixed models

A brief introduction to mixed models A brief introduction to mixed models University of Gothenburg Gothenburg April 6, 2017 Outline An introduction to mixed models based on a few examples: Definition of standard mixed models. Parameter estimation.

More information

CHAPTER 8 MODEL DIAGNOSTICS. 8.1 Residual Analysis

CHAPTER 8 MODEL DIAGNOSTICS. 8.1 Residual Analysis CHAPTER 8 MODEL DIAGNOSTICS We have now discussed methods for specifying models and for efficiently estimating the parameters in those models. Model diagnostics, or model criticism, is concerned with testing

More information

multilevel modeling: concepts, applications and interpretations

multilevel modeling: concepts, applications and interpretations multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models

More information

Stat 500 Midterm 2 12 November 2009 page 0 of 11

Stat 500 Midterm 2 12 November 2009 page 0 of 11 Stat 500 Midterm 2 12 November 2009 page 0 of 11 Please put your name on the back of your answer book. Do NOT put it on the front. Thanks. Do not start until I tell you to. The exam is closed book, closed

More information

Models for longitudinal data

Models for longitudinal data Faculty of Health Sciences Contents Models for longitudinal data Analysis of repeated measurements, NFA 016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

610 - R1A "Make friends" with your data Psychology 610, University of Wisconsin-Madison

610 - R1A Make friends with your data Psychology 610, University of Wisconsin-Madison 610 - R1A "Make friends" with your data Psychology 610, University of Wisconsin-Madison Prof Colleen F. Moore Note: The metaphor of making friends with your data was used by Tukey in some of his writings.

More information

1 Multiple Regression

1 Multiple Regression 1 Multiple Regression In this section, we extend the linear model to the case of several quantitative explanatory variables. There are many issues involved in this problem and this section serves only

More information

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

One-Way ANOVA. Some examples of when ANOVA would be appropriate include: One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement

More information

Multiple Linear Regression. Chapter 12

Multiple Linear Regression. Chapter 12 13 Multiple Linear Regression Chapter 12 Multiple Regression Analysis Definition The multiple regression model equation is Y = b 0 + b 1 x 1 + b 2 x 2 +... + b p x p + ε where E(ε) = 0 and Var(ε) = s 2.

More information

One-way between-subjects ANOVA. Comparing three or more independent means

One-way between-subjects ANOVA. Comparing three or more independent means One-way between-subjects ANOVA Comparing three or more independent means Data files SpiderBG.sav Attractiveness.sav Homework: sourcesofself-esteem.sav ANOVA: A Framework Understand the basic principles

More information

Econometrics. 7) Endogeneity

Econometrics. 7) Endogeneity 30C00200 Econometrics 7) Endogeneity Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Common types of endogeneity Simultaneity Omitted variables Measurement errors

More information

LECTURE 15: SIMPLE LINEAR REGRESSION I

LECTURE 15: SIMPLE LINEAR REGRESSION I David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).

More information

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form: Outline for today What is a generalized linear model Linear predictors and link functions Example: fit a constant (the proportion) Analysis of deviance table Example: fit dose-response data using logistic

More information

Example: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA

Example: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA s:5 Applied Linear Regression Chapter 8: ANOVA Two-way ANOVA Used to compare populations means when the populations are classified by two factors (or categorical variables) For example sex and occupation

More information

Lab 7 Multiple Regression and F Tests Of a Subset Of Predictors

Lab 7 Multiple Regression and F Tests Of a Subset Of Predictors Lab 7 Multiple Regression and F Tests Of a Subset Of Predictors Preliminary Information: [1] Last week someone wanted to change the y axis labeling on a plot of the TukeyHSD plot(). The labels printed

More information

ACOVA and Interactions

ACOVA and Interactions Chapter 15 ACOVA and Interactions Analysis of covariance (ACOVA) incorporates one or more regression variables into an analysis of variance. As such, we can think of it as analogous to the two-way ANOVA

More information

Swarthmore Honors Exam 2012: Statistics

Swarthmore Honors Exam 2012: Statistics Swarthmore Honors Exam 2012: Statistics 1 Swarthmore Honors Exam 2012: Statistics John W. Emerson, Yale University NAME: Instructions: This is a closed-book three-hour exam having six questions. You may

More information

Regression Analysis: Basic Concepts

Regression Analysis: Basic Concepts The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance

More information

ANOVA. Testing more than 2 conditions

ANOVA. Testing more than 2 conditions ANOVA Testing more than 2 conditions ANOVA Today s goal: Teach you about ANOVA, the test used to measure the difference between more than two conditions Outline: - Why anova? - Contrasts and post-hoc tests

More information

ANCOVA. Psy 420 Andrew Ainsworth

ANCOVA. Psy 420 Andrew Ainsworth ANCOVA Psy 420 Andrew Ainsworth What is ANCOVA? Analysis of covariance an extension of ANOVA in which main effects and interactions are assessed on DV scores after the DV has been adjusted for by the DV

More information