Repeated measures, part 1, simple methods
|
|
- Allyson Walton
- 5 years ago
- Views:
Transcription
1 enote 11 1 enote 11 Repeated measures, part 1, simple methods
2 enote 11 INDHOLD 2 Indhold 11 Repeated measures, part 1, simple methods Intro Main example: Activity of rats Separate analyses for each time point Example: Activity of rats analyzed separately for each month Analysis of summary statistics Example: Activity of rats analyzed via summary measure Random effects approach Example: Activity of rats analyzed via random effects model Pros and cons of simple approaches The R-package nlme and the function lme Exercises Intro This module describe various simple approaches for analyzing repeated measurements, and show how these analyses can be carried out in R. Data referred to as Repeated measurements (or sometimes as longitudinal data ) can be characterized by having several measurements on the same individuals or experimental units. These measurements
3 enote INTRO 3 are typically taken at different times, or at different positions within the individuals. Consider for instance the following experimental design, to compare two drugs (A and B) to reduce blood pressure: 1. Twenty individuals were selected randomly from the relevant population. 2. Half of these were given drug A and half were given drug B (randomly selected). 3. For a period of two months these individuals had their blood pressure measured every week, which resulted in eight measurements on each individual. The problem is that data collected this way, might be in violation of the standard assumption of independent measurements. It seems fair to expect two measurements from the same individual to be positively correlated, which would result in more similar measurements than two measurements from different individuals. Furthermore, two measurements taken on the same individual might be highly correlated if they are measured at two time points close to each other, but less correlated (or maybe independent) if they are measured far apart. This module describes some fairly simple (and maybe crude) methods for analyzing these data types. These methods include: Separate analyses for each time point Analysis of summary statistic Random effects approach The analysis of repeated measurements will continue in the next module, where some more explicit covariance models will be shown. The simplest approach to analyse repeated measurements would be to include time as a factor, and ignore the dependence between two observations on the same individual. Such an approach may lead to completely wrong conclusions. The essence of the problem is that this is the same as pretending to have more observations than are actually available. Two correlated observations contain less information than two independent observations, because one is partly explained by the other. This approach is unacceptable.
4 enote INTRO Main example: Activity of rats To investigate the effect of a certain type of exposure on the activity of rats, the following experiment was carried out. The experimental unit was a cage with two rats. During the entire experimental period the rats were daily exposed to the matter under investigation, in the concentration of 1, 2 or 3 units (treatment 1, 2 and 3, respectively). Once per month during 10 months the activity of the rats was measured by placing the rats from one cage in a chamber in which each intersection of a light beam was counted. The total count through a period of 57 hours was used as the result for that cage. Notice that in this setting the individual variable is cage. Summary of experiment: 3 treatments: 1, 2, 3 (concentration) 10 cages per treatment 10 contiguous months The response is activity (count of intersections of light beam during 57 hours). Here y = log(counts) is used for the analysis, because a residual plot showed that this was the most reasonable. The observations are listed in table , and the observations are plotted in Figure From the figure it seems that the activity is decreasing from month 1 to month 10 (maybe as a linear function?), and maybe that there is a small difference between the different doses. Plotting the individual curves is a very useful tool in the analysis of repeated measures. This should always be the first step. Quite often the main conclusions from the analysis can already be seen from a good plot of the data. Here we use interaction.plot to plot the individual profiles the resulting figure is seen in Figure 11.1: rats <- read.table("rats.txt", header=true, sep=",", dec=".") rats$month <- factor(rats$month) rats$treatm <- factor(rats$treatm) rats$cage <- factor(rats$cage) with(rats, interaction.plot(month, cage, lnc, legend = FALSE, las=1, lty = rep(1:3, each = 10), col = rep(2:4, each = 10))) Using the interaction.plot function here only produces the right plot because the time points are equidistant in these data. More generally one would need a quantitative
5 enote INTRO mean of lnc month Figur 11.1: The log(counts) for each cage plotted against month. The solid red lines are cages receiving dose=1, dashed green lines are dose=2, and dotted blue lines are dose=3.
6 enote INTRO 6 Month Dose Cage Tabel 11.1: The rats data set, here the raw activity counts are listed. copy of the time variable and use this as the x-axis in the plotting, e.g. using the ggplot2 package: require(ggplot2) rats$monthq <- as.numeric(rats$month) ggplot(rats, aes(x=monthq, y=lnc, group=cage, colour=treatm)) + geom_line()
7 enote INTRO lnc 9.5 treatm monthq Or the treatment group average time profiles: require(plyr) mns <- ddply(rats, ~ treatm + month + monthq, summarize, lnc = mean(lnc)) ggplot(mns, aes(x=monthq, y=lnc, group=treatm, colour=treatm)) + geom_point() + geom_line()
8 enote SEPARATE ANALYSES FOR EACH TIME POINT lnc treatm monthq Here we used the ddply function from the plyr package to compute the mean lnc for each level or unique value of treatm, month and monthq. We will not describe the details of this function or package, but only note that it is very efficient for data manipulation Separate analyses for each time point One way to avoid the problem of correlated measurements is to do a separate analysis for each point in time. This way only one observation from each individual is used, and hence they are independent. This way of analyzing repeated measurements is not wrong, but it is very inefficient, as all the remaining observations are wasted. This approach avoids the problem, instead of dealing with it. Separate analyses can be carried out for all the observed time points, but it will likely be very difficult to reach a coherent conclusion from all these sub tests. These sub tests will be correlated, and because the correlation structure is not part of the model, it is not possible to tell how strong this correlation is.
9 enote SEPARATE ANALYSES FOR EACH TIME POINT 9 Separate analyses can be carried out for selected time points far apart. This will (hopefully) cause the separate sub tests to be uncorrelated, or at least less correlated. Even with uncorrelated tests it will be difficult to reach a coherent conclusion, because of a problem known as mass significance (or multiplicity). For instance, if 20 tests are carried out at a 5% significance level, one of them is likely to be a false positive, i.e. a falsely significant p value. This problem is partly solved by using the Bonferroni correction for performing n tests (one for each time point). The Bonferroni correction simply states that the significance level 0.05/n should be used instead of the usual 0.05 (which sometimes might be shown by mutiplying the calculated p-value with n.) When selecting time points far apart, it is important that the selection must be done independently of the actual observations. Naturally the time points may not be selected systematically where there is large (or small) difference between treatments. Ideally the time points should be selected before data are collected Example: Activity of rats analyzed separately for each month Consider the rats data available in the rats.txt file. head(rats) treatm cage month lnc monthq To analyze the rats data set separately for each month, a simple one way analysis of variance model with treatment treatm as the only factor is used. The information about cage cannot be included, as we only have one observation from each cage in each monthly analysis. The model for each month is: lnc i = µ + α(treatm i ) + ε i, ε i i.i.d. N(0, σ 2 ), i = To do this in R, we split the rats data frame into a list of data frames, one for each month. We then apply (using sapply) the function fn, which fits the linear model and extracts the F and p values, to each of the data frames in the list:
10 enote ANALYSIS OF SUMMARY STATISTICS 10 ratsl <- split(rats, f = rats$month) # a list of data.frames # Function to fit model, get F and p from anova table: fn <- function(df) unlist(anova(lm(lnc ~ treatm, data=df))[1, c("f value", "Pr(>F)")]) # Alternative using plyr-functions: # round(t(daply(rats, ~ month,.fun=fn)), 2) round(sapply(ratsl, fn), 3) F value Pr(>F) These F values should be compared with F 95%,2,27 = 3.35 or with F 99.5%,2,27 = 6.49 if the Bonferroni correction is used. A few significant values are found, and even one if the Bonferroni correction is used, so the conclusion should be that weak evidence of group difference have been seen. It is possible to make a correct analysis time by time, but it is weak and often confusing, because it does not combine all information into one test Analysis of summary statistics Another way to avoid the problem of correlated measurements is to choose a single measure to summarize the individual curves, and then base the analysis on this measure. This again reduces the data set to independent observations one for each individual. To analyze the summary data set, standard methods for independent observations for instance analysis of variance can be used. The key is to choose a good summary measure. One possibility is to choose the value at a given time point, which reduces this summary method to the separate time point analysis described in the previous section. This choice is poor in most cases, because all other measurements are wasted. It is difficult to give general advice about the choice of summery measurement. Ideally, the summary measure should capture the most important feature of the curve. In some situations the most important feature is the net growth (last minus first), the average growth (slope), or time to reach the maximum point. It depends on the problem at hand.
11 enote ANALYSIS OF SUMMARY STATISTICS 11 Some common choices of summary measures are: Average over time Slope in regression with time (or higher order polynomial coefficients) Total increase (last point minus first point) Area under curve (AUC) Maximum or minimum point With the right choice of summary measure this type of analysis can be very useful, at least as a first step. These models have relatively few assumptions, and they can be checked via standard residual methods. Of course the downside of this method is that information may be lost by reducing each curve to one single measure Example: Activity of rats analyzed via summary measure The choice of summary measure for the rats data set is partly inspired by Figure It seems that the average slope is similar for the three treatments, but that the curves from dose=3 tends to be a slightly higher than the rest of the curves. To see if this is a significant difference the logarithm of the total count during all ten months lntot = log(total count) is used as summary measure. To calculate this summary measure from the previously described data set, the variable containing the log counts from each month lnc must be transformed back to the original counts, then the sum must be calculated, and finally the logarithm must be applied to the sum. This summary data set consists of independent measurements, as each cage is only used to generate one summary observation. Because it is now independent observations, it can be analyzed with a simple one way ANOVA model: lntot i = µ + α(treatm i ) + ε i, ε i i.i.d. N(0, σ 2 ), i = These operations can be done in R by writing: (The variable containing the logarithm of the total counts is called lnc)
12 enote RANDOM EFFECTS APPROACH 12 rats_sum <- ddply(rats,.(cage, treatm), summarize, logsum_count = log(sum(exp(lnc)))) anova(lm(logsum_count ~ treatm, data=rats_sum)) Analysis of Variance Table Response: logsum_count Df Sum Sq Mean Sq F value Pr(>F) treatm Residuals Signif. codes: 0 *** ** 0.01 * The p value for no treatment effect in this summary model is 5.22%. This is above the standard 5% significant level, but only slightly. In this analysis the entire curve has been summarized into a single measure, so a lot of information has been lost. A p value this low for the crude summary analysis could indicate that a significant treatment effect might be found with a more sophisticated analysis Random effects approach The two approaches described above both illustrated ways to reduce the data set to independent measures. This section explains the first step in modeling the actual covariance. As seen in previous modules, for instance the module about hierarchial random effects, the effect of adding a random effect is that two observations from the same level will possibly be positively correlated. Adding the individual factor to the model as a random effect will allow two observations from the same individual to be positively correlated Example: Activity of rats analyzed via random effects model It is reasonable to assume that two observations from the same cage could be correlated, so the model with cage as random effect is used. The factor month and the interaction between month and treatment are included. This was not possible in the previous models, because each curve was reduced into one number. In this analysis all observations
13 enote RANDOM EFFECTS APPROACH 13 are included into one coherent analysis. The model is: lnc i = µ + α(treatm i ) + β(month i ) + γ(treatm i, month i ) + d(cage i ) + ε i, where i = , d(cage i ) N(0, σ 2 d ), ε i N(0, σ 2 ), and all independent. Recall from previous modules that the covariance structure for this model is: 0, if cage i1 = cage i2 and i 1 = i 2 cov(y i1, y i2 ) = σ 2 d σd 2 + σ2, if cage i1 = cage i2 and i 1 = i 2, if i 1 = i 2 In other words this is the variance structure, where two observations from different cages are uncorrelated, and two observations from the same cage are positively correlated with correlation coefficient σ 2 d /(σ2 d + σ2 ). The following lines implement this model in R: require(lmertest) model1 <- lmer(lnc ~ month + treatm + month:treatm + (1 cage), data = rats) anova(model1) Analysis of Variance Table of type III with Satterthwaite approximation for degrees of freedom Sum Sq Mean Sq NumDF DenDF F.value Pr(>F) month < 2.2e-16 *** treatm month:treatm ** --- Signif. codes: 0 *** ** 0.01 * VarCorr(model1) Groups Name Std.Dev. cage (Intercept) Residual c(-2 * loglik(model1, REML=TRUE)) # REML=TRUE is default [1]
14 enote PROS AND CONS OF SIMPLE APPROACHES 14 This output give estimates of the variance parameters (σ 2 d = and σ 2 = ), twice the negative restricted/residual log likelihood (2lre = 8.61), and an ANOVA table for the fixed effects of the model. From this ANOVA table it is seen that the interaction between treatment and month is significant with a p value= The conclusion from this model is that treatment does have an effect on the activity, but the effect is not the same in all ten months. The main problem with this random effects approach is that all measurements on the same individual are assumed equally correlated, but some measurements are taken far apart and some measurements are taken close to each other, so this assumption is not always valid. The next module will suggest a few ways to deal with this problem. However, this random effects approach may give reasonable results for short series (with 2, 3, or 4 measurements on each individual) since the assumption of equal correlation may be ok in those cases. This random effects approach is also known as the split plot approach, or the split plot model. It is possible to view repeated measurements data as resulting from a kind of split plot experiment, with individuals as the main plots to which the treatments are applied. The sub plots are then the single measurements on each individuals. This interpretation is a bit weak, as the single measurements on each individual (typically at different times) cannot be randomized within the individual Pros and cons of simple approaches In this module a few simple approaches to the analysis of repeated measurements have been described. In many practical cases these simple approaches, especially the summary method, will give a sufficient and useful analysis of the data. Even in those cases where more sophisticated models are needed it is often helpful to run a few simple models first. Here follows a few pros and cons of the different methods: Separate analysis for each time point + Not wrong Can be confusing Difficult to reach coherent conclusion In general not very informative Analysis of summary statistic
15 enote THE R-PACKAGE NLME AND THE FUNCTION LME 15 + Good method with few and easily checked assumptions Important to choose good summary measure(s) Random effects approach + Good method for short series + Uses all observations Usually not good for long series 11.6 The R-package nlme and the function lme For what comes in the next module, the correlated residuals models, we will have to turn to the lme function of the nlme package. These model structures are not yet available by the lme4-package, while they may be implemented in the future. The lme function has a somewhat different syntax and also a somewhat different structure in the results. To run the simple split-plot version of the repeated measures model also given above: library(nlme) model2 <- lme(lnc ~ month + treatm + month:treatm, random = ~1 cage, data = rats) anova(model2) numdf dendf F-value p-value (Intercept) <.0001 month <.0001 treatm month:treatm VarCorr(model2) cage = pdlogchol(1) Variance StdDev (Intercept) Residual c(-2 * loglik(model2))
16 enote EXERCISES 16 [1] intervals(model2, which = "var-cov") Approximate 95% confidence intervals Random Effects: Level: cage lower est. upper sd((intercept)) Within-group standard error: lower est. upper The confidence intervals for the variance structure parameters produced here are not the same as those produced by the confint function (which cannot produce profile intervals for lme results). Instead they are Wald intervals constructed for log-standard deviations. This matches with the fact that the intervals are symmetric on the log-scale: ins <- intervals(model2, which = "var-cov") lins <- log(ins$sigma) unname(c(lins[2]-lins[1], lins[3]-lins[2])) [1] Exercises Exercise 1 Histamine concentration on dogs In an experiment with 16 dogs the blood histamine concentration was measured 0, 1, 3, and 5 minutes after injection of morphine or trimethaphane. Before injection the dogs were classified into two groups according to their level of histamine (intact or depleted). The data are available in the file histamin.txt and partly listed below.
17 enote EXERCISES 17 treatm level dog min hist morphine intact morphine intact morphine intact morphine intact morphine deplet morphine deplet morphine deplet morphine deplet trimetha intact trimetha intact trimetha intact trimetha intact trimetha deplet trimetha deplet trimetha deplet trimetha deplet morphine intact (64 lines total)..... trimetha deplet The main focus of this experiment is to compare the effect of trimethaphane to the effect of morphine. a) Make a plot of the data, for instance one line for each dog (maybe colored differently in each treatment group). b) Analyze these data using one or more of the simple methods. c) Formulate a conclusion about the treatment.
18 enote EXERCISES 18 Exercise 2 Growth of guinea pigs In an investigation of the effect of vitamin E on the growth of guinea pigs 15 animals were observed for 7 weeks. In week one they were given a growth inhibiting substance. In the beginning of week five they received different amounts of vitamin E (dosage 0, 1, or 2). there were five animals in each treatment group, and each animal were weighted at the end of week 1, 3, 4, 5, 6, and 7. The data is available in the file guinea.txt and is partly listed below. animal week weight dose (90 lines total) The focus of this experiment is the effect of vitamin E on the growth of guinea pigs. a) Plot of the data. b) What is the conclusion about vitamin E?
Repeated measures, part 2, advanced methods
enote 12 1 enote 12 Repeated measures, part 2, advanced methods enote 12 INDHOLD 2 Indhold 12 Repeated measures, part 2, advanced methods 1 12.1 Intro......................................... 3 12.2 A
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models Mixed effects models - II Henrik Madsen, Jan Kloppenborg Møller, Anders Nielsen April 16, 2012 H. Madsen, JK. Møller, A. Nielsen () Chapman & Hall
More informationHierarchical Random Effects
enote 5 1 enote 5 Hierarchical Random Effects enote 5 INDHOLD 2 Indhold 5 Hierarchical Random Effects 1 5.1 Introduction.................................... 2 5.2 Main example: Lactase measurements in
More informationWorkshop 9.3a: Randomized block designs
-1- Workshop 93a: Randomized block designs Murray Logan November 23, 16 Table of contents 1 Randomized Block (RCB) designs 1 2 Worked Examples 12 1 Randomized Block (RCB) designs 11 RCB design Simple Randomized
More informationSolution pigs exercise
Solution pigs exercise Course repeated measurements - R exercise class 2 November 24, 2017 Contents 1 Question 1: Import data 3 1.1 Data management..................................... 3 1.2 Inspection
More informationSimple Linear Regression
Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University
More informationSTK4900/ Lecture 10. Program
STK4900/9900 - Lecture 10 Program 1. Repeated measures and longitudinal data 2. Simple analysis approaches 3. Random effects models 4. Generalized estimating equations (GEE) 5. GEE for binary data (and
More informationMixed Model Theory, Part I
enote 4 1 enote 4 Mixed Model Theory, Part I enote 4 INDHOLD 2 Indhold 4 Mixed Model Theory, Part I 1 4.1 Design matrix for a systematic linear model.................. 2 4.2 The mixed model.................................
More informationWorkshop 9.1: Mixed effects models
-1- Workshop 91: Mixed effects models Murray Logan October 10, 2016 Table of contents 1 Non-independence - part 2 1 1 Non-independence - part 2 11 Linear models Homogeneity of variance σ 2 0 0 y i = β
More informationThe Analysis of Split-Plot Experiments
enote 7 1 enote 7 The Analysis of Split-Plot Experiments enote 7 INDHOLD 2 Indhold 7 The Analysis of Split-Plot Experiments 1 7.1 Introduction.................................... 2 7.2 The Split-Plot Model...............................
More informationSMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning
SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning 1 Agenda 1. Comparison of Treatments (One Variable) Analysis of Variance
More informationMixed Model: Split plot with two whole-plot factors, one split-plot factor, and CRD at the whole-plot level (e.g. fancier split-plot p.
STAT:5201 Applied Statistic II Mixed Model: Split plot with two whole-plot factors, one split-plot factor, and CRD at the whole-plot level (e.g. fancier split-plot p.422 OLRT) Hamster example with three
More informationCorrelated Data: Linear Mixed Models with Random Intercepts
1 Correlated Data: Linear Mixed Models with Random Intercepts Mixed Effects Models This lecture introduces linear mixed effects models. Linear mixed models are a type of regression model, which generalise
More informationMixed models with correlated measurement errors
Mixed models with correlated measurement errors Rasmus Waagepetersen October 9, 2018 Example from Department of Health Technology 25 subjects where exposed to electric pulses of 11 different durations
More informationLongitudinal data: simple univariate methods of analysis
Longitudinal data: simple univariate methods of analysis Danish version by Henrik Stryhn, June 1996 Department of Mathematcis and Physics, KVL Translation (and rewriting) by Ib Skovgaard, March 1998 (the
More informationRegression Models - Introduction
Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent
More informationBIOSTATISTICAL METHODS
BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH Cross-over Designs #: DESIGNING CLINICAL RESEARCH The subtraction of measurements from the same subject will mostly cancel or minimize effects
More information22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)
22s:152 Applied Linear Regression Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) We now consider an analysis with only categorical predictors (i.e. all predictors are
More informationDo not copy, post, or distribute
14 CORRELATION ANALYSIS AND LINEAR REGRESSION Assessing the Covariability of Two Quantitative Properties 14.0 LEARNING OBJECTIVES In this chapter, we discuss two related techniques for assessing a possible
More informationOutline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form
Outline Statistical inference for linear mixed models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark general form of linear mixed models examples of analyses using linear mixed
More informationStat 579: Generalized Linear Models and Extensions
Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 14 1 / 64 Data structure and Model t1 t2 tn i 1st subject y 11 y 12 y 1n1 2nd subject
More informationSTK4900/ Lecture 3. Program
STK4900/9900 - Lecture 3 Program 1. Multiple regression: Data structure and basic questions 2. The multiple linear regression model 3. Categorical predictors 4. Planned experiments and observational studies
More informationdf=degrees of freedom = n - 1
One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:
More informationThe First Thing You Ever Do When Receive a Set of Data Is
The First Thing You Ever Do When Receive a Set of Data Is Understand the goal of the study What are the objectives of the study? What would the person like to see from the data? Understand the methodology
More informationExtensions of One-Way ANOVA.
Extensions of One-Way ANOVA http://www.pelagicos.net/classes_biometry_fa18.htm What do I want You to Know What are two main limitations of ANOVA? What two approaches can follow a significant ANOVA? How
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationRandomized Block Designs with Replicates
LMM 021 Randomized Block ANOVA with Replicates 1 ORIGIN := 0 Randomized Block Designs with Replicates prepared by Wm Stein Randomized Block Designs with Replicates extends the use of one or more random
More informationAnalysis of Variance
Statistical Techniques II EXST7015 Analysis of Variance 15a_ANOVA_Introduction 1 Design The simplest model for Analysis of Variance (ANOVA) is the CRD, the Completely Randomized Design This model is also
More informationRegression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.
Regression, Part I I. Difference from correlation. II. Basic idea: A) Correlation describes the relationship between two variables, where neither is independent or a predictor. - In correlation, it would
More information22s:152 Applied Linear Regression. 1-way ANOVA visual:
22s:152 Applied Linear Regression 1-way ANOVA visual: Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 Y We now consider an analysis
More informationHypothesis Testing for Var-Cov Components
Hypothesis Testing for Var-Cov Components When the specification of coefficients as fixed, random or non-randomly varying is considered, a null hypothesis of the form is considered, where Additional output
More informationInteractions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept
Interactions Lectures 1 & Regression Sometimes two variables appear related: > smoking and lung cancers > height and weight > years of education and income > engine size and gas mileage > GMAT scores and
More informationChapter 1 Linear Regression with One Predictor
STAT 525 FALL 2018 Chapter 1 Linear Regression with One Predictor Professor Min Zhang Goals of Regression Analysis Serve three purposes Describes an association between X and Y In some applications, the
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationRule of Thumb Think beyond simple ANOVA when a factor is time or dose think ANCOVA.
May 003: Think beyond simple ANOVA when a factor is time or dose think ANCOVA. Case B: Factorial ANOVA (New Rule, 6.3). A few corrections have been inserted in blue. [At times I encounter information that
More informationAnswer to exercise: Blood pressure lowering drugs
Answer to exercise: Blood pressure lowering drugs The data set bloodpressure.txt contains data from a cross-over trial, involving three different formulations of a drug for lowering of blood pressure:
More informationAnalyzing More Complex Experimental Designs
Analyzing More Complex Experimental Designs Experimental Constraints In the real world, you may find it impossible to obtain completely independent samples We already talked about some ways to handle simple
More informationAnalysis of Variance: Repeated measures
Repeated-Measures ANOVA: Analysis of Variance: Repeated measures Each subject participates in all conditions in the experiment (which is why it is called repeated measures). A repeated-measures ANOVA is
More informationSPH 247 Statistical Analysis of Laboratory Data
SPH 247 Statistical Analysis of Laboratory Data March 31, 2015 SPH 247 Statistical Analysis of Laboratory Data 1 ANOVA Fixed and Random Effects We will review the analysis of variance (ANOVA) and then
More information36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)
36-309/749 Experimental Design for Behavioral and Social Sciences Dec 1, 2015 Lecture 11: Mixed Models (HLMs) Independent Errors Assumption An error is the deviation of an individual observed outcome (DV)
More informationwith the usual assumptions about the error term. The two values of X 1 X 2 0 1
Sample questions 1. A researcher is investigating the effects of two factors, X 1 and X 2, each at 2 levels, on a response variable Y. A balanced two-factor factorial design is used with 1 replicate. The
More informationAnalysis of Variance (ANOVA)
Analysis of Variance (ANOVA) Two types of ANOVA tests: Independent measures and Repeated measures Comparing 2 means: X 1 = 20 t - test X 2 = 30 How can we Compare 3 means?: X 1 = 20 X 2 = 30 X 3 = 35 ANOVA
More informationLecture 22 Mixed Effects Models III Nested designs
Lecture 22 Mixed Effects Models III Nested designs 94 Introduction: Crossed Designs The two-factor designs considered so far involve every level of the first factor occurring with every level of the second
More informationUnbalanced Data in Factorials Types I, II, III SS Part 1
Unbalanced Data in Factorials Types I, II, III SS Part 1 Chapter 10 in Oehlert STAT:5201 Week 9 - Lecture 2 1 / 14 When we perform an ANOVA, we try to quantify the amount of variability in the data accounted
More informationExtensions of One-Way ANOVA.
Extensions of One-Way ANOVA http://www.pelagicos.net/classes_biometry_fa17.htm What do I want You to Know What are two main limitations of ANOVA? What two approaches can follow a significant ANOVA? How
More informationIntruction to General and Generalized Linear Models
Intruction to General and Generalized Linear Models Mixed Effects Models IV Henrik Madsen Anna Helga Jónsdóttir hm@imm.dtu.dk April 30, 2012 Henrik Madsen Anna Helga Jónsdóttir (hm@imm.dtu.dk) Intruction
More informationLecture 10: F -Tests, ANOVA and R 2
Lecture 10: F -Tests, ANOVA and R 2 1 ANOVA We saw that we could test the null hypothesis that β 1 0 using the statistic ( β 1 0)/ŝe. (Although I also mentioned that confidence intervals are generally
More informationIntroduction and Background to Multilevel Analysis
Introduction and Background to Multilevel Analysis Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background and
More informationThe Statistical Sleuth in R: Chapter 5
The Statistical Sleuth in R: Chapter 5 Linda Loi Kate Aloisio Ruobing Zhang Nicholas J. Horton January 21, 2013 Contents 1 Introduction 1 2 Diet and lifespan 2 2.1 Summary statistics and graphical display........................
More informationSTAT 510 Final Exam Spring 2015
STAT 510 Final Exam Spring 2015 Instructions: The is a closed-notes, closed-book exam No calculator or electronic device of any kind may be used Use nothing but a pen or pencil Please write your name and
More informationProduct Held at Accelerated Stability Conditions. José G. Ramírez, PhD Amgen Global Quality Engineering 6/6/2013
Modeling Sub-Visible Particle Data Product Held at Accelerated Stability Conditions José G. Ramírez, PhD Amgen Global Quality Engineering 6/6/2013 Outline Sub-Visible Particle (SbVP) Poisson Negative Binomial
More informationSample Size / Power Calculations
Sample Size / Power Calculations A Simple Example Goal: To study the effect of cold on blood pressure (mmhg) in rats Use a Completely Randomized Design (CRD): 12 rats are randomly assigned to one of two
More informationCorrelated data. Repeated measurements over time. Typical set-up for repeated measurements. Traditional presentation of data
Faculty of Health Sciences Repeated measurements over time Correlated data NFA, May 22, 2014 Longitudinal measurements Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics University of
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the
More informationUnivariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data?
Univariate analysis Example - linear regression equation: y = ax + c Least squares criteria ( yobs ycalc ) = yobs ( ax + c) = minimum Simple and + = xa xc xy xa + nc = y Solve for a and c Univariate analysis
More informationLecture 3 Linear random intercept models
Lecture 3 Linear random intercept models Example: Weight of Guinea Pigs Body weights of 48 pigs in 9 successive weeks of follow-up (Table 3.1 DLZ) The response is measures at n different times, or under
More informationStat 5303 (Oehlert): Balanced Incomplete Block Designs 1
Stat 5303 (Oehlert): Balanced Incomplete Block Designs 1 > library(stat5303libs);library(cfcdae);library(lme4) > weardata
More informationStat 5303 (Oehlert): Randomized Complete Blocks 1
Stat 5303 (Oehlert): Randomized Complete Blocks 1 > library(stat5303libs);library(cfcdae);library(lme4) > immer Loc Var Y1 Y2 1 UF M 81.0 80.7 2 UF S 105.4 82.3 3 UF V 119.7 80.4 4 UF T 109.7 87.2 5 UF
More informationExam Applied Statistical Regression. Good Luck!
Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.
More informationTopic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model
Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is
More informationDe-mystifying random effects models
De-mystifying random effects models Peter J Diggle Lecture 4, Leahurst, October 2012 Linear regression input variable x factor, covariate, explanatory variable,... output variable y response, end-point,
More informationR Demonstration ANCOVA
R Demonstration ANCOVA Objective: The purpose of this week s session is to demonstrate how to perform an analysis of covariance (ANCOVA) in R, and how to plot the regression lines for each level of the
More informationInference with Heteroskedasticity
Inference with Heteroskedasticity Note on required packages: The following code requires the packages sandwich and lmtest to estimate regression error variance that may change with the explanatory variables.
More information(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.
FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December
More informationTABLE OF CONTENTS INTRODUCTION TO MIXED-EFFECTS MODELS...3
Table of contents TABLE OF CONTENTS...1 1 INTRODUCTION TO MIXED-EFFECTS MODELS...3 Fixed-effects regression ignoring data clustering...5 Fixed-effects regression including data clustering...1 Fixed-effects
More informationBIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES
BIOL 458 - Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES PART 1: INTRODUCTION TO ANOVA Purpose of ANOVA Analysis of Variance (ANOVA) is an extremely useful statistical method
More information1-Way Fixed Effects ANOVA
1-Way Fixed Effects ANOVA James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 19 1-Way Fixed Effects ANOVA 1 Introduction
More informationSTAT3401: Advanced data analysis Week 10: Models for Clustered Longitudinal Data
STAT3401: Advanced data analysis Week 10: Models for Clustered Longitudinal Data Berwin Turlach School of Mathematics and Statistics Berwin.Turlach@gmail.com The University of Western Australia Models
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationRegression. Bret Hanlon and Bret Larget. December 8 15, Department of Statistics University of Wisconsin Madison.
Regression Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison December 8 15, 2011 Regression 1 / 55 Example Case Study The proportion of blackness in a male lion s nose
More informationPart II { Oneway Anova, Simple Linear Regression and ANCOVA with R
Part II { Oneway Anova, Simple Linear Regression and ANCOVA with R Gilles Lamothe February 21, 2017 Contents 1 Anova with one factor 2 1.1 The data.......................................... 2 1.2 A visual
More informationIntroduction to Mixed Models in R
Introduction to Mixed Models in R Galin Jones School of Statistics University of Minnesota http://www.stat.umn.edu/ galin March 2011 Second in a Series Sponsored by Quantitative Methods Collaborative.
More information36-720: Linear Mixed Models
36-720: Linear Mixed Models Brian Junker October 8, 2007 Review: Linear Mixed Models (LMM s) Bayesian Analogues Facilities in R Computational Notes Predictors and Residuals Examples [Related to Christensen
More informationCHAPTER EIGHT Linear Regression
7 CHAPTER EIGHT Linear Regression 8. Scatter Diagram Example 8. A chemical engineer is investigating the effect of process operating temperature ( x ) on product yield ( y ). The study results in the following
More informationSimple, Marginal, and Interaction Effects in General Linear Models
Simple, Marginal, and Interaction Effects in General Linear Models PRE 905: Multivariate Analysis Lecture 3 Today s Class Centering and Coding Predictors Interpreting Parameters in the Model for the Means
More informationSimulation and Analysis of Data from a Classic Split Plot Experimental Design
Simulation and Analysis of Data from a Classic Split Plot Experimental Design 1 Split-Plot Experimental Designs Field Plot Block 1 Block 2 Block 3 Block 4 Genotype C Genotype B Genotype A Genotype B Genotype
More informationR Output for Linear Models using functions lm(), gls() & glm()
LM 04 lm(), gls() &glm() 1 R Output for Linear Models using functions lm(), gls() & glm() Different kinds of output related to linear models can be obtained in R using function lm() {stats} in the base
More informationChapter 3 ANALYSIS OF RESPONSE PROFILES
Chapter 3 ANALYSIS OF RESPONSE PROFILES 78 31 Introduction In this chapter we present a method for analysing longitudinal data that imposes minimal structure or restrictions on the mean responses over
More informationExplanatory Variables Must be Linear Independent...
Explanatory Variables Must be Linear Independent... Recall the multiple linear regression model Y j = β 0 + β 1 X 1j + β 2 X 2j + + β p X pj + ε j, i = 1,, n. is a shorthand for n linear relationships
More information1 Use of indicator random variables. (Chapter 8)
1 Use of indicator random variables. (Chapter 8) let I(A) = 1 if the event A occurs, and I(A) = 0 otherwise. I(A) is referred to as the indicator of the event A. The notation I A is often used. 1 2 Fitting
More informationA brief introduction to mixed models
A brief introduction to mixed models University of Gothenburg Gothenburg April 6, 2017 Outline An introduction to mixed models based on a few examples: Definition of standard mixed models. Parameter estimation.
More informationCHAPTER 8 MODEL DIAGNOSTICS. 8.1 Residual Analysis
CHAPTER 8 MODEL DIAGNOSTICS We have now discussed methods for specifying models and for efficiently estimating the parameters in those models. Model diagnostics, or model criticism, is concerned with testing
More informationmultilevel modeling: concepts, applications and interpretations
multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models
More informationStat 500 Midterm 2 12 November 2009 page 0 of 11
Stat 500 Midterm 2 12 November 2009 page 0 of 11 Please put your name on the back of your answer book. Do NOT put it on the front. Thanks. Do not start until I tell you to. The exam is closed book, closed
More informationModels for longitudinal data
Faculty of Health Sciences Contents Models for longitudinal data Analysis of repeated measurements, NFA 016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen
More information610 - R1A "Make friends" with your data Psychology 610, University of Wisconsin-Madison
610 - R1A "Make friends" with your data Psychology 610, University of Wisconsin-Madison Prof Colleen F. Moore Note: The metaphor of making friends with your data was used by Tukey in some of his writings.
More information1 Multiple Regression
1 Multiple Regression In this section, we extend the linear model to the case of several quantitative explanatory variables. There are many issues involved in this problem and this section serves only
More informationOne-Way ANOVA. Some examples of when ANOVA would be appropriate include:
One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement
More informationMultiple Linear Regression. Chapter 12
13 Multiple Linear Regression Chapter 12 Multiple Regression Analysis Definition The multiple regression model equation is Y = b 0 + b 1 x 1 + b 2 x 2 +... + b p x p + ε where E(ε) = 0 and Var(ε) = s 2.
More informationOne-way between-subjects ANOVA. Comparing three or more independent means
One-way between-subjects ANOVA Comparing three or more independent means Data files SpiderBG.sav Attractiveness.sav Homework: sourcesofself-esteem.sav ANOVA: A Framework Understand the basic principles
More informationEconometrics. 7) Endogeneity
30C00200 Econometrics 7) Endogeneity Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Common types of endogeneity Simultaneity Omitted variables Measurement errors
More informationLECTURE 15: SIMPLE LINEAR REGRESSION I
David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).
More informationReview: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:
Outline for today What is a generalized linear model Linear predictors and link functions Example: fit a constant (the proportion) Analysis of deviance table Example: fit dose-response data using logistic
More informationExample: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA
s:5 Applied Linear Regression Chapter 8: ANOVA Two-way ANOVA Used to compare populations means when the populations are classified by two factors (or categorical variables) For example sex and occupation
More informationLab 7 Multiple Regression and F Tests Of a Subset Of Predictors
Lab 7 Multiple Regression and F Tests Of a Subset Of Predictors Preliminary Information: [1] Last week someone wanted to change the y axis labeling on a plot of the TukeyHSD plot(). The labels printed
More informationACOVA and Interactions
Chapter 15 ACOVA and Interactions Analysis of covariance (ACOVA) incorporates one or more regression variables into an analysis of variance. As such, we can think of it as analogous to the two-way ANOVA
More informationSwarthmore Honors Exam 2012: Statistics
Swarthmore Honors Exam 2012: Statistics 1 Swarthmore Honors Exam 2012: Statistics John W. Emerson, Yale University NAME: Instructions: This is a closed-book three-hour exam having six questions. You may
More informationRegression Analysis: Basic Concepts
The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance
More informationANOVA. Testing more than 2 conditions
ANOVA Testing more than 2 conditions ANOVA Today s goal: Teach you about ANOVA, the test used to measure the difference between more than two conditions Outline: - Why anova? - Contrasts and post-hoc tests
More informationANCOVA. Psy 420 Andrew Ainsworth
ANCOVA Psy 420 Andrew Ainsworth What is ANCOVA? Analysis of covariance an extension of ANOVA in which main effects and interactions are assessed on DV scores after the DV has been adjusted for by the DV
More information