Multi-factor analysis of variance

Size: px
Start display at page:

Download "Multi-factor analysis of variance"

Transcription

1 Faculty of Health Sciences Outline Multi-factor analysis of variance Basic statistics for experimental researchers 2016 Two-way ANOVA and interaction Matched samples ANOVA Random vs systematic variation Julie Forman and Claus Thorn Ekstrøm Department of Biostatistics, University of Copenhagen Mixed models Repeatability and reproducibility 2 / 67 Two-way analysis of variance Example: Tumor growth What is the effect of two treatments in combination? B A 1 2 c 1 2. r Quantification / test of interaction. Effect of treatment A? Effect of treatment B? Do the two treatments interact? Possible with multiple observations for each combination. not possible with one observation for each combination. Randomized experiment with 28 animats. Two treatments: days (0/1): time of treatment / at day one. radiation (Control/10 Gy): nicely balanced Outcome: tumor volume. Note: one measurement on each animal we have independent data! 3 / 67 4 / 67

2 Data from combined treatment groups Interaction plot Radiation Day Mean (SD) Control (36.1) Control (46.1) Gy (73.3) Gy (47.3) > load("tumgrow.rda") > tumgrow$days <- factor(tumgrow$days) > plot(tumvol ~ interaction(radiation, days), + data=tumgrow, + col="lightblue", cex.lab=1.4, cex.axis=1.4) TumVol Control.0 10 Gy.0 Control.1 10 Gy.1 interaction(radiation, days) > with(tumgrow, interaction.plot(days, radiation, TumVol) ) mean of TumVol days radiation Control 10 Gy Sample means for combined treatments: Do we see the same effect of days with and without radiation? 5 / 67 6 / 67 Interaction What is interaction? Interaction between two treatments (or factors) means that The effect of the two treatments depend on one another. When interaction is present: Quantify differences between treatment combinations as in a one-way ANOVA. Estimated effects of the one treatment must be presented for each value of the other treatment in turn (and vice versa). Interaction is also called effect modification Because the effect of one treatment is modified by the other. 7 / 67 8 / 67

3 Model with or without interaction Parameter estimates Model: for k th anima with combination of i th and j th treatment. Y ijk = µ + α i + β j + γ ij + ε ijk µ: mean of reference group (no radiation, day 0). α: effect of radiation (at day 0) β: effect of time (without radiation) γ: possible interaction (i.e. increase/decrease in anticipated effect when the two treatments are combined). ε ijk s are the error terms assumed independent N(0, σ 2 ) In case γ = 0 we say that there is no interaction or that the treatment effects are additive. We can test this as a hypothesis. Model: With interaction Without interaction Effect Estimate (95% CI) Estimate (95% CI) Intercept (117.9; 204.8) (133.3; 205.9) Radiation (-67.2; 47.8) (-64.3; 16.1) Days (-15.7; 107.3) (-10.6;69.0) Interaction (-110.3; 52.4) assumed = 0! Hence the estimated effect of radiation is lower (95% CI to 52.4) for an animal examined at day 1 compared to an animal examined at day 0. The interaction is not significant, but the confidence is wide so we cannot completely rule out a possible effect modification. 9 / / 67 Interpretation of parameter estimates. Expected tumor volume for each treatment combination radiation day control Gy = = = = Here we have added the estimated treatment effects to the mean of the reference group. Testing interaction in R lm(tumvol ~ radiation*days, data=tumgrow) or similarly lm(tumvol ~ radiation + days + radiation:days, data=tumgrow) Note that: Use lm for two-way ANOVA Both factors must be factors either in the call or in the data frame. If reference groups are not chosen R uses the first in alphabetic order. Use relevel to change the reference level for a factor. Both the main effects and the interaction must be included in the model formula (either manually or through *). 11 / / 67

4 Output: test for interaction > result <- lm(tumvol ~ radiation*days, data=tumgrow) > drop1(result, test="chisq") Single term deletions Model: TumVol ~ radiation * days Df Sum of Sq RSS AIC Pr(>Chi) <none> radiation:days Type III-type tests: Drop one factor while keeping the others. Hierarchical: Do not test main effects if interaction is present. Output: parameter estimates > result <- lm(tumvol ~ radiation*days, data=tumgrow) > summary(result) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-08 *** radiation10 Gy days radiation10 Gy:days Signif. codes: 0 *** ** 0.01 * Residual standard error: on 24 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: 1.43 on 3 and 24 DF, p-value: / / 67 Lots of useful information! Post hoc testing: Interaction Post hoc testing: Interaction If interaction is present, compare all treatment combinations with: > library(lsmeans) > all <- lsmeans(result, ~ radiation*days) > pairs(all) contrast estimate SE df t.ratio p.value Control,0-10 Gy, Control,0 - Control, Control,0-10 Gy, Gy,0 - Control, Gy,0-10 Gy, Control,1-10 Gy, Or assess the effect of each treatment given the other with: > all <- lsmeans(result, ~ radiation*days days) > pairs(all) days = 0: contrast estimate SE df t.ratio p.value Control - 10 Gy days = 1: contrast estimate SE df t.ratio p.value Control - 10 Gy P value adjustment: tukey method for comparing a family of 4 estimates 15 / / 67

5 Post hoc testing: No interaction If there is no interaction, the treatment effects are additive. Hence, asses each treatment in turn with: > result2 <- lm(tumvol ~ radiation + days, data=tumgrow) > lsmeans(result2, ~ radiation) radiation lsmean SE df lower.cl upper.cl Control Gy Results are averaged over the levels of: days Confidence level used: 0.95 > lsmeans(result2, ~ days) days lsmean SE df lower.cl upper.cl Model checking The error terms ε rt s are assumed to be independent (this we know to be true). normally distributed with zero mean and equal variances Use the residuals for model checking: Probability or QQ-plot of residuals. Plot of residuals vs expected values and/or factors. Any outliers in the data? Results are averaged over the levels of: radiation Confidence level used: / 67 or use summary() since there are only two levels of each factor. Expected values and residuals 18 / 67 Diagnostic plots Expected value: for radiation=gy 10, day=1: ŷ ij = ˆµ + ˆα i + ˆβ j + ˆγ ij = = > library(mess) > residualplot(result) > qqnorm(residuals(result)) > qqline(residuals(result)) Normal Q Q Plot Residual: for the last animal: ε ijk r ijk = observed expected = y ijk ŷ ij ε st = = Stud.res Fitted values Sample Quantiles Theoretical Quantiles 19 / / 67

6 Outline Overview: comparison of treatment groups Two-way ANOVA and interaction Matched samples ANOVA Random vs systematic variation Mixed models Repeatability and reproducibility number independent paired of groups samples samples 2 unpaired paired t-test t-test 2 one-way two-way analysis of variance analysis of variance or mixed model Analysis of variance: Last week: t-tests and one-way ANOVA. Today: Two-way ANOVA and mixed models. 21 / / 67 Example + exercise: Gene expressions Spaghetti-o-gram Four treatments applied to five cell lines (from lecture 2). Treatment ctrl A B C > library(lattice) > load("geneexp.rda") > xyplot(ge ~ treatment, + groups=cellline, + data=geneexp, type="l") > xyplot(log(ge) ~ treatment, + groups=cellline, + data=geneexp, type="l") log(ge) ge Do we see: differences among treatments? differnces among cell lines? (Is this interesting?) Interaction? (Not possible to test and not that interesting) 23 / Ctrl A B C treatment Ctrl A B C treatment The cell lines should be roughly parallel and equally variable 24 / 67 Log-transformed seems better than raw data.

7 Two-way ANOVA model Measurement for subject s with treatment t: Y st = µ + α s + β t + ε st Test of treatment effect > result <- lm(log(ge) ~ treatment + cellline, data=geneexp) > drop1(result, test="chisq") Single term deletions µ is the intercept (mean of reference) α s describe expected differences between cell lines. β s describe expected differences between treatments. The error terms ε st s are assumed to be independent normally distributed with equal variances The model assumptions should be checked / 67 Model: log(ge) ~ treatment + cellline Df Sum of Sq RSS AIC Pr(>Chi) <none> treatment e-05 *** cellline ** --- Signif. codes: 0 *** ** 0.01 * We find significant differences among treatments (interesting) and among cell lines (not that interesting... ). 26 / 67 Parameter estimates > summary(result) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) *** treatmenta *** treatmentb treatmentc cellline cellline cellline cellline ** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 12 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: on 7 and 12 DF, p-value: Note: The control treatment has been chosen as reference (the way the factor was created). Treatment A, B, and C parameter estimates are expected differences wrt the control... on 27 / 67 log-scale!. Estimates of treatment effect As compared to the control group: Treatment log-scale back-transformed A 1.16 (0.58;1.74) +218% (+79%;+467%) B 0.17 (-0.41;0.75) +19% (-33%;+111%) C 0.13 (-0.45;0.70) +13% (-36%;+102%) i.e. treatment A approximately triples the gene expression level 100 {exp( ) 1} 100 ( ) 218. Multiple comparisons: could be performed by e.g. using lsmeans or the multiple comparison approaches from last week! 28 / 67

8 Outline Two-way ANOVA and interaction Matched samples ANOVA Random vs systematic variation Mixed models Repeatability and reproducibility 29 / 67 Gene expressions again From before \begin{verbatim} Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) ***.. cellline cellline cellline cellline ** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 12 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: on 7 and 12 DF, p-value: Overall significant differences in gene expression levels were found among the cell lines (P=0.0012), and estimates show that some cell lines differ. 30 / 67 Should this be reported as an interesting finding? Fixed and random effects Fixed effects such as treatment, dose, and time. Typically a limited number of carefully selected groups. Group names are specific and cannot be shuffled. Each group must have a decent size in order to reach interesting conclusions (statistical power). Random effect such as rat, cell line, experiment or operator. Possibly a large number of different groups. Group names are non-informative (number of rat or cell line) and could be shuffled without consequence. Allows inference to be extended beyond the subjects in the experiment and to the population they were sampled from. The number of groups matters not the size of the groups. Example: experiment with rabbits R = 6 rabbits vaccinated. On S = 6 spots on the back of each. Response: swelling in cm 2 Model: Which one? A one-way ANOVA with a random effect; the rabbit-factor > load("rabbit.rda") > plot(swelling ~ rabbit, + data=rabbit, + cex.axis=1.4, + cex.lab=1.4) swelling rabbit 31 / / 67

9 One-way ANOVA with random variation Comparison of k groups/clusters, satisfying: The groups are of no individual interest and it is of no relevance to test whether they have identical means. The groups may be thought of as a random sample from a population, that we want to describe. Example: Swelling was measured 6 times consecutively on a sample of 6 rabbits. 33 / 67 What response can we expect in the population? Test for identical rabbits means: P = (one-way ANOVA) is not very helpful in this regard, and neither are the estimates of differences between specific rabbits. Mean swelling with 95% CL (and normal range) is better. Random effects ANOVA model Model for response of s th spot on r th rabbit: Y rs = µ + a r + ε rs µ is the grand mean (i.e. of the rabbit population). a r is the between-rabbit deviation (i.e. how does rabbit r deviate from the grand mean). ε rs is the within-rabbit deviation (i.e. how does spot s deviate from its rabbit s mean). It is assumed that all error terms (a r s and ε rs s) are independent and normally distributed: a r N(0, ω 2 B), ε rs N(0, σ 2 W ) The deviations between rabbits are considered random and their variance ωb 2, is called the between-rabbit variance component. 34 / 67 Implications of random effects anova Each single observations is sampled from the same population assumed to follow the normal distribution: Y rs N(µ, ω 2 B + σ 2 W ) Population mean µ (the grand mean). Population variance ω 2 B + σ2 W (the total variation). But: Measurements made on the same rabbit are correlated with the so-called intra-class correlation More about correlation next lecture Parameter estimates Grand mean (µ): 7.37 (6.68;8.05). Variance components: Variation Variance component Estimate (95% CI) %of variation Between ω 2 B 0.33 (0.03; 1.04) 36% Within ω 2 W 0.58 (0.06; 2.48) 64% Total ω 2 B + σ2 W % Corr(y r1, y r2 ) = ρ = ω 2 B ω 2 B + σ2 W Warning: Confidence intervals for the variance components may be invalid due to the tiny sample size (only six rabbits). I.e. measurements made on the same rabbit tend to look more alike than measurements made on different rabbits 35 / / 67

10 Interpretation of variance components Mixed models in R Typical difference between spots on the same rabbit: y rs1 y rs2 = µ + α r + ε rs1 (µ + α r + ε rs2 ) = ε rs1 + ( ε rs2 ) N(0, 2 ωw 2 ) Normal region: ± = ± 2.16 cm 2 Typical difference between spots on different rabbits: y r1 s 1 y r2 s 2 = α r1 + ( α r2 ) + ε rs1 + ( ε rs2 ) N(0, 2 (σb 2 + ωw 2 )) Normal region: ± 2 2 ( ) = ± 2.70 cm 2 > library(lme4) > rabbit$rabbit <- factor(rabbit$rabbit) > result <- lmer(swelling ~ 1 + (1 rabbit), data=rabbit) Syntax is similar to lm with a model formula specifying the relationship between outcome and covariates. Categorical variables must be set to be factors. Random effects are specified by (1 group) in the model formula. Note: If Y 1 N(µ 1, σ 2 1) and Y 2 N(µ 2, σ 2 2) are independent normal variables, then their difference is normal Y 1 Y 2 N(µ 1 µ 2, σ σ 2 2). 37 / / 67 R: Mixed model output > summary(result) Linear mixed model fit by REML ['lmermod'] Formula: swelling ~ 1 + (1 rabbit) Data: rabbit REML criterion at convergence: 91.5 Scaled residuals: Min 1Q Median 3Q Max Random effects: Groups Name Variance Std.Dev. rabbit (Intercept) Residual Number of obs: 36, groups: rabbit, 6 Fixed effects: Estimate Std. Error t value (Intercept) Always check that numerical optimisation has converged. Finally: 39 / 67 Parameter estimates, and something like tests. Negative variance components Warning: It may happen that some programs reports a zero-estimate for the variation between, ω 2 B. By coincidence. Thus the model is OK. As a result of competition within clusters. 40 / 67 Example: yield of plants grown in the same pot. Thus, the model is wrong as the clustering leads to dissimilarities (negative correlation) rather than similarities (positive correlation) in outcome.

11 Comparison of modeling strategies Comments on the strategies: Quantifying overall swelling Four strategies for estimating the grand mean of the rabbit population method estimate (s.e.) 1: forget rabbit (0.155) 2: fixed rabbit (0.127) 3: rabbit averages (0.267) 4: random rabbit (0.267) 1. We (wrongfully) assume independence all 36 measurements 2. We estimate the mean swelling by classical one-way anova. 3. We reduce the data to six averages from the individual rabbits and then compute mean and SE. 4. We estimate the mean swelling in the random effects anova model. 1. Ignoring the clustering is wrong! leads to systematic underestimation of the standard error. 2. In the fixed effect one-way anova the grand mean has a different interpretation!... as the mean swelling of these six particular rabbits. leads to systematic underestimation of the standard error. 3. Looking at the sample of averages may be OK. At least in balanced designs (otherwise the individual averages have unequal variances and the standard error may be affected) But we loose information on within subject variation. 41 / / 67 Unbalanced data We delete the 3 smallest measurements from rabbit 2 (largest level) so that the data becomes unbalanced and the results change: method estimate (s.e.) 1: forget rabbit (0.163) 2: fixed rabbit (0.136) 3: rabbit averages (0.333) 4: random rabbit (0.298) Full sample (0.267) 1 we have omitted some of the largest observations 2 rabbit 2 has a lower weight in the average (only 3 observations) 3 average for rabbit 2 has increased 4 rabbit 2 has a lower weight in the average due to a larger standard error Design considerations Plan an experiment with: R rabbits (independent or true replicates). S spots for each rabbit (repeated measurements or pseudo replicates). R S measurements. Then variance of mean estimate var(ȳ) = ω2 B R + σ2 W RS, decreases with R and S. standard error rabbits The different curves correspond to S varying from 1 to / / 67

12 Effective sample size How many rabbits would we need to obtain the same precision in estimating the grand mean if we had only one measurement on each of R 1 rabbits? Solve the equation for var(ȳ): R 1 = R S 1 + ρ(s 1) where ρ is the within rabbit correlation. Outline Two-way ANOVA and interaction Matched samples ANOVA Random vs systematic variation Mixed models Estimate: ρ = ω2 B ω 2 B +σ2 W = = R 1 = 12.8 Repeatability and reproducibility I.e. one measurement on each of thirteen rabbits gives the same precision as six measurements on each of six rabbits! 45 / / 67 Linear mixed models Multi-level models Generalisations of ANOVA and GLM models involving both fixed effects (covariates) and several sources of random variation, the so-called variance components. Environmental variation. Between clinics, regions or countries. Biological variation. Between patients, animals, or cell lines. Within-individual variation. Between injection sites, tumors, slices. Variation due to uncontrollable circumstances. E.g. day to day, assay, observer. Measurement error. E.g. duplicates, triplicates. Mixed models are also called variance component models. Often we have a multi-level model with hierarchical ordering of the levels. We have variation (i.e. a variance component) on each level. And possibly fixed effects (covariates) on each level. individual context/cluster context/cluster level 1 level 2 level 3 spots rabbits slices tumors mice duplicates experiments operators Arrows indicate simplification or grouping. 47 / / 67

13 Merits of mixed models Drawbacks of mixed models We get a better understanding of the various sources of variation. Certain effects may be estimated more precisely (higher power), since some sources of variation are eliminated, e.g. by making comparisons within the same subject. This is analogous to the paired comparison situation. When planning subsequent investigations, the knowledge of the relative sizes of the variance components will (in principle) be of help in deciding the number of repetitions needed at each level. Independent (sometimes called true) replicates Repeated measurements (called pseudo replicates) Their statistical analysis is more difficult. When making inference (estimation and testing), it is important to take all sources of variation into account. Results may be biased if one or more sources of variation are disregarded! Only few statistical software can do the correct analyses. 49 / / 67 Testing fixed effects Testing fixed effects Imagine that rabbits are grouped in two (e.g. treatments): Rabbit 1 3 is group 1, 4 6 is group 2 level variation covariates 1 within rabbit spot 2 between rabbits group Part of the variation between rabbits could be explained by systematic differences between groups. Part of the variation within rabbits could be explained by systematic differences between spots. > rabbit$group <- factor(rabbit$rabbit %in% c("4", "5", "6"), + labels=c("grp1", "Grp2")) > result <- lmer(swelling ~ spot + group + (1 rabbit), + data=rabbit) > result Linear mixed model fit by REML ['lmermod'] Formula: swelling ~ spot + group + (1 rabbit) Data: rabbit REML criterion at convergence: Random effects: Groups Name Std.Dev. rabbit (Intercept) Residual Number of obs: 36, groups: rabbit, 6 Fixed Effects: (Intercept) spotb spotc spotd spote spotf groupgrp Output: rabbit < larger than before Residual < smaller than before 51 / / 67

14 Testing fixed effects with lmer (May need to restart R session) > library(lmertest) > result <- lmer(swelling ~ spot + group + (1 rabbit), + data=rabbit) > summary(result) Linear mixed model fit by REML t-tests use Satterthwaite approximations to degrees of freedom [lmermod] Formula: swelling ~ spot + group + (1 rabbit) Data: rabbit REML criterion at convergence: 84.3 Scaled residuals: Min 1Q Median 3Q Max Random effects: Groups Name Variance Std.Dev. rabbit (Intercept) Residual Number of obs: 36, groups: rabbit, 6 Fixed effects: Estimate Std. Error df t value Pr(> t ) (Intercept) e-08 *** spotb spotc spotd spote spotf / 67 groupgrp u n i v e rsignif. s i t y ocodes: f c o p e0 n *** h a g e0.001 n ** 0.01 * Correlation of Fixed Effects: (Intr) spotb spotc spotd spote spotf Summary spotb spotc spotd spote spotf groupgrp Measurements belonging together in the same cluster tend to look alike (they are correlated). If we fail to take this into account, we will experience: Possible bias in estimates (in unbalanced data). Too small standard errors (type 1 error) for estimates of level 2 effects (between-cluster effects). Too low efficiency (type 2 error) for evaluation of level 1 covariates (within-cluster effects) Disregarding repeated measurements When the random rabbit variation is ignored: Too small standard errors for estimates of difference between groups and too large standard errors for estimates of differences between spots! > result <- lm(swelling ~ spot + group, + data=rabbit) > summary(result) Call: lm(formula = swelling ~ spot + group, data = rabbit) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) <2e-16 *** spotb spotc spotd spote spotf groupgrp Signif. codes: 0 *** ** 0.01 * Residual standard error: on 29 degrees of freedom 54 Multiple / 67 R-squared: , Adjusted R-squared: F-statistic: on 6 and 29 DF, p-value: Outline Two-way ANOVA and interaction Matched samples ANOVA Random vs systematic variation Mixed models Repeatability and reproducibility 55 / / 67

15 Comparing measurement devices Illustration of all data Example: Peak expiratory flow rate, l/min: 17 subjects, 2 measurement devices, two replicates with each method. subject Wright mini Wright id Y 1p1 Y 1p2 Y 2p1 Y 2p Average SD Reference: Bland and Altman, Lancet (1986). 57 / / 67 Aim of investigation Simple approaches Quantify the precision of each measuring device Repeatability (variability=measurement error) Quantify the agreement between the two devices. Bias of one method compared to the other. Variance of one method compared to the other. Can the devices be used interchangably? For reliability of each method separately we could: make Bland Altman plots of differences vs averages. compute limits of agreement, i.e. the 95% normal range of the differences. For reproducibility (method comparison) we might: compare the averages in a Bland-Altman plot? Not good - unless you also do averages in clinic! For both at the same time: Mixed model for variance between and within methods. 59 / / 67

16 Variance component models Stratified analyses For each method (i = 1, 2) we have a variance component model 61 / 67 Y ijk = µ i + a ij + ε ijk µ i population mean as anticipated by method i. a ij deviation of subject j from population mean, assumed normally distributed N(0, σ 2 i ). ε ijk deviation for replicate k (measurement error), assumed normally distributed N(0, ω 2 i ). > load("wright.rda") > lmer(flow ~ 1 + (1 id), data=wright, subset=(method=="mini")) Linear mixed model fit by REML ['mermodlmertest'] Formula: flow ~ 1 + (1 id) Data: wright Subset: (method == "mini") REML criterion at convergence: Random effects: Groups Name Std.Dev. id (Intercept) Residual Number of obs: 34, groups: id, 17 Fixed Effects: (Intercept) > lmer(flow ~ 1 + (1 id), data=wright, subset=(method=="wright")) Linear mixed model fit by REML ['mermodlmertest'] Formula: flow ~ 1 + (1 id) Data: wright Subset: (method == "wright") REML criterion at convergence: Random effects: Groups Name Std.Dev. id (Intercept) Residual Number of obs: 34, groups: id, 17 Fixed Effects: (Intercept) 62 / Joint model for both methods Advanced analysis For methods (i = 1, 2): Y ijk = µ i + a ij + ε ijk ε ijk assumed normally distributed N(0, ω 2 i ) and independent across methods. a ij assumed normally distributed N(0, σ 2 i ) and correlated with ρ = Cor(a i1, a i2 ). Anticipated means for the same subject ought to look a lot like each other, so the a ij s are likely to be correlated across methods. > library(methcomp) > mydata <- Meth(wright, meth=3, item=4, repl=6, y=5) The following variables from the dataframe "wright" are used as the Meth variables: meth: method item: id repl: repl y: flow #Replicates Method 2 #Items #Obs: 68 Values: min med max mini wright / / 67

17 Advanced analysis Repeatability > BA.est(mydata, linked=false) Conversion between methods: alpha beta sd.pred LoA-lo LoA-up To: From: mini mini wright wright mini wright Variance components (sd): IxR MxI res mini wright Typical differences (approximate 95% normal range) between two measurement with the same method: Wright: ˆω 2 1 = ±2 2ω 2 1 ±43.3 Mini: ˆω 2 2 = ±2 2ω 2 2 ±56.3 Seemingly Wright is more precise, but is the difference significant? F = = 1.69 F (17, 17) P = 0.14 Don t form too firm a conclusion with too small data. 65 / / 67 Reproducibility No evidence of systematic differences between the two methods. Estimated bias +6.0 for mini vs wright. Typical differences between the two methods: var(y 1jk Y 2jk ) = var(a 1j a 2j + ε 1jk ε 2jk ) = σ σ 2 2 2σ 12 + ω ω 2 2 Limits-of-agreement: 6.03 ± = ( 69.3, 81.3). 67 / 67

Multi-factor analysis of variance

Multi-factor analysis of variance Faculty of Health Sciences Outline Multi-factor analysis of variance Basic statistics for experimental researchers 2015 Two-way ANOVA and interaction Mathed samples ANOVA Random vs systematic variation

More information

Variance component models

Variance component models Faculty of Health Sciences Variance component models Analysis of repeated measurements, NFA 2016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen Topics for

More information

Varians- og regressionsanalyse

Varians- og regressionsanalyse Faculty of Health Sciences Varians- og regressionsanalyse Variance component models Lene Theil Skovgaard Department of Biostatistics Variance component models Definitions and motivation One-way anova with

More information

Analysis of variance and regression. December 4, 2007

Analysis of variance and regression. December 4, 2007 Analysis of variance and regression December 4, 2007 Variance component models Variance components One-way anova with random variation estimation interpretations Two-way anova with random variation Crossed

More information

Variance component models part I

Variance component models part I Faculty of Health Sciences Variance component models part I Analysis of repeated measurements, 30th November 2012 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

Correlated data. Variance component models. Example: Evaluate vaccine. Traditional assumption so far. Faculty of Health Sciences

Correlated data. Variance component models. Example: Evaluate vaccine. Traditional assumption so far. Faculty of Health Sciences Faculty of Health Sciences Variance component models Definitions and motivation Correlated data Variance component models, I Lene Theil Skovgaard November 29, 2013 One-way anova with random variation The

More information

Variance components and LMMs

Variance components and LMMs Faculty of Health Sciences Variance components and LMMs Analysis of repeated measurements, 4th December 2014 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

Variance components and LMMs

Variance components and LMMs Faculty of Health Sciences Topics for today Variance components and LMMs Analysis of repeated measurements, 4th December 04 Leftover from 8/: Rest of random regression example. New concepts for today:

More information

Linear mixed models. Faculty of Health Sciences. Analysis of repeated measurements, 10th March Julie Lyng Forman & Lene Theil Skovgaard

Linear mixed models. Faculty of Health Sciences. Analysis of repeated measurements, 10th March Julie Lyng Forman & Lene Theil Skovgaard Faculty of Health Sciences Linear mixed models Analysis of repeated measurements, 10th March 2015 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen 1 / 80 Program

More information

Linear mixed models. Program. What are repeated measurements? Outline. Faculty of Health Sciences. Analysis of repeated measurements, 10th March 2015

Linear mixed models. Program. What are repeated measurements? Outline. Faculty of Health Sciences. Analysis of repeated measurements, 10th March 2015 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Program Faculty of Health Sciences Topics: Linear mixed models

More information

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman.

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman. Faculty of Health Sciences Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 27, 2018 1 / 84 Overview One-way anova with random variation The rabbit example Hierarchical

More information

Correlated data. Overview. Example: Swelling due to vaccine. Variance component models. Faculty of Health Sciences. Variance component models

Correlated data. Overview. Example: Swelling due to vaccine. Variance component models. Faculty of Health Sciences. Variance component models Faculty of Health Sciences Overview Correlated data Variance component models One-way anova with random variation The rabbit example Hierarchical models with several levels Random regression Lene Theil

More information

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models Faculty of Health Sciences Overview Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 29, 2016 One-way anova with random variation The rabbit example Hierarchical

More information

Workshop 9.3a: Randomized block designs

Workshop 9.3a: Randomized block designs -1- Workshop 93a: Randomized block designs Murray Logan November 23, 16 Table of contents 1 Randomized Block (RCB) designs 1 2 Worked Examples 12 1 Randomized Block (RCB) designs 11 RCB design Simple Randomized

More information

Mixed Model: Split plot with two whole-plot factors, one split-plot factor, and CRD at the whole-plot level (e.g. fancier split-plot p.

Mixed Model: Split plot with two whole-plot factors, one split-plot factor, and CRD at the whole-plot level (e.g. fancier split-plot p. STAT:5201 Applied Statistic II Mixed Model: Split plot with two whole-plot factors, one split-plot factor, and CRD at the whole-plot level (e.g. fancier split-plot p.422 OLRT) Hamster example with three

More information

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman.

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman. Faculty of Health Sciences Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 28, 2017 1 / 96 Overview One-way anova with random variation The rabbit example Hierarchical

More information

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models Faculty of Health Sciences Overview Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 28, 2017 One-way anova with random variation The rabbit example Hierarchical

More information

Linear regression and correlation

Linear regression and correlation Faculty of Health Sciences Linear regression and correlation Statistics for experimental medical researchers 2018 Julie Forman, Christian Pipper & Claus Ekstrøm Department of Biostatistics, University

More information

Analysis of variance and regression. April 17, Contents Comparison of several groups One-way ANOVA. Two-way ANOVA Interaction Model checking

Analysis of variance and regression. April 17, Contents Comparison of several groups One-way ANOVA. Two-way ANOVA Interaction Model checking Analysis of variance and regression Contents Comparison of several groups One-way ANOVA April 7, 008 Two-way ANOVA Interaction Model checking ANOVA, April 008 Comparison of or more groups Julie Lyng Forman,

More information

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form Outline Statistical inference for linear mixed models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark general form of linear mixed models examples of analyses using linear mixed

More information

Mixed Model Theory, Part I

Mixed Model Theory, Part I enote 4 1 enote 4 Mixed Model Theory, Part I enote 4 INDHOLD 2 Indhold 4 Mixed Model Theory, Part I 1 4.1 Design matrix for a systematic linear model.................. 2 4.2 The mixed model.................................

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the

More information

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model 1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor

More information

20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36

20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36 20. REML Estimation of Variance Components Copyright c 2018 (Iowa State University) 20. Statistics 510 1 / 36 Consider the General Linear Model y = Xβ + ɛ, where ɛ N(0, Σ) and Σ is an n n positive definite

More information

Analysis of variance. April 16, Contents Comparison of several groups

Analysis of variance. April 16, Contents Comparison of several groups Contents Comparison of several groups Analysis of variance April 16, 2009 One-way ANOVA Two-way ANOVA Interaction Model checking Acknowledgement for use of presentation Julie Lyng Forman, Dept. of Biostatistics

More information

Analysis of variance. April 16, 2009

Analysis of variance. April 16, 2009 Analysis of variance April 16, 2009 Contents Comparison of several groups One-way ANOVA Two-way ANOVA Interaction Model checking Acknowledgement for use of presentation Julie Lyng Forman, Dept. of Biostatistics

More information

Statistics for exp. medical researchers Regression and Correlation

Statistics for exp. medical researchers Regression and Correlation Faculty of Health Sciences Regression analysis Statistics for exp. medical researchers Regression and Correlation Lene Theil Skovgaard Sept. 28, 2015 Linear regression, Estimation and Testing Confidence

More information

Answer to exercise: Blood pressure lowering drugs

Answer to exercise: Blood pressure lowering drugs Answer to exercise: Blood pressure lowering drugs The data set bloodpressure.txt contains data from a cross-over trial, involving three different formulations of a drug for lowering of blood pressure:

More information

Correlated data. Repeated measurements over time. Typical set-up for repeated measurements. Traditional presentation of data

Correlated data. Repeated measurements over time. Typical set-up for repeated measurements. Traditional presentation of data Faculty of Health Sciences Repeated measurements over time Correlated data NFA, May 22, 2014 Longitudinal measurements Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics University of

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen Outline Data in wide and long format

More information

Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA

Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA Faculty of Health Sciences Outline Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA Lene Theil Skovgaard Sept. 14, 2015 Paired comparisons: tests and confidence intervals

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen 2 / 28 Preparing data for analysis The

More information

Workshop 7.4a: Single factor ANOVA

Workshop 7.4a: Single factor ANOVA -1- Workshop 7.4a: Single factor ANOVA Murray Logan November 23, 2016 Table of contents 1 Revision 1 2 Anova Parameterization 2 3 Partitioning of variance (ANOVA) 10 4 Worked Examples 13 1. Revision 1.1.

More information

SPH 247 Statistical Analysis of Laboratory Data

SPH 247 Statistical Analysis of Laboratory Data SPH 247 Statistical Analysis of Laboratory Data March 31, 2015 SPH 247 Statistical Analysis of Laboratory Data 1 ANOVA Fixed and Random Effects We will review the analysis of variance (ANOVA) and then

More information

Stat 5303 (Oehlert): Randomized Complete Blocks 1

Stat 5303 (Oehlert): Randomized Complete Blocks 1 Stat 5303 (Oehlert): Randomized Complete Blocks 1 > library(stat5303libs);library(cfcdae);library(lme4) > immer Loc Var Y1 Y2 1 UF M 81.0 80.7 2 UF S 105.4 82.3 3 UF V 119.7 80.4 4 UF T 109.7 87.2 5 UF

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

Outline. Analysis of Variance. Acknowledgements. Comparison of 2 or more groups. Comparison of serveral groups

Outline. Analysis of Variance. Acknowledgements. Comparison of 2 or more groups. Comparison of serveral groups Outline Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~lts/regression10_2/index.html Comparison of serveral groups Model checking Marc Andersen, mja@statgroup.dk

More information

22s:152 Applied Linear Regression. Take random samples from each of m populations.

22s:152 Applied Linear Regression. Take random samples from each of m populations. 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

STAT 525 Fall Final exam. Tuesday December 14, 2010

STAT 525 Fall Final exam. Tuesday December 14, 2010 STAT 525 Fall 2010 Final exam Tuesday December 14, 2010 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will

More information

Correlated Data: Linear Mixed Models with Random Intercepts

Correlated Data: Linear Mixed Models with Random Intercepts 1 Correlated Data: Linear Mixed Models with Random Intercepts Mixed Effects Models This lecture introduces linear mixed effects models. Linear mixed models are a type of regression model, which generalise

More information

A brief introduction to mixed models

A brief introduction to mixed models A brief introduction to mixed models University of Gothenburg Gothenburg April 6, 2017 Outline An introduction to mixed models based on a few examples: Definition of standard mixed models. Parameter estimation.

More information

Analysis of variance and regression. May 13, 2008

Analysis of variance and regression. May 13, 2008 Analysis of variance and regression May 13, 2008 Repeated measurements over time Presentation of data Traditional ways of analysis Variance component model (the dogs revisited) Random regression Baseline

More information

Multiple Linear Regression. Chapter 12

Multiple Linear Regression. Chapter 12 13 Multiple Linear Regression Chapter 12 Multiple Regression Analysis Definition The multiple regression model equation is Y = b 0 + b 1 x 1 + b 2 x 2 +... + b p x p + ε where E(ε) = 0 and Var(ε) = s 2.

More information

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

Introduction and Background to Multilevel Analysis

Introduction and Background to Multilevel Analysis Introduction and Background to Multilevel Analysis Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background and

More information

Chap The McGraw-Hill Companies, Inc. All rights reserved.

Chap The McGraw-Hill Companies, Inc. All rights reserved. 11 pter11 Chap Analysis of Variance Overview of ANOVA Multiple Comparisons Tests for Homogeneity of Variances Two-Factor ANOVA Without Replication General Linear Model Experimental Design: An Overview

More information

Statistical Analysis of Method Comparison studies. Comparing two methods with one measurement on each Morning

Statistical Analysis of Method Comparison studies. Comparing two methods with one measurement on each Morning Statistical Analysis of Method Comparison studies Bendix Carstensen Claus Thorn Ekstrøm Steno Diabetes Center, Gentofte, Denmark & Dept. Biostatistics, Medical Faculty, University of Copenhagen http://bendixcarstensen.com

More information

Example. Multiple Regression. Review of ANOVA & Simple Regression /749 Experimental Design for Behavioral and Social Sciences

Example. Multiple Regression. Review of ANOVA & Simple Regression /749 Experimental Design for Behavioral and Social Sciences 36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 29, 2015 Lecture 5: Multiple Regression Review of ANOVA & Simple Regression Both Quantitative outcome Independent, Gaussian errors

More information

Models for longitudinal data

Models for longitudinal data Faculty of Health Sciences Contents Models for longitudinal data Analysis of repeated measurements, NFA 016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical

More information

Homework 3 - Solution

Homework 3 - Solution STAT 526 - Spring 2011 Homework 3 - Solution Olga Vitek Each part of the problems 5 points 1. KNNL 25.17 (Note: you can choose either the restricted or the unrestricted version of the model. Please state

More information

Two-Way Analysis of Variance - no interaction

Two-Way Analysis of Variance - no interaction 1 Two-Way Analysis of Variance - no interaction Example: Tests were conducted to assess the effects of two factors, engine type, and propellant type, on propellant burn rate in fired missiles. Three engine

More information

Introduction to the Analysis of Hierarchical and Longitudinal Data

Introduction to the Analysis of Hierarchical and Longitudinal Data Introduction to the Analysis of Hierarchical and Longitudinal Data Georges Monette, York University with Ye Sun SPIDA June 7, 2004 1 Graphical overview of selected concepts Nature of hierarchical models

More information

SAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model

SAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model Topic 23 - Unequal Replication Data Model Outline - Fall 2013 Parameter Estimates Inference Topic 23 2 Example Page 954 Data for Two Factor ANOVA Y is the response variable Factor A has levels i = 1, 2,...,

More information

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013 Topic 19 - Inference - Fall 2013 Outline Inference for Means Differences in cell means Contrasts Multiplicity Topic 19 2 The Cell Means Model Expressed numerically Y ij = µ i + ε ij where µ i is the theoretical

More information

Outline. Analysis of Variance. Comparison of 2 or more groups. Acknowledgements. Comparison of serveral groups

Outline. Analysis of Variance. Comparison of 2 or more groups. Acknowledgements. Comparison of serveral groups Outline Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~jufo/varianceregressionf2011.html Comparison of serveral groups Model checking Marc Andersen, mja@statgroup.dk

More information

Coping with Additional Sources of Variation: ANCOVA and Random Effects

Coping with Additional Sources of Variation: ANCOVA and Random Effects Coping with Additional Sources of Variation: ANCOVA and Random Effects 1/49 More Noise in Experiments & Observations Your fixed coefficients are not always so fixed Continuous variation between samples

More information

Correlated data. Overview. Cross-over study. Repetition. Faculty of Health Sciences. Variance component models, II. More on variance component models

Correlated data. Overview. Cross-over study. Repetition. Faculty of Health Sciences. Variance component models, II. More on variance component models Faculty of Health Sciences Overview Correlated data More on variance component models Variance component models, II Cross-over studies Non-normal data Comparing measurement devices Lene Theil Skovgaard

More information

Simple, Marginal, and Interaction Effects in General Linear Models

Simple, Marginal, and Interaction Effects in General Linear Models Simple, Marginal, and Interaction Effects in General Linear Models PRE 905: Multivariate Analysis Lecture 3 Today s Class Centering and Coding Predictors Interpreting Parameters in the Model for the Means

More information

Hierarchical Random Effects

Hierarchical Random Effects enote 5 1 enote 5 Hierarchical Random Effects enote 5 INDHOLD 2 Indhold 5 Hierarchical Random Effects 1 5.1 Introduction.................................... 2 5.2 Main example: Lactase measurements in

More information

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim 0.0 1.0 1.5 2.0 2.5 3.0 8 10 12 14 16 18 20 22 y x Figure 1: The fitted line using the shipment route-number of ampules data STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim Problem#

More information

De-mystifying random effects models

De-mystifying random effects models De-mystifying random effects models Peter J Diggle Lecture 4, Leahurst, October 2012 Linear regression input variable x factor, covariate, explanatory variable,... output variable y response, end-point,

More information

df=degrees of freedom = n - 1

df=degrees of freedom = n - 1 One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

Outline for today. Two-way analysis of variance with random effects

Outline for today. Two-way analysis of variance with random effects Outline for today Two-way analysis of variance with random effects Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark Two-way ANOVA using orthogonal projections March 4, 2018 1 /

More information

HW 2 due March 6 random effects and mixed effects models ELM Ch. 8 R Studio Cheatsheets In the News: homeopathic vaccines

HW 2 due March 6 random effects and mixed effects models ELM Ch. 8 R Studio Cheatsheets In the News: homeopathic vaccines Today HW 2 due March 6 random effects and mixed effects models ELM Ch. 8 R Studio Cheatsheets In the News: homeopathic vaccines STA 2201: Applied Statistics II March 4, 2015 1/35 A general framework y

More information

Chapter 16: Understanding Relationships Numerical Data

Chapter 16: Understanding Relationships Numerical Data Chapter 16: Understanding Relationships Numerical Data These notes reflect material from our text, Statistics, Learning from Data, First Edition, by Roxy Peck, published by CENGAGE Learning, 2015. Linear

More information

Lecture 2. The Simple Linear Regression Model: Matrix Approach

Lecture 2. The Simple Linear Regression Model: Matrix Approach Lecture 2 The Simple Linear Regression Model: Matrix Approach Matrix algebra Matrix representation of simple linear regression model 1 Vectors and Matrices Where it is necessary to consider a distribution

More information

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 An Introduction to Multilevel Models PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 Today s Class Concepts in Longitudinal Modeling Between-Person vs. +Within-Person

More information

Stat 5303 (Oehlert): Balanced Incomplete Block Designs 1

Stat 5303 (Oehlert): Balanced Incomplete Block Designs 1 Stat 5303 (Oehlert): Balanced Incomplete Block Designs 1 > library(stat5303libs);library(cfcdae);library(lme4) > weardata

More information

Analysis of Variance

Analysis of Variance 1 / 70 Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~lts/regression11_2 Marc Andersen, mja@statgroup.dk Analysis of variance and regression for health researchers,

More information

ACOVA and Interactions

ACOVA and Interactions Chapter 15 ACOVA and Interactions Analysis of covariance (ACOVA) incorporates one or more regression variables into an analysis of variance. As such, we can think of it as analogous to the two-way ANOVA

More information

R Output for Linear Models using functions lm(), gls() & glm()

R Output for Linear Models using functions lm(), gls() & glm() LM 04 lm(), gls() &glm() 1 R Output for Linear Models using functions lm(), gls() & glm() Different kinds of output related to linear models can be obtained in R using function lm() {stats} in the base

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

Correlated data. Longitudinal data. Typical set-up for repeated measurements. Examples from literature, I. Faculty of Health Sciences

Correlated data. Longitudinal data. Typical set-up for repeated measurements. Examples from literature, I. Faculty of Health Sciences Faculty of Health Sciences Longitudinal data Correlated data Longitudinal measurements Outline Designs Models for the mean Covariance patterns Lene Theil Skovgaard November 27, 2015 Random regression Baseline

More information

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES BIOL 458 - Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES PART 1: INTRODUCTION TO ANOVA Purpose of ANOVA Analysis of Variance (ANOVA) is an extremely useful statistical method

More information

Regression and Models with Multiple Factors. Ch. 17, 18

Regression and Models with Multiple Factors. Ch. 17, 18 Regression and Models with Multiple Factors Ch. 17, 18 Mass 15 20 25 Scatter Plot 70 75 80 Snout-Vent Length Mass 15 20 25 Linear Regression 70 75 80 Snout-Vent Length Least-squares The method of least

More information

R in Linguistic Analysis. Wassink 2012 University of Washington Week 6

R in Linguistic Analysis. Wassink 2012 University of Washington Week 6 R in Linguistic Analysis Wassink 2012 University of Washington Week 6 Overview R for phoneticians and lab phonologists Johnson 3 Reading Qs Equivalence of means (t-tests) Multiple Regression Principal

More information

Analysis of variance. Gilles Guillot. September 30, Gilles Guillot September 30, / 29

Analysis of variance. Gilles Guillot. September 30, Gilles Guillot September 30, / 29 Analysis of variance Gilles Guillot gigu@dtu.dk September 30, 2013 Gilles Guillot (gigu@dtu.dk) September 30, 2013 1 / 29 1 Introductory example 2 One-way ANOVA 3 Two-way ANOVA 4 Two-way ANOVA with interactions

More information

Example: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA

Example: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA s:5 Applied Linear Regression Chapter 8: ANOVA Two-way ANOVA Used to compare populations means when the populations are classified by two factors (or categorical variables) For example sex and occupation

More information

Biostatistics for physicists fall Correlation Linear regression Analysis of variance

Biostatistics for physicists fall Correlation Linear regression Analysis of variance Biostatistics for physicists fall 2015 Correlation Linear regression Analysis of variance Correlation Example: Antibody level on 38 newborns and their mothers There is a positive correlation in antibody

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline

More information

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum T-test: means of Spock's judge versus all other judges 1 The TTEST Procedure Variable: pcwomen judge1 N Mean Std Dev Std Err Minimum Maximum OTHER 37 29.4919 7.4308 1.2216 16.5000 48.9000 SPOCKS 9 14.6222

More information

Simple, Marginal, and Interaction Effects in General Linear Models: Part 1

Simple, Marginal, and Interaction Effects in General Linear Models: Part 1 Simple, Marginal, and Interaction Effects in General Linear Models: Part 1 PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 2: August 24, 2012 PSYC 943: Lecture 2 Today s Class Centering and

More information

STAT 705 Chapter 19: Two-way ANOVA

STAT 705 Chapter 19: Two-way ANOVA STAT 705 Chapter 19: Two-way ANOVA Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 38 Two-way ANOVA Material covered in Sections 19.2 19.4, but a bit

More information

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data Today s Class: Review of concepts in multivariate data Introduction to random intercepts Crossed random effects models

More information

3. Design Experiments and Variance Analysis

3. Design Experiments and Variance Analysis 3. Design Experiments and Variance Analysis Isabel M. Rodrigues 1 / 46 3.1. Completely randomized experiment. Experimentation allows an investigator to find out what happens to the output variables when

More information

Outline. Example and Model ANOVA table F tests Pairwise treatment comparisons with LSD Sample and subsample size determination

Outline. Example and Model ANOVA table F tests Pairwise treatment comparisons with LSD Sample and subsample size determination Outline 1 The traditional approach 2 The Mean Squares approach for the Completely randomized design (CRD) CRD and one-way ANOVA Variance components and the F test Inference about the intercept Sample vs.

More information

STAT 572 Assignment 5 - Answers Due: March 2, 2007

STAT 572 Assignment 5 - Answers Due: March 2, 2007 1. The file glue.txt contains a data set with the results of an experiment on the dry sheer strength (in pounds per square inch) of birch plywood, bonded with 5 different resin glues A, B, C, D, and E.

More information

FACTORIAL DESIGNS and NESTED DESIGNS

FACTORIAL DESIGNS and NESTED DESIGNS Experimental Design and Statistical Methods Workshop FACTORIAL DESIGNS and NESTED DESIGNS Jesús Piedrafita Arilla jesus.piedrafita@uab.cat Departament de Ciència Animal i dels Aliments Items Factorial

More information

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is

More information

WU Weiterbildung. Linear Mixed Models

WU Weiterbildung. Linear Mixed Models Linear Mixed Effects Models WU Weiterbildung SLIDE 1 Outline 1 Estimation: ML vs. REML 2 Special Models On Two Levels Mixed ANOVA Or Random ANOVA Random Intercept Model Random Coefficients Model Intercept-and-Slopes-as-Outcomes

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

I r j Binom(m j, p j ) I L(, ; y) / exp{ y j + (x j y j ) m j log(1 + e + x j. I (, y) / L(, ; y) (, )

I r j Binom(m j, p j ) I L(, ; y) / exp{ y j + (x j y j ) m j log(1 + e + x j. I (, y) / L(, ; y) (, ) Today I Bayesian analysis of logistic regression I Generalized linear mixed models I CD on fixed and random effects I HW 2 due February 28 I Case Studies SSC 2014 Toronto I March/April: Semi-parametric

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

STAT3401: Advanced data analysis Week 10: Models for Clustered Longitudinal Data

STAT3401: Advanced data analysis Week 10: Models for Clustered Longitudinal Data STAT3401: Advanced data analysis Week 10: Models for Clustered Longitudinal Data Berwin Turlach School of Mathematics and Statistics Berwin.Turlach@gmail.com The University of Western Australia Models

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

STAT 705 Chapter 19: Two-way ANOVA

STAT 705 Chapter 19: Two-way ANOVA STAT 705 Chapter 19: Two-way ANOVA Adapted from Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 41 Two-way ANOVA This material is covered in Sections

More information

Lecture 15 Topic 11: Unbalanced Designs (missing data)

Lecture 15 Topic 11: Unbalanced Designs (missing data) Lecture 15 Topic 11: Unbalanced Designs (missing data) In the real world, things fall apart: plants are destroyed/trampled/eaten animals get sick volunteers quit assistants are sloppy accidents happen

More information