Correlated data. Variance component models. Example: Evaluate vaccine. Traditional assumption so far. Faculty of Health Sciences

Size: px
Start display at page:

Download "Correlated data. Variance component models. Example: Evaluate vaccine. Traditional assumption so far. Faculty of Health Sciences"

Transcription

1 Faculty of Health Sciences Variance component models Definitions and motivation Correlated data Variance component models, I Lene Theil Skovgaard November 29, 2013 One-way anova with random variation The rabbit example Hierarchical models with several levels Comparing measurement devices Crossed random effects (interactions) The visual acuity example Hjemmesider: ltsk@sund.ku.dk 1 / 86 2 / 86 Traditional assumption so far Example: Evaluate vaccine Independence One observation per individual (unit) No twins/siblings... Why is this important? Otherwise some observations (typically on the same individual) will look alike (correlated) Variation between these will not reflect the population variation (the variation between individuals) The number of observations will seem misleadingly high More precisely: We seek an estimate of the swelling due to the vaccine. Experiment: 6 rabbits, each vaccinated in 6 spots on the back Outcome y rs : swelling in cm 2, where r= 1,,R=6 denotes the rabbit, s= 1,,S=6 denotes the spot We have observed a total of 36 swelling areas, but we must expect swelling to be specific to the individual rabbit. 3 / 86 4 / 86

2 Scatter plot Naive quantification of swelling x-axis: Arbitrary numbering of rabbits The MEANS Procedure Variable N Mean Std Dev Std Error swelling What is wrong here? Imagine all measurements on a rabbit resulted in the same value... 5 / 86 6 / 86 Example: Number of infections Neglectance of correlation Number of positive swabs in 5 family members from each of 18 families will lead to errors Typical errors: Wrong standard errors (too small or too big) Wrong confidence intervals (too narrow or too wide) Wrong conclusions (type I or type II errors) The type of error depends upon the kind of question asked.. will be further explained 7 / 86 8 / 86

3 Terminology for correlated measurements Mixed models Cluster/multilevel design: Same outcome (response) measured on all individuals in a number of families/villages/school classes Repeated measurements: Same outcome (response) measured in different situations (or at different spots) for the same individual. Longitudinal measurements: Same outcome (response) measured consecutively over time for each individual. Multivariate outcome: Several outcomes (responses) for each individual, e.g. a number of hormone measurements that we want to study simultaneously. Two basic types of generalizations: Categorical covariates (family, school): Anova Variance component models Quantitative covariates (time, age): Regression Random regression SAS: Proc Mixed Mix of systematic and random effects 9 / / 86 Variance component models Hierarchical designs, cluster designs Generalizations of ANOVA-type models, involving several sources of random variation (variance components) geographical/environmental variation between regions, hospitals, schools or countries biological variation variation between individuals, families or animals within-individual variation variation between arms, teeth, injection sites, days variation due to uncontrollable circumstances time of day, temperature, observer measurement error e.g. School, School Class and Pupil [I] = [S*C*P] [S*C] S 11 / / 86

4 Examples of hierarchies Hierarchical designs, with covariates level 1 level 2 level 3 subjects twin pairs countries subjects families regions students classes schools spots rabbits fields sections rats visits subjects centres Measurements belonging together in the same cluster look alike (are correlated) On all levels, we may have random variation (variance components), as well as covariates 13 / 86 [S C P] [S C] [S] Gender Class grade School type 14 / 86 Merits of cluster designs Drawbacks of cluster designs Certain effects may be estimated more precisely, since some sources of variation are eliminated, e.g. by making comparisons within a family. This is analogous to the paired comparison situation. When planning subsequent investigations, the knowledge of the relative sizes of the variance components will be of help in deciding the number of repetitions needed at each level Bias may result, if one or more sources of variation are disregarded possible bias in the mean value structure low efficiency (type 2 error) for evaluation of level 1 covariates (within-cluster effects) too small standard errors (type 1 error) for estimates of level 2 effects (between-cluster effects) When making inference (estimation and testing), it is important to take all sources of variation into account, and effects have to be evaluated using the relevant variation! 15 / / 86

5 The vaccine in rabbits example Output from anova model Traditional model : swelling = rabbit level + variation y rs = α r + ε rs, ε rs N (0, σ 2 ) The variation (σ) can be regarded either as within-rabbit variation or measurement error (probably a combination of the two). In computer terms: Each rabbit has its own level, i.e. rabbit is a factor proc glm data=rabbit; class rabbit; model swelling=rabbit / solution; run; 17 / 86 The GLM Procedure Dependent Variable: swelling Sum of Source DF Squares Mean Square F Value Pr > F Model Error Corrected Total R-Square Coeff Var Root MSE swelling Mean Source DF Type III SS Mean Square F Value Pr > F rabbit MS B = MS W = / 86 Output from anova model, II Standard Parameter Estimate Error t Value Pr > t Intercept B <.0001 rabbit B rabbit B rabbit B rabbit B rabbit B rabbit B... But: Do we get any useful information from this? We are not interested in these particular 6 rabbits, only in rabbits in general, as a species! We assume these 6 rabbits to have been randomly selected from the species. 19 / 86 We choose to model rabbit variation instead of rabbit levels: swelling = grand mean + between-rabbit variation + within-rabbit variation y rs = µ + a r + ε rs where the a r s and the ε rs s are assumed independent, normally distributed with Var(a r )=ω 2 B, Var(ε rs)=σ 2 W The variation between rabbits is now a random factor ωb 2 and σ2 W are variance components, and the model is also called a two-level model 20 / 86

6 Fixed vs. random effects? Formulation in terms of correlation 21 / 86 Fixed: Random: all values of the factor present (typically only a few, e.g. treatment) allows inference for these particular factor values only must include a reasonable number of observations for each factor value a representative sample of values of the factor is present allows inference to be extended beyond the values in the experiment (e.g. geographical areas, classes, rabbits) is necessary when we have a covariate for this level, e.g. class grade, or treatment All swelling observations have common mean and variance: y rs N (µ, ω 2 B + σ 2 W ) But: Measurements made on the same rabbit are correlated with the intra-class correlation Corr(y r1, y r2 ) = ρ = ω 2 B ω 2 B + σ2 W Measurements made on the same rabbit tend to look more alike than measurements made on different rabbits. All measurements on the same rabbit look equally much alike. This correlation structure is called compound symmetry (CS) or exchangeability. 22 / 86 Estimation of variance components Variances are positive! In balanced situations: Within-rabbit variation = Residual variation, as usual: σ 2 W = MS W Between-rabbit variation, not quite variation between averages: But note: It may happen that σ B 2 becomes negative! by a coincidence as a result of competition between units belonging together, e.g. when measuring yield for plants grown in the same pot ω 2 B = MS B MS W S = where S denotes the number of spots, here 6 = 0.33 In such a case, it will be reported as a zero 23 / / 86

7 Reading in data in SAS Estimation in SAS data a0; input spot $ y1-y6; datalines; a b c d e f / ; / run; / / / \ / \/ data rabbit; set a0; rabbit=1; swelling=y1; output; rabbit=2; swelling=y2; output; rabbit=3; swelling=y3; output; rabbit=4; swelling=y4; output; rabbit=5; swelling=y5; output; rabbit=6; swelling=y6; output; run; proc mixed data=rabbit; class rabbit; model swelling = / s; random rabbit; run; Covariance Parameter Estimates Cov Parm Estimate rabbit Residual Solution for Fixed Effects Standard Effect Estimate Error DF t Value Pr > t Intercept < / / 86 Interpretation of variance components Interpretation of variance components, cont d Proportion of Variation Variance component Estimate variation Between ωb % Within σw % Total ωb 2 + σ2 W % Typical differences (95% Prediction Intervals): for spots on the same rabbit ± = ±2.16 cm 2 for spots on different rabbits ± = ±2.70 cm 2 27 / 86 Approx. 2 3 of the variation in the measurements comes from the variation within rabbits, i.e. between injection sites on the same rabbit. Why? Could there be a systematic difference between the injection sites? Two-way anova: Source DF Type III SS Mean Square F Value Pr > F rabbit spot This does not seem to be the case (P=0.26). 28 / 86

8 Design considerations, precision of overall mean Effective sample size For R=no. of rabbits, varying from 3 to 20: For S=no. of spots, varying from 1 to 10: If we had only one observation for each of k rabbits, how many rabbits would we then need to obtain the same precision? k = R S 1 + ρ(s 1) We have here ρ = ω2 B ω 2 B +σ2 W = = k = 12.8 Effectively, we have only approximately two independent observations from each rabbit! Var(ȳ) = ω2 B R + σ2 W RS 29 / / 86 Quantification of overall swelling Estimation of individual rabbit means...? Forget rabbit: Pool all 36 measurements, wrongly assuming independence. ˆµ = 7.367(0.155) Rabbit averages: Start out by taking averages for each rabbit. ˆµ = 7.367(0.267) Random rabbit: Estimate the mean swelling of rabbits as a species (in general the correct approach, using mixed models). ˆµ = 7.367(0.267) Two different approaches: Traditional averages ȳ r. BLUP s (best linear unbiased predictor) rely on the assumption that individuals come from the same population, and become weighted averages: ω 2 B ω 2 B + σ2 W S ȳ r. + σ 2 W S ω 2 B + σ2 W S ȳ.. which have been shrinked towards the overall mean, ȳ.. 31 / / 86

9 BLUPs vs. averages, shrinkage Quantification for reduced dataset BLUPs are used for ranking e.g. schools When the 3 smallest measurements from rabbit 2 (largest level) are omitted, the results become: Forget rabbit: We have omitted some of the largest observation. ˆµ = 7.367(0.155) ˆµ = 7.291(0.163) Random rabbit: rabbit 2 has a lower weight in the average due to a larger standard error ˆµ = 7.367(0.267) ˆµ = 7.390(0.298) 33 / / 86 Quantification for reduced dataset, cont d BLUPs for the reduced data set Unweighted rabbit averages: average for rabbit 2 has increased ˆµ = 7.367(0.267) ˆµ = 7.436(0.333) Weighted rabbit averages: rabbit 2 has a lower weight in the average due to only 3 observations ˆµ = 7.367(0.267) ˆµ = 7.291(0.265) Larger shrinkage than before, for rabbit no / / 86

10 Confidence limits for the variance components...just as a warning Intra-individual variation σw 2 : < σw 2 < Inter-individual variation ωb 2 : < ωb 2 < 2.48 So, we should take care not to over-interpret for small datasets... Now imagine, that rabbits are grouped in two (grp=1,2) proc mixed data=rabbit; class grp rabbit; model swelling = grp / s; random rabbit(grp); run; Cov Parm Estimate rabbit(grp) < this changes Residual < this stays the same Solution for Fixed Effects Standard Effect grp Estimate Error DF t Value Pr > t Intercept <.0001 grp grp / / 86 Such a comparison can not be performed in the usual way (ignoring the rabbits), since we then compare groups as if we have much more information than we actually have Type I error will occur! proc glm data=rabbit; class grp; model swelling=grp / solution; run; T for H0: Pr > T Std Error of Parameter Estimate Parameter=0 Estimate Intercept grp Two-level model level Unit Variation Covariates 1 Individual observations within rabbit rabbit*spot spot 2 Individuals/Clusters between rabbits group rabbit overall mean Errors if the random rabbit variation is ignored: low efficiency (type 2 error) for evaluation of level 1 covariates (systematic spot?) too small standard errors (type 1 error) for estimates of level 2 effects (group, overall mean) 39 / / 86

11 Factor diagrams Example of 3-level model In the traditional one-way anova: [I] = [R*S] [R] 0 In case of grouping: [I] = [R*S] [R] G 0 We have here used the notation arrows indicating simplifications / groupings [ ] for the random effects, corresponding to variance components on the various levels. Number of nuclei per cell in the rat pancreas, used for the evaluation of cytostatica 4 rats (R) 3 sections for each rat (S) 5 randomly chosen fields from each section (F) Hierarchy: fields sections rats σ 2 τ 2 ω 2 Factor diagram: [I] = [R*S*F] [R*S] [R] 0 Henrik Winther Nielsen, Inst. Med. Anat. 41 / / 86 Scatter plot, with jitter 3-level model in SAS proc mixed data=nuclei; class rat section; model nuclei= / s; random rat section(rat); run; Covariance Parameter Estimates Cov Parm Estimate rat section(rat) Residual Solution for Fixed Effects Symbols indicate sections Standard Effect Estimate Error DF t Value Pr > t Intercept / / 86

12 Estimates of variance components Typical differences Proportion of Variation Variance component Estimate variation Rats ω % Sections τ % Fields σ % Total ω 2 + τ 2 + σ % Almost all variation is on the lowest level! between two measurements: for different fields on the same section ± = ±1.255 for different sections on the same rat ±2 2 ( ) = ±1.264 for sections on different rats ±2 2 ( ) = ± / / 86 Correlations Comparing measurement devices vary, depending on Measurements on the same section: Example: Peak expiratory flow rate, l/min: 17 subjects, 2 measurement devices, each measured twice Corr(y rs1, y rs2 ) = ω 2 + τ 2 ω 2 + τ 2 + σ 2 = Measurements on different sections of the same rat: Corr(y r11, y r22 ) = ω 2 ω 2 + τ 2 + σ 2 = Measurements from different rats are independent 47 / 86 subject Wright mini Wright id Y 1p1 Y 1p2 Y 2p1 Y 2p Average SD / 86 (Bland and Altman, 1986).

13 Illustration of all data Aim of investigation Quantify the precision of each measuring device: compare the two repetitions Quantify the agreement between the two devices: compare individual measurements - or averages Give practical advice for clinical use: can we trust the devices, and can we use them interchangeably? 49 / / 86 Variance component model Correlation structure Subject, p = 1,..., 17 Methods, m = 1, 2 Repetitions, j = 1, 2 in the above model, for each subject with ordering (Wrigt1, Wright2, MiniWright1, MiniWright2) Y pmj = β m + A p + C pm + ε pmj where A p N (0, ω 2 ), C pm N (0, τ 2 ), ε pmj N (0, σ 2 ) Note: Patients need not be random here..., why?? ω 2 + τ 2 + σ 2 ω 2 + τ 2 ω 2 ω 2 ω 2 + τ 2 ω 2 + τ 2 + σ 2 ω 2 ω 2 ω 2 ω 2 ω 2 + τ 2 + σ 2 ω 2 + τ 2 ω 2 ω 2 ω 2 + τ 2 ω 2 + τ 2 + σ 2 51 / / 86

14 Correlation structure SAS-programming if subjects are considered systematic: For each subject*method combination, i.e. for two repetitions: ( τ 2 + σ 2 τ 2 ) τ 2 τ 2 + σ 2 proc mixed data=wright; class method id; model wr=method / s; random intercept method / subject=id; run; or maybe proc mixed data=wright; class method id; model wr=id method / s; random id*method; run; 53 / / 86 Output Estimates Class Level Information Class Levels Values method 2 mini wright id Covariance Parameter Estimates Cov Parm Subject Estimate Intercept id method id Residual Fit Statistics -2 Res Log Likelihood AIC (smaller is better) Solution for Fixed Effects Standard Effect method Estimate Error DF t Value Pr > t Intercept <.0001 method mini method wright Variance components: ω 2 = τ 2 = σ 2 = Systematic difference between measuring devices: ˆβ 1 ˆβ 2 = 6.03(8.05), P = 0.46 How can we use these?? 55 / / 86

15 Precision of the methods Agreement between the two methods are assumed identicalfor the two devices Difference between double measurements (identical repetitions): Difference between single measurements by the two methods: D p = Y p1j1 Y p2j2 Limits-of-agreement: D pm = Y pmj1 Y pmj2 = ε p1j1 ε p2j2 N (0, 2σ 2 ) ±2 2σ 2 = ±50.23 Limits-of-agreement: = β 1 β 2 + C p1 C p2 + ε p1j1 ε p2j2 N (β 1 β 2, 2τ 2 + 2σ 2 ) ±2 2(τ 2 + σ 2 ) = ±75.31 (where we have ignored the nonsignificant systematic difference between the two, otherwise add 6.03) 57 / / 86 Agreement between averages Difference in precision?? New model (with systematic subject effects): can be obtained by direct calculation of SD for difference between averages, or from D p = X p1. X p2. = β 1 β 2 + C p1 C p2 + ε p1. ε p2. N (β 1 β 2, 2τ 2 + σ 2 ) Limits-of-agreement: ±2 2τ 2 + σ 2 = ±66.41 Only reasonable, if averages is the standard for clinical use! Y pmj = µ + β m + α p + C pm + ε pmj C pm N (0, τ 2 ) ε pmj N (0, σ m 2 ) proc mixed data=wright; class method id; model wr=id method / ddfm=satterth s; random id*method; repeated / group=method type=simple subject=id*method; run; 59 / / 86

16 Output, systematic subject effect Results Covariance Parameter Estimates Cov Parm Subject Group Estimate method*id Residual method*id method mini Residual method*id method wright Fit Statistics -2 Res Log Likelihood AIC (smaller is better) Solution for Fixed Effects Standard Effect method id Estimate Error DF t Value Pr > t Intercept <.0001 id id id method mini method wright Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F id <.0001 method Precisions: Wright: σ 2 1 = mini Wright: σ 2 2 = Conclusion: Wright is better than mini Wright, but is it significantly better? F = σ2 2 σ 2 1 = = 1.69 F(17, 17) P = 0.14 No... Alternative test: 2 log Q = = 1.2 χ 2 (1) P = / / 86 Dubious/Incorrect Bland-Altman approaches Measurements taken in pairs Calculate agreement between averages: We have seen that these estimate 2τ 2 + σ 2 instead of 2τ 2 + 2σ 2 In general, with k repetitions: 2τ k σ2 Calculate all possible differences between pairs D pj = Y p1j1 Y p2j2 = β 1 β 2 + C p1 C p2 + ε p1j1 ε p2j2 These have the correct variance, but they are correlated due to the C s (the same individual have several differences) and can give erroneous results if the measurement devices react to subject characteristics e.g. over time... Y pmt = β m + A p + C pm + E pt + ε pmt Precision becomes impossible, since we have no true replications Agreement: D pt = Y p1t Y p2t = β 1 β 2 + C p1 C p2 + ε p1t ε p2t but these differences will again be correlated due to the C s 63 / / 86

17 Example: Visual acuity Data 7 individuals are looking at a screen, where a light flash appears. They are looking through 4 lenses, with powers 6/6, 6/18, 6/36 and 6/60, i.e. 4 magnifications: 1, 3, 6 and 10 with 2 eyes Outcome: Visual acuity, the time lag (milliseconds) between the stimulus and the electrical response at the back of the cortex Crowder & Hand (1990) 65 / / 86 Factors to take into account Main effects: 7 individuals (person), 2 eyes for each individual (eye) 4 lens magnifications (power) Interactions? person*eye person*power eye*power 2-order interaction person*eye*power = Residual 67 / / 86

18 Model ingredients Model formulation Outcome: Visual acuity p = 1,..., 7, e = 1, 2, m = 1, 2, 3, 4 Effects: Systematic: Mean value µ em eye α e, power β m eye*power γ em Random effects: patient A p patient*eye B pe, patient*power C pm Residual: patient*eye*power ε pem where Y pem = µ em + A p + B pe + C pm + ε pem A p N (0, ω 2 ) B pe N (0, τe 2 ) C pm N (0, τm) 2 ε pem N (0, σ 2 ) 69 / / 86 Factor diagram Not quite a multilevel model, but.. [I ] = [Pa Ey Po] [Pa Ey] Ey Po [Pa Po] Ey [Pa] Po 0 still a variance component model Level Unit Covariates 1 single measurements Ey*Po 2 interactions 2e [Pa*Ey] Ey 2m [Pa*Po] Po 3 individuals, [Pa] overall level 71 / / 86

19 SAS code, and output ods graphics on; proc mixed plots=data=visual covtest; class patient eye power; model acuity=eye power eye*power / outpm=udpm outp=udp s residual influence; random intercept eye power / subject=patient; run; ods graphics off; Covariance Parameter Estimates Standard Z Cov Parm Subject Estimate Error Value Pr > Z Intercept patient eye patient power patient Residual Solution for Fixed Effects Standard Effect eye power Estimate Error DF t Value Pr > t Intercept <.0001 eye left eye right power power power power eye*power left eye*power left eye*power left eye*power left eye*power right eye*power right eye*power right eye*power right Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F eye power eye*power / / 86 Predicted mean profiles Individual predictions 75 / / 86

20 Model checks Ordinary residual plot The model was: Y pem = µ em + A p + B pe + C pm + ε pem where A p N (0, ω 2 ), B pe N (0, τe 2 ), C pm N (0, τm), 2 ε pem N (0, σ 2 ) Two types of residuals Ordinary residuals: Y pem = µ em Conditional residuals: ε pem 77 / / 86 Conditional residual plot Influence diagnostics 79 / / 86

21 Omit the interaction eye*power Eye comparisons Covariance Parameter Estimates Standard Z Cov Parm Subject Estimate Error Value Pr > Z Intercept patient eye patient power patient Residual Solution for Fixed Effects Standard Effect eye power Estimate Error DF t Value Pr > t Intercept <.0001 eye left eye right power power power power Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F eye power Difference between eye averages: Ȳ.e1. Ȳ.e 2. = µ stuff + B.e1 B.e2 + ε.e1. ε.e2. Var(Ȳ.e 1. Ȳ.e 2.) = 2 7 τ 2 e σ2 = τ 2 e = is rather large (people have different eye preferences), but we have four measurements on each eye Still, we have to have a larger difference in order to detect it 81 / / 86 Magnification comparisons If we ignore correlations Difference between magnification averages: Ȳ..m1 Ȳ..m 2 = µ stuff + C..m1 C..m2 + ε..m1 ε..m2 Var(Ȳ..m 1 Ȳ..m 2 ) = 2 7 τ 2 m σ2 = τ 2 m = 3.97 is not that large (people react more or less identically to the different magnifications) and we have only two measurements for each magnification So, we can detect smaller differences than for eye comparisons and use a model with no random effects, we get Covariance Parameter Estimates Cov Parm Estimate Residual Solution for Fixed Effects Standard Effect eye power Estimate Error DF t Value Pr > t Intercept <.0001 eye left eye right power power power power Type 3 Tests of Fixed Effects Num Den Effect DF DF Chi-Square F Value Pr > ChiSq Pr > F eye power / / 86

22 Comparisons in wrong model New σ 2 = τ 2 e = τ 2 m = 0 Eye differences: Magnification differences: Var(Ȳ.e 1. Ȳ.e 2.) = σ2 = Var(Ȳ..m 1 Ȳ..m 2 ) = σ2 = Systematic vs. random effects Could the patients be treated as systematic here? Yes: Covariance Parameter Estimates Cov Parm Subject Estimate eye patient power patient Residual Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F patient eye power eye*power Can you think why? 85 / / 86

Varians- og regressionsanalyse

Varians- og regressionsanalyse Faculty of Health Sciences Varians- og regressionsanalyse Variance component models Lene Theil Skovgaard Department of Biostatistics Variance component models Definitions and motivation One-way anova with

More information

Analysis of variance and regression. December 4, 2007

Analysis of variance and regression. December 4, 2007 Analysis of variance and regression December 4, 2007 Variance component models Variance components One-way anova with random variation estimation interpretations Two-way anova with random variation Crossed

More information

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models Faculty of Health Sciences Overview Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 29, 2016 One-way anova with random variation The rabbit example Hierarchical

More information

Correlated data. Overview. Cross-over study. Repetition. Faculty of Health Sciences. Variance component models, II. More on variance component models

Correlated data. Overview. Cross-over study. Repetition. Faculty of Health Sciences. Variance component models, II. More on variance component models Faculty of Health Sciences Overview Correlated data More on variance component models Variance component models, II Cross-over studies Non-normal data Comparing measurement devices Lene Theil Skovgaard

More information

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman.

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman. Faculty of Health Sciences Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 27, 2018 1 / 84 Overview One-way anova with random variation The rabbit example Hierarchical

More information

Correlated data. Overview. Example: Swelling due to vaccine. Variance component models. Faculty of Health Sciences. Variance component models

Correlated data. Overview. Example: Swelling due to vaccine. Variance component models. Faculty of Health Sciences. Variance component models Faculty of Health Sciences Overview Correlated data Variance component models One-way anova with random variation The rabbit example Hierarchical models with several levels Random regression Lene Theil

More information

Variance components and LMMs

Variance components and LMMs Faculty of Health Sciences Variance components and LMMs Analysis of repeated measurements, 4th December 2014 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

Multi-factor analysis of variance

Multi-factor analysis of variance Faculty of Health Sciences Outline Multi-factor analysis of variance Basic statistics for experimental researchers 2015 Two-way ANOVA and interaction Mathed samples ANOVA Random vs systematic variation

More information

Variance components and LMMs

Variance components and LMMs Faculty of Health Sciences Topics for today Variance components and LMMs Analysis of repeated measurements, 4th December 04 Leftover from 8/: Rest of random regression example. New concepts for today:

More information

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman.

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman. Faculty of Health Sciences Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 28, 2017 1 / 96 Overview One-way anova with random variation The rabbit example Hierarchical

More information

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models Faculty of Health Sciences Overview Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 28, 2017 One-way anova with random variation The rabbit example Hierarchical

More information

Variance component models

Variance component models Faculty of Health Sciences Variance component models Analysis of repeated measurements, NFA 2016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen Topics for

More information

Linear mixed models. Faculty of Health Sciences. Analysis of repeated measurements, 10th March Julie Lyng Forman & Lene Theil Skovgaard

Linear mixed models. Faculty of Health Sciences. Analysis of repeated measurements, 10th March Julie Lyng Forman & Lene Theil Skovgaard Faculty of Health Sciences Linear mixed models Analysis of repeated measurements, 10th March 2015 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen 1 / 80 Program

More information

Variance component models part I

Variance component models part I Faculty of Health Sciences Variance component models part I Analysis of repeated measurements, 30th November 2012 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

Linear mixed models. Program. What are repeated measurements? Outline. Faculty of Health Sciences. Analysis of repeated measurements, 10th March 2015

Linear mixed models. Program. What are repeated measurements? Outline. Faculty of Health Sciences. Analysis of repeated measurements, 10th March 2015 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Program Faculty of Health Sciences Topics: Linear mixed models

More information

Correlated data. Repeated measurements over time. Typical set-up for repeated measurements. Traditional presentation of data

Correlated data. Repeated measurements over time. Typical set-up for repeated measurements. Traditional presentation of data Faculty of Health Sciences Repeated measurements over time Correlated data NFA, May 22, 2014 Longitudinal measurements Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics University of

More information

Multi-factor analysis of variance

Multi-factor analysis of variance Faculty of Health Sciences Outline Multi-factor analysis of variance Basic statistics for experimental researchers 2016 Two-way ANOVA and interaction Matched samples ANOVA Random vs systematic variation

More information

Correlated data. Further topics. Sources of random variation. Specification of mixed models. Faculty of Health Sciences.

Correlated data. Further topics. Sources of random variation. Specification of mixed models. Faculty of Health Sciences. Faculty of Health Sciences Further topics Correlated data Further topics Lene Theil Skovgaard December 11, 2012 Specification of mixed models Model check and diagnostics Explained variation, R 2 Missing

More information

Analysis of variance and regression. May 13, 2008

Analysis of variance and regression. May 13, 2008 Analysis of variance and regression May 13, 2008 Repeated measurements over time Presentation of data Traditional ways of analysis Variance component model (the dogs revisited) Random regression Baseline

More information

Statistics for exp. medical researchers Regression and Correlation

Statistics for exp. medical researchers Regression and Correlation Faculty of Health Sciences Regression analysis Statistics for exp. medical researchers Regression and Correlation Lene Theil Skovgaard Sept. 28, 2015 Linear regression, Estimation and Testing Confidence

More information

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is

More information

Answer to exercise: Blood pressure lowering drugs

Answer to exercise: Blood pressure lowering drugs Answer to exercise: Blood pressure lowering drugs The data set bloodpressure.txt contains data from a cross-over trial, involving three different formulations of a drug for lowering of blood pressure:

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA

Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA Faculty of Health Sciences Outline Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA Lene Theil Skovgaard Sept. 14, 2015 Paired comparisons: tests and confidence intervals

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages

More information

Correlated data. Longitudinal data. Typical set-up for repeated measurements. Examples from literature, I. Faculty of Health Sciences

Correlated data. Longitudinal data. Typical set-up for repeated measurements. Examples from literature, I. Faculty of Health Sciences Faculty of Health Sciences Longitudinal data Correlated data Longitudinal measurements Outline Designs Models for the mean Covariance patterns Lene Theil Skovgaard November 27, 2015 Random regression Baseline

More information

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 An Introduction to Multilevel Models PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 Today s Class Concepts in Longitudinal Modeling Between-Person vs. +Within-Person

More information

STAT 525 Fall Final exam. Tuesday December 14, 2010

STAT 525 Fall Final exam. Tuesday December 14, 2010 STAT 525 Fall 2010 Final exam Tuesday December 14, 2010 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will

More information

Models for Clustered Data

Models for Clustered Data Models for Clustered Data Edps/Psych/Soc 589 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline Notation NELS88 data Fixed Effects ANOVA

More information

Models for Clustered Data

Models for Clustered Data Models for Clustered Data Edps/Psych/Stat 587 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2017 Outline Notation NELS88 data Fixed Effects ANOVA

More information

Review of CLDP 944: Multilevel Models for Longitudinal Data

Review of CLDP 944: Multilevel Models for Longitudinal Data Review of CLDP 944: Multilevel Models for Longitudinal Data Topics: Review of general MLM concepts and terminology Model comparisons and significance testing Fixed and random effects of time Significance

More information

Models for longitudinal data

Models for longitudinal data Faculty of Health Sciences Contents Models for longitudinal data Analysis of repeated measurements, NFA 016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

multilevel modeling: concepts, applications and interpretations

multilevel modeling: concepts, applications and interpretations multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models

More information

Regression models. Categorical covariate, Quantitative outcome. Examples of categorical covariates. Group characteristics. Faculty of Health Sciences

Regression models. Categorical covariate, Quantitative outcome. Examples of categorical covariates. Group characteristics. Faculty of Health Sciences Faculty of Health Sciences Categorical covariate, Quantitative outcome Regression models Categorical covariate, Quantitative outcome Lene Theil Skovgaard April 29, 2013 PKA & LTS, Sect. 3.2, 3.2.1 ANOVA

More information

Analysis of variance. April 16, Contents Comparison of several groups

Analysis of variance. April 16, Contents Comparison of several groups Contents Comparison of several groups Analysis of variance April 16, 2009 One-way ANOVA Two-way ANOVA Interaction Model checking Acknowledgement for use of presentation Julie Lyng Forman, Dept. of Biostatistics

More information

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1 CLDP 944 Example 3a page 1 From Between-Person to Within-Person Models for Longitudinal Data The models for this example come from Hoffman (2015) chapter 3 example 3a. We will be examining the extent to

More information

Analysis of variance. April 16, 2009

Analysis of variance. April 16, 2009 Analysis of variance April 16, 2009 Contents Comparison of several groups One-way ANOVA Two-way ANOVA Interaction Model checking Acknowledgement for use of presentation Julie Lyng Forman, Dept. of Biostatistics

More information

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression. 10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for

More information

Analysis of variance and regression. April 17, Contents Comparison of several groups One-way ANOVA. Two-way ANOVA Interaction Model checking

Analysis of variance and regression. April 17, Contents Comparison of several groups One-way ANOVA. Two-way ANOVA Interaction Model checking Analysis of variance and regression Contents Comparison of several groups One-way ANOVA April 7, 008 Two-way ANOVA Interaction Model checking ANOVA, April 008 Comparison of or more groups Julie Lyng Forman,

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

ANOVA Longitudinal Models for the Practice Effects Data: via GLM

ANOVA Longitudinal Models for the Practice Effects Data: via GLM Psyc 943 Lecture 25 page 1 ANOVA Longitudinal Models for the Practice Effects Data: via GLM Model 1. Saturated Means Model for Session, E-only Variances Model (BP) Variances Model: NO correlation, EQUAL

More information

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data Today s Class: Review of concepts in multivariate data Introduction to random intercepts Crossed random effects models

More information

Outline. Analysis of Variance. Acknowledgements. Comparison of 2 or more groups. Comparison of serveral groups

Outline. Analysis of Variance. Acknowledgements. Comparison of 2 or more groups. Comparison of serveral groups Outline Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~lts/regression10_2/index.html Comparison of serveral groups Model checking Marc Andersen, mja@statgroup.dk

More information

Describing Change over Time: Adding Linear Trends

Describing Change over Time: Adding Linear Trends Describing Change over Time: Adding Linear Trends Longitudinal Data Analysis Workshop Section 7 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section

More information

Introduction to the Analysis of Hierarchical and Longitudinal Data

Introduction to the Analysis of Hierarchical and Longitudinal Data Introduction to the Analysis of Hierarchical and Longitudinal Data Georges Monette, York University with Ye Sun SPIDA June 7, 2004 1 Graphical overview of selected concepts Nature of hierarchical models

More information

STAT 5200 Handout #23. Repeated Measures Example (Ch. 16)

STAT 5200 Handout #23. Repeated Measures Example (Ch. 16) Motivating Example: Glucose STAT 500 Handout #3 Repeated Measures Example (Ch. 16) An experiment is conducted to evaluate the effects of three diets on the serum glucose levels of human subjects. Twelve

More information

Section 9c. Propensity scores. Controlling for bias & confounding in observational studies

Section 9c. Propensity scores. Controlling for bias & confounding in observational studies Section 9c Propensity scores Controlling for bias & confounding in observational studies 1 Logistic regression and propensity scores Consider comparing an outcome in two treatment groups: A vs B. In a

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */ CLP 944 Example 4 page 1 Within-Personn Fluctuation in Symptom Severity over Time These data come from a study of weekly fluctuation in psoriasis severity. There was no intervention and no real reason

More information

Introduction to Within-Person Analysis and RM ANOVA

Introduction to Within-Person Analysis and RM ANOVA Introduction to Within-Person Analysis and RM ANOVA Today s Class: From between-person to within-person ANOVAs for longitudinal data Variance model comparisons using 2 LL CLP 944: Lecture 3 1 The Two Sides

More information

Outline. Analysis of Variance. Comparison of 2 or more groups. Acknowledgements. Comparison of serveral groups

Outline. Analysis of Variance. Comparison of 2 or more groups. Acknowledgements. Comparison of serveral groups Outline Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~jufo/varianceregressionf2011.html Comparison of serveral groups Model checking Marc Andersen, mja@statgroup.dk

More information

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3 STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae

More information

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010 MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010 Part 1 of this document can be found at http://www.uvm.edu/~dhowell/methods/supplements/mixed Models for Repeated Measures1.pdf

More information

Analysis of Variance

Analysis of Variance 1 / 70 Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~lts/regression11_2 Marc Andersen, mja@statgroup.dk Analysis of variance and regression for health researchers,

More information

with the usual assumptions about the error term. The two values of X 1 X 2 0 1

with the usual assumptions about the error term. The two values of X 1 X 2 0 1 Sample questions 1. A researcher is investigating the effects of two factors, X 1 and X 2, each at 2 levels, on a response variable Y. A balanced two-factor factorial design is used with 1 replicate. The

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Introduction to Crossover Trials

Introduction to Crossover Trials Introduction to Crossover Trials Stat 6500 Tutorial Project Isaac Blackhurst A crossover trial is a type of randomized control trial. It has advantages over other designed experiments because, under certain

More information

Introduction to Random Effects of Time and Model Estimation

Introduction to Random Effects of Time and Model Estimation Introduction to Random Effects of Time and Model Estimation Today s Class: The Big Picture Multilevel model notation Fixed vs. random effects of time Random intercept vs. random slope models How MLM =

More information

Lab 11. Multilevel Models. Description of Data

Lab 11. Multilevel Models. Description of Data Lab 11 Multilevel Models Henian Chen, M.D., Ph.D. Description of Data MULTILEVEL.TXT is clustered data for 386 women distributed across 40 groups. ID: 386 women, id from 1 to 386, individual level (level

More information

Simple linear regression

Simple linear regression Simple linear regression Biometry 755 Spring 2008 Simple linear regression p. 1/40 Overview of regression analysis Evaluate relationship between one or more independent variables (X 1,...,X k ) and a single

More information

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Applied Statistics Friday, January 15, 2016

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Applied Statistics Friday, January 15, 2016 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Applied Statistics Friday, January 15, 2016 Work all problems. 60 points are needed to pass at the Masters Level and 75 to pass at the

More information

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as ST 51, Summer, Dr. Jason A. Osborne Homework assignment # - Solutions 1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available

More information

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE Biostatistics Workshop 2008 Longitudinal Data Analysis Session 4 GARRETT FITZMAURICE Harvard University 1 LINEAR MIXED EFFECTS MODELS Motivating Example: Influence of Menarche on Changes in Body Fat Prospective

More information

Random Coefficient Model (a.k.a. multilevel model) (Adapted from UCLA Statistical Computing Seminars)

Random Coefficient Model (a.k.a. multilevel model) (Adapted from UCLA Statistical Computing Seminars) STAT:5201 Applied Statistic II Random Coefficient Model (a.k.a. multilevel model) (Adapted from UCLA Statistical Computing Seminars) School math achievement scores The data file consists of 7185 students

More information

Topic 20: Single Factor Analysis of Variance

Topic 20: Single Factor Analysis of Variance Topic 20: Single Factor Analysis of Variance Outline Single factor Analysis of Variance One set of treatments Cell means model Factor effects model Link to linear regression using indicator explanatory

More information

SAS Syntax and Output for Data Manipulation:

SAS Syntax and Output for Data Manipulation: CLP 944 Example 5 page 1 Practice with Fixed and Random Effects of Time in Modeling Within-Person Change The models for this example come from Hoffman (2015) chapter 5. We will be examining the extent

More information

Lecture 1 Introduction to Multi-level Models

Lecture 1 Introduction to Multi-level Models Lecture 1 Introduction to Multi-level Models Course Website: http://www.biostat.jhsph.edu/~ejohnson/multilevel.htm All lecture materials extracted and further developed from the Multilevel Model course

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

Topic 25 - One-Way Random Effects Models. Outline. Random Effects vs Fixed Effects. Data for One-way Random Effects Model. One-way Random effects

Topic 25 - One-Way Random Effects Models. Outline. Random Effects vs Fixed Effects. Data for One-way Random Effects Model. One-way Random effects Topic 5 - One-Way Random Effects Models One-way Random effects Outline Model Variance component estimation - Fall 013 Confidence intervals Topic 5 Random Effects vs Fixed Effects Consider factor with numerous

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs) 36-309/749 Experimental Design for Behavioral and Social Sciences Dec 1, 2015 Lecture 11: Mixed Models (HLMs) Independent Errors Assumption An error is the deviation of an individual observed outcome (DV)

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Possibly useful formulas for this exam: b1 = Corr(X,Y) SDY / SDX. confidence interval: Estimate ± (Critical Value) (Standard Error of Estimate)

Possibly useful formulas for this exam: b1 = Corr(X,Y) SDY / SDX. confidence interval: Estimate ± (Critical Value) (Standard Error of Estimate) Statistics 5100 Exam 2 (Practice) Directions: Be sure to answer every question, and do not spend too much time on any part of any question. Be concise with all your responses. Partial SAS output and statistical

More information

SAS Code for Data Manipulation: SPSS Code for Data Manipulation: STATA Code for Data Manipulation: Psyc 945 Example 1 page 1

SAS Code for Data Manipulation: SPSS Code for Data Manipulation: STATA Code for Data Manipulation: Psyc 945 Example 1 page 1 Psyc 945 Example page Example : Unconditional Models for Change in Number Match 3 Response Time (complete data, syntax, and output available for SAS, SPSS, and STATA electronically) These data come from

More information

Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA

Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED Maribeth Johnson Medical College of Georgia Augusta, GA Overview Introduction to longitudinal data Describe the data for examples

More information

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study 1.4 0.0-6 7 8 9 10 11 12 13 14 15 16 17 18 19 age Model 1: A simple broken stick model with knot at 14 fit with

More information

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical

More information

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum T-test: means of Spock's judge versus all other judges 1 The TTEST Procedure Variable: pcwomen judge1 N Mean Std Dev Std Err Minimum Maximum OTHER 37 29.4919 7.4308 1.2216 16.5000 48.9000 SPOCKS 9 14.6222

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen Outline Data in wide and long format

More information

Overview Scatter Plot Example

Overview Scatter Plot Example Overview Topic 22 - Linear Regression and Correlation STAT 5 Professor Bruce Craig Consider one population but two variables For each sampling unit observe X and Y Assume linear relationship between variables

More information

Measuring the fit of the model - SSR

Measuring the fit of the model - SSR Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do

More information

Longitudinal Data Analysis of Health Outcomes

Longitudinal Data Analysis of Health Outcomes Longitudinal Data Analysis of Health Outcomes Longitudinal Data Analysis Workshop Running Example: Days 2 and 3 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development

More information

Faculty of Health Sciences. Correlated data. More about LMMs. Lene Theil Skovgaard. December 4, / 104

Faculty of Health Sciences. Correlated data. More about LMMs. Lene Theil Skovgaard. December 4, / 104 Faculty of Health Sciences Correlated data More about LMMs Lene Theil Skovgaard December 4, 2015 1 / 104 Further topics Model check and diagnostics Cross-over studies Paired T-tests with missing values

More information

Longitudinal Modeling with Logistic Regression

Longitudinal Modeling with Logistic Regression Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS Ravinder Malhotra and Vipul Sharma National Dairy Research Institute, Karnal-132001 The most common use of statistics in dairy science is testing

More information

Weighted Least Squares

Weighted Least Squares Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w

More information

Analysis of Covariance

Analysis of Covariance Analysis of Covariance (ANCOVA) Bruce A Craig Department of Statistics Purdue University STAT 514 Topic 10 1 When to Use ANCOVA In experiment, there is a nuisance factor x that is 1 Correlated with y 2

More information

6. Multiple regression - PROC GLM

6. Multiple regression - PROC GLM Use of SAS - November 2016 6. Multiple regression - PROC GLM Karl Bang Christensen Department of Biostatistics, University of Copenhagen. http://biostat.ku.dk/~kach/sas2016/ kach@biostat.ku.dk, tel: 35327491

More information

171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th

171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th Name 171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th Use the selected SAS output to help you answer the questions. The SAS output is all at the back of the exam on pages

More information

Random Coefficients Model Examples

Random Coefficients Model Examples Random Coefficients Model Examples STAT:5201 Week 15 - Lecture 2 1 / 26 Each subject (or experimental unit) has multiple measurements (this could be over time, or it could be multiple measurements on a

More information

More about linear mixed models

More about linear mixed models Faculty of Health Sciences Contents More about linear mixed models Analysis of repeated measurements, NFA 2016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont. TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted

More information

High-dimensional regression

High-dimensional regression High-dimensional regression Advanced Methods for Data Analysis 36-402/36-608) Spring 2014 1 Back to linear regression 1.1 Shortcomings Suppose that we are given outcome measurements y 1,... y n R, and

More information

STAT 705 Generalized linear mixed models

STAT 705 Generalized linear mixed models STAT 705 Generalized linear mixed models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 24 Generalized Linear Mixed Models We have considered random

More information

Introduction and Background to Multilevel Analysis

Introduction and Background to Multilevel Analysis Introduction and Background to Multilevel Analysis Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background and

More information

WORKSHOP 3 Measuring Association

WORKSHOP 3 Measuring Association WORKSHOP 3 Measuring Association Concepts Analysing Categorical Data o Testing of Proportions o Contingency Tables & Tests o Odds Ratios Linear Association Measures o Correlation o Simple Linear Regression

More information

VIII. ANCOVA. A. Introduction

VIII. ANCOVA. A. Introduction VIII. ANCOVA A. Introduction In most experiments and observational studies, additional information on each experimental unit is available, information besides the factors under direct control or of interest.

More information

General Principles Within-Cases Factors Only Within and Between. Within Cases ANOVA. Part One

General Principles Within-Cases Factors Only Within and Between. Within Cases ANOVA. Part One Within Cases ANOVA Part One 1 / 25 Within Cases A case contributes a DV value for every value of a categorical IV It is natural to expect data from the same case to be correlated - NOT independent For

More information