Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models

Size: px
Start display at page:

Download "Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models"

Transcription

1 Faculty of Health Sciences Overview Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 29, 2016 One-way anova with random variation The rabbit example Hierarchical models with several levels Crossed random effects (interactions) The visual acuity example The ecological fallacy Comparing measurement devices Hjemmesider: RepeatedMeasuresE2016.html 1 / 92 2 / 92 Terminology for correlated measurements Variance component models Longitudinal measurements: Same outcome (response) measured consecutively over time for each individual. Repeated measurements: Same outcome (response) measured in different situations (or at different spots) for the same individual. Cluster/multilevel design: Same outcome (response) measured on all individuals in a number of families/villages/school classes Multivariate outcome: Several outcomes (responses) for each individual, e.g. a number of hormone measurements that we want to study simultaneously. Generalizations of ANOVA-type models, involving several sources of random variation (variance components) geographical/environmental variation between regions, hospitals, schools or countries biological variation variation between individuals, families or animals within-individual variation variation between arms, teeth, injection sites, days variation due to uncontrollable circumstances time of day, temperature, observer measurement error 3 / 92 4 / 92

2 Example: Swelling due to vaccine Scatter plot Research question: How much swelling can be expected in relation to a vaccination? x-axis: Arbitrary numbering of rabbits Experiment: 6 rabbits, each vaccinated in 6 spots on the back Outcome y rs : swelling in cm 2, where r= 1,,R=6 denotes the rabbit, s= 1,,S=6 denotes the spot We have observed a total of 36 swelling areas, but we must expect swelling to be specific to the individual rabbit. 5 / 92 6 / 92 Naive quantification of swelling Correlated observations The MEANS Procedure Analysis Variable : swelling Lower 95% Upper 95% N Mean Std Error CL for Mean CL for Mean What is wrong here? 7 / 92 Imagine all measurements on a rabbit resulted in the same value... Then we would actually only have 6 measurements... So what when they are only somewhat identical Observations on the same individual look alike, they are correlated Why is this important? Variation between these will not reflect the population variation (the variation between individuals) The number of observations will seem misleadingly high So, we have to take the correlation into account 8 / 92

3 Neglectance of correlation Naive analysis of swelling will lead to errors Typical errors: Wrong standard errors (too small or too big) Wrong confidence intervals (too narrow or too wide) Wrong conclusions (type I or type II errors) The type of error depends upon the kind of question asked.. to be further explained Each rabbit has a mean level There is some variation between the six injection sites for the same rabbit In computer language: The rabbit is a factor, and the analysis is a one-way ANOVA proc glm data=rabbit; class rabbit; model swelling=rabbit / solution; run; 9 / / 92 Output from naive model Output fra den naive model, II The GLM Procedure Dependent Variable: swelling Sum of Source DF Squares Mean Square F Value Pr > F Model Error Corrected Total R-Square Coeff Var Root MSE swelling Mean Source DF Type III SS Mean Square F Value Pr > F rabbit The rabbits have different levels (P=0.0040) but this was NOT the question 11 / 92 Standard Parameter Estimate Error t Value Pr > t Intercept B <.0001 rabbit B rabbit B rabbit B rabbit B rabbit B rabbit B... But: Do we get any useful information from this? We are not interested in these particular 6 rabbits, only in rabbits in general, as a species! We assume these 6 rabbits to have been randomly selected from the species. 12 / 92

4 Variance component model Instead of fixed level parameters for each rabbit, we model the differences between rabbits as an extra source of variation: y rs = µ + a r + ε rs where the a r s and the ε rs s are assumed to be independent, Normally distributed, with variances Var(a r )=, Var(ε rs)=σ 2 W The variation between rabbits is now a random effect, or random factor, ωb 2 and σ2 W are called variance components, and the model is also called a two-level model 13 / 92 Formulation in terms of correlation All swelling observations have common mean and variance: y rs N (µ, + σ 2 W ) But: Measurements made on the same rabbit are correlated with the intra-class correlation Corr(y r1, y r2 ) = ρ = + σ2 W Measurements made on the same rabbit tend to look more alike than measurements made on different rabbits. All measurements on the same rabbit look equally much alike. This correlation structure is called compound symmetry (CS) or exchangeability. 14 / 92 Covariance and correlation Exchangeability = Compound Symmetry For the six injections sites, the covariance matrix for each rabbit is: + σ2 W + σ2 W + σ2 W + σ2 W + σ2 W + σ2 W and the corresponding Compound symmetry correlation structure is: 1 ρ ρ ρ ρ ρ ρ 1 ρ ρ ρ ρ ρ ρ 1 ρ ρ ρ ρ ρ ρ 1 ρ ρ ρ ρ ρ ρ 1 ρ ρ ρ ρ ρ ρ 1 15 / 92 This covariance/correlation structure implies: All variances are equal: There should be the same variation between rabbits for all injection sites Any pair of measurements are equally correlated: All injection sites should be equally related to each other How could this assumption be violated? Are the injection sites randomly selected? If not, an unstructured covariance may be more appropriate: Some injection sites are more related than others (e.g. due to proximity). 16 / 92

5 Illustration of variance component model Estimation in SAS proc mixed data=rabbit; class rabbit; model swelling = / ddfm=kr s cl; random rabbit; run; Covariance Parameter Estimates Cov Parm Estimate rabbit Residual Solution for Fixed Effects The 6 blue line segments illustrate the individual rabbit means, and thereby the variationen between rabbits For one of the rabbits, the variation within rabbit is illustrated as a green Normal curve The rosy filled curve illustrate the total distribution 17 / 92 Standard Effect Estimate Error DF t Value Lower Upper Intercept Comparing to p. 7 reveals that correctly taking the correlation into account yields the same estimate, but substantially wider confidence interval To ignore the correlation leads to a type 1 error 18 / 92 Interpretation of variance components Interpretation of variance components, cont d Proportion of Variation Variance component Estimate variation Between ωb % Within σw % Total ωb 2 + σ2 W % Typical differences (95% Prediction Intervals): for spots on the same rabbit ± = ±2.16 cm 2 for spots on different rabbits ± = ±2.70 cm 2 19 / 92 Approx. 2 3 of the variation in the measurements comes from the variation within rabbits, i.e. between injection sites on the same rabbit. Why? Could there be a systematic difference between the injection sites? Two-way anova: Source DF Type III SS Mean Square F Value Pr > F rabbit spot This does not seem to be the case (P=0.26). 20 / 92

6 Design considerations, precision of overall mean Effective sample size For R=no. of rabbits, varying from 3 to 20: For S=no. of spots, varying from 1 to 10: If we had only one observation for each of k rabbits, how many rabbits would we then need to obtain the same precision? k = R S 1 + ρ(s 1) Inserting R = 6, S = 6 and ρ = yields k = 12.8 ω2 B +σ2 W = = Effectively, we have only approximately two independent observations from each rabbit! Standard error is the square root of: Var(ȳ) = ω2 B R 21 / 92 + σ2 W RS Take care: This is a pilot study, do not rely too heavily on the results. 22 / 92 Reduced data set - omit 3 observations Quantification of overall swelling What kind of effects would we expect? Right columns correspond to reduced data set, where the 3 smallest measurements from rabbit 2 (with the highest level) are omitted. All 36 data Omitting 3 observations Method Estimate (SE) Estimate (SE) Simple averages (0.155) (0.163) of all (p. 7) Average (0.267) (0.333) of averages Weighted average (0.265) of averages Variance component (0.267) (0.298) model, (p.18) 23 / / 92

7 Comments to quantifications in Table on p. 24 Comments, cont d Simple averages: Pool all 36 measurements, wrongly assuming independence. This will result in too small standard errors. In the reduced data set, the estimate is downwards biased, since we have omitted some of the largest observations. Average of averages: Start out by taking averages for each rabbit. This will be OK for balanced designs, but when we omit the three lowest observations for rabbit 2, this rabbit appear to have a higher level and will give an upwards bias. Weighted average of averages: As above, but weighted according to number of observations. For balanced designs, all weights are equal, but when we omit three observations, the rabbit 2 has a lower weight in the average due to only 3 observations This will result in a downwards bias. Random rabbit: The variance component model will yield the correct result, provided that observations are missing at random. In the reduced data set, rabbit 2 has a lower weight in the average due to a larger standard error 25 / / 92 Estimation of individual rabbit means...? BLUPs vs. averages, shrinkage Two different approaches: Traditional averages ȳ r. BLUP s (best linear unbiased predictor) rely on the assumption that individuals come from the same population, and become weighted averages which have been shrinked towards the overall mean: kȳ r. + (1 k)ȳ.. where Left panel: The full dataset, Right panel: Reduced data set: k = + σ2 W S (k is close to 0 when σ 2 W is large, otherwise close to 1) More shrinkage if rabbits look alike BLUPs are used for ranking e.g. schools 27 / 92 Larger shrinkage for rabbit no. 2 in reduced dataset 28 / 92

8 Hierarchical designs, cluster designs Hierarchical designs, with covariates e.g. School, School Class and Pupil [I] = [S*C*P] [S*C] S 29 / 92 [S C P] [S C] [S] Gender Class grade School type 30 / 92 Examples of hierarchies Merits of cluster designs level 1 level 2 level 3 subjects twin pairs countries subjects families regions students classes schools spots rabbits fields sections rats visits subjects centres Measurements belonging together in the same cluster look alike (are correlated) On all levels, we may have random variation (variance components), as well as covariates Certain effects may be estimated more precisely, since some sources of variation are eliminated, e.g. by making comparisons within a family or a school class This is analogous to the paired comparison situation. When planning subsequent investigations, the knowledge of the relative sizes of the variance components will be of help in deciding the number of repetitions needed at each level 31 / / 92

9 Drawbacks of cluster designs Bias may result, if one or more sources of variation are disregarded low efficiency (type 2 error) for evaluation of level 1 covariates (within-cluster effects) too small standard errors (type 1 error) for estimates of level 2 effects (between-cluster effects) possible bias in the mean value structure, in case of missing values For longitudinal data, we saw that ignoring correlations implies that: Time will appear less important Groups will appear more different 33 / 92 The school example Models for such data include 3 sources of variation: 1: Variation between schools ([S]), 2: Variation between classes in each school ([S*C]) and 3: Variation between pupils in each class ([S*C*P], residual variation) What may happen if we forget the variation between classes in the same school, [S*C]? Pupils in the same class will be assumed no more correlated than pupils from different classes (in the same school) Covariates on class level (e.g. class grade) will appear too important Covariates on pupil level (e.g. gender) will appear less important We will return to this example, when we discuss binary data 34 / 92 Another example of a 3-level model Pilot study Research problem: In order to evaluate the effect of cytostatica on pancreas islet β-cells, we need to quantify the number of nuclei per cell. Henrik Winther Nielsen, Inst. Med. Anat. How should data be collected in order to maximize precision with low expense and work load? 4 rats (R) 3 sections for each rat (S) 5 randomly chosen fields from each section (F) Scatter plot, with jitter (symbols indicate sections) How many animals (rats)? How many slices of the pancreas? How many sections of each slice should be counted? Hierarchy: fields sections rats σ 2 τ 2 ω 2 Factor diagram: [I] = [R*S*F] [R*S] [R] 0 35 / / 92

10 3-level model in SAS Variances are positive! proc mixed data=nuclei; class rat section; model nuclei= / ddfm=kr s; random intercept section / subject=rat; run; Covariance Parameter Estimates Covariance Parameter Estimates Cov Parm Subject Estimate Intercept rat section rat Residual Solution for Fixed Effects Standard Effect Estimate Error DF t Value Pr > t Intercept and therefore these models describe all correlations to be positive. But note: It may happen that correlations are in reality negative! by a coincidence as a result of competition between units belonging together, e.g. when measuring yield for plants grown in the same pot In such a case, the corresponding variance component will be reported as a zero Here, the variation between sections is close to 0 37 / / 92 Interpretation of variance components Typical differences Proportion of Variation Variance component Estimate variation Rats ω % Sections τ % Fields σ % Total ω 2 + τ 2 + σ % Almost all variation is on the lowest level: Rats appear quite identical, perhaps they are from the same litter? Sections appear extremely identical, is the pancreas homogeneous? between two measurements: for different fields on the same section ± = ±1.255 for different sections on the same rat ±2 2 ( ) = ±1.264 for sections on different rats ±2 2 ( ) = ± / / 92

11 Correlations Example: Visual acuity vary, depending on Measurements on the same section: Corr(y rs1, y rs2 ) = ω 2 + τ 2 ω 2 + τ 2 + σ 2 = Measurements on different sections of the same rat: Corr(y r11, y r22 ) = ω 2 ω 2 + τ 2 + σ 2 = Measurements from different rats are independent Research question: When assessing the visual acuity, does it matter what eye you use, or the vicinity of the object? 7 individuals are looking at a screen, where a light flash appears. They are looking through 4 lenses, with powers 6/6, 6/18, 6/36 and 6/60, i.e. 4 magnifications: 1, 3, 6 and 10 with 2 eyes Outcome: Visual acuity, the time lag (milliseconds) between the stimulus and the electrical response at the back of the cortex 41 / / 92 Data Crowder & Hand (1990) 43 / / 92

12 Factors to take into account (systematic and random) Model formulation Main effects: 7 individuals (person), random p = 1,..., 7, e = 1, 2, m = 1, 2, 3, 4 2 eyes for each individual (eye) 4 lens magnifications (power) Interactions? person*eye person*power eye*power 2-order interaction random random person*eye*power = Residual random 45 / 92 where 46 / 92 Y pem = µ em + A p + B pe + C pm + ε pem A p N (0, ω 2 ) B pe N (0, τe 2 ) C pm N (0, τm) 2 ε pem N (0, σ 2 ) Factor diagram Not quite a multilevel model... [I ] = [Pa Ey Po] [Pa Ey] Ey Po [Pa Po] Ey [Pa] Po 0 since [Pa*Ey] and [Pa*Po] are not nested. It is, however, still a variance component model Level Unit Covariates 1 single measurements Ey*Po 2 interactions 2e [Pa*Ey] Ey 2m [Pa*Po] Po 3 individuals, [Pa] overall level Note the random interactions with patient 47 / / 92

13 SAS code, and output proc mixed data=visual; class patient eye power; model acuity=eye power eye*power / outpred=udp outpredm=udpm ddfm=kr s cl; random intercept eye power / subject=patient; run; Covariance Parameter Estimates Cov Parm Subject Estimate Intercept patient eye patient power patient Residual Solution for Fixed Effects Standard Effect eye power Estimate Error DF t Value Pr > t Intercept <.0001 eye left eye right power power power power eye*power left eye*power left eye*power left eye*power left eye*power right eye*power right eye*power right eye*power right Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F eye power eye*power / / 92 Predicted mean profiles Individual predictions From the dataset udpm: From the dataset udp: 51 / / 92

14 Ordinary residual plots Conditional residual plots 53 / / 92 Omit the random effect patient*power? Omit the interaction eye*power Individual predictions: Predictions from additive model: 55 / / 92

15 Systematic vs. random effects Fixed vs. random effects? Could the patients be treated as systematic here? Yes: Covariance Parameter Estimates Cov Parm Subject Estimate eye patient power patient Residual Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F patient eye power eye*power Can you think why? 57 / / 92 Fixed: Random: all values of the factor present (typically only a few, e.g. treatment) allows inference for these particular factor values only must include a reasonable number of observations for each factor value Only a representative sample of values of the factor is present allows inference to be extended beyond the values in the experiment (e.g. geographical areas, classes, rabbits) is necessary when we have a covariate for this level, e.g. class grade, or treatment Example: Blood pressure and social inequity Structure of example Scientific question: Is blood pressure affected by social inequity in the area where you live? Data: women from 17 regions of Malmö Individual measurements of blood pressure and education Social inequity defined as rate of people with low educational achievement (less than 9 years) Outcome: Blood Pressure (diastolic) Covariates: Individual level (level 1): indicator of low educational achievement (x) (less than 9 years of school) age group (10 year groups from 20 to 79) Regional level (level 2): rate of people with low educational achievement (z) from the Skåne Council Statistics Office From: J. Merlo et.al, J. Epidemiol Community Health 55, / / 92

16 Possible analysis strategies Ecological analysis = analysis on level 2: could be called the easy choice Outcome: Average blood pressure in region Covariate: Rate of low educational achievement (z) Two-level mixed model, Outcome: Individual blood pressure Covariates: A combination of the covariates from p. 60: x, z and age Ecological analysis = analysis on level 2 Average blood pressure in region vs rate of people with low educational achievement. Size of circle indicates size of investigation. Estimate of regression coefficient: with SE 1.420, P=0.005 Seems an important explanatory variable?!? 61 / / 92 Interpretation of level 2 analysis Pitfall of ecological analysis A quite risky interpretation of the impact of low education rate, ˆβ = 4.655: If you move from an area with a rate of 100% to a region with a rate of 0%, you would expect to decrease your blood pressure with 4.655mm, so if you move to an area with a 10% lower rate of uneducated people, the decrease will be only 1 2 mm Not so impressive..., and quite possibly exaggerated, since it includes the effect of individual education! What is going on? There are hardy any differences between average blood pressure in the 17 regions (range approximately , see figure p. 62) If we fit a two-level model without covariates (just like the rabbits), we find the variance components: Between regions: Within regions: This means that region can only account for = 0.36% of the variation in blood pressures. Thus, any covariate on region level will have very little impact on blood pressure! 63 / / 92

17 Individual and regional blood presure Estimates from two-level model Effect of individual education achievement (x) vs. regional educational achievement (z): Estimate (SE) Variation Included x z Between Within R 2 Covariates (individual) (regional) regions regions (of ωb 2 ) ωb 2 σw 2 none % (ref) age % x, age 1.15 (0.17) % z, age (1.35) % x, z, age 1.09 (0.17) 2.97 (1.25) % 65 / / 92 Conclusion R 2 for variance component models Part of the high blood pressure in regions with a high percentage of low education is due to people with low education having a high blood pressure (effect of x) The ecological analysis adds up the individual effect (x) and the regional effect (z), but is not able to distinguish between the two. It overestimates the level 2 effect: Moving will not help you as much as estimated It cannot be interpreted as a level 1 effect: It cannot compensate for your own low education Education does seem to be able to decrease your blood pressure - with an estimated 1.09 mm... We have two (or more) different variances to explain! residual variation (variation within regions, σ 2 W ) decreases when an important x-covariate is included (level 1) may decrease when an important z-covariate is included (level 2) variation between regions, ωb 2 decreases when an important z-covariate is included (level 2) may increase when an important x-covariate is included (level 1) e.g. education, if this is unevenly distributed between the regions This has to do with confounding 67 / / 92

18 Hypothetical example A Hypothetical example B The x s (education) vary between (three) regions, and the average outcomes (ȳ, blue squares) are therefore quite different: The x s (education) vary between (three) regions, but the average outcomes (ȳ) are almost equal: The levels of outcome y, controlled for x are more equal, so ω 2 decreased when x is included in the model 69 / 92 The levels of outcome y, controlled for x are very different! ω 2 increased when x is included in the model 70 / 92 Example: suicide and religion Scientific question: Are protestants more likely to commit suicide, as compared to catholics? If so, regions with a high rate of protestants should also have a high rate of suicides. Ecological analysis = analysis on level 2, the regions rate of suicide vs percentage protestants in region. Data: In a number of regions, we count: Number of suicides (in a given period) Outcome: % suicides (among all citizens) Number of protestants and catholics, Covariate: % protestants 71 / 92 Percent of suicides is seen to increase with percent of protestants. So... Are protestants more likely to commit suicidide? 72 / 92

19 Two-level model Comparing measurement devices Outcome: Suicide (in a well defined period), yes or no Covariate: Individual religion (x), percentage protestants (z) Interaction between x and z: More suicides among catholics in regions with many protestants but they do not count as much, since they are a minor group 73 / 92 Example: Peak expiratory flow rate, l/min: 17 subjects, 2 measurement devices, each measured twice subject Wright mini Wright id Y 1p1 Y 1p2 Y 2p1 Y 2p Average SD / 92 Illustration of all data Aim of investigation 1. Quantify the precision of each measuring device: compare the two repetitions 2. Quantify the agreement between the two devices: compare individual measurements - or averages 3. Give practical advice for clinical use: can we trust the devices, and can we use them interchangeably? 75 / / 92

20 Analysis approaches Repeatability: Differences no.2 - no.1 1. Make Bland-Altman plots and quantify limits-of-agreement, for each method separately 2. Make Bland-Altman plots and quantify limits-of-agreement for difference between the two metods use averages of the two measurements for each method only good if this is clinical practice use single measurements for each method but how do we then get pairs... Better perform a mixed model for all measurements simultaneously 3. This is a practical (clinical, not statistical) question, based on results from the above 77 / / 92 Limits-of agreement Separate two-level models Differences between repetitions, with reference interval: Wright: (-53.51, 43.82) Mini wright: 2.88 (-61.93, 67.69) If there is no learning effect, the intervals should be symmetrical around zero. We look at it as a variance component model, for each method separately. 79 / 92 proc mixed data=wright; by method; class id; model flow= / ddfm=kr s cl; random intercept / subject=id; run; Output next page results in the reference intervals: Residual Method standard deviation Reference interval Wright = ±45.68 Mini Wright = ± / 92

21 Separate two-level models, results method=mini Covariance Parameter Estimates Cov Parm Subject Estimate Intercept id Residual Solution for Fixed Effects Standard Effect Estimate Error DF t Value Pr > t Alpha Intercept < Joint variance component model, for both methods Subject, p = 1,..., 17 Methods, m = 1, 2 Repetitions, j = 1, 2 Y p1j = β 1 + A p1 + ε p1j, j = 1, 2 Y p2j = β 2 + A p2 + ε p2j, j = 1, 2 where (A p1, A p2 ) N 2 (0, Σ) and ε pmj N (0, σm) 2 method=wright Covariance Parameter Estimates Cov Parm Subject Estimate Intercept id Residual Solution for Fixed Effects Standard Effect Estimate Error DF t Value Pr > t Alpha Intercept < / 92 proc mixed data=wright; class method id; model flow=method / ddfm=kr s cl; random method / type=un subject=id gcorr vcorr; repeated / type=simple group=method subject=id*method r rcorr; run; 82 / 92 Correlation structure Results from joint model model in the joint model, for each subject, with ordering (Wrigt1, Wright2, MiniWright1, MiniWright2) Class Level Information Class Levels Values method 2 mini wright id and ω σ2 1 ω 2 1 ρω 1 ω 2 ρω 1 ω 2 ω 2 1 ω σ2 1 ρω 1 ω 2 ρω 1 ω 2 ρω 1 ω 2 ρω 1 ω 2 ω σ2 2 ω 2 2 ρω 1 ω 2 ρω 1 ω 2 ω 2 2 ω σ2 2 Σ = ( ω 2 1 ρω 1 ω 2 ρω 1 ω 2 ω 2 2 ) Covariance Parameter Estimates Cov Parm Subject Group Estimate UN(1,1) id UN(2,1) id UN(2,2) id Residual method*id method mini Residual method*id method wright Fit Statistics -2 Res Log Likelihood AIC (Smaller is Better) Solution for Fixed Effects Standard Effect method Estimate Error DF t Value Pr > t Alpha Intercept < method mini method wright / / 92

22 Estimates Systematic difference between measuring devices: ˆβ 1 ˆβ 2 = 6.03(8.05), P = 0.46 Correlation estimates: Correlation between levels for the two types of devices 85 / 92 Estimated G Correlation Matrix Row Effect method id Col1 Col2 1 method mini method wright Correlation between alle 4 measurements on the same individual: Estimated V Correlation Matrix for id 1 Row Col1 Col2 Col3 Col Precision of the methods is calculated from identical repetitions: Limits-of-agreement: as shown on p / 92 D pm = Y pmj1 Y pmj2 = ε p1j1 ε p2j2 N (0, 2σ 2 m) ± approx 2 2σm 2 Agreement between the two methods Agreement between averages Difference between single measurements by the two methods: D p = Y p1j1 Y p2j2 = β 1 β 2 + A p1 A p2 + ε p1j1 ε p2j2 N (β 1 β 2, v 2 1) where v 2 1 = ω2 1 + ω2 2 2ω 1ω 2 + σ σ2 2 Limits-of-agreement, Mini-Wright: 6.03 ± approx 2 v1 2 = ( 73.41, 85.47) can be obtained by direct calculation of SD for difference between averages, or from D p = X p1. X p2. = β 1 β 2 + A p1 A p2 + ε p1. ε p2. N (β 1 β 2, v 2 2) where v 2 2 = ω2 1 + ω2 2 2ω 1ω σ σ2 2 Limits-of-agreement, Mini-Wright: 6.03 ± approx 2 = ( 64.02, 76.08) v 2 2 Only reasonable, if averages are standard for clinical use! 87 / / 92

23 Difference in precision? Results when assuming equal precisions We can test this by comparing with a simpler model, with σ 1 = σ 2, simply by deleting the repeated-statement: proc mixed data=wright; class method id; model flow=method / ddfm=kr s cl; random method / type=un subject=id gcorr vcorr; run; Estimated G Correlation Matrix Row Effect method id Col1 Col2 1 method mini method wright Covariance Parameter Estimates Cov Parm Subject Estimate UN(1,1) id UN(2,1) id UN(2,2) id Residual Fit Statistics -2 Res Log Likelihood AIC (Smaller is Better) Solution for Fixed Effects Standard Effect method Estimate Error DF t Value Pr > t Alpha Intercept < method mini / / 92 Test for difference in precision We had the estimated precisions: Wright: σ 2 1 = mini Wright: σ 2 2 = Comparing the two models (p. 83 and p. 90), we find 2 log Q = = 1.2 χ 2 (1) P = 0.27 Alternative test: F = σ2 2 σ 2 1 = = 1.69 F(17, 17) P = 0.14 Conclusion: Wright is not significantly better than mini Wright. 91 / 92 Dubious/Incorrect Bland-Altman approaches Calculate agreement between averages: is only reasonable, if this is clinical practice Otherwise, the limits will be too optimistic Calculate all possible differences between pairs, 4 different when we have two repetitions but these are correlated and will give rise to too optimistic (narrow) limits 92 / 92

Correlated data. Variance component models. Example: Evaluate vaccine. Traditional assumption so far. Faculty of Health Sciences

Correlated data. Variance component models. Example: Evaluate vaccine. Traditional assumption so far. Faculty of Health Sciences Faculty of Health Sciences Variance component models Definitions and motivation Correlated data Variance component models, I Lene Theil Skovgaard November 29, 2013 One-way anova with random variation The

More information

Varians- og regressionsanalyse

Varians- og regressionsanalyse Faculty of Health Sciences Varians- og regressionsanalyse Variance component models Lene Theil Skovgaard Department of Biostatistics Variance component models Definitions and motivation One-way anova with

More information

Analysis of variance and regression. December 4, 2007

Analysis of variance and regression. December 4, 2007 Analysis of variance and regression December 4, 2007 Variance component models Variance components One-way anova with random variation estimation interpretations Two-way anova with random variation Crossed

More information

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman.

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman. Faculty of Health Sciences Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 27, 2018 1 / 84 Overview One-way anova with random variation The rabbit example Hierarchical

More information

Correlated data. Overview. Example: Swelling due to vaccine. Variance component models. Faculty of Health Sciences. Variance component models

Correlated data. Overview. Example: Swelling due to vaccine. Variance component models. Faculty of Health Sciences. Variance component models Faculty of Health Sciences Overview Correlated data Variance component models One-way anova with random variation The rabbit example Hierarchical models with several levels Random regression Lene Theil

More information

Variance component models

Variance component models Faculty of Health Sciences Variance component models Analysis of repeated measurements, NFA 2016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen Topics for

More information

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman.

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman. Faculty of Health Sciences Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 28, 2017 1 / 96 Overview One-way anova with random variation The rabbit example Hierarchical

More information

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models Faculty of Health Sciences Overview Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 28, 2017 One-way anova with random variation The rabbit example Hierarchical

More information

Variance component models part I

Variance component models part I Faculty of Health Sciences Variance component models part I Analysis of repeated measurements, 30th November 2012 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

Multi-factor analysis of variance

Multi-factor analysis of variance Faculty of Health Sciences Outline Multi-factor analysis of variance Basic statistics for experimental researchers 2015 Two-way ANOVA and interaction Mathed samples ANOVA Random vs systematic variation

More information

Variance components and LMMs

Variance components and LMMs Faculty of Health Sciences Variance components and LMMs Analysis of repeated measurements, 4th December 2014 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

Variance components and LMMs

Variance components and LMMs Faculty of Health Sciences Topics for today Variance components and LMMs Analysis of repeated measurements, 4th December 04 Leftover from 8/: Rest of random regression example. New concepts for today:

More information

Correlated data. Overview. Cross-over study. Repetition. Faculty of Health Sciences. Variance component models, II. More on variance component models

Correlated data. Overview. Cross-over study. Repetition. Faculty of Health Sciences. Variance component models, II. More on variance component models Faculty of Health Sciences Overview Correlated data More on variance component models Variance component models, II Cross-over studies Non-normal data Comparing measurement devices Lene Theil Skovgaard

More information

Linear mixed models. Faculty of Health Sciences. Analysis of repeated measurements, 10th March Julie Lyng Forman & Lene Theil Skovgaard

Linear mixed models. Faculty of Health Sciences. Analysis of repeated measurements, 10th March Julie Lyng Forman & Lene Theil Skovgaard Faculty of Health Sciences Linear mixed models Analysis of repeated measurements, 10th March 2015 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen 1 / 80 Program

More information

Linear mixed models. Program. What are repeated measurements? Outline. Faculty of Health Sciences. Analysis of repeated measurements, 10th March 2015

Linear mixed models. Program. What are repeated measurements? Outline. Faculty of Health Sciences. Analysis of repeated measurements, 10th March 2015 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Program Faculty of Health Sciences Topics: Linear mixed models

More information

Answer to exercise: Blood pressure lowering drugs

Answer to exercise: Blood pressure lowering drugs Answer to exercise: Blood pressure lowering drugs The data set bloodpressure.txt contains data from a cross-over trial, involving three different formulations of a drug for lowering of blood pressure:

More information

Correlated data. Repeated measurements over time. Typical set-up for repeated measurements. Traditional presentation of data

Correlated data. Repeated measurements over time. Typical set-up for repeated measurements. Traditional presentation of data Faculty of Health Sciences Repeated measurements over time Correlated data NFA, May 22, 2014 Longitudinal measurements Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics University of

More information

Analysis of variance and regression. May 13, 2008

Analysis of variance and regression. May 13, 2008 Analysis of variance and regression May 13, 2008 Repeated measurements over time Presentation of data Traditional ways of analysis Variance component model (the dogs revisited) Random regression Baseline

More information

Statistics for exp. medical researchers Regression and Correlation

Statistics for exp. medical researchers Regression and Correlation Faculty of Health Sciences Regression analysis Statistics for exp. medical researchers Regression and Correlation Lene Theil Skovgaard Sept. 28, 2015 Linear regression, Estimation and Testing Confidence

More information

Models for longitudinal data

Models for longitudinal data Faculty of Health Sciences Contents Models for longitudinal data Analysis of repeated measurements, NFA 016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

Multi-factor analysis of variance

Multi-factor analysis of variance Faculty of Health Sciences Outline Multi-factor analysis of variance Basic statistics for experimental researchers 2016 Two-way ANOVA and interaction Matched samples ANOVA Random vs systematic variation

More information

Correlated data. Longitudinal data. Typical set-up for repeated measurements. Examples from literature, I. Faculty of Health Sciences

Correlated data. Longitudinal data. Typical set-up for repeated measurements. Examples from literature, I. Faculty of Health Sciences Faculty of Health Sciences Longitudinal data Correlated data Longitudinal measurements Outline Designs Models for the mean Covariance patterns Lene Theil Skovgaard November 27, 2015 Random regression Baseline

More information

SAS Syntax and Output for Data Manipulation:

SAS Syntax and Output for Data Manipulation: CLP 944 Example 5 page 1 Practice with Fixed and Random Effects of Time in Modeling Within-Person Change The models for this example come from Hoffman (2015) chapter 5. We will be examining the extent

More information

Correlated data. Further topics. Sources of random variation. Specification of mixed models. Faculty of Health Sciences.

Correlated data. Further topics. Sources of random variation. Specification of mixed models. Faculty of Health Sciences. Faculty of Health Sciences Further topics Correlated data Further topics Lene Theil Skovgaard December 11, 2012 Specification of mixed models Model check and diagnostics Explained variation, R 2 Missing

More information

multilevel modeling: concepts, applications and interpretations

multilevel modeling: concepts, applications and interpretations multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models

More information

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 An Introduction to Multilevel Models PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 Today s Class Concepts in Longitudinal Modeling Between-Person vs. +Within-Person

More information

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */ CLP 944 Example 4 page 1 Within-Personn Fluctuation in Symptom Severity over Time These data come from a study of weekly fluctuation in psoriasis severity. There was no intervention and no real reason

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

Introduction to Within-Person Analysis and RM ANOVA

Introduction to Within-Person Analysis and RM ANOVA Introduction to Within-Person Analysis and RM ANOVA Today s Class: From between-person to within-person ANOVAs for longitudinal data Variance model comparisons using 2 LL CLP 944: Lecture 3 1 The Two Sides

More information

Review of CLDP 944: Multilevel Models for Longitudinal Data

Review of CLDP 944: Multilevel Models for Longitudinal Data Review of CLDP 944: Multilevel Models for Longitudinal Data Topics: Review of general MLM concepts and terminology Model comparisons and significance testing Fixed and random effects of time Significance

More information

Introduction to Random Effects of Time and Model Estimation

Introduction to Random Effects of Time and Model Estimation Introduction to Random Effects of Time and Model Estimation Today s Class: The Big Picture Multilevel model notation Fixed vs. random effects of time Random intercept vs. random slope models How MLM =

More information

Describing Change over Time: Adding Linear Trends

Describing Change over Time: Adding Linear Trends Describing Change over Time: Adding Linear Trends Longitudinal Data Analysis Workshop Section 7 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section

More information

Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA

Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA Faculty of Health Sciences Outline Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA Lene Theil Skovgaard Sept. 14, 2015 Paired comparisons: tests and confidence intervals

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen Outline Data in wide and long format

More information

ANOVA Longitudinal Models for the Practice Effects Data: via GLM

ANOVA Longitudinal Models for the Practice Effects Data: via GLM Psyc 943 Lecture 25 page 1 ANOVA Longitudinal Models for the Practice Effects Data: via GLM Model 1. Saturated Means Model for Session, E-only Variances Model (BP) Variances Model: NO correlation, EQUAL

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages

More information

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is

More information

Faculty of Health Sciences. Correlated data. More about LMMs. Lene Theil Skovgaard. December 4, / 104

Faculty of Health Sciences. Correlated data. More about LMMs. Lene Theil Skovgaard. December 4, / 104 Faculty of Health Sciences Correlated data More about LMMs Lene Theil Skovgaard December 4, 2015 1 / 104 Further topics Model check and diagnostics Cross-over studies Paired T-tests with missing values

More information

Models for Clustered Data

Models for Clustered Data Models for Clustered Data Edps/Psych/Soc 589 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline Notation NELS88 data Fixed Effects ANOVA

More information

Models for Clustered Data

Models for Clustered Data Models for Clustered Data Edps/Psych/Stat 587 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2017 Outline Notation NELS88 data Fixed Effects ANOVA

More information

Random Intercept Models

Random Intercept Models Random Intercept Models Edps/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline A very simple case of a random intercept

More information

Random Coefficient Model (a.k.a. multilevel model) (Adapted from UCLA Statistical Computing Seminars)

Random Coefficient Model (a.k.a. multilevel model) (Adapted from UCLA Statistical Computing Seminars) STAT:5201 Applied Statistic II Random Coefficient Model (a.k.a. multilevel model) (Adapted from UCLA Statistical Computing Seminars) School math achievement scores The data file consists of 7185 students

More information

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010 MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010 Part 1 of this document can be found at http://www.uvm.edu/~dhowell/methods/supplements/mixed Models for Repeated Measures1.pdf

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

Lecture 1 Introduction to Multi-level Models

Lecture 1 Introduction to Multi-level Models Lecture 1 Introduction to Multi-level Models Course Website: http://www.biostat.jhsph.edu/~ejohnson/multilevel.htm All lecture materials extracted and further developed from the Multilevel Model course

More information

Correlated data. Non-normal outcomes. Reminder on binary data. Non-normal data. Faculty of Health Sciences. Non-normal outcomes

Correlated data. Non-normal outcomes. Reminder on binary data. Non-normal data. Faculty of Health Sciences. Non-normal outcomes Faculty of Health Sciences Non-normal outcomes Correlated data Non-normal outcomes Lene Theil Skovgaard December 5, 2014 Generalized linear models Generalized linear mixed models Population average models

More information

Topic 25 - One-Way Random Effects Models. Outline. Random Effects vs Fixed Effects. Data for One-way Random Effects Model. One-way Random effects

Topic 25 - One-Way Random Effects Models. Outline. Random Effects vs Fixed Effects. Data for One-way Random Effects Model. One-way Random effects Topic 5 - One-Way Random Effects Models One-way Random effects Outline Model Variance component estimation - Fall 013 Confidence intervals Topic 5 Random Effects vs Fixed Effects Consider factor with numerous

More information

SAS Code for Data Manipulation: SPSS Code for Data Manipulation: STATA Code for Data Manipulation: Psyc 945 Example 1 page 1

SAS Code for Data Manipulation: SPSS Code for Data Manipulation: STATA Code for Data Manipulation: Psyc 945 Example 1 page 1 Psyc 945 Example page Example : Unconditional Models for Change in Number Match 3 Response Time (complete data, syntax, and output available for SAS, SPSS, and STATA electronically) These data come from

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs) 36-309/749 Experimental Design for Behavioral and Social Sciences Dec 1, 2015 Lecture 11: Mixed Models (HLMs) Independent Errors Assumption An error is the deviation of an individual observed outcome (DV)

More information

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data Today s Class: Review of concepts in multivariate data Introduction to random intercepts Crossed random effects models

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen 2 / 28 Preparing data for analysis The

More information

Regression models. Categorical covariate, Quantitative outcome. Examples of categorical covariates. Group characteristics. Faculty of Health Sciences

Regression models. Categorical covariate, Quantitative outcome. Examples of categorical covariates. Group characteristics. Faculty of Health Sciences Faculty of Health Sciences Categorical covariate, Quantitative outcome Regression models Categorical covariate, Quantitative outcome Lene Theil Skovgaard April 29, 2013 PKA & LTS, Sect. 3.2, 3.2.1 ANOVA

More information

Models for binary data

Models for binary data Faculty of Health Sciences Models for binary data Analysis of repeated measurements 2015 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen 1 / 63 Program for

More information

Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA

Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED Maribeth Johnson Medical College of Georgia Augusta, GA Overview Introduction to longitudinal data Describe the data for examples

More information

Analysis of variance. April 16, Contents Comparison of several groups

Analysis of variance. April 16, Contents Comparison of several groups Contents Comparison of several groups Analysis of variance April 16, 2009 One-way ANOVA Two-way ANOVA Interaction Model checking Acknowledgement for use of presentation Julie Lyng Forman, Dept. of Biostatistics

More information

Analysis of variance. April 16, 2009

Analysis of variance. April 16, 2009 Analysis of variance April 16, 2009 Contents Comparison of several groups One-way ANOVA Two-way ANOVA Interaction Model checking Acknowledgement for use of presentation Julie Lyng Forman, Dept. of Biostatistics

More information

WU Weiterbildung. Linear Mixed Models

WU Weiterbildung. Linear Mixed Models Linear Mixed Effects Models WU Weiterbildung SLIDE 1 Outline 1 Estimation: ML vs. REML 2 Special Models On Two Levels Mixed ANOVA Or Random ANOVA Random Intercept Model Random Coefficients Model Intercept-and-Slopes-as-Outcomes

More information

Introduction to Crossover Trials

Introduction to Crossover Trials Introduction to Crossover Trials Stat 6500 Tutorial Project Isaac Blackhurst A crossover trial is a type of randomized control trial. It has advantages over other designed experiments because, under certain

More information

High-dimensional regression

High-dimensional regression High-dimensional regression Advanced Methods for Data Analysis 36-402/36-608) Spring 2014 1 Back to linear regression 1.1 Shortcomings Suppose that we are given outcome measurements y 1,... y n R, and

More information

Analysis of variance and regression. April 17, Contents Comparison of several groups One-way ANOVA. Two-way ANOVA Interaction Model checking

Analysis of variance and regression. April 17, Contents Comparison of several groups One-way ANOVA. Two-way ANOVA Interaction Model checking Analysis of variance and regression Contents Comparison of several groups One-way ANOVA April 7, 008 Two-way ANOVA Interaction Model checking ANOVA, April 008 Comparison of or more groups Julie Lyng Forman,

More information

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3 STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae

More information

Sleep data, two drugs Ch13.xls

Sleep data, two drugs Ch13.xls Model Based Statistics in Biology. Part IV. The General Linear Mixed Model.. Chapter 13.3 Fixed*Random Effects (Paired t-test) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch

More information

Analysing data: regression and correlation S6 and S7

Analysing data: regression and correlation S6 and S7 Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association

More information

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1 CLDP 944 Example 3a page 1 From Between-Person to Within-Person Models for Longitudinal Data The models for this example come from Hoffman (2015) chapter 3 example 3a. We will be examining the extent to

More information

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression. 10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for

More information

More about linear mixed models

More about linear mixed models Faculty of Health Sciences Contents More about linear mixed models Analysis of repeated measurements, NFA 2016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE Biostatistics Workshop 2008 Longitudinal Data Analysis Session 4 GARRETT FITZMAURICE Harvard University 1 LINEAR MIXED EFFECTS MODELS Motivating Example: Influence of Menarche on Changes in Body Fat Prospective

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science

UNIVERSITY OF TORONTO Faculty of Arts and Science UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Today s Class (or 3): Summary of steps in building unconditional models for time What happens to missing predictors Effects of time-invariant predictors

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Lecture 4. Random Effects in Completely Randomized Design

Lecture 4. Random Effects in Completely Randomized Design Lecture 4. Random Effects in Completely Randomized Design Montgomery: 3.9, 13.1 and 13.7 1 Lecture 4 Page 1 Random Effects vs Fixed Effects Consider factor with numerous possible levels Want to draw inference

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

Designing Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D.

Designing Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D. Designing Multilevel Models Using SPSS 11.5 Mixed Model John Painter, Ph.D. Jordan Institute for Families School of Social Work University of North Carolina at Chapel Hill 1 Creating Multilevel Models

More information

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall 1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work

More information

Covariance Structure Approach to Within-Cases

Covariance Structure Approach to Within-Cases Covariance Structure Approach to Within-Cases Remember how the data file grapefruit1.data looks: Store sales1 sales2 sales3 1 62.1 61.3 60.8 2 58.2 57.9 55.1 3 51.6 49.2 46.2 4 53.7 51.5 48.3 5 61.4 58.7

More information

Linear regression and correlation

Linear regression and correlation Faculty of Health Sciences Linear regression and correlation Statistics for experimental medical researchers 2018 Julie Forman, Christian Pipper & Claus Ekstrøm Department of Biostatistics, University

More information

Introduction to the Analysis of Hierarchical and Longitudinal Data

Introduction to the Analysis of Hierarchical and Longitudinal Data Introduction to the Analysis of Hierarchical and Longitudinal Data Georges Monette, York University with Ye Sun SPIDA June 7, 2004 1 Graphical overview of selected concepts Nature of hierarchical models

More information

Section 9c. Propensity scores. Controlling for bias & confounding in observational studies

Section 9c. Propensity scores. Controlling for bias & confounding in observational studies Section 9c Propensity scores Controlling for bias & confounding in observational studies 1 Logistic regression and propensity scores Consider comparing an outcome in two treatment groups: A vs B. In a

More information

Serial Correlation. Edps/Psych/Stat 587. Carolyn J. Anderson. Fall Department of Educational Psychology

Serial Correlation. Edps/Psych/Stat 587. Carolyn J. Anderson. Fall Department of Educational Psychology Serial Correlation Edps/Psych/Stat 587 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 017 Model for Level 1 Residuals There are three sources

More information

Supplemental Materials. In the main text, we recommend graphing physiological values for individual dyad

Supplemental Materials. In the main text, we recommend graphing physiological values for individual dyad 1 Supplemental Materials Graphing Values for Individual Dyad Members over Time In the main text, we recommend graphing physiological values for individual dyad members over time to aid in the decision

More information

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont. TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as ST 51, Summer, Dr. Jason A. Osborne Homework assignment # - Solutions 1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available

More information

Repeated Measures Modeling With PROC MIXED E. Barry Moser, Louisiana State University, Baton Rouge, LA

Repeated Measures Modeling With PROC MIXED E. Barry Moser, Louisiana State University, Baton Rouge, LA Paper 188-29 Repeated Measures Modeling With PROC MIXED E. Barry Moser, Louisiana State University, Baton Rouge, LA ABSTRACT PROC MIXED provides a very flexible environment in which to model many types

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

Applied Microeconometrics (L5): Panel Data-Basics

Applied Microeconometrics (L5): Panel Data-Basics Applied Microeconometrics (L5): Panel Data-Basics Nicholas Giannakopoulos University of Patras Department of Economics ngias@upatras.gr November 10, 2015 Nicholas Giannakopoulos (UPatras) MSc Applied Economics

More information

171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th

171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th Name 171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th Use the selected SAS output to help you answer the questions. The SAS output is all at the back of the exam on pages

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept Interactions Lectures 1 & Regression Sometimes two variables appear related: > smoking and lung cancers > height and weight > years of education and income > engine size and gas mileage > GMAT scores and

More information

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p ) Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p. 376-390) BIO656 2009 Goal: To see if a major health-care reform which took place in 1997 in Germany was

More information

STAT 705 Generalized linear mixed models

STAT 705 Generalized linear mixed models STAT 705 Generalized linear mixed models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 24 Generalized Linear Mixed Models We have considered random

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

y response variable x 1, x 2,, x k -- a set of explanatory variables

y response variable x 1, x 2,, x k -- a set of explanatory variables 11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression 36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form

More information

Correlation and Regression Bangkok, 14-18, Sept. 2015

Correlation and Regression Bangkok, 14-18, Sept. 2015 Analysing and Understanding Learning Assessment for Evidence-based Policy Making Correlation and Regression Bangkok, 14-18, Sept. 2015 Australian Council for Educational Research Correlation The strength

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

CHL 5225 H Crossover Trials. CHL 5225 H Crossover Trials

CHL 5225 H Crossover Trials. CHL 5225 H Crossover Trials CHL 55 H Crossover Trials The Two-sequence, Two-Treatment, Two-period Crossover Trial Definition A trial in which patients are randomly allocated to one of two sequences of treatments (either 1 then, or

More information

Faculty of Health Sciences. Correlated data. Count variables. Lene Theil Skovgaard & Julie Lyng Forman. December 6, 2016

Faculty of Health Sciences. Correlated data. Count variables. Lene Theil Skovgaard & Julie Lyng Forman. December 6, 2016 Faculty of Health Sciences Correlated data Count variables Lene Theil Skovgaard & Julie Lyng Forman December 6, 2016 1 / 76 Modeling count outcomes Outline The Poisson distribution for counts Poisson models,

More information

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics Faculty of Health Sciences Regression models Counts, Poisson regression, 27-5-2013 Lene Theil Skovgaard Dept. of Biostatistics 1 / 36 Count outcome PKA & LTS, Sect. 7.2 Poisson regression The Binomial

More information