Correlated data. Overview. Cross-over study. Repetition. Faculty of Health Sciences. Variance component models, II. More on variance component models

Size: px
Start display at page:

Download "Correlated data. Overview. Cross-over study. Repetition. Faculty of Health Sciences. Variance component models, II. More on variance component models"

Transcription

1 Faculty of Health Sciences Overview Correlated data More on variance component models Variance component models, II Cross-over studies Non-normal data Comparing measurement devices Lene Theil Skovgaard Non-hierarchical models December 4, / 96 2 / 96 Repetition Cross-over study Patients with chronic headache are randomized into two groups: Severeal measurements on the same unit The traditional assumption of independence is violated Disregarding this may lead to erroneous conclusions is often quite simple to handle by introducing random effects sometimes more complicated Both groups receive LMMA and placebo, on two different days, with a suitable wash-out period in-between Group A was treated first with placebo (period 1), and then with LNMMA (period 2) Group B was treated first with LNMMA (period 1), and then with placebo (period 2) Pain was measured subjectively on a VAS-scale (small is good), at baseline and at 30, 60, 90 and 120 minutes after treatment. Ashina, Lassen, Bendtsen, Jensen og Olesen (1999), Lancet, pp / 96 4 / 96

2 Data We have repeated measurements over time (topic of next lecture) Here, we shall reduce the complexity by simply looking at difference between baseline and follow up (remember that for this to be a good idea, the correlation must be strong) Outcome: Difference between follow-up measurements and baseline, i.e. Y 30 + Y 60 + Y 120 3Y 0 5 / 96 6 / 96 Observations Average over patients 7 / 96 8 / 96

3 Ignoring periods: Paired T-test for treatment effect What about simple ANOVA? The TTEST Procedure Two-way anova in treat and period Difference: lnmma - placebo N Mean Std Dev Std Err Minimum Maximum Mean 95% CL Mean Std Dev 95% CL Std Dev DF t Value Pr > t LNMMA is more effective than placebo: 42.9 (14.6, 71.2) but we have ignored a possible effect of period... 9 / 96 Parameter Estimate Error t Value Pr > t Intercept B treat lnmma B treat placebo B... period B period B... Here, we have ignored the pairing / 96 Model for cross-over study Traditional approaches t = active, placebo, p = 1, 2(periods], i = 1, 2,... (individulas) Assuming no carry-over effect With no carry-over effect: Y tpi = α t + β p + A i + ε tpi where A i N (0, ω 2 ) ε tpi N (0, σ 2 ) With carry-over effect: Y tpi = α t + β p + γ tp + A i + ε tpi group Period 1 Period 2 period2-period1 A β 1 + α p + A i β 2 + α a + A i β 2 β 1 + α a α p B β 1 + α a + A j β 2 + α p + A j β 2 β 1 + α p α a group A-B α p α a + A α a α p + A 2(α a α p ) A i, A j and A refer to something with a subject effect Three possible comparisons of treatments, two of these ( ) include biological variations and are therefore less powerful 11 / / 96

4 Comparison in each period separately period=1 Variable: effect treat N Mean Std Dev Std Err Minimum Maximum lnmma placebo Diff (1-2) treat Method Mean 95% CL Mean Std Dev lnmma placebo Diff (1-2) Pooled Diff (1-2) Satterthwaite Method Variances DF t Value Pr > t Pooled Equal Satterthwaite Unequal Equality of Variances Method Num DF Den DF F Value Pr > F Folded F LNMMA is more effective than placebo: 8.2 (-37.6, 54.0) 13 / 96 period=2 Variable: effect treat N Mean Std Dev Std Err Minimum Maximum lnmma placebo Diff (1-2) treat Method Mean 95% CL Mean Std Dev lnmma placebo Diff (1-2) Pooled Diff (1-2) Satterthwaite Method Variances DF t Value Pr > t Pooled Equal Satterthwaite Unequal Equality of Variances Method Num DF Den DF F Value Pr > F Folded F LNMMA is more effective than placebo: 70.5 (22.5, 118.4) 14 / 96 Effect of treatment, adjusted for period Conclusion on treatment effect Using 1 2 period difference The TTEST Procedure Variable: half_period group N Mean Std Dev Std Err Minimum Maximum A B Diff (1-2) group Method Mean 95% CL Mean Std Dev A B Diff (1-2) Pooled Diff (1-2) Satterthwaite Method Variances DF t Value Pr > t Pooled Equal Satterthwaite Unequal Equality of Variances Method Num DF Den DF F Value Pr > F Folded F Method Effect Confidence Interval P-value Period (-37.59, 53.99) 0.71 Period (11.55, ) No adjustment (14.55, 71.19) T-test Period adjustment (-21.92, 50.25) No correlation Period adjustment (9.97, 68.70) With correlation 15 / / 96

5 Not much effect of period adjustment Assuming a carry-over effect (C) because the period effect (on average) is not large: 14.2 (11.8, 40.1) Variable: half_treat group N Mean Std Dev Std Err Minimum Maximum A B Diff (1-2) group Method Mean 95% CL Mean Std Dev A B Diff (1-2) Pooled Diff (1-2) Satterthwaite Method Variances DF t Value Pr > t Pooled Equal Satterthwaite Unequal Equality of Variances Method Num DF Den DF F Value Pr > F Folded F but the picture suggests...? 17 / 96 i.e. an interaction treat*period (γ p2 ) group LNMMA placebo sum A β 2 + α a + A i β 1 + α p + A i β 1 + β 2 + α p + α a + A i B β 1 + α a + A j β 2 + α p + C + A j β 1 + β 2 + α p + α a + γ p2 + A j group A-B β 1 β 2 + A β 2 β 1 + γ p2 + A γ p2 + A Test for carry-over effect: T-test for the sums 18 / 96 Test of carry-over effect, T-test Coded as a variance component model Variable: treat_sum group N Mean Std Dev Std Err Minimum Maximum A B Diff (1-2) group Method Mean 95% CL Mean Std Dev A B Diff (1-2) Pooled Diff (1-2) Satterthwaite Method Variances DF t Value Pr > t Pooled Equal Satterthwaite Unequal Equality of Variances Method Num DF Den DF F Value Pr > F Folded F Without carry-over effect: proc mixed data=a1 /*covtest*/; class patient group treat period; model effect=treat period / outpred=udp outpredm=udpm residual influence s cl; random intercept / subject=patient(group); Not significant, but... The carry-over effect is estimated to be an extra effect of placebo in period 2 of 62.3, with confidence interval ( 25.4, 149.9) 19 / 96 Include a carry-over effect as the interaction treat*period 20 / 96

6 Mixed, no carry-over effect Predictions in contrast to observations Cov Parm Subject Estimate Intercept patient(group) Residual Solution for Fixed Effects assuming no carry-over effect: Effect treat period Estimate Error DF t Value Pr > t Intercept treat lmmma treat placebo period period Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F treat period / / 96 Test of carry-over effect, Mixed Output, traditional parametrization proc mixed data=a1 /*covtest*/; class patient group treat period; model effect=treat period treat*period / s cl; random intercept / subject=patient(group); or data a1; set a1; active=(treat="lnmma"); period2=(period=2); carry_over=(treat="placebo")*(period=2); proc mixed data=a1; class patient group active period2 carry_over; model effect=active period2 carry_over / s cl; random intercept / subject=patient(group); 23 / 96 Cov Parm Subject Estimate Intercept patient(group) Residual Solution for Fixed Effects Effect treat period Estimate Error DF t Value Pr > t Intercept treat lmmma treat placebo period period treat*period lmmma treat*period lmmma treat*period placebo treat*period placebo Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F treat period treat*period / 96

7 Output, intuitive parametrization Output, intuitive parametrization II Class Level Information Class Levels Values patient group 2 A B active period carry_over Cov Parm Subject Estimate Intercept patient(group) Residual Solution for Fixed Effects carry_ Effect active period2 over Estimate Error DF t Value Intercept active active period period carry_over carry_over Solution for Fixed Effects carry_ Effect active period2 over Pr > t Alpha Lower Upper Intercept active active period period carry_over carry_over Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F active period carry_over / / 96 Predictions, without carry-over effect Paired T-test, revisited Tryptase level before and after operation Garvey et al. (2010a,b) 27 / / 96

8 Paired T-tests written as two-way anova The GLM Procedure Class Level Information The TTEST Procedure Difference: logbefore - logafter N Mean Std Dev Std Err Minimum Maximum Mean 95% CL Mean Std Dev 95% CL Std Dev DF t Value Pr > t <.0001 Tryptase levels decrease with a factor = 0.909, i.e. approximately 9% (CI: ) 29 / 96 Class Levels Values patient time 2 1:before 2:after Number of Observations Read 240 Number of Observations Used 240 Dependent Variable: logtryptase Source DF Type III SS Mean Square F Value Pr > F patient <.0001 time <.0001 Parameter Estimate Error t Value Pr > t Intercept B <.0001 patient B patient B patient B <.0001 patient B... time 1:before B <.0001 time 2:after B / 96 or mixed Missing values proc mixed data=b1; class patient time; model logtryptase=time / s cl; random patient; The Mixed Procedure Cov Parm Estimate patient Residual Solution for Fixed Effects Effect time Estimate Error DF t Value Pr > t Alpha Intercept < time 1:before < time 2:after Effect time Lower Upper Intercept time 1:before time 2:after.. 31 / 96 Now excluding a random number of observations: Analysis Variable : tryptase N N time Obs N Miss :before :after / 96

9 Paired T-test The TTEST Procedure Difference: logbefore - logafter N Mean Std Dev Std Err Minimum Maximum Mean 95% CL Mean Std Dev 95% CL Std Dev DF t Value Pr > t Only 79 patients with both observations before and after operation 33 / / 96 Unpaired T-test Random effects model in order to use all available observations Dependent Variable: logtryptase Sum of Source DF Squares Mean Square F Value Pr > F Model Error Corrected Total Class Level Information Class Levels Values time 2 1:before 2:after Number of Observations Read 240 Number of Observations Used 191 Parameter Estimate Error t Value Pr > t Intercept B <.0001 time 1:before B time 2:after B... to account for correlations/pairing and use all available observations: proc mixed data=b1; class patient time; model logtryptase=time / s cl; random patient; Parameter 95% Confidence Limits Intercept time 1:before time 2:after.. 35 / / 96

10 Output from mixed model Conclusion on tryptase effect The Mixed Procedure Dependent Variable logtryptase Number of Observations Number of Observations Read 240 Number of Observations Used 191 Number of Observations Not Used 49 Covariance Parameter Estimates Cov Parm Estimate patient Residual Solution for Fixed Effects Effect time Estimate Error DF t Value Pr > t Alpha Intercept < time 1:before time 2:after Effect time Lower Upper Intercept time 1:before time 2:after.. 37 / 96 Method Effect N Confidence Interval P-value Paired T-test (0.0112, ) Unpaired T-test ( , ) 0.17 Mixed model (0.0131, ) Back-transformed: Ratio before/after is estimated to = 1.087, or ratio after/before estimated to = 0.92, i.e. an 8% decrease, with confidence interval ( , ) = (0.87, 0.97), i.e. from 3-13% 38 / 96 When should we use what approach? Non-normal data Typical data from e.g. epidemiology are often not normally distributed (binary, ordinal, counts, survival...) Paired T-test: When the correlation is strong, and only few observations are missing Unpaired T-test: When the correlation is weak, and many observations are missing Random effects model: Always possible But note: The missingness has to be random! More on missing values later on... Generalised linear models in exponential families: Multiple regression models, on a scale that corresponds to the data: Normal (link=identity) Binomial (link=logit) Poisson (link=log) Mean value: µ Link funktion: g(µ) linear in covariates, i.e. g(µ i ) = β 0 + β 1 x i1 + + β k x ik + A i 39 / / 96

11 The Binomial distribution Poisson distribution N independent binary observations U i, all with P(U i = 1) = p (e.g. p = 0.51 for a baby boy) X = U U N = U i (number of ones, e.g. boys in a family) The distribution of X is called a Binomial distribution and is written X Bin(N, p) 41 / 96 P(X = x) = ( ) N p x (1 p) N x x Counts with no well-defined upper limit: the number of cancer cases in a specific community during a specific year the number of metastases following an experimentally induced cancer in laboratory rats Law of rare events: As the count parameter N in a Binomial distribution gets larger and the parameter p gets close to either 0 or 1, the Binomial probabilities are approximately P(u) = P(y = u) = mu exp( m), (1) u! where m = Np is the mean value. 42 / 96 Smoking among school children Possible covariates, at various levels Hierarchical (multilevel) design: 1498 children (i) 90 classes (c) 46 schools (s) Outcome: Individual smoking behaviour (0/1), y sci p sci ; the probability that child i in class c on school s is a smoker Mette Rasmussen Individual (i): sex, age, parental smoking behaviour (c2ab), parental smoking attitude (c1112a), parental labour market attachment (fsoc), best friend smoking (c2cr) Class (c): sex ratio, number of pupils, grade School (s): Type, school connectedness (con3ny) 43 / / 96

12 Multilevel model for binary outcomes Initial model y sci Bernoulli(p sci ) p sci = P(y sci = 1 η sci ) = exp(η sci) 1 + exp(η sci ) where η sci = β x sci + a s + b sc i-covariates + school + class a s = γ sz s + A s s-covariates + random school b sc = γ cz sc + B sc c-covariates + random class η sci = β x sci + γ 1z s + γ 2z sc + A s + B sc Two-level model: no covariates only random school proc glimmix data=a1; class school sclass; model dglryg(descending) = / dist=binary link=logit ddfm=satterth s; random school; A s N (0, ω 2 ) between school variation B sc N (0, τ 2 ) between classes (within school) variation 45 / / 96 Interesting part of output Interpretation of estimates Fit Statistics -2 Res Log Pseudo-Likelihood Generalized Chi-Square Gener. Chi-Square / DF 0.96 Cov Parm Estimate Error SCHOOL Solutions for Fixed Effects Effect Estimate Error DF t Value Pr > t Intercept < / 96 Fixed effects: Only intercept, i.e. overall level: Inverse logit-transformation: > exp(a)/(1+exp(a)) [1] Overall, approx. 18.6% of the pupils smoke Random effects (MOR) For two individuals from different schools, (and with identical covariates) we calculate median OR for a randomly chosen high risk individual compared to a randomly chosen low risk individual: 48 / 96

13 Median Odds Ratio, MOR Inclusion of variation between school classes Variation between schools: Variance component: ω 2 = ω = Typical difference D on logit scale: D N (0, 2ω 2 ) : ±2 2 ω 2 median numerical difference on logit scale: D 2 2ω 2 χ 2 1 2ω : 2 median(χ 2 1 ) median odds ratio (MOR): exp( 2ω 2 median(χ 2 1 )) and since median(χ 2 1 ) = , we get MOR = exp(0.954 ω) = 1.46 proc glimmix data=a1; class school sclass; model dglryg(descending) = / dist=binary link=logit ddfm =satterth s; random school sclass; Output: Fit Statistics -2 Res Log Pseudo-Likelihood Generalized Chi-Square Gener. Chi-Square / DF / / 96 Output, continued A possible third level... Cov Parm Estimate Error SCHOOL 0. sclass Solutions for Fixed Effects Effect Estimate Error DF t Value Pr > t Intercept <.0001 The variation between schools can be totally explained by the variation between school classes 51 / 96 Imagine an extra grouping: Gender group within class, i.e. a subgrouping in boys and girls Note: This is not the same as a gender effect it need not be a systematic difference the group definition is a substitute for cliques of which we know nothing Modify the Random-statement to: random school sclass ggroup; and remember ggroup in the Class-statement 52 / 96

14 Systematic sex effect The GLIMMIX Procedure Fit Statistics -2 Res Log Pseudo-Likelihood Generalized Chi-Square Gener. Chi-Square / DF 0.83 Cov Parm Estimate Error SCHOOL 0. sclass GGROUP Solutions for Fixed Effects proc glimmix data=a1; class school sclass ggroup sex c1112a c2ab c2cr; model dglryg(descending) = sex / dist=binary link=logit ddfm =satterth s; random school sclass ggroup; Effect Estimate Error DF t Value Pr > t Intercept < / / 96 Interpretation of results The GLIMMIX Procedure Fit Statistics -2 Res Log Pseudo-Likelihood Generalized Chi-Square Gener. Chi-Square / DF 0.84 Cov Parm Estimate Error SCHOOL 0. sclass GGROUP Solutions for Fixed Effects Effect sex Estimate Error DF t Value Pr > t Intercept <.0001 sex dreng sex pige Systematic effect of sex: OR=exp(0.4188) = 1.52 for girls vs. boys Random effects: MOR for two children of opposite sex in the same class: 1.52 exp( ) = 2.90 Random effects: MOR for two children of opposite sex in different classes (at same or different schools): 1.52 exp( ) = 3.10 How much does systematic sex effect explain of the random components? 55 / / 96

15 Variance component estimates Odds ratios (OR) and MOR model school school class gender group school alone school and school class school, class and gender group as above, with sex model school school class gender group sex school alone school and school class school, class and gender group as above, with sex 57 / / 96 Comparing measurement devices Illustration of all data Example: Peak expiratory flow rate, l/min: 17 subjects, 2 measurement devices Each measured twice subject Wright mini Wright id Y 1p1 Y 1p2 Y 2p1 Y 2p Average SD (Bland and Altman, 1986). 59 / / 96

16 Aim of investigation Variance component model Precision of each measuring device compare the two repetitions Agreement between the two devices compare individual measurements - or averages Practical advice for clinical use can we trust the devices, and use them interchangeably? Subject, p = 1,..., 17 Methods, m = 1, 2 Repetitions, j = 1, 2 Y pmj = β m + A p + C pm + ε pmj where A p N (0, ω 2 ), C pm N (0, τ 2 ), ε pmj N (0, σ 2 ) Note: Patients need not be random here..., why?? 61 / / 96 Correlation structure Correlation structure in the above model, for each subject if subjects are considered systematic: ω 2 + τ 2 + σ 2 ω 2 + τ 2 ω 2 ω 2 ω 2 + τ 2 ω 2 + τ 2 + σ 2 ω 2 ω 2 ω 2 ω 2 ω 2 + τ 2 + σ 2 ω 2 + τ 2 ω 2 ω 2 ω 2 + τ 2 ω 2 + τ 2 + σ 2 For each subject*method combination, i.e. for two repetitions: ( τ 2 + σ 2 τ 2 ) τ 2 τ 2 + σ 2 63 / / 96

17 SAS-programming Output proc mixed data=wright; class method id; model wr=method / ddfm=satterth s; random intercept method / subject=id; or proc mixed data=wright; class method id; model wr=id method / s; random id*method; Class Level Information Class Levels Values method 2 mini wright id Cov Parm Subject Estimate Intercept id method id Residual Fit Statistics -2 Res Log Likelihood AIC (smaller is better) Solution for Fixed Effects Effect method Estimate Error DF t Value Pr > t Intercept <.0001 method mini method wright / / 96 Estimates Precision of the methods Variance components: ω 2 = τ 2 = are assumed identical Difference between double measurements (identical repetitions): D pm = Y pmj1 Y pmj2 σ 2 = Systematic difference between measuring devices: = ε p1j1 ε p2j2 N (0, 2σ 2 ) ˆβ 1 ˆβ 2 = 6.03(8.05), P = 0.46 Limits-of-agreement: How can we use these?? ±2 2σ 2 = ± / / 96

18 Agreement between the two methods Agreement between averages Difference between single measurements by the two methods: D p = Y p1j1 Y p2j2 = β 1 β 2 + C p1 C p2 + ε p1j1 ε p2j2 N (β 1 β 2, 2τ 2 + 2σ 2 ) D p = X p1. X p2. = β 1 β 2 + C p1 C p2 + ε p1. ε p2. N (β 1 β 2, 2τ 2 + σ 2 ) Limits-of-agreement: ±2 2(τ 2 + σ 2 ) = ±75.31 (where we have ignored the nonsignificant systematic difference between the two, otherwise add 6.03) 69 / 96 Limits-of-agreement: ±2 2τ 2 + σ 2 = ±66.41 Only reasonable, if averages is the standard for clinical use! 70 / 96 Difference in precision?? Output, systematic subject effect New model (with systematic subject effects): Y pmj = µ + β m + α p + C pm + ε pmj C pm N (0, τ 2 ) ε pmj N (0, σ m 2 ) proc mixed data=wright; class method id; model wr=id method / ddfm=satterth s; random id*method; repeated / group=method type=simple subject=id*method; 71 / 96 Cov Parm Subject Group Estimate method*id Residual method*id method mini Residual method*id method wright Fit Statistics -2 Res Log Likelihood AIC (smaller is better) Solution for Fixed Effects Effect method id Estimate Error DF t Value Pr > t Intercept <.0001 id id id id method mini method wright Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F id <.0001 method / 96

19 Results Incorrect Bland-Altman approaches Precisions: Wright: σ 2 1 = mini Wright: σ 2 2 = Conclusion: Wright is better than mini Wright, but is it significantly better? F = σ2 2 σ 2 1 = = 1.69 F(17, 17) P = 0.14 No... Alternative test: 2 log Q = = 1.2 χ 2 (1) P = 0.27 Calculate agreement between averages: We have seen that these estimate 2τ 2 + σ 2 instead of 2τ 2 + 2σ 2 In general, with k repetitions: 2τ k σ2 Calculate all possible pairs D pj = Y p1j1 Y p2j2 = β 1 β 2 + C p1 C p2 + ε p1j1 ε p2j2 but these differences will be correlated due to the C s and can give erroneous results if the measurement devices react to subject characteristics 73 / / 96 Measurements taken in pairs Effect of the lens strength on visual acuity e.g. over time... Y pmt = β m + A p + C pm + E pt + ε pmt Precision becomes impossible, since we have no true replications Agreement: D pt = Y p1t Y p2t = β 1 β 2 + C p1 C p2 + +ε p1t ε p2t 7 individuals are looking at a screen, where a light flash appears. They are looking through 4 lenses, with powers 6/6, 6/18, 6/36 and 6/60, i.e. 4 magnifications: 1, 3, 6 and 10 with 2 eyes Outcome: Visual acuity, the time lag (milliseconds) between the stimulus and the electrical response at the back of the cortex but these differences will again be correlated due to the C s 75 / / 96

20 Data Factors to take into account Main effects: 7 individuals (person), 2 eyes for each individual (eye) 4 lens magnifications (power) Crowder & Hand (1990) Interactions? person*eye person*power eye*power 2-order interaction person*eye*power = Residual 77 / / 96 Model ingredients Outcome: Visual acuity Systematic: Mean value µ em eye α e, power β m eye*power γ em Random effects: patient A p patient*eye B pe, patient*power C pm Residual: patient*eye*power ε pem 79 / / 96

21 Model formulation Factor diagram where p = 1,..., 7, e = 1, 2, m = 1, 2, 3, 4 Y pem = µ em + A p + B pe + C pm + ε pem A p N (0, ω 2 ) B pe N (0, τe 2 ) C pm N (0, τm) 2 ε pem N (0, σ 2 ) [I ] = [Pa Ey Po] [Pa Ey] Ey Po [Pa Po] Ey [Pa] Po 0 81 / / 96 Not quite a multilevel model, but.. Level Unit Covariates 1 single measurements Ey*Po 2 interactions 2e [Pa*Ey] Ey 2m [Pa*Po] Po 3 individuals, [Pa] overall level proc mixed data=visual covtest; class patient eye power; model acuity=eye power eye*power / s; * random patient patient*eye patient*power; random intercept eye power / subject=patient; Z Cov Parm Subject Estimate Error Value Pr > Z Intercept patient eye patient power patient Residual Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F eye power eye*power / / 96

22 Solution for Fixed Effects Predicted mean profiles Effect eye power Estimate Error DF t Value Pr > t Intercept <.0001 eye left eye right power power power power eye*power left eye*power left eye*power left eye*power left eye*power right eye*power right eye*power right eye*power right / / 96 Individual predictions Residual plot 87 / / 96

23 Omit the interaction eye*power Eye comparisons Z Cov Parm Subject Estimate Error Value Pr > Z Intercept patient eye patient power patient Residual Solution for Fixed Effects Effect eye power Estimate Error DF t Value Pr > t Intercept <.0001 eye left eye right power power power power Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F eye power / 96 Y pem = µ em + A p + B pe + C pm + ε pem where A p N (0, ω 2 ), B pe N (0, τe 2 ), C pm N (0, τm), 2 ε pem N (0, σ 2 ) Difference between eye averages: 90 / 96 Ȳ.e1. Ȳ.e 2. = µ stuff + B.e1 B.e2 + ε.e1. ε.e2. Consequence for eye comparisons Magnification comparisons Y pem = µ em + A p + B pe + C pm + ε pem Var(Ȳ.e 1. Ȳ.e 2.) = 2 7 τ 2 e σ2 τ 2 e is rather large (people have different eye preferences) We have to demand a larger difference in order to detect it where A p N (0, ω 2 ), B pe N (0, τe 2 ), C pm N (0, τm), 2 ε pem N (0, σ 2 ) Difference between magnification averages: Ȳ..m1 Ȳ..m 2 = µ stuff + C..m1 C..m2 + ε..m1 ε..m2 91 / / 96

24 Consequence for magnification comparisons If we ignore correlations Var(Ȳ..m 1 Ȳ..m 2 ) = 2 7 τ 2 m σ2 τ 2 m is not that large (people react more or less identically to the different magnifications) We can detect smaller differences i.e a model with no random effects Eye differences: but another σ 2 Magnification differences: Var(Ȳ.e 1. Ȳ.e 2.) = σ2 Var(Ȳ..m 1 Ȳ..m 2 ) = σ2 93 / / 96 Incorrect analysis, ignoring random effects Systematic vs. random effects Covariance Parameter Estimates Cov Parm Estimate Residual Solution for Fixed Effects Effect eye power Estimate Error DF t Value Pr > t Intercept <.0001 eye left eye right power power power power Type 3 Tests of Fixed Effects Num Den Effect DF DF Chi-Square F Value Pr > ChiSq Pr > F eye power Could the patients be treated as systematic here? Yes: Cov Parm Subject Estimate eye patient power patient Residual Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F patient eye power eye*power Can you think why? 95 / / 96

Correlated data. Variance component models. Example: Evaluate vaccine. Traditional assumption so far. Faculty of Health Sciences

Correlated data. Variance component models. Example: Evaluate vaccine. Traditional assumption so far. Faculty of Health Sciences Faculty of Health Sciences Variance component models Definitions and motivation Correlated data Variance component models, I Lene Theil Skovgaard November 29, 2013 One-way anova with random variation The

More information

Varians- og regressionsanalyse

Varians- og regressionsanalyse Faculty of Health Sciences Varians- og regressionsanalyse Variance component models Lene Theil Skovgaard Department of Biostatistics Variance component models Definitions and motivation One-way anova with

More information

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models Faculty of Health Sciences Overview Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 29, 2016 One-way anova with random variation The rabbit example Hierarchical

More information

Analysis of variance and regression. December 4, 2007

Analysis of variance and regression. December 4, 2007 Analysis of variance and regression December 4, 2007 Variance component models Variance components One-way anova with random variation estimation interpretations Two-way anova with random variation Crossed

More information

Correlated data. Non-normal outcomes. Reminder on binary data. Non-normal data. Faculty of Health Sciences. Non-normal outcomes

Correlated data. Non-normal outcomes. Reminder on binary data. Non-normal data. Faculty of Health Sciences. Non-normal outcomes Faculty of Health Sciences Non-normal outcomes Correlated data Non-normal outcomes Lene Theil Skovgaard December 5, 2014 Generalized linear models Generalized linear mixed models Population average models

More information

Linear mixed models. Faculty of Health Sciences. Analysis of repeated measurements, 10th March Julie Lyng Forman & Lene Theil Skovgaard

Linear mixed models. Faculty of Health Sciences. Analysis of repeated measurements, 10th March Julie Lyng Forman & Lene Theil Skovgaard Faculty of Health Sciences Linear mixed models Analysis of repeated measurements, 10th March 2015 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen 1 / 80 Program

More information

Faculty of Health Sciences. Correlated data. More about LMMs. Lene Theil Skovgaard. December 4, / 104

Faculty of Health Sciences. Correlated data. More about LMMs. Lene Theil Skovgaard. December 4, / 104 Faculty of Health Sciences Correlated data More about LMMs Lene Theil Skovgaard December 4, 2015 1 / 104 Further topics Model check and diagnostics Cross-over studies Paired T-tests with missing values

More information

Correlated data. Further topics. Sources of random variation. Specification of mixed models. Faculty of Health Sciences.

Correlated data. Further topics. Sources of random variation. Specification of mixed models. Faculty of Health Sciences. Faculty of Health Sciences Further topics Correlated data Further topics Lene Theil Skovgaard December 11, 2012 Specification of mixed models Model check and diagnostics Explained variation, R 2 Missing

More information

Linear mixed models. Program. What are repeated measurements? Outline. Faculty of Health Sciences. Analysis of repeated measurements, 10th March 2015

Linear mixed models. Program. What are repeated measurements? Outline. Faculty of Health Sciences. Analysis of repeated measurements, 10th March 2015 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Program Faculty of Health Sciences Topics: Linear mixed models

More information

Multi-factor analysis of variance

Multi-factor analysis of variance Faculty of Health Sciences Outline Multi-factor analysis of variance Basic statistics for experimental researchers 2015 Two-way ANOVA and interaction Mathed samples ANOVA Random vs systematic variation

More information

Variance components and LMMs

Variance components and LMMs Faculty of Health Sciences Variance components and LMMs Analysis of repeated measurements, 4th December 2014 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

Variance components and LMMs

Variance components and LMMs Faculty of Health Sciences Topics for today Variance components and LMMs Analysis of repeated measurements, 4th December 04 Leftover from 8/: Rest of random regression example. New concepts for today:

More information

Variance component models

Variance component models Faculty of Health Sciences Variance component models Analysis of repeated measurements, NFA 2016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen Topics for

More information

More about linear mixed models

More about linear mixed models Faculty of Health Sciences Contents More about linear mixed models Analysis of repeated measurements, NFA 2016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

Faculty of Health Sciences. Correlated data. Count variables. Lene Theil Skovgaard & Julie Lyng Forman. December 6, 2016

Faculty of Health Sciences. Correlated data. Count variables. Lene Theil Skovgaard & Julie Lyng Forman. December 6, 2016 Faculty of Health Sciences Correlated data Count variables Lene Theil Skovgaard & Julie Lyng Forman December 6, 2016 1 / 76 Modeling count outcomes Outline The Poisson distribution for counts Poisson models,

More information

Variance component models part I

Variance component models part I Faculty of Health Sciences Variance component models part I Analysis of repeated measurements, 30th November 2012 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

Introduction to Crossover Trials

Introduction to Crossover Trials Introduction to Crossover Trials Stat 6500 Tutorial Project Isaac Blackhurst A crossover trial is a type of randomized control trial. It has advantages over other designed experiments because, under certain

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics Faculty of Health Sciences Regression models Counts, Poisson regression, 27-5-2013 Lene Theil Skovgaard Dept. of Biostatistics 1 / 36 Count outcome PKA & LTS, Sect. 7.2 Poisson regression The Binomial

More information

Answer to exercise: Blood pressure lowering drugs

Answer to exercise: Blood pressure lowering drugs Answer to exercise: Blood pressure lowering drugs The data set bloodpressure.txt contains data from a cross-over trial, involving three different formulations of a drug for lowering of blood pressure:

More information

Correlated data. Repeated measurements over time. Typical set-up for repeated measurements. Traditional presentation of data

Correlated data. Repeated measurements over time. Typical set-up for repeated measurements. Traditional presentation of data Faculty of Health Sciences Repeated measurements over time Correlated data NFA, May 22, 2014 Longitudinal measurements Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics University of

More information

Analysis of variance and regression. May 13, 2008

Analysis of variance and regression. May 13, 2008 Analysis of variance and regression May 13, 2008 Repeated measurements over time Presentation of data Traditional ways of analysis Variance component model (the dogs revisited) Random regression Baseline

More information

STAT 705 Generalized linear mixed models

STAT 705 Generalized linear mixed models STAT 705 Generalized linear mixed models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 24 Generalized Linear Mixed Models We have considered random

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages

More information

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum T-test: means of Spock's judge versus all other judges 1 The TTEST Procedure Variable: pcwomen judge1 N Mean Std Dev Std Err Minimum Maximum OTHER 37 29.4919 7.4308 1.2216 16.5000 48.9000 SPOCKS 9 14.6222

More information

Swabs, revisited. The families were subdivided into 3 groups according to the factor crowding, which describes the space available for the household.

Swabs, revisited. The families were subdivided into 3 groups according to the factor crowding, which describes the space available for the household. Swabs, revisited 18 families with 3 children each (in well defined age intervals) were followed over a certain period of time, during which repeated swabs were taken. The variable swabs indicates how many

More information

STAT 5200 Handout #26. Generalized Linear Mixed Models

STAT 5200 Handout #26. Generalized Linear Mixed Models STAT 5200 Handout #26 Generalized Linear Mixed Models Up until now, we have assumed our error terms are normally distributed. What if normality is not realistic due to the nature of the data? (For example,

More information

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman.

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman. Faculty of Health Sciences Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 28, 2017 1 / 96 Overview One-way anova with random variation The rabbit example Hierarchical

More information

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models Faculty of Health Sciences Overview Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 28, 2017 One-way anova with random variation The rabbit example Hierarchical

More information

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman.

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman. Faculty of Health Sciences Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 27, 2018 1 / 84 Overview One-way anova with random variation The rabbit example Hierarchical

More information

Correlated data. Overview. Example: Swelling due to vaccine. Variance component models. Faculty of Health Sciences. Variance component models

Correlated data. Overview. Example: Swelling due to vaccine. Variance component models. Faculty of Health Sciences. Variance component models Faculty of Health Sciences Overview Correlated data Variance component models One-way anova with random variation The rabbit example Hierarchical models with several levels Random regression Lene Theil

More information

STAT 7030: Categorical Data Analysis

STAT 7030: Categorical Data Analysis STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012

More information

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

Lecture 10: Introduction to Logistic Regression

Lecture 10: Introduction to Logistic Regression Lecture 10: Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 2007 Logistic Regression Regression for a response variable that follows a binomial distribution Recall the binomial

More information

Correlated data. Longitudinal data. Typical set-up for repeated measurements. Examples from literature, I. Faculty of Health Sciences

Correlated data. Longitudinal data. Typical set-up for repeated measurements. Examples from literature, I. Faculty of Health Sciences Faculty of Health Sciences Longitudinal data Correlated data Longitudinal measurements Outline Designs Models for the mean Covariance patterns Lene Theil Skovgaard November 27, 2015 Random regression Baseline

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

Statistics for exp. medical researchers Regression and Correlation

Statistics for exp. medical researchers Regression and Correlation Faculty of Health Sciences Regression analysis Statistics for exp. medical researchers Regression and Correlation Lene Theil Skovgaard Sept. 28, 2015 Linear regression, Estimation and Testing Confidence

More information

STA6938-Logistic Regression Model

STA6938-Logistic Regression Model Dr. Ying Zhang STA6938-Logistic Regression Model Topic 2-Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of

More information

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION

More information

Outline. Topic 20 - Diagnostics and Remedies. Residuals. Overview. Diagnostics Plots Residual checks Formal Tests. STAT Fall 2013

Outline. Topic 20 - Diagnostics and Remedies. Residuals. Overview. Diagnostics Plots Residual checks Formal Tests. STAT Fall 2013 Topic 20 - Diagnostics and Remedies - Fall 2013 Diagnostics Plots Residual checks Formal Tests Remedial Measures Outline Topic 20 2 General assumptions Overview Normally distributed error terms Independent

More information

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials Lecture : Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 27 Binomial Model n independent trials (e.g., coin tosses) p = probability of success on each trial (e.g., p =! =

More information

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 12: Effect modification, and confounding in logistic regression Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression

More information

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

More information

Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA

Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA Faculty of Health Sciences Outline Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA Lene Theil Skovgaard Sept. 14, 2015 Paired comparisons: tests and confidence intervals

More information

A Re-Introduction to General Linear Models (GLM)

A Re-Introduction to General Linear Models (GLM) A Re-Introduction to General Linear Models (GLM) Today s Class: You do know the GLM Estimation (where the numbers in the output come from): From least squares to restricted maximum likelihood (REML) Reviewing

More information

CHL 5225 H Crossover Trials. CHL 5225 H Crossover Trials

CHL 5225 H Crossover Trials. CHL 5225 H Crossover Trials CHL 55 H Crossover Trials The Two-sequence, Two-Treatment, Two-period Crossover Trial Definition A trial in which patients are randomly allocated to one of two sequences of treatments (either 1 then, or

More information

A Re-Introduction to General Linear Models

A Re-Introduction to General Linear Models A Re-Introduction to General Linear Models Today s Class: Big picture overview Why we are using restricted maximum likelihood within MIXED instead of least squares within GLM Linear model interpretation

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Today s Class (or 3): Summary of steps in building unconditional models for time What happens to missing predictors Effects of time-invariant predictors

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs) 36-309/749 Experimental Design for Behavioral and Social Sciences Dec 1, 2015 Lecture 11: Mixed Models (HLMs) Independent Errors Assumption An error is the deviation of an individual observed outcome (DV)

More information

Multi-factor analysis of variance

Multi-factor analysis of variance Faculty of Health Sciences Outline Multi-factor analysis of variance Basic statistics for experimental researchers 2016 Two-way ANOVA and interaction Matched samples ANOVA Random vs systematic variation

More information

Models for binary data

Models for binary data Faculty of Health Sciences Models for binary data Analysis of repeated measurements 2015 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen 1 / 63 Program for

More information

Logistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression

Logistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression Logistic Regression Usual linear regression (repetition) y i = b 0 + b 1 x 1i + b 2 x 2i + e i, e i N(0,σ 2 ) or: y i N(b 0 + b 1 x 1i + b 2 x 2i,σ 2 ) Example (DGA, p. 336): E(PEmax) = 47.355 + 1.024

More information

Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models:

Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models: Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models: Marginal models: based on the consequences of dependence on estimating model parameters.

More information

Models for longitudinal data

Models for longitudinal data Faculty of Health Sciences Contents Models for longitudinal data Analysis of repeated measurements, NFA 016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

Lecture 3.1 Basic Logistic LDA

Lecture 3.1 Basic Logistic LDA y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science

UNIVERSITY OF TORONTO Faculty of Arts and Science UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Today s Topics: What happens to missing predictors Effects of time-invariant predictors Fixed vs. systematically varying vs. random effects Model building

More information

Lecture 2: Poisson and logistic regression

Lecture 2: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

SAS Syntax and Output for Data Manipulation:

SAS Syntax and Output for Data Manipulation: CLP 944 Example 5 page 1 Practice with Fixed and Random Effects of Time in Modeling Within-Person Change The models for this example come from Hoffman (2015) chapter 5. We will be examining the extent

More information

Topic 20: Single Factor Analysis of Variance

Topic 20: Single Factor Analysis of Variance Topic 20: Single Factor Analysis of Variance Outline Single factor Analysis of Variance One set of treatments Cell means model Factor effects model Link to linear regression using indicator explanatory

More information

Generalized linear models

Generalized linear models Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models

More information

Model Assumptions; Predicting Heterogeneity of Variance

Model Assumptions; Predicting Heterogeneity of Variance Model Assumptions; Predicting Heterogeneity of Variance Today s topics: Model assumptions Normality Constant variance Predicting heterogeneity of variance CLP 945: Lecture 6 1 Checking for Violations of

More information

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study 1.4 0.0-6 7 8 9 10 11 12 13 14 15 16 17 18 19 age Model 1: A simple broken stick model with knot at 14 fit with

More information

6. Multiple regression - PROC GLM

6. Multiple regression - PROC GLM Use of SAS - November 2016 6. Multiple regression - PROC GLM Karl Bang Christensen Department of Biostatistics, University of Copenhagen. http://biostat.ku.dk/~kach/sas2016/ kach@biostat.ku.dk, tel: 35327491

More information

Topic 25 - One-Way Random Effects Models. Outline. Random Effects vs Fixed Effects. Data for One-way Random Effects Model. One-way Random effects

Topic 25 - One-Way Random Effects Models. Outline. Random Effects vs Fixed Effects. Data for One-way Random Effects Model. One-way Random effects Topic 5 - One-Way Random Effects Models One-way Random effects Outline Model Variance component estimation - Fall 013 Confidence intervals Topic 5 Random Effects vs Fixed Effects Consider factor with numerous

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Topics: What happens to missing predictors Effects of time-invariant predictors Fixed vs. systematically varying vs. random effects Model building strategies

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population

More information

Longitudinal Modeling with Logistic Regression

Longitudinal Modeling with Logistic Regression Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to

More information

Random Coefficient Model (a.k.a. multilevel model) (Adapted from UCLA Statistical Computing Seminars)

Random Coefficient Model (a.k.a. multilevel model) (Adapted from UCLA Statistical Computing Seminars) STAT:5201 Applied Statistic II Random Coefficient Model (a.k.a. multilevel model) (Adapted from UCLA Statistical Computing Seminars) School math achievement scores The data file consists of 7185 students

More information

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression 22s:52 Applied Linear Regression Ch. 4 (sec. and Ch. 5 (sec. & 4: Logistic Regression Logistic Regression When the response variable is a binary variable, such as 0 or live or die fail or succeed then

More information

Mixed Models for Longitudinal Binary Outcomes. Don Hedeker Department of Public Health Sciences University of Chicago.

Mixed Models for Longitudinal Binary Outcomes. Don Hedeker Department of Public Health Sciences University of Chicago. Mixed Models for Longitudinal Binary Outcomes Don Hedeker Department of Public Health Sciences University of Chicago hedeker@uchicago.edu https://hedeker-sites.uchicago.edu/ Hedeker, D. (2005). Generalized

More information

Analysis of variance and regression. November 22, 2007

Analysis of variance and regression. November 22, 2007 Analysis of variance and regression November 22, 2007 Parametrisations: Choice of parameters Comparison of models Test for linearity Linear splines Lene Theil Skovgaard, Dept. of Biostatistics, Institute

More information

Chapter 11. Analysis of Variance (One-Way)

Chapter 11. Analysis of Variance (One-Way) Chapter 11 Analysis of Variance (One-Way) We now develop a statistical procedure for comparing the means of two or more groups, known as analysis of variance or ANOVA. These groups might be the result

More information

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression Section IX Introduction to Logistic Regression for binary outcomes Poisson regression 0 Sec 9 - Logistic regression In linear regression, we studied models where Y is a continuous variable. What about

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

Today. HW 1: due February 4, pm. Aspects of Design CD Chapter 2. Continue with Chapter 2 of ELM. In the News:

Today. HW 1: due February 4, pm. Aspects of Design CD Chapter 2. Continue with Chapter 2 of ELM. In the News: Today HW 1: due February 4, 11.59 pm. Aspects of Design CD Chapter 2 Continue with Chapter 2 of ELM In the News: STA 2201: Applied Statistics II January 14, 2015 1/35 Recap: data on proportions data: y

More information

ANOVA Longitudinal Models for the Practice Effects Data: via GLM

ANOVA Longitudinal Models for the Practice Effects Data: via GLM Psyc 943 Lecture 25 page 1 ANOVA Longitudinal Models for the Practice Effects Data: via GLM Model 1. Saturated Means Model for Session, E-only Variances Model (BP) Variances Model: NO correlation, EQUAL

More information

Outline. PubH 5450 Biostatistics I Prof. Carlin. Confidence Interval for the Mean. Part I. Reviews

Outline. PubH 5450 Biostatistics I Prof. Carlin. Confidence Interval for the Mean. Part I. Reviews Outline Outline PubH 5450 Biostatistics I Prof. Carlin Lecture 11 Confidence Interval for the Mean Known σ (population standard deviation): Part I Reviews σ x ± z 1 α/2 n Small n, normal population. Large

More information

Categorical and Zero Inflated Growth Models

Categorical and Zero Inflated Growth Models Categorical and Zero Inflated Growth Models Alan C. Acock* Summer, 2009 *Alan C. Acock, Department of Human Development and Family Sciences, Oregon State University, Corvallis OR 97331 (alan.acock@oregonstate.edu).

More information

Chapter 1. Modeling Basics

Chapter 1. Modeling Basics Chapter 1. Modeling Basics What is a model? Model equation and probability distribution Types of model effects Writing models in matrix form Summary 1 What is a statistical model? A model is a mathematical

More information

2 >1. That is, a parallel study design will require

2 >1. That is, a parallel study design will require Cross Over Design Cross over design is commonly used in various type of research for its unique feature of accounting for within subject variability. For studies with short length of treatment time, illness

More information

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3 STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae

More information

Two Sample Problems. Two sample problems

Two Sample Problems. Two sample problems Two Sample Problems Two sample problems The goal of inference is to compare the responses in two groups. Each group is a sample from a different population. The responses in each group are independent

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 3: Bivariate association : Categorical variables Proportion in one group One group is measured one time: z test Use the z distribution as an approximation to the binomial

More information

Lecture 4 Multiple linear regression

Lecture 4 Multiple linear regression Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters

More information

Review of CLDP 944: Multilevel Models for Longitudinal Data

Review of CLDP 944: Multilevel Models for Longitudinal Data Review of CLDP 944: Multilevel Models for Longitudinal Data Topics: Review of general MLM concepts and terminology Model comparisons and significance testing Fixed and random effects of time Significance

More information

STAT 525 Fall Final exam. Tuesday December 14, 2010

STAT 525 Fall Final exam. Tuesday December 14, 2010 STAT 525 Fall 2010 Final exam Tuesday December 14, 2010 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen Outline Data in wide and long format

More information

Lecture 5: Poisson and logistic regression

Lecture 5: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 3-5 March 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */ CLP 944 Example 4 page 1 Within-Personn Fluctuation in Symptom Severity over Time These data come from a study of weekly fluctuation in psoriasis severity. There was no intervention and no real reason

More information

Lecture 01: Introduction

Lecture 01: Introduction Lecture 01: Introduction Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 01: Introduction

More information

Thursday Morning. Growth Modelling in Mplus. Using a set of repeated continuous measures of bodyweight

Thursday Morning. Growth Modelling in Mplus. Using a set of repeated continuous measures of bodyweight Thursday Morning Growth Modelling in Mplus Using a set of repeated continuous measures of bodyweight 1 Growth modelling Continuous Data Mplus model syntax refresher ALSPAC Confirmatory Factor Analysis

More information

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 An Introduction to Multilevel Models PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 Today s Class Concepts in Longitudinal Modeling Between-Person vs. +Within-Person

More information

Overdispersion Workshop in generalized linear models Uppsala, June 11-12, Outline. Overdispersion

Overdispersion Workshop in generalized linear models Uppsala, June 11-12, Outline. Overdispersion Biostokastikum Overdispersion is not uncommon in practice. In fact, some would maintain that overdispersion is the norm in practice and nominal dispersion the exception McCullagh and Nelder (1989) Overdispersion

More information

Overview. Prerequisites

Overview. Prerequisites Overview Introduction Practicalities Review of basic ideas Peter Dalgaard Department of Biostatistics University of Copenhagen Structure of the course The normal distribution t tests Determining the size

More information

Topic 23: Diagnostics and Remedies

Topic 23: Diagnostics and Remedies Topic 23: Diagnostics and Remedies Outline Diagnostics residual checks ANOVA remedial measures Diagnostics Overview We will take the diagnostics and remedial measures that we learned for regression and

More information