Correlated data. Further topics. Sources of random variation. Specification of mixed models. Faculty of Health Sciences.

Size: px
Start display at page:

Download "Correlated data. Further topics. Sources of random variation. Specification of mixed models. Faculty of Health Sciences."

Transcription

1 Faculty of Health Sciences Further topics Correlated data Further topics Lene Theil Skovgaard December 11, 2012 Specification of mixed models Model check and diagnostics Explained variation, R 2 Missing values Several series for each individual Additional examples: Visual acuity example (from Variance components II) Baseline revisited: Simple before and after study Reading ability 1 / 96 2 / 96 Specification of mixed models Sources of random variation Systematic variation: Between-individual covariates: treatment, sex, age, baseline value... Within-individual covariates: time, cumulative dose, temperature... is specified as usual, including possible interactions Random variation Interactions between systematic and random effects are always random 1. Random effects: 2. Serial correlation: 3. Measurement error: 3 / 96 4 / 96

2 SAS, PROC MIXED General model specification model describes the systematic part (fixed effects, mean value structure) random describes the random effects repeated describes the serial correlation local adds an additional measurement error sorry..., matrix notation for brevity, Y i = (y i1,..., y ik ) denote all outcome values for subject i where Y i = X i β + Z i b i + ε i β denotes systematic effects, β = (β 1,..., β p ) b i denotes random effects, b i = (b i1,..., b iq ) ε i is the serially dependent residual variation, ε i = (ε i1,..., ε ik ) We assume all b i s and ε i s to be independent, with mean zero and Var(b i ) = G, Var(ε i ) = R i 5 / 96 6 / 96 For non-normal data Residual variance Y i follows some distribution from the exponential family, with mean value where E(Y i ) = g 1 (η i ) η i = X i β + Z i b i and the variance V (Y i ) = V i is determined by the distribution. In general, there is no free variance parameter, since the variance is determined from the mean value: Normal (link=identity), free variance parameter σ 2 Binomial (link=logit), variance np(1 p) Poisson (link=log), variance λ = E(Y ) Overdispersion: The variance is seen to be larger than determined by the distribution. 7 / 96 8 / 96

3 Overdispersion Swabs example from last week can be caused by omitted covariates (isn t that always the case?) unrecognized clusters heterogeneity, e.g. a zero -group (non-susceptibles) Traditional solution: An over-dispersion parameter φ is estimated and multiplied onto the variance cheating...we would like instead proc glimmix data=swab; class crowding family name; model swab=crowding name / dist=poisson link=log s cl; random family(crowding); run; gave as part of the output Fit Statistics -2 Res Log Pseudo-Likelihood Gener. Chi-Square / DF 1.96 < with a free parameter 9 / 96 η i = X i β + Z i b i + ε i Var(ε i ) = Σ i When this number is greater than 1, it indicates overdispersion, and the P-values in the Poisson analysis might be too small. 10 / 96 Model with overdispersion Output, continued proc glimmix data=swab; class crowding family name; model swab=crowding name / dist=poisson link=log s cl; random intercept / subject=family(crowding); random _residual_; < run; providing the output The GLIMMIX Procedure Fit Statistics -2 Res Log Pseudo-Likelihood Generalized Chi-Square Gener. Chi-Square / DF 2.12 Covariance Parameter Estimates Standard Cov Parm Subject Estimate Error Intercept family(crowding) Residual (VC) Type III Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F crowding name <.0001 with P-values more or less identical to the normality-based results. 11 / / 96

4 But... Real model for overdispersion The model is not a real model, since no such distribution exists... If we should build a real model with overdispersion, we would write: y cfn Poisson(λ cfn ) log(λ cfn ) = µ + β c + γ n + A cf + ε cfn A cf N (0, ω 2 ), ε cfn N (0, σ 2 ) will require two random-statements, like this: proc glimmix data=swab; class crowding family name; model swab=crowding name / dist=poisson link=log s cl; random intercept / subject=family(crowding); random intercept / subject=name*family(crowding); run; but this model unfortunately cannot be estimated by glimmix: 13 / / 96 Convergence problems Model check and diagnostics (normal models) The GLIMMIX Procedure Iteration History Objective Max Iteration Restarts Subiterations Function Change Gradient Did not converge. Covariance Parameter Estimates Standard Cov Parm Subject Estimate Error Intercept family(crowding) Intercept family*name(crowdin) Ordinary residuals, from systematic effect, Y i X i ˆβ These will be correlated Conditional residuals, Y i X i ˆβ + Ziˆbi Normality of random effects, estimated BLUP s, ˆb i Investigation of covariance structure, (variogram) Detection of influential observations It probably requires true replicates / / 96

5 Normality of random effects Variogram Model checks Histogram of BLUP s from the model is not worth much Instead, build model with a mixture of normal distributions and make a test... Variance of difference between time points: γ(u) = 1 2 E(ε t ε t u ) 2 = τ 2 (1 ρ(u)) + σ 2 If normality does not apply: no large effect on estimates for β, G and R standard error become biased, but may be corrected in various ways BLUP s ˆb i become invalid, especially when the residual variance σ 2 is large Nugget: σ 2 Sill: τ 2 + σ 2 Variance: ω 2 + τ 2 + σ 2 17 / / 96 Local influence Reasons for a large influence Idea: Put some infinitesimal extra weight ( ) on a single observation (i); Weight vector: ω i = (1,, 1 +,, 1) Look at the change in likelihood: LD(ω i ) = 2(l(ˆθ) l(ˆθ ω i )) Make a Local influence plot (i, C i ), where unusual combination of covariates X i large ordinary residuals (from X i β) unusual combination of covariates Z i bad choice of covariance pattern V i (α) C i = 1 LD(ω i) 19 / / 96

6 Explained variation in percent, R 2 Hypothetical example We have two (or more) different variances to explain! Residual variation (variation within individuals, σw 2 ) decreases (as usual) when we include an important x covariate (level 1) may decrease when we include an important z covariate (level 2) Variation between individuals, ωb 2 decreases when we include an important z covariate (level 2) may increase or decrease, when we include an important x covariate (level 1) 21 / 96 The x s vary between individuals, and the average outcomes (ȳ) are mostly due to this variation: Levels of y, for fixed x are quite alike! ω 2 decreases 22 / 96 Another hypothetical example Missing values The x s vary between individuals, but the average outcomes (ȳ) are almost identical: Levels of y, for fixed x are very different! ω 2 increases Most investigations are planned to be balanced but almost inevitable turn out to have missing values, or drop-out patients just by coincidence (blood sample lost or ruined) because of exclusion (the patient has recovered) we lost track of the patient (may be worrysome) the patient is too ill to show up (very serious, i.e. carrying information) 23 / / 96

7 Types of missing data Possible missing mechanism, I Low values are good: When the patient is well treated, he drops out Single missing values Drop-outs 25 / / 96 Possible missing mechanism, II Notation Low values are bad: Below some threshold, the patient is too ill to show up (informative missing) Outcome Y git Parameters θ (level, slope etc.) Covariates x git Indicator of missing c git Missing outcomes Y git not observed, i.e. corresponding to c git = 1 27 / / 96

8 Types of missingness Hypothetical TLC-observations MCAR Missing completely at random P(c git = 1) depends only on the parameter θ MAR Missing at random CDEP: P(c git = 1) depends upon θ and covariates x git YDEP: P(c git = 1) depends upon θ, covariates x git and observed outcomes Y git NI Non-ignorable: (informative missing) P(c git = 1) depends upon the unobserved=missing outcome Y git Lung capacity measured at regular time intervals for two groups, that we want to compare 29 / / 96 Average for the two groups Hypothetical example: Informative missing Patients who get below 3.5 drop out, averages change 31 / / 96

9 Traditional handling of missing data Complete case analysis Make an analysis including only those individuals who are observed at all available time points Complete case analysis LOCF: Last observation carried forward Time average imputation Model prediction imputation Information loss Potential bias, if there is a specific reason for the missingness Likelihood methods 33 / / 96 LOCF: Last observation carried forward If an individual has no observed value at time t k, replace the missing value by the previous observation, t k 1 For drop-outs, all subsequent values will equal this t k 1 Time average imputation Subject effect will be underestimated Large residuals, i.e. overestimation of residual variation The time effect will be less pronounced Large residuals, i.e. overestimation of residual variation 35 / / 96

10 Model prediction imputation Likelihood methods Two-step procedure Too small residuals, i.e. downwards bias of of SD Mixed models for all available observations 37 / / 96 MCAR: Missing completely at random MAR: Missing at random Complete case analysis OK, but inefficient If only few observations are missing, imputations could work but the variations will be affected Likelihood approaches (mixed models) OK uses all available information Complete case analysis is biased We disregard subjects with special characteristics If only few observations are missing, imputations could work but the variations will be affected Likelihood approaches (mixed models) OK uses all available information 39 / / 96

11 Mixed models for MAR Non-ignorable Nothing works! Many attempts have been tried to model the missing mechanisms, but they all rely on assumptions that cannot be checked. 41 / / 96 Example: Effect of exercise on appetite HIGH, week=pre 53 subjects, in three groups: Control Moderate exercise: 1 2 hour a day Extensive exercise: 1 hour a day Before and after exercise/placebo: Exercise test, with blood samples taken every half hour from baseline until 3 hours. Several hormones are measured, e.g. ghrelin Mads Rosenkilde 43 / / 96

12 HIGH, week=post Baseline differences, before exercise? No, not really Can we then disregard these? 45 / / / / 96

13 Aim of investigation Model for each group separately Does the exercise change something? Does the time course change from the first to the second week (pre/post exercise)? If so, does it change more than for the control group? And does it apply equally to the two exercise groups? Fixed effects week: Pre or Post time: 0, 30, 60, 90, 120, 150, 180 minutes Interaction week*time: A change in the pattern from before to after exercise Random effects Patients, Sub: 18 (HIGH, MOD), 17 (XCON) Interaction Sub*week Serial covariance structure Autoregressive? Local error term? 49 / / 96 Analysis for each group separately Output, exercise HIGH grp=high The Mixed Procedure proc mixed data=a0 covtest; by grp; class Sub week time; model log_ghrelin=time week*time / ddfm=satterth s cl; random intercept / subject=sub vcorr v; repeated time / subject=sub*week type=sp(pow)(numtime) local rcorr r; lsmeans week*time / slice=time; run; Data Set Dependent Variable Covariance Structures Subject Effects Estimation Method Residual Variance Method Fixed Effects SE Method Degrees of Freedom Method Model Information Class Level Information WORK.A0 log_ghrelin Variance Components, Spatial Power Sub, Sub*week REML Profile Model-Based Satterthwaite Class Levels Values Sub 18 ALMA ANAP ANLO ANTF BRFR CASC CHBE DAHA DERJ GRPE HEJE JAKU MIFH MIMR MIMÂİ MINI NIHA THES week 2 Post Pre time / / 96

14 Output, exercise HIGH, II Output, exercise HIGH, III Dimensions Covariance Parameters 4 Columns in X 24 Columns in Z Per Subject 1 Subjects 18 Max Obs Per Subject 14 Covariance Parameter Estimates Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F time <.0001 week*time Type 3 Tests of Fixed Effects Standard Z Cov Parm Subject Estimate Error Value Pr Z Intercept Sub Variance Sub*week SP(POW) Sub*week Residual <.0001 Num Den Effect DF DF F Value Pr > F week time <.0001 week*time / / 96 Output, exercise HIGH, IV Results from all three groups Tests of Effect Slices Num Den Effect time DF DF F Value Pr > F week*time week*time week*time week*time week*time week*time week*time Solution for Fixed Effects Standard Effect week time Estimate Error DF t Value Pr > t week*time Post week*time Post week*time Post week*time Post week*time Post week*time Post week*time Post ω 2 τ 2 ρ σ 2 P: week*time id/int HIGH /0.056 MOD /0.23 XCON /0.092 All *: Test for second-order interaction: grp*week*time 55 / / 96

15 All groups simultaneously proc mixed data=a0 covtest; class Sub grp week time; model log_ghrelin=grp week time grp*week grp*time week*time grp*week*time / ddfm=satterth s cl; random intercept / subject=sub(grp) vcorr v; repeated time / subject=week*sub(grp) type=sp(pow)(numtime) local rcorr r; lsmeans grp*week*time / slice=week; lsmeans grp*week*time / slice=grp; run; Class Level Information Class Levels Values Sub 53 ALJE ALMA ANAP ANBR ANKR ANLI ANLO ANMO ANTF ASOL ASSA BRFR THBR THES THLU TUMY ULRA grp 3 HIGH MOD XCON week 2 Post Pre time Dimensions Covariance Parameters 4 Columns in X 96 Columns in Z Per Subject 1 Subjects 53 Max Obs Per Subject 14 Number of Observations Number of Observations Read 736 Number of Observations Used 736 Number of Observations Not Used 0 57 / / 96 Baseline differences The Mixed Procedure Covariance Parameter Estimates Standard Z Cov Parm Subject Estimate Error Value Pr Z Intercept Sub <.0001 Variance Sub*week <.0001 SP(POW) Sub*week <.0001 Residual <.0001 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F grp week hours <.0001 grp*week grp*hours week*hours grp*week*hours Even though these are not significantly different, we know that we ought to include it as a covariate, but which baseline? The observation at time 0 in the Pre-exercise week? The observation at time 0 at each of the two weeks (i.e. separate values for variable baseline, depending upon the week) Think about which question you want to answer! 59 / / 96

16 Including first baseline as covariate Output, with baseline as covariate proc sort data=a0; by grp Sub; run; data baseline; set a0; if week= Pre and time=0; ghrelin0=ghrelin; log_ghrelin0=log_ghrelin; run; data test; merge a0 baseline; by grp Sub; Dimensions Covariance Parameters 4 Columns in X 85 Columns in Z Per Subject 1 Subjects 52 Max Obs Per Subject 12 Number of Observations Number of Observations Read 632 Number of Observations Used 621 Number of Observations Not Used 11 Covariance Parameter Estimates proc mixed data=test covtest; where week= Post or time>0; class Sub grp week time; model log_ghrelin=log_ghrelin0 grp week time grp*week grp*time week*time grp*week*time / ddfm=satterth s cl; random intercept / subject=sub vcorr v; repeated time / subject=sub*week type=sp(pow)(numtime) local rcorr r; run; 61 / 96 Standard Z Cov Parm Subject Estimate Error Value Pr Z Intercept Sub Variance Sub*week <.0001 SP(POW) Sub*week Residual < / 96 Model reduction Solution for Fixed Effects Standard Effect grp week hours Estimate Error DF t Value Intercept log_ghrelin Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F log_ghrelin <.0001 grp week time <.0001 grp*week grp*time week*time grp*week*time < Proceed with some model reduction and model checks / 96 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F log_ghrelin <.0001 grp week time <.0001 week*time Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F log_ghrelin <.0001 week time <.0001 week*time / 96

17 Could this have been done simpler? Effect of the lens strength on visual acuity Analysis of differences because we have only two occasions and exactly the same time points Reducing the time courses to averages and/or slopes, to give a simpler model for derived quantities 7 individuals are looking at a screen, where a light flash appears. They are looking through 4 lenses, with powers 6/6, 6/18, 6/36 and 6/60, i.e. 4 magnifications: 1, 3, 6 and 10 with 2 eyes Outcome: Visual acuity, the time lag (milliseconds) between the stimulus and the electrical response at the back of the cortex 65 / / 96 Data Crowder & Hand (1990) 67 / / 96

18 Factors to take into account Model formulation Main effects: 7 individuals (person), A p 2 eyes for each individual (eye), α e 4 lens magnifications (power), β m Interactions? person*eye, B pe where p = 1,..., 7, e = 1, 2, m = 1, 2, 3, 4 Y pem = µ em + A p + B pe + C pm + ε pem person*power, C pm eye*power, γ em 2-order interaction person*eye*power = Residual, ε pem A p N (0, ω 2 ) B pe N (0, τe 2 ) C pm N (0, τm) 2 ε pem N (0, σ 2 ) 69 / / 96 Factor diagram Not quite a multilevel model, but.. [I ] = [Pa Ey Po] [Pa Ey] Ey Po [Pa Po] Ey [Pa] Po 0 Level Unit Covariates 1 single measurements Ey*Po 2 interactions 2e [Pa*Ey] Ey 2m [Pa*Po] Po 3 individuals, [Pa] overall level 71 / / 96

19 proc mixed data=visual covtest; class patient eye power; model acuity=eye power eye*power / s ddfm=satterth; * random patient patient*eye patient*power; random intercept eye power / subject=patient; run; Covariance Parameter Estimates Standard Z Cov Parm Subject Estimate Error Value Pr > Z Intercept patient eye patient power patient Residual Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F eye power eye*power Solution for Fixed Effects Standard Effect eye power Estimate Error DF t Value Pr > t Intercept <.0001 eye left eye right power power power power eye*power left eye*power left eye*power left eye*power left eye*power right eye*power right eye*power right eye*power right / / 96 Predicted mean profiles Individual predictions 75 / / 96

20 Residual plot Omit the interaction eye*power Covariance Parameter Estimates Standard Z Cov Parm Subject Estimate Error Value Pr > Z Intercept patient eye patient power patient Residual Solution for Fixed Effects Standard Effect eye power Estimate Error DF t Value Pr > t Intercept <.0001 eye left eye right power power power power Type 3 Tests of Fixed Effects 77 / 96 Num Den Effect DF DF F Value Pr > F eye power / 96 Eye comparisons Consequence for eye comparisons Model: Y pem = µ em + A p + B pe + C pm + ε pem where A p N (0, ω 2 ), B pe N (0, τe 2 ), C pm N (0, τm), 2 ε pem N (0, σ 2 ) Difference between eye averages: Ȳ.e1. Ȳ.e 2. = µ stuff Var(Ȳ.e 1. Ȳ.e 2.) = 2 7 τ 2 e σ2 τ 2 e is rather large (people have different eye preferences) We have to demand a larger difference in order to detect it P-values rather large + B.e1 B.e2 + ε.e1. ε.e2. 79 / / 96

21 Magnification comparisons Consequence for magnification comparisons Model: Y pem = µ em + A p + B pe + C pm + ε pem where A p N (0, ω 2 ), B pe N (0, τe 2 ), C pm N (0, τm), 2 ε pem N (0, σ 2 ) Difference between magnification averages: Ȳ..m1 Ȳ..m 2 = µ stuff Var(Ȳ..m 1 Ȳ..m 2 ) = 2 7 τ 2 m σ2 τ 2 m is not that large (people react more or less identically to the different magnifications) We can detect smaller differences P-values rather small + C..m1 C..m2 + ε..m1 ε..m2 81 / / 96 If we ignore correlations Incorrect analysis, ignoring random effects i.e a model with no random effects Eye differences: but another σ 2 Magnification differences: Var(Ȳ.e 1. Ȳ.e 2.) = σ2 Var(Ȳ..m 1 Ȳ..m 2 ) = σ2 Covariance Parameter Estimates Cov Parm Estimate Residual Solution for Fixed Effects Standard Effect eye power Estimate Error DF t Value Pr > t Intercept <.0001 eye left eye right power power power power Type 3 Tests of Fixed Effects Num Den Effect DF DF Chi-Square F Value Pr > ChiSq Pr > F eye <--too small power <--too large 83 / / 96

22 Systematic vs. random effects Could the patients be treated as systematic here? Yes: Covariance Parameter Estimates Cov Parm Subject Estimate eye patient power patient Residual Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F patient eye power eye*power Example with only two time points (baseline and follow-up) from Vickers, A.J. & Altman, D.G.: Analysing controlled clinical trials with baseline and follow-up measurements. British Medical Journal 2001; 323: : 52 patients with shoulder pain are randomized to either Acupuncture (n=25) Placebo (n=27) Pain is evaluated on a 100 point scale before and after treatment. High scores are good Can you think why? 85 / / 96 Results on pain scores Development of pain, actual and hypothetical Comparison of the two groups Average pain score Treatment effect placebo acupuncture difference (n=27) (n=25) (95% CI) P-value Baseline 53.9 (14.0) 60.4 (12.3) Type of analysis Follow-up 62.3 (17.9) 79.6 (17.1) 17.3 (7.5; 27.1) Changes* 8.4 (14.6) 19.2 (16.1) 10.8 (2.3; 19.4) Ancova 12.7 (4.1; 21.3) * results published in Kleinhenz et.al. Pain 1999; 83: / / 96

23 Approaches for pain score analysis Approaches for pain score analysis, II Baseline The acupuncture group lies somewhat above placebo Follow-up We would expect the acupuncture group to be higher also after treatment Therefore, a direct comparison of follow-up times is unreasonable (we see too big a difference) Change Low baseline implies an expected large positive change (regression to the mean) The placebo group is therefore expected to increase the most Therefore, a direct comparison of changes is unreasonable (we see too small a difference) 89 / / 96 General approaches for handling baseline Recommandation Ancova Analysis of covariance, a special case of multiple regression: Outcome: follow-up data Covariates treatment (factor: acupuncture/placebo) baseline measurement (quantitative) Repeated measurement analysis Treatment effect appears as an interaction between treatment and time When can we use follow-up data? when we have a control group and proper randomisation when the correlation is low When can we use differences? when we have a control group and proper randomisation when the correlation is large When can we use analysis of covariance? always - as long as baseline imbalance is not related to treatment effect! 91 / / 96

24 Example: Reading ability as a function of age/training and cohort/age: Longitudinal (within-individual, β W ) effect vs. cross-sectional (between-individual, β B ) effect: Model Baseline level: a p1 at the age x p1 : a p1 = α + β B x p1 + δ p, β B negative Baseline measurement: y p1 = a p1 + ε p1 Follow-up level: a p2 = a p1 + β W (x p2 x p1 ) at the age x p2 : Follow-up measurement: y p2 = a p2 + ε p2 Difference: y p2 y p1 = β W (x p2 x p1 ) δ p + (ε p2 ε p1 ) Model for all y-observations: y pj = α + β B x p1 + β W (x pj x p1 ) + δ p + ε pj 93 / / 96 Actual analyses Estimation results Regression with inter- as well as intra-individual effect of age/time: proc mixed data=reading; class id; model read=age1 difage / s; random id; run; Covariance Parameter Estimates Cov Parm Estimate id Residual Standard Effect Estimate Error DF t Value Pr > t Intercept age difage cross sectional (β B ) longitudinal (β W ) Method Cohort effect Age effect y i1 vs. x i (0.458) y i2 vs. x i (0.534) y ij vs. x ij (0.384) no individual effect y i2 y i1 vs. x i2 x i (0.211) no intercept y ij vs. x ij (0.307) random individual effect y ij vs (0.572) (0.312) x i1 and (x i2 x i1 ) 95 / / 96

Correlated data. Variance component models. Example: Evaluate vaccine. Traditional assumption so far. Faculty of Health Sciences

Correlated data. Variance component models. Example: Evaluate vaccine. Traditional assumption so far. Faculty of Health Sciences Faculty of Health Sciences Variance component models Definitions and motivation Correlated data Variance component models, I Lene Theil Skovgaard November 29, 2013 One-way anova with random variation The

More information

Correlated data. Overview. Cross-over study. Repetition. Faculty of Health Sciences. Variance component models, II. More on variance component models

Correlated data. Overview. Cross-over study. Repetition. Faculty of Health Sciences. Variance component models, II. More on variance component models Faculty of Health Sciences Overview Correlated data More on variance component models Variance component models, II Cross-over studies Non-normal data Comparing measurement devices Lene Theil Skovgaard

More information

Analysis of variance and regression. December 4, 2007

Analysis of variance and regression. December 4, 2007 Analysis of variance and regression December 4, 2007 Variance component models Variance components One-way anova with random variation estimation interpretations Two-way anova with random variation Crossed

More information

Swabs, revisited. The families were subdivided into 3 groups according to the factor crowding, which describes the space available for the household.

Swabs, revisited. The families were subdivided into 3 groups according to the factor crowding, which describes the space available for the household. Swabs, revisited 18 families with 3 children each (in well defined age intervals) were followed over a certain period of time, during which repeated swabs were taken. The variable swabs indicates how many

More information

Varians- og regressionsanalyse

Varians- og regressionsanalyse Faculty of Health Sciences Varians- og regressionsanalyse Variance component models Lene Theil Skovgaard Department of Biostatistics Variance component models Definitions and motivation One-way anova with

More information

Analysis of variance and regression. May 13, 2008

Analysis of variance and regression. May 13, 2008 Analysis of variance and regression May 13, 2008 Repeated measurements over time Presentation of data Traditional ways of analysis Variance component model (the dogs revisited) Random regression Baseline

More information

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models Faculty of Health Sciences Overview Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 29, 2016 One-way anova with random variation The rabbit example Hierarchical

More information

Correlated data. Repeated measurements over time. Typical set-up for repeated measurements. Traditional presentation of data

Correlated data. Repeated measurements over time. Typical set-up for repeated measurements. Traditional presentation of data Faculty of Health Sciences Repeated measurements over time Correlated data NFA, May 22, 2014 Longitudinal measurements Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics University of

More information

Faculty of Health Sciences. Correlated data. More about LMMs. Lene Theil Skovgaard. December 4, / 104

Faculty of Health Sciences. Correlated data. More about LMMs. Lene Theil Skovgaard. December 4, / 104 Faculty of Health Sciences Correlated data More about LMMs Lene Theil Skovgaard December 4, 2015 1 / 104 Further topics Model check and diagnostics Cross-over studies Paired T-tests with missing values

More information

Variance component models part I

Variance component models part I Faculty of Health Sciences Variance component models part I Analysis of repeated measurements, 30th November 2012 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

Correlated data. Longitudinal data. Typical set-up for repeated measurements. Examples from literature, I. Faculty of Health Sciences

Correlated data. Longitudinal data. Typical set-up for repeated measurements. Examples from literature, I. Faculty of Health Sciences Faculty of Health Sciences Longitudinal data Correlated data Longitudinal measurements Outline Designs Models for the mean Covariance patterns Lene Theil Skovgaard November 27, 2015 Random regression Baseline

More information

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman.

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman. Faculty of Health Sciences Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 28, 2017 1 / 96 Overview One-way anova with random variation The rabbit example Hierarchical

More information

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models Faculty of Health Sciences Overview Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 28, 2017 One-way anova with random variation The rabbit example Hierarchical

More information

Variance components and LMMs

Variance components and LMMs Faculty of Health Sciences Variance components and LMMs Analysis of repeated measurements, 4th December 2014 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman.

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman. Faculty of Health Sciences Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 27, 2018 1 / 84 Overview One-way anova with random variation The rabbit example Hierarchical

More information

Correlated data. Overview. Example: Swelling due to vaccine. Variance component models. Faculty of Health Sciences. Variance component models

Correlated data. Overview. Example: Swelling due to vaccine. Variance component models. Faculty of Health Sciences. Variance component models Faculty of Health Sciences Overview Correlated data Variance component models One-way anova with random variation The rabbit example Hierarchical models with several levels Random regression Lene Theil

More information

Variance components and LMMs

Variance components and LMMs Faculty of Health Sciences Topics for today Variance components and LMMs Analysis of repeated measurements, 4th December 04 Leftover from 8/: Rest of random regression example. New concepts for today:

More information

Multi-factor analysis of variance

Multi-factor analysis of variance Faculty of Health Sciences Outline Multi-factor analysis of variance Basic statistics for experimental researchers 2015 Two-way ANOVA and interaction Mathed samples ANOVA Random vs systematic variation

More information

Answer to exercise: Blood pressure lowering drugs

Answer to exercise: Blood pressure lowering drugs Answer to exercise: Blood pressure lowering drugs The data set bloodpressure.txt contains data from a cross-over trial, involving three different formulations of a drug for lowering of blood pressure:

More information

Faculty of Health Sciences. Correlated data. Count variables. Lene Theil Skovgaard & Julie Lyng Forman. December 6, 2016

Faculty of Health Sciences. Correlated data. Count variables. Lene Theil Skovgaard & Julie Lyng Forman. December 6, 2016 Faculty of Health Sciences Correlated data Count variables Lene Theil Skovgaard & Julie Lyng Forman December 6, 2016 1 / 76 Modeling count outcomes Outline The Poisson distribution for counts Poisson models,

More information

STAT 5200 Handout #26. Generalized Linear Mixed Models

STAT 5200 Handout #26. Generalized Linear Mixed Models STAT 5200 Handout #26 Generalized Linear Mixed Models Up until now, we have assumed our error terms are normally distributed. What if normality is not realistic due to the nature of the data? (For example,

More information

Linear mixed models. Faculty of Health Sciences. Analysis of repeated measurements, 10th March Julie Lyng Forman & Lene Theil Skovgaard

Linear mixed models. Faculty of Health Sciences. Analysis of repeated measurements, 10th March Julie Lyng Forman & Lene Theil Skovgaard Faculty of Health Sciences Linear mixed models Analysis of repeated measurements, 10th March 2015 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen 1 / 80 Program

More information

Models for longitudinal data

Models for longitudinal data Faculty of Health Sciences Contents Models for longitudinal data Analysis of repeated measurements, NFA 016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

SAS Syntax and Output for Data Manipulation:

SAS Syntax and Output for Data Manipulation: CLP 944 Example 5 page 1 Practice with Fixed and Random Effects of Time in Modeling Within-Person Change The models for this example come from Hoffman (2015) chapter 5. We will be examining the extent

More information

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */ CLP 944 Example 4 page 1 Within-Personn Fluctuation in Symptom Severity over Time These data come from a study of weekly fluctuation in psoriasis severity. There was no intervention and no real reason

More information

More about linear mixed models

More about linear mixed models Faculty of Health Sciences Contents More about linear mixed models Analysis of repeated measurements, NFA 2016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

Correlated data. Non-normal outcomes. Reminder on binary data. Non-normal data. Faculty of Health Sciences. Non-normal outcomes

Correlated data. Non-normal outcomes. Reminder on binary data. Non-normal data. Faculty of Health Sciences. Non-normal outcomes Faculty of Health Sciences Non-normal outcomes Correlated data Non-normal outcomes Lene Theil Skovgaard December 5, 2014 Generalized linear models Generalized linear mixed models Population average models

More information

Statistics for exp. medical researchers Regression and Correlation

Statistics for exp. medical researchers Regression and Correlation Faculty of Health Sciences Regression analysis Statistics for exp. medical researchers Regression and Correlation Lene Theil Skovgaard Sept. 28, 2015 Linear regression, Estimation and Testing Confidence

More information

Variance component models

Variance component models Faculty of Health Sciences Variance component models Analysis of repeated measurements, NFA 2016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen Topics for

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Today s Class (or 3): Summary of steps in building unconditional models for time What happens to missing predictors Effects of time-invariant predictors

More information

Linear mixed models. Program. What are repeated measurements? Outline. Faculty of Health Sciences. Analysis of repeated measurements, 10th March 2015

Linear mixed models. Program. What are repeated measurements? Outline. Faculty of Health Sciences. Analysis of repeated measurements, 10th March 2015 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Program Faculty of Health Sciences Topics: Linear mixed models

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Topics: What happens to missing predictors Effects of time-invariant predictors Fixed vs. systematically varying vs. random effects Model building strategies

More information

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010 MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010 Part 1 of this document can be found at http://www.uvm.edu/~dhowell/methods/supplements/mixed Models for Repeated Measures1.pdf

More information

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1 CLDP 944 Example 3a page 1 From Between-Person to Within-Person Models for Longitudinal Data The models for this example come from Hoffman (2015) chapter 3 example 3a. We will be examining the extent to

More information

STAT 705 Generalized linear mixed models

STAT 705 Generalized linear mixed models STAT 705 Generalized linear mixed models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 24 Generalized Linear Mixed Models We have considered random

More information

STAT 5200 Handout #23. Repeated Measures Example (Ch. 16)

STAT 5200 Handout #23. Repeated Measures Example (Ch. 16) Motivating Example: Glucose STAT 500 Handout #3 Repeated Measures Example (Ch. 16) An experiment is conducted to evaluate the effects of three diets on the serum glucose levels of human subjects. Twelve

More information

Analysis of Count Data A Business Perspective. George J. Hurley Sr. Research Manager The Hershey Company Milwaukee June 2013

Analysis of Count Data A Business Perspective. George J. Hurley Sr. Research Manager The Hershey Company Milwaukee June 2013 Analysis of Count Data A Business Perspective George J. Hurley Sr. Research Manager The Hershey Company Milwaukee June 2013 Overview Count data Methods Conclusions 2 Count data Count data Anything with

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science

UNIVERSITY OF TORONTO Faculty of Arts and Science UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator

More information

Models for binary data

Models for binary data Faculty of Health Sciences Models for binary data Analysis of repeated measurements 2015 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen 1 / 63 Program for

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen 2 / 28 Preparing data for analysis The

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen Outline Data in wide and long format

More information

Repeated Measures Modeling With PROC MIXED E. Barry Moser, Louisiana State University, Baton Rouge, LA

Repeated Measures Modeling With PROC MIXED E. Barry Moser, Louisiana State University, Baton Rouge, LA Paper 188-29 Repeated Measures Modeling With PROC MIXED E. Barry Moser, Louisiana State University, Baton Rouge, LA ABSTRACT PROC MIXED provides a very flexible environment in which to model many types

More information

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study 1.4 0.0-6 7 8 9 10 11 12 13 14 15 16 17 18 19 age Model 1: A simple broken stick model with knot at 14 fit with

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

Introduction to Random Effects of Time and Model Estimation

Introduction to Random Effects of Time and Model Estimation Introduction to Random Effects of Time and Model Estimation Today s Class: The Big Picture Multilevel model notation Fixed vs. random effects of time Random intercept vs. random slope models How MLM =

More information

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE Biostatistics Workshop 2008 Longitudinal Data Analysis Session 4 GARRETT FITZMAURICE Harvard University 1 LINEAR MIXED EFFECTS MODELS Motivating Example: Influence of Menarche on Changes in Body Fat Prospective

More information

De-mystifying random effects models

De-mystifying random effects models De-mystifying random effects models Peter J Diggle Lecture 4, Leahurst, October 2012 Linear regression input variable x factor, covariate, explanatory variable,... output variable y response, end-point,

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Today s Topics: What happens to missing predictors Effects of time-invariant predictors Fixed vs. systematically varying vs. random effects Model building

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

Time Invariant Predictors in Longitudinal Models

Time Invariant Predictors in Longitudinal Models Time Invariant Predictors in Longitudinal Models Longitudinal Data Analysis Workshop Section 9 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section

More information

Missing Data in Longitudinal Studies: Mixed-effects Pattern-Mixture and Selection Models

Missing Data in Longitudinal Studies: Mixed-effects Pattern-Mixture and Selection Models Missing Data in Longitudinal Studies: Mixed-effects Pattern-Mixture and Selection Models Hedeker D & Gibbons RD (1997). Application of random-effects pattern-mixture models for missing data in longitudinal

More information

Introduction to Within-Person Analysis and RM ANOVA

Introduction to Within-Person Analysis and RM ANOVA Introduction to Within-Person Analysis and RM ANOVA Today s Class: From between-person to within-person ANOVAs for longitudinal data Variance model comparisons using 2 LL CLP 944: Lecture 3 1 The Two Sides

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages

More information

Topic 23: Diagnostics and Remedies

Topic 23: Diagnostics and Remedies Topic 23: Diagnostics and Remedies Outline Diagnostics residual checks ANOVA remedial measures Diagnostics Overview We will take the diagnostics and remedial measures that we learned for regression and

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Topics: Summary of building unconditional models for time Missing predictors in MLM Effects of time-invariant predictors Fixed, systematically varying,

More information

Analysis of variance and regression. November 22, 2007

Analysis of variance and regression. November 22, 2007 Analysis of variance and regression November 22, 2007 Parametrisations: Choice of parameters Comparison of models Test for linearity Linear splines Lene Theil Skovgaard, Dept. of Biostatistics, Institute

More information

Review of CLDP 944: Multilevel Models for Longitudinal Data

Review of CLDP 944: Multilevel Models for Longitudinal Data Review of CLDP 944: Multilevel Models for Longitudinal Data Topics: Review of general MLM concepts and terminology Model comparisons and significance testing Fixed and random effects of time Significance

More information

Lab 11. Multilevel Models. Description of Data

Lab 11. Multilevel Models. Description of Data Lab 11 Multilevel Models Henian Chen, M.D., Ph.D. Description of Data MULTILEVEL.TXT is clustered data for 386 women distributed across 40 groups. ID: 386 women, id from 1 to 386, individual level (level

More information

BIOSTATISTICAL METHODS

BIOSTATISTICAL METHODS BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH Cross-over Designs #: DESIGNING CLINICAL RESEARCH The subtraction of measurements from the same subject will mostly cancel or minimize effects

More information

Describing Change over Time: Adding Linear Trends

Describing Change over Time: Adding Linear Trends Describing Change over Time: Adding Linear Trends Longitudinal Data Analysis Workshop Section 7 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

Analyzing Pilot Studies with Missing Observations

Analyzing Pilot Studies with Missing Observations Analyzing Pilot Studies with Missing Observations Monnie McGee mmcgee@smu.edu. Department of Statistical Science Southern Methodist University, Dallas, Texas Co-authored with N. Bergasa (SUNY Downstate

More information

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington Analsis of Longitudinal Data Patrick J. Heagert PhD Department of Biostatistics Universit of Washington 1 Auckland 2008 Session Three Outline Role of correlation Impact proper standard errors Used to weight

More information

Approximate analysis of covariance in trials in rare diseases, in particular rare cancers

Approximate analysis of covariance in trials in rare diseases, in particular rare cancers Approximate analysis of covariance in trials in rare diseases, in particular rare cancers Stephen Senn (c) Stephen Senn 1 Acknowledgements This work is partly supported by the European Union s 7th Framework

More information

Lecture 4 Multiple linear regression

Lecture 4 Multiple linear regression Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters

More information

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p ) Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p. 376-390) BIO656 2009 Goal: To see if a major health-care reform which took place in 1997 in Germany was

More information

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

More information

Model Assumptions; Predicting Heterogeneity of Variance

Model Assumptions; Predicting Heterogeneity of Variance Model Assumptions; Predicting Heterogeneity of Variance Today s topics: Model assumptions Normality Constant variance Predicting heterogeneity of variance CLP 945: Lecture 6 1 Checking for Violations of

More information

VIII. ANCOVA. A. Introduction

VIII. ANCOVA. A. Introduction VIII. ANCOVA A. Introduction In most experiments and observational studies, additional information on each experimental unit is available, information besides the factors under direct control or of interest.

More information

Mixed-Effects Pattern-Mixture Models for Incomplete Longitudinal Data. Don Hedeker University of Illinois at Chicago

Mixed-Effects Pattern-Mixture Models for Incomplete Longitudinal Data. Don Hedeker University of Illinois at Chicago Mixed-Effects Pattern-Mixture Models for Incomplete Longitudinal Data Don Hedeker University of Illinois at Chicago This work was supported by National Institute of Mental Health Contract N44MH32056. 1

More information

Multi-factor analysis of variance

Multi-factor analysis of variance Faculty of Health Sciences Outline Multi-factor analysis of variance Basic statistics for experimental researchers 2016 Two-way ANOVA and interaction Matched samples ANOVA Random vs systematic variation

More information

CHL 5225 H Crossover Trials. CHL 5225 H Crossover Trials

CHL 5225 H Crossover Trials. CHL 5225 H Crossover Trials CHL 55 H Crossover Trials The Two-sequence, Two-Treatment, Two-period Crossover Trial Definition A trial in which patients are randomly allocated to one of two sequences of treatments (either 1 then, or

More information

SAS Code for Data Manipulation: SPSS Code for Data Manipulation: STATA Code for Data Manipulation: Psyc 945 Example 1 page 1

SAS Code for Data Manipulation: SPSS Code for Data Manipulation: STATA Code for Data Manipulation: Psyc 945 Example 1 page 1 Psyc 945 Example page Example : Unconditional Models for Change in Number Match 3 Response Time (complete data, syntax, and output available for SAS, SPSS, and STATA electronically) These data come from

More information

Practical Considerations Surrounding Normality

Practical Considerations Surrounding Normality Practical Considerations Surrounding Normality Prof. Kevin E. Thorpe Dalla Lana School of Public Health University of Toronto KE Thorpe (U of T) Normality 1 / 16 Objectives Objectives 1. Understand the

More information

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics Faculty of Health Sciences Regression models Counts, Poisson regression, 27-5-2013 Lene Theil Skovgaard Dept. of Biostatistics 1 / 36 Count outcome PKA & LTS, Sect. 7.2 Poisson regression The Binomial

More information

BIOS 6649: Handout Exercise Solution

BIOS 6649: Handout Exercise Solution BIOS 6649: Handout Exercise Solution NOTE: I encourage you to work together, but the work you submit must be your own. Any plagiarism will result in loss of all marks. This assignment is based on weight-loss

More information

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model

More information

Research Design: Topic 18 Hierarchical Linear Modeling (Measures within Persons) 2010 R.C. Gardner, Ph.d.

Research Design: Topic 18 Hierarchical Linear Modeling (Measures within Persons) 2010 R.C. Gardner, Ph.d. Research Design: Topic 8 Hierarchical Linear Modeling (Measures within Persons) R.C. Gardner, Ph.d. General Rationale, Purpose, and Applications Linear Growth Models HLM can also be used with repeated

More information

Analysis of Incomplete Non-Normal Longitudinal Lipid Data

Analysis of Incomplete Non-Normal Longitudinal Lipid Data Analysis of Incomplete Non-Normal Longitudinal Lipid Data Jiajun Liu*, Devan V. Mehrotra, Xiaoming Li, and Kaifeng Lu 2 Merck Research Laboratories, PA/NJ 2 Forrest Laboratories, NY *jiajun_liu@merck.com

More information

Chapter 1. Modeling Basics

Chapter 1. Modeling Basics Chapter 1. Modeling Basics What is a model? Model equation and probability distribution Types of model effects Writing models in matrix form Summary 1 What is a statistical model? A model is a mathematical

More information

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 An Introduction to Multilevel Models PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 Today s Class Concepts in Longitudinal Modeling Between-Person vs. +Within-Person

More information

Mixed Models for Longitudinal Binary Outcomes. Don Hedeker Department of Public Health Sciences University of Chicago.

Mixed Models for Longitudinal Binary Outcomes. Don Hedeker Department of Public Health Sciences University of Chicago. Mixed Models for Longitudinal Binary Outcomes Don Hedeker Department of Public Health Sciences University of Chicago hedeker@uchicago.edu https://hedeker-sites.uchicago.edu/ Hedeker, D. (2005). Generalized

More information

6. Multiple regression - PROC GLM

6. Multiple regression - PROC GLM Use of SAS - November 2016 6. Multiple regression - PROC GLM Karl Bang Christensen Department of Biostatistics, University of Copenhagen. http://biostat.ku.dk/~kach/sas2016/ kach@biostat.ku.dk, tel: 35327491

More information

Random Coefficient Model (a.k.a. multilevel model) (Adapted from UCLA Statistical Computing Seminars)

Random Coefficient Model (a.k.a. multilevel model) (Adapted from UCLA Statistical Computing Seminars) STAT:5201 Applied Statistic II Random Coefficient Model (a.k.a. multilevel model) (Adapted from UCLA Statistical Computing Seminars) School math achievement scores The data file consists of 7185 students

More information

Exam Applied Statistical Regression. Good Luck!

Exam Applied Statistical Regression. Good Luck! Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.

More information

Package HGLMMM for Hierarchical Generalized Linear Models

Package HGLMMM for Hierarchical Generalized Linear Models Package HGLMMM for Hierarchical Generalized Linear Models Marek Molas Emmanuel Lesaffre Erasmus MC Erasmus Universiteit - Rotterdam The Netherlands ERASMUSMC - Biostatistics 20-04-2010 1 / 52 Outline General

More information

Statistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23

Statistical Methods. Missing Data  snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23 1 / 23 Statistical Methods Missing Data http://www.stats.ox.ac.uk/ snijders/sm.htm Tom A.B. Snijders University of Oxford November, 2011 2 / 23 Literature: Joseph L. Schafer and John W. Graham, Missing

More information

Overdispersion Workshop in generalized linear models Uppsala, June 11-12, Outline. Overdispersion

Overdispersion Workshop in generalized linear models Uppsala, June 11-12, Outline. Overdispersion Biostokastikum Overdispersion is not uncommon in practice. In fact, some would maintain that overdispersion is the norm in practice and nominal dispersion the exception McCullagh and Nelder (1989) Overdispersion

More information

Whether to use MMRM as primary estimand.

Whether to use MMRM as primary estimand. Whether to use MMRM as primary estimand. James Roger London School of Hygiene & Tropical Medicine, London. PSI/EFSPI European Statistical Meeting on Estimands. Stevenage, UK: 28 September 2015. 1 / 38

More information

Bios 6648: Design & conduct of clinical research

Bios 6648: Design & conduct of clinical research Bios 6648: Design & conduct of clinical research Section 2 - Formulating the scientific and statistical design designs 2.5(b) Binary 2.5(c) Skewed baseline (a) Time-to-event (revisited) (b) Binary (revisited)

More information

with the usual assumptions about the error term. The two values of X 1 X 2 0 1

with the usual assumptions about the error term. The two values of X 1 X 2 0 1 Sample questions 1. A researcher is investigating the effects of two factors, X 1 and X 2, each at 2 levels, on a response variable Y. A balanced two-factor factorial design is used with 1 replicate. The

More information

Longitudinal Data Analysis of Health Outcomes

Longitudinal Data Analysis of Health Outcomes Longitudinal Data Analysis of Health Outcomes Longitudinal Data Analysis Workshop Running Example: Days 2 and 3 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development

More information

Well-developed and understood properties

Well-developed and understood properties 1 INTRODUCTION TO LINEAR MODELS 1 THE CLASSICAL LINEAR MODEL Most commonly used statistical models Flexible models Well-developed and understood properties Ease of interpretation Building block for more

More information

Shared Random Parameter Models for Informative Missing Data

Shared Random Parameter Models for Informative Missing Data Shared Random Parameter Models for Informative Missing Data Dean Follmann NIAID NIAID/NIH p. A General Set-up Longitudinal data (Y ij,x ij,r ij ) Y ij = outcome for person i on visit j R ij = 1 if observed

More information

Joint Modeling of Longitudinal Item Response Data and Survival

Joint Modeling of Longitudinal Item Response Data and Survival Joint Modeling of Longitudinal Item Response Data and Survival Jean-Paul Fox University of Twente Department of Research Methodology, Measurement and Data Analysis Faculty of Behavioural Sciences Enschede,

More information

A Practitioner s Guide to Cluster-Robust Inference

A Practitioner s Guide to Cluster-Robust Inference A Practitioner s Guide to Cluster-Robust Inference A. C. Cameron and D. L. Miller presented by Federico Curci March 4, 2015 Cameron Miller Cluster Clinic II March 4, 2015 1 / 20 In the previous episode

More information

Lecture 7 Time-dependent Covariates in Cox Regression

Lecture 7 Time-dependent Covariates in Cox Regression Lecture 7 Time-dependent Covariates in Cox Regression So far, we ve been considering the following Cox PH model: λ(t Z) = λ 0 (t) exp(β Z) = λ 0 (t) exp( β j Z j ) where β j is the parameter for the the

More information

Longitudinal Modeling with Logistic Regression

Longitudinal Modeling with Logistic Regression Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to

More information