Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington
|
|
- Lucas Shields
- 5 years ago
- Views:
Transcription
1 Analsis of Longitudinal Data Patrick J. Heagert PhD Department of Biostatistics Universit of Washington 1 Auckland 2008
2 Session Three Outline Role of correlation Impact proper standard errors Used to weight individuals (clusters) Models for correlation / covariance Regression: Group-to-Group variation Random effects: Individual-to-Individual variation Serial correlation: Observation-to-Observation variation 2 Auckland 2008
3 Longitudinal Data Analsis INTRODUCTION to CORRELATION and WEIGHTING 3 Auckland 2008
4 Treatment of Lead-Exposed Children (TLC) Trial: In the 1990 s a placebo-controlled randomized trial of a new chelating agent, succimer, was conducted among children with lead levels 20-44µg/dL. Children received up to three 26-da courses of succimer or placebo and were followed for 3 ears. Data set with 100 children. m = 50 placebo; m = 50 active. Illustrate: naive analses and the impact of correlation. 4 Auckland 2008
5 TLC Trial Means Blood Lead Placebo Active Weeks 5 Auckland 2008
6 Simple (naive?) Analsis of Treatment Post Data Onl compare the mean blood lead after baseline in the TX and control groups using 3 measurements/person, and all 100 subjects. Issue(s) = Pre/Post Data compare the mean blood lead after baseline to the mean blood lead at baseline for the treatment subjects onl using 4 measurements/person, and onl 50 subjects. Issue(s) = 6 Auckland 2008
7 6-1 Auckland 2008 Simple Analsis: Post Onl Data week 0 week 1 week 4 week 6 Control β 0 β 0 β 0 Treatment β 0 + β 1 β 0 + β 1 β 0 + β 1
8 6-2 Auckland 2008 Post Data Onl. *** Analsis using POST DATA at weeks 1, 4, and 6.. regress tx if week>=1 Number of obs = Coef. Std. Err. t P> t [95% Conf. Interval] tx _cons
9 6-3 Auckland regress tx if week>=1, cluster(id) Number of obs = 300 Regression with robust standard errors Number of clusters (id) = Robust Coef. Std. Err. t P> t [95% Conf. Interval] tx _cons
10 Treatment β 0 β 0 + β 1 β 0 + β 1 β 0 + β Auckland 2008 Simple Analsis: Pre/Post for Treatment Group Onl Control week 0 week 1 week 4 week 6
11 6-5 Auckland 2008 Pre/Post Data, TX Group Onl.. *** Analsis using PRE/POST for treatment subjects.. regress post if tx==1 Number of obs = Coef. Std. Err. t P> t [95% Conf. Interval] post _cons
12 6-6 Auckland regress post if tx==1, cluster(id) Number of obs = 200 Regression with robust standard errors Number of clusters (id) = Robust Coef. Std. Err. t P> t [95% Conf. Interval] post _cons
13 Dependent Data and Proper Variance Estimates Let X ij = 0 denote placebo assignment and X ij = 1 denote active treatment. (1) Consider (Y i1, Y i2 ) with (X i1, X i2 ) = (0, 0) for i = 1 : n and (X i1, X i2 ) = (1, 1) for i = (n + 1) : 2n ˆµ 0 = 1 2n ˆµ 1 = 1 2n n 2 i=1 j=1 2n Y ij 2 i=n+1 j=1 var(ˆµ 1 ˆµ 0 ) = 1 n {σ2 (1 + ρ)} Y ij 7 Auckland 2008
14 7-1 Auckland 2008 Scenario 1 subject control treatment time 1 time 2 time 1 time 2 ID = 101 Y 1,1 Y 1,2 ID = 102 Y 2,1 Y 2,2 ID = 103 Y 3,1 Y 3,2 ID = 104 Y 4,1 Y 4,2 ID = 105 Y 5,1 Y 5,2 ID = 106 Y 6,1 Y 6,2
15 Dependent Data and Proper Variance Estimates (2) Consider (Y i1, Y i2 ) with (X i1, X i2 ) = (0, 1) for i = 1 : n and (X i1, X i2 ) = (1, 0) for i = (n + 1) : 2n ˆµ 0 = 1 2n ˆµ 1 = 1 2n { n Y i1 + i=1 { n Y i2 + i=1 2n i=n+1 2n i=n+1 Y i2 } Y i1 } var(ˆµ 1 ˆµ 0 ) = 1 n {σ2 (1 ρ)} 8 Auckland 2008
16 8-1 Auckland 2008 Scenario 2 subject control treatment time 1 time 2 time 1 time 2 ID = 101 Y 1,1 Y 1,2 ID = 102 Y 2,1 Y 2,2 ID = 103 Y 3,1 Y 3,2 ID = 104 Y 4,2 Y 4,1 ID = 105 Y 5,2 Y 5,1 ID = 106 Y 6,2 Y 6,1
17 Dependent Data and Proper Variance Estimates If we simpl had 2n independent observations on treatment (X = 1) and 2n independent observations on control then we d obtain var(ˆµ 1 ˆµ 0 ) = σ2 2n + σ2 2n = 1 n σ2 Q: What is the impact of dependence relative to the situation where all (2n + 2n) observations are independent? (1) positive dependence, ρ > 0, results in a loss of precision. (2) positive dependence, ρ > 0, results in an improvement in precision! 9 Auckland 2008
18 Therefore: Dependent data impacts proper statements of precision. Dependent data ma increase or decrease standard errors depending on the design. 10 Auckland 2008
19 Weighted Estimation Consider the situation where subjects report both the number of attempts and the number of successes: (Y i, N i ). Examples: live born (Y i ) in a litter (N i ) condoms used (Y i ) in sexual encounters (N i ) SAEs (Y i ) among total surgeries (N i ) Q: How to combine these data from i = 1 : m subjects to estimate a common rate (proportion) of successes? 11 Auckland 2008
20 Weighted Estimation Proposal 1: ˆp 1 = i Y i / i N i Proposal 2: Simple Example: ˆp 2 = 1 m Y i /N i i Data : (1, 10) (2, 100) ˆp 1 = (2 + 1)/(110) = ˆp 2 = 1 {1/10 + 2/100} = Auckland 2008
21 Weighted Estimation Note: Each of these estimators, ˆp 1, and ˆp 2, can be viewed as weighted estimators of the form: ˆp w = { i w i Y i N i } / i We obtain ˆp 1 b letting w i = N i, corresponding to equal weight given each to binar outcome, Y ij, Y i = N i j=1 Y ij. We obtain ˆp 2 b letting w i = 1, corresponding to equal weight given to each subject. w i Q: What s optimal? 13 Auckland 2008
22 Weighted Estimation A: Whatever weights are closest to 1/variance of Y i /N i (stat theor called Gauss-Markov ). If subjects are perfectl homogeneous then and ˆp 1 is best. V (Y i ) = N i p(1 p) If subjects are heterogeneous then, for example V (Y i ) = N i p(1 p){1 + (N i 1)ρ} and an estimator closer to ˆp 2 is best. 14 Auckland 2008
23 Summar: Role of Correlation Statistical inference must account for the dependence. correlation impacts standard errors! Consideration as to the choice of weighting will depend on the variance/covariance of the response variables. correlation impacts regression estimates! 15 Auckland 2008
24 Longitudinal Data Analsis INTRODUCTION to REGRESSION APPROACHES 16 Auckland 2008
25 Statistical Models Regression model: Groups mean response as a function of covariates. sstematic variation Random effects: Individuals variation from subject-to-subject in trajector. random between-subject variation Within-subject variation: Observations variation of individual observations over time random within-subject variation 17 Auckland 2008
26 Groups: Scientific Questions as Regression Questions concerning the rate of decline refer to the time slope for FEV1: E[ FEV1 X = age, gender, f508] = β 0 (X) + β 1 (X) time Time Scales Let age0 = age-at-entr, age i1 Let agel = time-since-entr, age ij age i1 18 Auckland 2008
27 CF Regression Model Model: E[FEV X i ] = β 0 +β 1 age0 + β 2 agel +β 3 female +β 4 f508 = 1 + β 5 f508 = 2 +β 6 female agel +β 7 f508 = 1 agel + β 8 f508 = 2 agel = β 0 (X i ) + β 1 (X i ) agel 19 Auckland 2008
28 19-1 Auckland 2008 Intercept f508=0 f508=1 f508=2 male β 0 + β 1 age0 β 0 + β 1 age0 β 0 + β 1 age0 +β 4 +β 5 female β 0 + β 1 age0 β 0 + β 1 age0 β 0 + β 1 age0 +β 3 +β 3 + β 4 +β 3 + β 5
29 19-2 Auckland 2008 Slope f508=0 f508=1 f508=2 male β 2 β 2 + β 7 β 2 + β 8 female β 2 β 2 + β 7 β 2 + β 8 +β 6 +β 6 +β 6
30 19-3 Auckland 2008 Gender Groups (f508==0) FEV Male Female Age (ears)
31 19-4 Auckland 2008 Genotpe Groups (male) FEV f508=0 f508=1 f508= Age (ears)
32 Define Y ij = FEV1 for subject i at time t ij X i = (X ij,..., X ini ) X ij = (X ij,1, X ij,2,..., X ij,p ) age0, agel, gender, genotpe Issue: Response variables measured on the same subject are correlated. cov(y ij, Y ik ) 0 20 Auckland 2008
33 Some Notation It is useful to have some notation that can be used to discuss the stack of data that correspond to each subject. Let n i denote the number of observations for subject i. Define: Y i = Y i1 Y i2. Y ini If the subjects are observed at a common set of times t 1, t 2,..., t m then E(Y ij ) = µ j denotes the mean of the population at time t j. 21 Auckland 2008
34 Dependence and Correlation Recall that observations are termed independent when deviation in one variable does not predict deviation in the other variable. Given two subjects with the same age and gender, then the blood pressure for patient ID=212 is not predictive of the blood pressure for patient ID=334. Observations are called dependent or correlated when one variable does predict the value of another variable. The LDL cholesterol of patient ID=212 at age 57 is predictive of the LDL cholesterol of patient ID=212 at age Auckland 2008
35 Dependence and Correlation Recall: The variance of a variable, Y ij (fix time t j for now) is defined as: σ 2 j = E [ (Y ij µ j ) 2] = E [(Y ij µ j )(Y ij µ j )] The variance measures the average distance that an observation falls awa from the mean. 23 Auckland 2008
36 Dependence and Correlation Define: The covariance of two variables, Y ij, and Y ik (fix t j and t k ) is defined as: σ jk = E [(Y ij µ j )(Y ik µ k )] The covariance measures whether, on average, departures in one variable, Y ij µ j, go together with departures in a second variable, Y ik µ k. In simple linear regression of Y ij on Y ik the regression coefficient β 1 in E(Y ij Y ik ) = β 0 + β 1 Y ik is the covariance divided b the variance of Y ik : β 1 = σ jk σ 2 k 24 Auckland 2008
37 Dependence and Correlation Define: The correlation of two variables, Y ij, and Y ik (fix t j and t k ) is defined as: ρ jk = E [(Y ij µ j )(Y ik µ k )] σ j σ k The correlation is a measure of dependence that takes values between -1 and +1. Recall that a correlation of 0.0 implies that the two measures are unrelated (linearl). Recall that a correlation of 1.0 implies that the two measures fall perfectl on a line one exactl predicts the other! 25 Auckland 2008
38 Wh interest in covariance and/or correlation? Recall that on earlier pages our standard error for the sample mean difference µ 1 µ 0 depends on ρ. In general a statistical model for the outcomes Y i = (Y i1, Y i2,..., Y ini ) requires the following: Means: µ j Variances: σ 2 j Covariances: σ jk, or correlations ρ jk. Therefore, one approach to making inferences based on longitudinal data is to construct a model for each of these three components. 26 Auckland 2008
39 Something new to model... cov(y i ) = var(y i1 ) cov(y i1, Y i2 )... cov(y i1, Y ini ) cov(y i2, Y i1 ) var(y i2 )... cov(y i2, Y ini ) cov(y ini, Y i1 ) cov(y ini, Y i2 )... var(y ini ) = σ 2 1 σ 1 σ 2 ρ σ 1 σ ni ρ 1ni σ 2 σ 1 ρ 21 σ σ 2 σ ni ρ 2ni σ ni σ 1 ρ ni 1 σ ni σ 2 ρ ni 2... σ 2 n i 27 Auckland 2008
40 TLC Trial Covariances Placebo Active Auckland 2008
41 TLC Trial Correlations Placebo Active Auckland 2008
42 Mean and Covariance Models for FEV1 Models: E(Y ij X i ) = µ ij (regression) Groups cov(y i X i ) = Σ i = between-subjects }{{} + within-subjects }{{} individual to individual observation to observation Q: What are appropriate covariance models for the FEV1 data? Individual-to-Individual variation? Observation-to-Observation variation? 30 Auckland 2008
43 age age 10 age age 14 age age 18 age age *10^ *10^1 6*10^1 age Auckland 2008
44 How to build models for correlation? Mixed models random effects between-subject variabilit within-subject similarit due to sharing trajector Serial correlation close in time implies strong similarit correlation decreases as time separation increases 31 Auckland 2008
45 Toward the Linear Mixed Model Regression model: mean response as a function of covariates. sstematic variation Random effects: Individuals variation from subject-to-subject in trajector. random between-subject variation Within-subject variation: variation of individual observations over time random within-subject variation 32 Auckland 2008
46 32-1 Auckland 2008 Two Subjects Response Months
47 Levels of Analsis We first consider the distribution of measurements within subjects: Y ij = β 0,i + β 1,i t ij + e ij e ij N (0, σ 2 ) E[Y i X i, β i ] = β 0,i + β 1,i t ij = [1, time ij ] β 0,i β 1,i = X i β i 33 Auckland 2008
48 Levels of Analsis We can equivalentl separate the subject-specific regression coefficients into the average coefficient and the specific departure for subject i: β 0,i = β 0 + b 0,i β 1,i = β 1 + b 1,i This allows another perspective: Y ij = β 0,i + β 1,i t ij + e ij = (β 0 + β 1 t ij ) + (b 0,i + b 1,i t ij ) + e ij E[Y i X i, β i ] = X i β }{{} + X i b }{{} i mean model between-subject 34 Auckland 2008
49 34-1 Auckland 2008 Sample of Lines FEV Age (ears)
50 34-2 Auckland 2008 Intercepts and Slopes Slope Deviation (b1) Intercept Deviation (b0)
51 Levels of Analsis Next we consider the distribution of patterns (parameters) among subjects: equivalentl β i N (β, D) b i N (0, D) Y i = X i β }{{} + X i b }{{} i + e }{{} i mean model between-subject within-subject 35 Auckland 2008
52 35-1 Auckland 2008 Fixed intercept, Fixed slope Random intercept, Fixed slope beta0 + beta1 * time beta0 + beta1 * time time Fixed intercept, Random slope time Random intercept, Random slope beta0 + beta1 * time beta0 + beta1 * time time time
53 Between-subject Variation We can use the idea of random effects to allow different tpes of between-subject heterogeneit: The magnitude of heterogeneit is characterized b D: b i = var(b i ) = b 0,i b 1,i D 11 D 12 D 21 D Auckland 2008
54 Between-subject Variation The components of D can be interpreted as: D 11 the tpical subject-to-subject deviation in the overall level of the response. D 22 the tpical subject-to-subject deviation in the change (time slope) of the response. D 12 the covariance between individual intercepts and slopes. If positive then subjects with high levels also have high rates of change. If negative then subjects with high levels have low rates of change. 37 Auckland 2008
55 37-1 Auckland 2008 Fixed intercept, Fixed slope Random intercept, Fixed slope beta0 + beta1 * time beta0 + beta1 * time time Fixed intercept, Random slope time Random intercept, Random slope beta0 + beta1 * time beta0 + beta1 * time time time
56 Between-subject Variation: Examples No random effects: Y ij = β 0 + β 1 t ij + e ij = [1, time ij ]β + e ij Random intercepts: Y ij = (β 0 + β 1 t ij ) + b 0,i + e ij = [1, time ij ]β + [ 1 ]b 0,i + e ij Random intercepts and slopes: Y ij = (β 0 + β 1 t ij ) + b 0,i + b 1,i t ij + e ij = [1, time ij ]β + [1, time ij ]b i + e ij 38 Auckland 2008
57 Toward the Linear Mixed Model Regression model: mean response as a function of covariates. sstematic variation Random effects: variation from subject-to-subject in trajector. random between-subject variation Within-subject variation: Observation variation of individual observations over time random within-subject variation 39 Auckland 2008
58 Covariance Models Serial Models Linear mixed models assume that each subject follows his/her own line. In some situations the dependence is more local meaning that observations close in time are more similar than those far apart in time. 40 Auckland 2008
59 Covariance Models Define e ij = ρ e ij 1 + ɛ ij ɛ ij N (0, σ 2 (1 ρ 2 ) ) ɛ i0 N (0, σ 2 ) This leads to autocorrelated errors: cov(e ij, e ik ) = σ 2 ρ j k 41 Auckland 2008
60 ID=1, rho=0.5 ID=2, rho=0.5 ID=3, rho=0.5 ID=4, rho=0.5 ID=5, rho=0.5 ID=6, rho=0.5 ID=7, rho=0.5 ID=8, rho=0.5 ID=9, rho=0.5 ID=10, rho=0.5 ID=11, rho=0.5 ID=12, rho=0.5 ID=13, rho=0.5 ID=14, rho=0.5 ID=15, rho=0.5 ID=16, rho=0.5 ID=17, rho=0.5 ID=18, rho=0.5 ID=19, rho=0.5 ID=20, rho= Auckland 2008
61 ID=1, rho=0.9 ID=2, rho=0.9 ID=3, rho=0.9 ID=4, rho=0.9 ID=5, rho=0.9 ID=6, rho=0.9 ID=7, rho=0.9 ID=8, rho=0.9 ID=9, rho=0.9 ID=10, rho=0.9 ID=11, rho=0.9 ID=12, rho=0.9 ID=13, rho=0.9 ID=14, rho=0.9 ID=15, rho=0.9 ID=16, rho=0.9 ID=17, rho=0.9 ID=18, rho=0.9 ID=19, rho=0.9 ID=20, rho= Auckland 2008
62 41-3 Auckland 2008 Two Subjects Response line for subject 1 average line line for subject 2 subject 1 dropout Months
63 Toward the Linear Mixed Model Regression model: mean response as a function of covariates. sstematic variation Random effects: variation from subject-to-subject in trajector. random between-subject variation Within-subject variation: variation of individual observations over time random within-subject variation 42 Auckland 2008
64 Session Three Summar Role of correlation Impact proper standard errors Used to weight individuals (clusters) Models for correlation / covariance Regression: Group-to-Group variation Random effects: Individual-to-Individual variation Serial correlation: Observation-to-Observation variation 43 Auckland 2008
65 43-1 Auckland 2008 EXTRA: Mixed Models and Covariances/Correlation Q: What is the correlation between outcomes Y ij and Y ik under these random effects models? Random Intercept Model Y ij = β 0 + β 1 t ij + b 0,i + e ij Y ik = β 0 + β 1 t ik + b 0,i + e ik var(y ij ) = var(b 0,i ) + var(e ij ) = D 11 + σ 2 cov(y ij, Y ik ) = cov(b 0,i + e ij, b 0,i + e ik ) = D 11
66 43-2 Auckland 2008 EXTRA: Mixed Models and Covariances/Correlation Random Intercept Model corr(y ij, Y ik ) = D 11 D11 + σ 2 D 11 + σ 2 = D 11 D 11 + σ 2 = between var between var + within var Therefore, an two outcomes have the same correlation. Doesn t depend on the specific times, nor on the distance between the measurements. Exchangeable correlation model. Assuming: var(e ij ) = σ 2, and cov(e ij, e ik ) = 0.
67 43-3 Auckland 2008 EXTRA: Mixed Models and Covariances/Correlation Random Intercept and Slope Model Y ij = (β 0 + β 1 t ij ) + (b 0,i + b 1,i t ij ) + e ij Y ik = (β 0 + β 1 t ik ) + (b 0,i + b 1,i t ik ) + e ik var(y ij ) = var(b 0,i + b 1,i t ij ) + var(e ij ) = D D 12 t ij + D 22 t 2 ij + σ 2 cov(y ij, Y ik ) = cov[(b 0,i + b 1,i t ij + e ij ), (b 0,i + b 1,i t ik + e ik )] = D 11 + D 12 (t ij + t ik ) + D 22 t ij t ik
68 43-4 Auckland 2008 EXTRA: Mixed Models and Covariances/Correlation Random Intercept and Slope Model ρ ijk = corr(y ij, Y ik ) = D 11 + D 12 (t ij + t ik ) + D 22 t ij t ik D D 12 t ij + D 22 t 2 ij + σ2 D D 12 t ik + D 22 t 2 ik + σ2 Therefore, two outcomes ma not have the same correlation. Correlation depends on the specific times for the observations, and does not have a simple form. Assuming: var(e ij ) = σ 2, and cov(e ij, e ik ) = 0.
Stat 579: Generalized Linear Models and Extensions
Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 12 1 / 34 Correlated data multivariate observations clustered data repeated measurement
More informationIntroduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data
Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data Bryan A. Comstock and Patrick J. Heagerty Department of Biostatistics University of Washington
More informationAnalysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington
Analysis of Longitudinal Data Patrick J Heagerty PhD Department of Biostatistics University of Washington Auckland 8 Session One Outline Examples of longitudinal data Scientific motivation Opportunities
More informationLecture 3 Linear random intercept models
Lecture 3 Linear random intercept models Example: Weight of Guinea Pigs Body weights of 48 pigs in 9 successive weeks of follow-up (Table 3.1 DLZ) The response is measures at n different times, or under
More informationBiostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE
Biostatistics Workshop 2008 Longitudinal Data Analysis Session 4 GARRETT FITZMAURICE Harvard University 1 LINEAR MIXED EFFECTS MODELS Motivating Example: Influence of Menarche on Changes in Body Fat Prospective
More informationTutorial 6: Tutorial on Translating between GLIMMPSE Power Analysis and Data Analysis. Acknowledgements:
Tutorial 6: Tutorial on Translating between GLIMMPSE Power Analysis and Data Analysis Anna E. Barón, Keith E. Muller, Sarah M. Kreidler, and Deborah H. Glueck Acknowledgements: The project was supported
More informationCovariance Models (*) X i : (n i p) design matrix for fixed effects β : (p 1) regression coefficient for fixed effects
Covariance Models (*) Mixed Models Laird & Ware (1982) Y i = X i β + Z i b i + e i Y i : (n i 1) response vector X i : (n i p) design matrix for fixed effects β : (p 1) regression coefficient for fixed
More informationStat 579: Generalized Linear Models and Extensions
Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationStatistical Methods III Statistics 212. Problem Set 2 - Answer Key
Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423
More informationIntroduction to mtm: An R Package for Marginalized Transition Models
Introduction to mtm: An R Package for Marginalized Transition Models Bryan A. Comstock and Patrick J. Heagerty Department of Biostatistics University of Washington 1 Introduction Marginalized transition
More informationModule 6 Case Studies in Longitudinal Data Analysis
Module 6 Case Studies in Longitudinal Data Analysis Benjamin French, PhD Radiation Effects Research Foundation SISCR 2018 July 24, 2018 Learning objectives This module will focus on the design of longitudinal
More informationmultilevel modeling: concepts, applications and interpretations
multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models
More informationLecture 3.1 Basic Logistic LDA
y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data
More informationCorrelation and regression. Correlation and regression analysis. Measures of association. Why bother? Positive linear relationship
1 Correlation and regression analsis 12 Januar 2009 Monda, 14.00-16.00 (C1058) Frank Haege Department of Politics and Public Administration Universit of Limerick frank.haege@ul.ie www.frankhaege.eu Correlation
More informationECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests
ECON4150 - Introductory Econometrics Lecture 5: OLS with One Regressor: Hypothesis Tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 5 Lecture outline 2 Testing Hypotheses about one
More informationAccounting for Baseline Observations in Randomized Clinical Trials
Accounting for Baseline Observations in Randomized Clinical Trials Scott S Emerson, MD, PhD Department of Biostatistics, University of Washington, Seattle, WA 9895, USA October 6, 0 Abstract In clinical
More informationLecture 4: Multivariate Regression, Part 2
Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above
More informationOutline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes
Lecture 2.1 Basic Linear LDA 1 Outline Linear OLS Models vs: Linear Marginal Models Linear Conditional Models Random Intercepts Random Intercepts & Slopes Cond l & Marginal Connections Empirical Bayes
More informationBios 6649: Clinical Trials - Statistical Design and Monitoring
Bios 6649: Clinical Trials - Statistical Design and Monitoring Spring Semester 2015 John M. Kittelson Department of Biostatistics & Informatics Colorado School of Public Health University of Colorado Denver
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1
MA 575 Linear Models: Cedric E Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 1 Within-group Correlation Let us recall the simple two-level hierarchical
More informationAccounting for Baseline Observations in Randomized Clinical Trials
Accounting for Baseline Observations in Randomized Clinical Trials Scott S Emerson, MD, PhD Department of Biostatistics, University of Washington, Seattle, WA 9895, USA August 5, 0 Abstract In clinical
More informationSTAT 526 Advanced Statistical Methodology
STAT 526 Advanced Statistical Methodology Fall 2017 Lecture Note 10 Analyzing Clustered/Repeated Categorical Data 0-0 Outline Clustered/Repeated Categorical Data Generalized Linear Mixed Models Generalized
More informationLecture 3: Multivariate Regression
Lecture 3: Multivariate Regression Rates, cont. Two weeks ago, we modeled state homicide rates as being dependent on one variable: poverty. In reality, we know that state homicide rates depend on numerous
More informationBios 6648: Design & conduct of clinical research
Bios 6648: Design & conduct of clinical research Section 2 - Formulating the scientific and statistical design designs 2.5(b) Binary 2.5(c) Skewed baseline (a) Time-to-event (revisited) (b) Binary (revisited)
More informationLecture Outline. Biost 518 Applied Biostatistics II. Choice of Model for Analysis. Choice of Model. Choice of Model. Lecture 10: Multiple Regression:
Biost 518 Applied Biostatistics II Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture utline Choice of Model Alternative Models Effect of data driven selection of
More informationInference about the Slope and Intercept
Inference about the Slope and Intercept Recall, we have established that the least square estimates and 0 are linear combinations of the Y i s. Further, we have showed that the are unbiased and have the
More informationCorrelation and Simple Linear Regression
Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline
More informationSample Size and Power Considerations for Longitudinal Studies
Sample Size and Power Considerations for Longitudinal Studies Outline Quantities required to determine the sample size in longitudinal studies Review of type I error, type II error, and power For continuous
More information,..., θ(2),..., θ(n)
Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.
More informationHeteroscedasticity and Autocorrelation
Heteroscedasticity and Autocorrelation Carlo Favero Favero () Heteroscedasticity and Autocorrelation 1 / 17 Heteroscedasticity, Autocorrelation, and the GLS estimator Let us reconsider the single equation
More informationLongitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois
Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 217, Chicago, Illinois Outline 1. Opportunities and challenges of panel data. a. Data requirements b. Control
More informationMonday 7 th Febraury 2005
Monday 7 th Febraury 2 Analysis of Pigs data Data: Body weights of 48 pigs at 9 successive follow-up visits. This is an equally spaced data. It is always a good habit to reshape the data, so we can easily
More informationBIOS 2083: Linear Models
BIOS 2083: Linear Models Abdus S Wahed September 2, 2009 Chapter 0 2 Chapter 1 Introduction to linear models 1.1 Linear Models: Definition and Examples Example 1.1.1. Estimating the mean of a N(μ, σ 2
More informationBIOSTATISTICAL METHODS
BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH Cross-over Designs #: DESIGNING CLINICAL RESEARCH The subtraction of measurements from the same subject will mostly cancel or minimize effects
More informationSTAT 705 Generalized linear mixed models
STAT 705 Generalized linear mixed models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 24 Generalized Linear Mixed Models We have considered random
More informationMATH ASSIGNMENT 2: SOLUTIONS
MATH 204 - ASSIGNMENT 2: SOLUTIONS (a) Fitting the simple linear regression model to each of the variables in turn yields the following results: we look at t-tests for the individual coefficients, and
More informationIntroduction to Linear Mixed Models: Modeling continuous longitudinal outcomes
1/64 to : Modeling continuous longitudinal outcomes Dr Cameron Hurst cphurst@gmail.com CEU, ACRO and DAMASAC, Khon Kaen University 4 th Febuary, 2557 2/64 Some motivational datasets Before we start, I
More informationLecture 4: Multivariate Regression, Part 2
Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above
More informationAcknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression
INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical
More informationEstimating compound expectation in a regression framework with the new cereg command
Estimating compound expectation in a regression framework with the new cereg command 2015 Nordic and Baltic Stata Users Group meeting Celia García-Pareja Matteo Bottai Unit of Biostatistics, IMM, KI September
More informationData Analysis 1 LINEAR REGRESSION. Chapter 03
Data Analysis 1 LINEAR REGRESSION Chapter 03 Data Analysis 2 Outline The Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression Other Considerations in Regression Model Qualitative
More informationST 732, Midterm Solutions Spring 2019
ST 732, Midterm Solutions Spring 2019 Please sign the following pledge certifying that the work on this test is your own: I have neither given nor received aid on this test. Signature: Printed Name: There
More informationFigure 36: Respiratory infection versus time for the first 49 children.
y BINARY DATA MODELS We devote an entire chapter to binary data since such data are challenging, both in terms of modeling the dependence, and parameter interpretation. We again consider mixed effects
More informationBinomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials
Lecture : Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 27 Binomial Model n independent trials (e.g., coin tosses) p = probability of success on each trial (e.g., p =! =
More informationBiostatistics in Research Practice - Regression I
Biostatistics in Research Practice - Regression I Simon Crouch 30th Januar 2007 In scientific studies, we often wish to model the relationships between observed variables over a sample of different subjects.
More informationLecture 10: Introduction to Logistic Regression
Lecture 10: Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 2007 Logistic Regression Regression for a response variable that follows a binomial distribution Recall the binomial
More informationStat 579: Generalized Linear Models and Extensions
Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 14 1 / 64 Data structure and Model t1 t2 tn i 1st subject y 11 y 12 y 1n1 2nd subject
More informationRANDOM and REPEATED statements - How to Use Them to Model the Covariance Structure in Proc Mixed. Charlie Liu, Dachuang Cao, Peiqi Chen, Tony Zagar
Paper S02-2007 RANDOM and REPEATED statements - How to Use Them to Model the Covariance Structure in Proc Mixed Charlie Liu, Dachuang Cao, Peiqi Chen, Tony Zagar Eli Lilly & Company, Indianapolis, IN ABSTRACT
More informationModels for Clustered Data
Models for Clustered Data Edps/Psych/Soc 589 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline Notation NELS88 data Fixed Effects ANOVA
More informationDiscrete Multivariate Statistics
Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are
More informationDe-mystifying random effects models
De-mystifying random effects models Peter J Diggle Lecture 4, Leahurst, October 2012 Linear regression input variable x factor, covariate, explanatory variable,... output variable y response, end-point,
More informationModels for Clustered Data
Models for Clustered Data Edps/Psych/Stat 587 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2017 Outline Notation NELS88 data Fixed Effects ANOVA
More informationSemiparametric Mixed Effects Models with Flexible Random Effects Distribution
Semiparametric Mixed Effects Models with Flexible Random Effects Distribution Marie Davidian North Carolina State University davidian@stat.ncsu.edu www.stat.ncsu.edu/ davidian Joint work with A. Tsiatis,
More information4 Introduction to modeling longitudinal data
4 Introduction to modeling longitudinal data We are now in a position to introduce a basic statistical model for longitudinal data. The models and methods we discuss in subsequent chapters may be viewed
More informationHandout 11: Measurement Error
Handout 11: Measurement Error In which you learn to recognise the consequences for OLS estimation whenever some of the variables you use are not measured as accurately as you might expect. A (potential)
More informationThursday Morning. Growth Modelling in Mplus. Using a set of repeated continuous measures of bodyweight
Thursday Morning Growth Modelling in Mplus Using a set of repeated continuous measures of bodyweight 1 Growth modelling Continuous Data Mplus model syntax refresher ALSPAC Confirmatory Factor Analysis
More information1/15. Over or under dispersion Problem
1/15 Over or under dispersion Problem 2/15 Example 1: dogs and owners data set In the dogs and owners example, we had some concerns about the dependence among the measurements from each individual. Let
More information1 The basics of panel data
Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Related materials: Steven Buck Notes to accompany fixed effects material 4-16-14 ˆ Wooldridge 5e, Ch. 1.3: The Structure of Economic Data ˆ Wooldridge
More informationModel Mis-specification
Model Mis-specification Carlo Favero Favero () Model Mis-specification 1 / 28 Model Mis-specification Each specification can be interpreted of the result of a reduction process, what happens if the reduction
More informationMeasurement Error. Often a data set will contain imperfect measures of the data we would ideally like.
Measurement Error Often a data set will contain imperfect measures of the data we would ideally like. Aggregate Data: (GDP, Consumption, Investment are only best guesses of theoretical counterparts and
More informationRandom Intercept Models
Random Intercept Models Edps/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline A very simple case of a random intercept
More informationAnalysing longitudinal data when the visit times are informative
Analysing longitudinal data when the visit times are informative Eleanor Pullenayegum, PhD Scientist, Hospital for Sick Children Associate Professor, University of Toronto eleanor.pullenayegum@sickkids.ca
More informationLDA Midterm Due: 02/21/2005
LDA.665 Midterm Due: //5 Question : The randomized intervention trial is designed to answer the scientific questions: whether social network method is effective in retaining drug users in treatment programs,
More informationDesigns for Clinical Trials
Designs for Clinical Trials Chapter 5 Reading Instructions 5.: Introduction 5.: Parallel Group Designs (read) 5.3: Cluster Randomized Designs (less important) 5.4: Crossover Designs (read+copies) 5.5:
More informationModel and Working Correlation Structure Selection in GEE Analyses of Longitudinal Data
The 3rd Australian and New Zealand Stata Users Group Meeting, Sydney, 5 November 2009 1 Model and Working Correlation Structure Selection in GEE Analyses of Longitudinal Data Dr Jisheng Cui Public Health
More informationSection Inference for a Single Proportion
Section 8.1 - Inference for a Single Proportion Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin Inference for a Single Proportion For most of what follows, we will be making two assumptions
More informationScatter plot of data from the study. Linear Regression
1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25
More informationSimulating Longer Vectors of Correlated Binary Random Variables via Multinomial Sampling
Simulating Longer Vectors of Correlated Binary Random Variables via Multinomial Sampling J. Shults a a Department of Biostatistics, University of Pennsylvania, PA 19104, USA (v4.0 released January 2015)
More informationOne-sample categorical data: approximate inference
One-sample categorical data: approximate inference Patrick Breheny October 6 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction It is relatively easy to think about the distribution
More informationThe consequences of misspecifying the random effects distribution when fitting generalized linear mixed models
The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San
More informationBusiness Statistics. Lecture 10: Correlation and Linear Regression
Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form
More informationREPEATED MEASURES. Copyright c 2012 (Iowa State University) Statistics / 29
REPEATED MEASURES Copyright c 2012 (Iowa State University) Statistics 511 1 / 29 Repeated Measures Example In an exercise therapy study, subjects were assigned to one of three weightlifting programs i=1:
More informationECON 497 Final Exam Page 1 of 12
ECON 497 Final Exam Page of 2 ECON 497: Economic Research and Forecasting Name: Spring 2008 Bellas Final Exam Return this exam to me by 4:00 on Wednesday, April 23. It may be e-mailed to me. It may be
More informationExample: Forced Expiratory Volume (FEV) Program L13. Example: Forced Expiratory Volume (FEV) Example: Forced Expiratory Volume (FEV)
Program L13 Relationships between two variables Correlation, cont d Regression Relationships between more than two variables Multiple linear regression Two numerical variables Linear or curved relationship?
More informationIntroduction to Econometrics
Introduction to Econometrics STAT-S-301 Introduction to Time Series Regression and Forecasting (2016/2017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 Introduction to Time Series Regression
More informationBIOS 6649: Handout Exercise Solution
BIOS 6649: Handout Exercise Solution NOTE: I encourage you to work together, but the work you submit must be your own. Any plagiarism will result in loss of all marks. This assignment is based on weight-loss
More informationECON Introductory Econometrics. Lecture 17: Experiments
ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.
More informationAdvantages of Mixed-effects Regression Models (MRM; aka multilevel, hierarchical linear, linear mixed models) 1. MRM explicitly models individual
Advantages of Mixed-effects Regression Models (MRM; aka multilevel, hierarchical linear, linear mixed models) 1. MRM explicitly models individual change across time 2. MRM more flexible in terms of repeated
More informationBIOS 312: Precision of Statistical Inference
and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample
More informationAn R # Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM
An R Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM Lloyd J. Edwards, Ph.D. UNC-CH Department of Biostatistics email: Lloyd_Edwards@unc.edu Presented to the Department
More informationEconometrics Summary Algebraic and Statistical Preliminaries
Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L
More informationBayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang
Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January
More informationScatter plot of data from the study. Linear Regression
1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25
More informationRecent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data
Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)
More informationGeneralized Estimating Equations (gee) for glm type data
Generalized Estimating Equations (gee) for glm type data Søren Højsgaard mailto:sorenh@agrsci.dk Biometry Research Unit Danish Institute of Agricultural Sciences January 23, 2006 Printed: January 23, 2006
More informationRegression #8: Loose Ends
Regression #8: Loose Ends Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #8 1 / 30 In this lecture we investigate a variety of topics that you are probably familiar with, but need to touch
More information5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is
Practice Final Exam Last Name:, First Name:. Please write LEGIBLY. Answer all questions on this exam in the space provided (you may use the back of any page if you need more space). Show all work but do
More informationBIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke
BIOL 51A - Biostatistics 1 1 Lecture 1: Intro to Biostatistics Smoking: hazardous? FEV (l) 1 2 3 4 5 No Yes Smoke BIOL 51A - Biostatistics 1 2 Box Plot a.k.a box-and-whisker diagram or candlestick chart
More informationPractical Considerations Surrounding Normality
Practical Considerations Surrounding Normality Prof. Kevin E. Thorpe Dalla Lana School of Public Health University of Toronto KE Thorpe (U of T) Normality 1 / 16 Objectives Objectives 1. Understand the
More informationLecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016
Statistics 255 - Survival Analysis Presented March 8, 2016 Dan Gillen Department of Statistics University of California, Irvine 12.1 Examples Clustered or correlated survival times Disease onset in family
More informationCorrelation and simple linear regression S5
Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/41 Introduction Eample: Brain size and
More informationSection I. Define or explain the following terms (3 points each) 1. centered vs. uncentered 2 R - 2. Frisch theorem -
First Exam: Economics 388, Econometrics Spring 006 in R. Butler s class YOUR NAME: Section I (30 points) Questions 1-10 (3 points each) Section II (40 points) Questions 11-15 (10 points each) Section III
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationMixed models with correlated measurement errors
Mixed models with correlated measurement errors Rasmus Waagepetersen October 9, 2018 Example from Department of Health Technology 25 subjects where exposed to electric pulses of 11 different durations
More informationAuto correlation 2. Note: In general we can have AR(p) errors which implies p lagged terms in the error structure, i.e.,
1 Motivation Auto correlation 2 Autocorrelation occurs when what happens today has an impact on what happens tomorrow, and perhaps further into the future This is a phenomena mainly found in time-series
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 5 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 44 Outline of Lecture 5 Now that we know the sampling distribution
More informationLinear Modelling in Stata Session 6: Further Topics in Linear Modelling
Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 14/11/2017 This Week Categorical Variables Categorical
More informationLecture (chapter 13): Association between variables measured at the interval-ratio level
Lecture (chapter 13): Association between variables measured at the interval-ratio level Ernesto F. L. Amaral April 9 11, 2018 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015.
More informationPower for linear models of longitudinal data with applications to Alzheimer s Disease Phase II study design
Power for linear models of longitudinal data with applications to Alzheimer s Disease Phase II study design Michael C. Donohue, Steven D. Edland, Anthony C. Gamst Division of Biostatistics and Bioinformatics
More information