Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington

Size: px
Start display at page:

Download "Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington"

Transcription

1 Analsis of Longitudinal Data Patrick J. Heagert PhD Department of Biostatistics Universit of Washington 1 Auckland 2008

2 Session Three Outline Role of correlation Impact proper standard errors Used to weight individuals (clusters) Models for correlation / covariance Regression: Group-to-Group variation Random effects: Individual-to-Individual variation Serial correlation: Observation-to-Observation variation 2 Auckland 2008

3 Longitudinal Data Analsis INTRODUCTION to CORRELATION and WEIGHTING 3 Auckland 2008

4 Treatment of Lead-Exposed Children (TLC) Trial: In the 1990 s a placebo-controlled randomized trial of a new chelating agent, succimer, was conducted among children with lead levels 20-44µg/dL. Children received up to three 26-da courses of succimer or placebo and were followed for 3 ears. Data set with 100 children. m = 50 placebo; m = 50 active. Illustrate: naive analses and the impact of correlation. 4 Auckland 2008

5 TLC Trial Means Blood Lead Placebo Active Weeks 5 Auckland 2008

6 Simple (naive?) Analsis of Treatment Post Data Onl compare the mean blood lead after baseline in the TX and control groups using 3 measurements/person, and all 100 subjects. Issue(s) = Pre/Post Data compare the mean blood lead after baseline to the mean blood lead at baseline for the treatment subjects onl using 4 measurements/person, and onl 50 subjects. Issue(s) = 6 Auckland 2008

7 6-1 Auckland 2008 Simple Analsis: Post Onl Data week 0 week 1 week 4 week 6 Control β 0 β 0 β 0 Treatment β 0 + β 1 β 0 + β 1 β 0 + β 1

8 6-2 Auckland 2008 Post Data Onl. *** Analsis using POST DATA at weeks 1, 4, and 6.. regress tx if week>=1 Number of obs = Coef. Std. Err. t P> t [95% Conf. Interval] tx _cons

9 6-3 Auckland regress tx if week>=1, cluster(id) Number of obs = 300 Regression with robust standard errors Number of clusters (id) = Robust Coef. Std. Err. t P> t [95% Conf. Interval] tx _cons

10 Treatment β 0 β 0 + β 1 β 0 + β 1 β 0 + β Auckland 2008 Simple Analsis: Pre/Post for Treatment Group Onl Control week 0 week 1 week 4 week 6

11 6-5 Auckland 2008 Pre/Post Data, TX Group Onl.. *** Analsis using PRE/POST for treatment subjects.. regress post if tx==1 Number of obs = Coef. Std. Err. t P> t [95% Conf. Interval] post _cons

12 6-6 Auckland regress post if tx==1, cluster(id) Number of obs = 200 Regression with robust standard errors Number of clusters (id) = Robust Coef. Std. Err. t P> t [95% Conf. Interval] post _cons

13 Dependent Data and Proper Variance Estimates Let X ij = 0 denote placebo assignment and X ij = 1 denote active treatment. (1) Consider (Y i1, Y i2 ) with (X i1, X i2 ) = (0, 0) for i = 1 : n and (X i1, X i2 ) = (1, 1) for i = (n + 1) : 2n ˆµ 0 = 1 2n ˆµ 1 = 1 2n n 2 i=1 j=1 2n Y ij 2 i=n+1 j=1 var(ˆµ 1 ˆµ 0 ) = 1 n {σ2 (1 + ρ)} Y ij 7 Auckland 2008

14 7-1 Auckland 2008 Scenario 1 subject control treatment time 1 time 2 time 1 time 2 ID = 101 Y 1,1 Y 1,2 ID = 102 Y 2,1 Y 2,2 ID = 103 Y 3,1 Y 3,2 ID = 104 Y 4,1 Y 4,2 ID = 105 Y 5,1 Y 5,2 ID = 106 Y 6,1 Y 6,2

15 Dependent Data and Proper Variance Estimates (2) Consider (Y i1, Y i2 ) with (X i1, X i2 ) = (0, 1) for i = 1 : n and (X i1, X i2 ) = (1, 0) for i = (n + 1) : 2n ˆµ 0 = 1 2n ˆµ 1 = 1 2n { n Y i1 + i=1 { n Y i2 + i=1 2n i=n+1 2n i=n+1 Y i2 } Y i1 } var(ˆµ 1 ˆµ 0 ) = 1 n {σ2 (1 ρ)} 8 Auckland 2008

16 8-1 Auckland 2008 Scenario 2 subject control treatment time 1 time 2 time 1 time 2 ID = 101 Y 1,1 Y 1,2 ID = 102 Y 2,1 Y 2,2 ID = 103 Y 3,1 Y 3,2 ID = 104 Y 4,2 Y 4,1 ID = 105 Y 5,2 Y 5,1 ID = 106 Y 6,2 Y 6,1

17 Dependent Data and Proper Variance Estimates If we simpl had 2n independent observations on treatment (X = 1) and 2n independent observations on control then we d obtain var(ˆµ 1 ˆµ 0 ) = σ2 2n + σ2 2n = 1 n σ2 Q: What is the impact of dependence relative to the situation where all (2n + 2n) observations are independent? (1) positive dependence, ρ > 0, results in a loss of precision. (2) positive dependence, ρ > 0, results in an improvement in precision! 9 Auckland 2008

18 Therefore: Dependent data impacts proper statements of precision. Dependent data ma increase or decrease standard errors depending on the design. 10 Auckland 2008

19 Weighted Estimation Consider the situation where subjects report both the number of attempts and the number of successes: (Y i, N i ). Examples: live born (Y i ) in a litter (N i ) condoms used (Y i ) in sexual encounters (N i ) SAEs (Y i ) among total surgeries (N i ) Q: How to combine these data from i = 1 : m subjects to estimate a common rate (proportion) of successes? 11 Auckland 2008

20 Weighted Estimation Proposal 1: ˆp 1 = i Y i / i N i Proposal 2: Simple Example: ˆp 2 = 1 m Y i /N i i Data : (1, 10) (2, 100) ˆp 1 = (2 + 1)/(110) = ˆp 2 = 1 {1/10 + 2/100} = Auckland 2008

21 Weighted Estimation Note: Each of these estimators, ˆp 1, and ˆp 2, can be viewed as weighted estimators of the form: ˆp w = { i w i Y i N i } / i We obtain ˆp 1 b letting w i = N i, corresponding to equal weight given each to binar outcome, Y ij, Y i = N i j=1 Y ij. We obtain ˆp 2 b letting w i = 1, corresponding to equal weight given to each subject. w i Q: What s optimal? 13 Auckland 2008

22 Weighted Estimation A: Whatever weights are closest to 1/variance of Y i /N i (stat theor called Gauss-Markov ). If subjects are perfectl homogeneous then and ˆp 1 is best. V (Y i ) = N i p(1 p) If subjects are heterogeneous then, for example V (Y i ) = N i p(1 p){1 + (N i 1)ρ} and an estimator closer to ˆp 2 is best. 14 Auckland 2008

23 Summar: Role of Correlation Statistical inference must account for the dependence. correlation impacts standard errors! Consideration as to the choice of weighting will depend on the variance/covariance of the response variables. correlation impacts regression estimates! 15 Auckland 2008

24 Longitudinal Data Analsis INTRODUCTION to REGRESSION APPROACHES 16 Auckland 2008

25 Statistical Models Regression model: Groups mean response as a function of covariates. sstematic variation Random effects: Individuals variation from subject-to-subject in trajector. random between-subject variation Within-subject variation: Observations variation of individual observations over time random within-subject variation 17 Auckland 2008

26 Groups: Scientific Questions as Regression Questions concerning the rate of decline refer to the time slope for FEV1: E[ FEV1 X = age, gender, f508] = β 0 (X) + β 1 (X) time Time Scales Let age0 = age-at-entr, age i1 Let agel = time-since-entr, age ij age i1 18 Auckland 2008

27 CF Regression Model Model: E[FEV X i ] = β 0 +β 1 age0 + β 2 agel +β 3 female +β 4 f508 = 1 + β 5 f508 = 2 +β 6 female agel +β 7 f508 = 1 agel + β 8 f508 = 2 agel = β 0 (X i ) + β 1 (X i ) agel 19 Auckland 2008

28 19-1 Auckland 2008 Intercept f508=0 f508=1 f508=2 male β 0 + β 1 age0 β 0 + β 1 age0 β 0 + β 1 age0 +β 4 +β 5 female β 0 + β 1 age0 β 0 + β 1 age0 β 0 + β 1 age0 +β 3 +β 3 + β 4 +β 3 + β 5

29 19-2 Auckland 2008 Slope f508=0 f508=1 f508=2 male β 2 β 2 + β 7 β 2 + β 8 female β 2 β 2 + β 7 β 2 + β 8 +β 6 +β 6 +β 6

30 19-3 Auckland 2008 Gender Groups (f508==0) FEV Male Female Age (ears)

31 19-4 Auckland 2008 Genotpe Groups (male) FEV f508=0 f508=1 f508= Age (ears)

32 Define Y ij = FEV1 for subject i at time t ij X i = (X ij,..., X ini ) X ij = (X ij,1, X ij,2,..., X ij,p ) age0, agel, gender, genotpe Issue: Response variables measured on the same subject are correlated. cov(y ij, Y ik ) 0 20 Auckland 2008

33 Some Notation It is useful to have some notation that can be used to discuss the stack of data that correspond to each subject. Let n i denote the number of observations for subject i. Define: Y i = Y i1 Y i2. Y ini If the subjects are observed at a common set of times t 1, t 2,..., t m then E(Y ij ) = µ j denotes the mean of the population at time t j. 21 Auckland 2008

34 Dependence and Correlation Recall that observations are termed independent when deviation in one variable does not predict deviation in the other variable. Given two subjects with the same age and gender, then the blood pressure for patient ID=212 is not predictive of the blood pressure for patient ID=334. Observations are called dependent or correlated when one variable does predict the value of another variable. The LDL cholesterol of patient ID=212 at age 57 is predictive of the LDL cholesterol of patient ID=212 at age Auckland 2008

35 Dependence and Correlation Recall: The variance of a variable, Y ij (fix time t j for now) is defined as: σ 2 j = E [ (Y ij µ j ) 2] = E [(Y ij µ j )(Y ij µ j )] The variance measures the average distance that an observation falls awa from the mean. 23 Auckland 2008

36 Dependence and Correlation Define: The covariance of two variables, Y ij, and Y ik (fix t j and t k ) is defined as: σ jk = E [(Y ij µ j )(Y ik µ k )] The covariance measures whether, on average, departures in one variable, Y ij µ j, go together with departures in a second variable, Y ik µ k. In simple linear regression of Y ij on Y ik the regression coefficient β 1 in E(Y ij Y ik ) = β 0 + β 1 Y ik is the covariance divided b the variance of Y ik : β 1 = σ jk σ 2 k 24 Auckland 2008

37 Dependence and Correlation Define: The correlation of two variables, Y ij, and Y ik (fix t j and t k ) is defined as: ρ jk = E [(Y ij µ j )(Y ik µ k )] σ j σ k The correlation is a measure of dependence that takes values between -1 and +1. Recall that a correlation of 0.0 implies that the two measures are unrelated (linearl). Recall that a correlation of 1.0 implies that the two measures fall perfectl on a line one exactl predicts the other! 25 Auckland 2008

38 Wh interest in covariance and/or correlation? Recall that on earlier pages our standard error for the sample mean difference µ 1 µ 0 depends on ρ. In general a statistical model for the outcomes Y i = (Y i1, Y i2,..., Y ini ) requires the following: Means: µ j Variances: σ 2 j Covariances: σ jk, or correlations ρ jk. Therefore, one approach to making inferences based on longitudinal data is to construct a model for each of these three components. 26 Auckland 2008

39 Something new to model... cov(y i ) = var(y i1 ) cov(y i1, Y i2 )... cov(y i1, Y ini ) cov(y i2, Y i1 ) var(y i2 )... cov(y i2, Y ini ) cov(y ini, Y i1 ) cov(y ini, Y i2 )... var(y ini ) = σ 2 1 σ 1 σ 2 ρ σ 1 σ ni ρ 1ni σ 2 σ 1 ρ 21 σ σ 2 σ ni ρ 2ni σ ni σ 1 ρ ni 1 σ ni σ 2 ρ ni 2... σ 2 n i 27 Auckland 2008

40 TLC Trial Covariances Placebo Active Auckland 2008

41 TLC Trial Correlations Placebo Active Auckland 2008

42 Mean and Covariance Models for FEV1 Models: E(Y ij X i ) = µ ij (regression) Groups cov(y i X i ) = Σ i = between-subjects }{{} + within-subjects }{{} individual to individual observation to observation Q: What are appropriate covariance models for the FEV1 data? Individual-to-Individual variation? Observation-to-Observation variation? 30 Auckland 2008

43 age age 10 age age 14 age age 18 age age *10^ *10^1 6*10^1 age Auckland 2008

44 How to build models for correlation? Mixed models random effects between-subject variabilit within-subject similarit due to sharing trajector Serial correlation close in time implies strong similarit correlation decreases as time separation increases 31 Auckland 2008

45 Toward the Linear Mixed Model Regression model: mean response as a function of covariates. sstematic variation Random effects: Individuals variation from subject-to-subject in trajector. random between-subject variation Within-subject variation: variation of individual observations over time random within-subject variation 32 Auckland 2008

46 32-1 Auckland 2008 Two Subjects Response Months

47 Levels of Analsis We first consider the distribution of measurements within subjects: Y ij = β 0,i + β 1,i t ij + e ij e ij N (0, σ 2 ) E[Y i X i, β i ] = β 0,i + β 1,i t ij = [1, time ij ] β 0,i β 1,i = X i β i 33 Auckland 2008

48 Levels of Analsis We can equivalentl separate the subject-specific regression coefficients into the average coefficient and the specific departure for subject i: β 0,i = β 0 + b 0,i β 1,i = β 1 + b 1,i This allows another perspective: Y ij = β 0,i + β 1,i t ij + e ij = (β 0 + β 1 t ij ) + (b 0,i + b 1,i t ij ) + e ij E[Y i X i, β i ] = X i β }{{} + X i b }{{} i mean model between-subject 34 Auckland 2008

49 34-1 Auckland 2008 Sample of Lines FEV Age (ears)

50 34-2 Auckland 2008 Intercepts and Slopes Slope Deviation (b1) Intercept Deviation (b0)

51 Levels of Analsis Next we consider the distribution of patterns (parameters) among subjects: equivalentl β i N (β, D) b i N (0, D) Y i = X i β }{{} + X i b }{{} i + e }{{} i mean model between-subject within-subject 35 Auckland 2008

52 35-1 Auckland 2008 Fixed intercept, Fixed slope Random intercept, Fixed slope beta0 + beta1 * time beta0 + beta1 * time time Fixed intercept, Random slope time Random intercept, Random slope beta0 + beta1 * time beta0 + beta1 * time time time

53 Between-subject Variation We can use the idea of random effects to allow different tpes of between-subject heterogeneit: The magnitude of heterogeneit is characterized b D: b i = var(b i ) = b 0,i b 1,i D 11 D 12 D 21 D Auckland 2008

54 Between-subject Variation The components of D can be interpreted as: D 11 the tpical subject-to-subject deviation in the overall level of the response. D 22 the tpical subject-to-subject deviation in the change (time slope) of the response. D 12 the covariance between individual intercepts and slopes. If positive then subjects with high levels also have high rates of change. If negative then subjects with high levels have low rates of change. 37 Auckland 2008

55 37-1 Auckland 2008 Fixed intercept, Fixed slope Random intercept, Fixed slope beta0 + beta1 * time beta0 + beta1 * time time Fixed intercept, Random slope time Random intercept, Random slope beta0 + beta1 * time beta0 + beta1 * time time time

56 Between-subject Variation: Examples No random effects: Y ij = β 0 + β 1 t ij + e ij = [1, time ij ]β + e ij Random intercepts: Y ij = (β 0 + β 1 t ij ) + b 0,i + e ij = [1, time ij ]β + [ 1 ]b 0,i + e ij Random intercepts and slopes: Y ij = (β 0 + β 1 t ij ) + b 0,i + b 1,i t ij + e ij = [1, time ij ]β + [1, time ij ]b i + e ij 38 Auckland 2008

57 Toward the Linear Mixed Model Regression model: mean response as a function of covariates. sstematic variation Random effects: variation from subject-to-subject in trajector. random between-subject variation Within-subject variation: Observation variation of individual observations over time random within-subject variation 39 Auckland 2008

58 Covariance Models Serial Models Linear mixed models assume that each subject follows his/her own line. In some situations the dependence is more local meaning that observations close in time are more similar than those far apart in time. 40 Auckland 2008

59 Covariance Models Define e ij = ρ e ij 1 + ɛ ij ɛ ij N (0, σ 2 (1 ρ 2 ) ) ɛ i0 N (0, σ 2 ) This leads to autocorrelated errors: cov(e ij, e ik ) = σ 2 ρ j k 41 Auckland 2008

60 ID=1, rho=0.5 ID=2, rho=0.5 ID=3, rho=0.5 ID=4, rho=0.5 ID=5, rho=0.5 ID=6, rho=0.5 ID=7, rho=0.5 ID=8, rho=0.5 ID=9, rho=0.5 ID=10, rho=0.5 ID=11, rho=0.5 ID=12, rho=0.5 ID=13, rho=0.5 ID=14, rho=0.5 ID=15, rho=0.5 ID=16, rho=0.5 ID=17, rho=0.5 ID=18, rho=0.5 ID=19, rho=0.5 ID=20, rho= Auckland 2008

61 ID=1, rho=0.9 ID=2, rho=0.9 ID=3, rho=0.9 ID=4, rho=0.9 ID=5, rho=0.9 ID=6, rho=0.9 ID=7, rho=0.9 ID=8, rho=0.9 ID=9, rho=0.9 ID=10, rho=0.9 ID=11, rho=0.9 ID=12, rho=0.9 ID=13, rho=0.9 ID=14, rho=0.9 ID=15, rho=0.9 ID=16, rho=0.9 ID=17, rho=0.9 ID=18, rho=0.9 ID=19, rho=0.9 ID=20, rho= Auckland 2008

62 41-3 Auckland 2008 Two Subjects Response line for subject 1 average line line for subject 2 subject 1 dropout Months

63 Toward the Linear Mixed Model Regression model: mean response as a function of covariates. sstematic variation Random effects: variation from subject-to-subject in trajector. random between-subject variation Within-subject variation: variation of individual observations over time random within-subject variation 42 Auckland 2008

64 Session Three Summar Role of correlation Impact proper standard errors Used to weight individuals (clusters) Models for correlation / covariance Regression: Group-to-Group variation Random effects: Individual-to-Individual variation Serial correlation: Observation-to-Observation variation 43 Auckland 2008

65 43-1 Auckland 2008 EXTRA: Mixed Models and Covariances/Correlation Q: What is the correlation between outcomes Y ij and Y ik under these random effects models? Random Intercept Model Y ij = β 0 + β 1 t ij + b 0,i + e ij Y ik = β 0 + β 1 t ik + b 0,i + e ik var(y ij ) = var(b 0,i ) + var(e ij ) = D 11 + σ 2 cov(y ij, Y ik ) = cov(b 0,i + e ij, b 0,i + e ik ) = D 11

66 43-2 Auckland 2008 EXTRA: Mixed Models and Covariances/Correlation Random Intercept Model corr(y ij, Y ik ) = D 11 D11 + σ 2 D 11 + σ 2 = D 11 D 11 + σ 2 = between var between var + within var Therefore, an two outcomes have the same correlation. Doesn t depend on the specific times, nor on the distance between the measurements. Exchangeable correlation model. Assuming: var(e ij ) = σ 2, and cov(e ij, e ik ) = 0.

67 43-3 Auckland 2008 EXTRA: Mixed Models and Covariances/Correlation Random Intercept and Slope Model Y ij = (β 0 + β 1 t ij ) + (b 0,i + b 1,i t ij ) + e ij Y ik = (β 0 + β 1 t ik ) + (b 0,i + b 1,i t ik ) + e ik var(y ij ) = var(b 0,i + b 1,i t ij ) + var(e ij ) = D D 12 t ij + D 22 t 2 ij + σ 2 cov(y ij, Y ik ) = cov[(b 0,i + b 1,i t ij + e ij ), (b 0,i + b 1,i t ik + e ik )] = D 11 + D 12 (t ij + t ik ) + D 22 t ij t ik

68 43-4 Auckland 2008 EXTRA: Mixed Models and Covariances/Correlation Random Intercept and Slope Model ρ ijk = corr(y ij, Y ik ) = D 11 + D 12 (t ij + t ik ) + D 22 t ij t ik D D 12 t ij + D 22 t 2 ij + σ2 D D 12 t ik + D 22 t 2 ik + σ2 Therefore, two outcomes ma not have the same correlation. Correlation depends on the specific times for the observations, and does not have a simple form. Assuming: var(e ij ) = σ 2, and cov(e ij, e ik ) = 0.

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 12 1 / 34 Correlated data multivariate observations clustered data repeated measurement

More information

Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data

Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data Bryan A. Comstock and Patrick J. Heagerty Department of Biostatistics University of Washington

More information

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington Analysis of Longitudinal Data Patrick J Heagerty PhD Department of Biostatistics University of Washington Auckland 8 Session One Outline Examples of longitudinal data Scientific motivation Opportunities

More information

Lecture 3 Linear random intercept models

Lecture 3 Linear random intercept models Lecture 3 Linear random intercept models Example: Weight of Guinea Pigs Body weights of 48 pigs in 9 successive weeks of follow-up (Table 3.1 DLZ) The response is measures at n different times, or under

More information

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE Biostatistics Workshop 2008 Longitudinal Data Analysis Session 4 GARRETT FITZMAURICE Harvard University 1 LINEAR MIXED EFFECTS MODELS Motivating Example: Influence of Menarche on Changes in Body Fat Prospective

More information

Tutorial 6: Tutorial on Translating between GLIMMPSE Power Analysis and Data Analysis. Acknowledgements:

Tutorial 6: Tutorial on Translating between GLIMMPSE Power Analysis and Data Analysis. Acknowledgements: Tutorial 6: Tutorial on Translating between GLIMMPSE Power Analysis and Data Analysis Anna E. Barón, Keith E. Muller, Sarah M. Kreidler, and Deborah H. Glueck Acknowledgements: The project was supported

More information

Covariance Models (*) X i : (n i p) design matrix for fixed effects β : (p 1) regression coefficient for fixed effects

Covariance Models (*) X i : (n i p) design matrix for fixed effects β : (p 1) regression coefficient for fixed effects Covariance Models (*) Mixed Models Laird & Ware (1982) Y i = X i β + Z i b i + e i Y i : (n i 1) response vector X i : (n i p) design matrix for fixed effects β : (p 1) regression coefficient for fixed

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

Introduction to mtm: An R Package for Marginalized Transition Models

Introduction to mtm: An R Package for Marginalized Transition Models Introduction to mtm: An R Package for Marginalized Transition Models Bryan A. Comstock and Patrick J. Heagerty Department of Biostatistics University of Washington 1 Introduction Marginalized transition

More information

Module 6 Case Studies in Longitudinal Data Analysis

Module 6 Case Studies in Longitudinal Data Analysis Module 6 Case Studies in Longitudinal Data Analysis Benjamin French, PhD Radiation Effects Research Foundation SISCR 2018 July 24, 2018 Learning objectives This module will focus on the design of longitudinal

More information

multilevel modeling: concepts, applications and interpretations

multilevel modeling: concepts, applications and interpretations multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models

More information

Lecture 3.1 Basic Logistic LDA

Lecture 3.1 Basic Logistic LDA y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data

More information

Correlation and regression. Correlation and regression analysis. Measures of association. Why bother? Positive linear relationship

Correlation and regression. Correlation and regression analysis. Measures of association. Why bother? Positive linear relationship 1 Correlation and regression analsis 12 Januar 2009 Monda, 14.00-16.00 (C1058) Frank Haege Department of Politics and Public Administration Universit of Limerick frank.haege@ul.ie www.frankhaege.eu Correlation

More information

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests ECON4150 - Introductory Econometrics Lecture 5: OLS with One Regressor: Hypothesis Tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 5 Lecture outline 2 Testing Hypotheses about one

More information

Accounting for Baseline Observations in Randomized Clinical Trials

Accounting for Baseline Observations in Randomized Clinical Trials Accounting for Baseline Observations in Randomized Clinical Trials Scott S Emerson, MD, PhD Department of Biostatistics, University of Washington, Seattle, WA 9895, USA October 6, 0 Abstract In clinical

More information

Lecture 4: Multivariate Regression, Part 2

Lecture 4: Multivariate Regression, Part 2 Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above

More information

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes Lecture 2.1 Basic Linear LDA 1 Outline Linear OLS Models vs: Linear Marginal Models Linear Conditional Models Random Intercepts Random Intercepts & Slopes Cond l & Marginal Connections Empirical Bayes

More information

Bios 6649: Clinical Trials - Statistical Design and Monitoring

Bios 6649: Clinical Trials - Statistical Design and Monitoring Bios 6649: Clinical Trials - Statistical Design and Monitoring Spring Semester 2015 John M. Kittelson Department of Biostatistics & Informatics Colorado School of Public Health University of Colorado Denver

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 MA 575 Linear Models: Cedric E Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 1 Within-group Correlation Let us recall the simple two-level hierarchical

More information

Accounting for Baseline Observations in Randomized Clinical Trials

Accounting for Baseline Observations in Randomized Clinical Trials Accounting for Baseline Observations in Randomized Clinical Trials Scott S Emerson, MD, PhD Department of Biostatistics, University of Washington, Seattle, WA 9895, USA August 5, 0 Abstract In clinical

More information

STAT 526 Advanced Statistical Methodology

STAT 526 Advanced Statistical Methodology STAT 526 Advanced Statistical Methodology Fall 2017 Lecture Note 10 Analyzing Clustered/Repeated Categorical Data 0-0 Outline Clustered/Repeated Categorical Data Generalized Linear Mixed Models Generalized

More information

Lecture 3: Multivariate Regression

Lecture 3: Multivariate Regression Lecture 3: Multivariate Regression Rates, cont. Two weeks ago, we modeled state homicide rates as being dependent on one variable: poverty. In reality, we know that state homicide rates depend on numerous

More information

Bios 6648: Design & conduct of clinical research

Bios 6648: Design & conduct of clinical research Bios 6648: Design & conduct of clinical research Section 2 - Formulating the scientific and statistical design designs 2.5(b) Binary 2.5(c) Skewed baseline (a) Time-to-event (revisited) (b) Binary (revisited)

More information

Lecture Outline. Biost 518 Applied Biostatistics II. Choice of Model for Analysis. Choice of Model. Choice of Model. Lecture 10: Multiple Regression:

Lecture Outline. Biost 518 Applied Biostatistics II. Choice of Model for Analysis. Choice of Model. Choice of Model. Lecture 10: Multiple Regression: Biost 518 Applied Biostatistics II Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture utline Choice of Model Alternative Models Effect of data driven selection of

More information

Inference about the Slope and Intercept

Inference about the Slope and Intercept Inference about the Slope and Intercept Recall, we have established that the least square estimates and 0 are linear combinations of the Y i s. Further, we have showed that the are unbiased and have the

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline

More information

Sample Size and Power Considerations for Longitudinal Studies

Sample Size and Power Considerations for Longitudinal Studies Sample Size and Power Considerations for Longitudinal Studies Outline Quantities required to determine the sample size in longitudinal studies Review of type I error, type II error, and power For continuous

More information

,..., θ(2),..., θ(n)

,..., θ(2),..., θ(n) Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.

More information

Heteroscedasticity and Autocorrelation

Heteroscedasticity and Autocorrelation Heteroscedasticity and Autocorrelation Carlo Favero Favero () Heteroscedasticity and Autocorrelation 1 / 17 Heteroscedasticity, Autocorrelation, and the GLS estimator Let us reconsider the single equation

More information

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 217, Chicago, Illinois Outline 1. Opportunities and challenges of panel data. a. Data requirements b. Control

More information

Monday 7 th Febraury 2005

Monday 7 th Febraury 2005 Monday 7 th Febraury 2 Analysis of Pigs data Data: Body weights of 48 pigs at 9 successive follow-up visits. This is an equally spaced data. It is always a good habit to reshape the data, so we can easily

More information

BIOS 2083: Linear Models

BIOS 2083: Linear Models BIOS 2083: Linear Models Abdus S Wahed September 2, 2009 Chapter 0 2 Chapter 1 Introduction to linear models 1.1 Linear Models: Definition and Examples Example 1.1.1. Estimating the mean of a N(μ, σ 2

More information

BIOSTATISTICAL METHODS

BIOSTATISTICAL METHODS BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH Cross-over Designs #: DESIGNING CLINICAL RESEARCH The subtraction of measurements from the same subject will mostly cancel or minimize effects

More information

STAT 705 Generalized linear mixed models

STAT 705 Generalized linear mixed models STAT 705 Generalized linear mixed models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 24 Generalized Linear Mixed Models We have considered random

More information

MATH ASSIGNMENT 2: SOLUTIONS

MATH ASSIGNMENT 2: SOLUTIONS MATH 204 - ASSIGNMENT 2: SOLUTIONS (a) Fitting the simple linear regression model to each of the variables in turn yields the following results: we look at t-tests for the individual coefficients, and

More information

Introduction to Linear Mixed Models: Modeling continuous longitudinal outcomes

Introduction to Linear Mixed Models: Modeling continuous longitudinal outcomes 1/64 to : Modeling continuous longitudinal outcomes Dr Cameron Hurst cphurst@gmail.com CEU, ACRO and DAMASAC, Khon Kaen University 4 th Febuary, 2557 2/64 Some motivational datasets Before we start, I

More information

Lecture 4: Multivariate Regression, Part 2

Lecture 4: Multivariate Regression, Part 2 Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above

More information

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical

More information

Estimating compound expectation in a regression framework with the new cereg command

Estimating compound expectation in a regression framework with the new cereg command Estimating compound expectation in a regression framework with the new cereg command 2015 Nordic and Baltic Stata Users Group meeting Celia García-Pareja Matteo Bottai Unit of Biostatistics, IMM, KI September

More information

Data Analysis 1 LINEAR REGRESSION. Chapter 03

Data Analysis 1 LINEAR REGRESSION. Chapter 03 Data Analysis 1 LINEAR REGRESSION Chapter 03 Data Analysis 2 Outline The Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression Other Considerations in Regression Model Qualitative

More information

ST 732, Midterm Solutions Spring 2019

ST 732, Midterm Solutions Spring 2019 ST 732, Midterm Solutions Spring 2019 Please sign the following pledge certifying that the work on this test is your own: I have neither given nor received aid on this test. Signature: Printed Name: There

More information

Figure 36: Respiratory infection versus time for the first 49 children.

Figure 36: Respiratory infection versus time for the first 49 children. y BINARY DATA MODELS We devote an entire chapter to binary data since such data are challenging, both in terms of modeling the dependence, and parameter interpretation. We again consider mixed effects

More information

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials Lecture : Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 27 Binomial Model n independent trials (e.g., coin tosses) p = probability of success on each trial (e.g., p =! =

More information

Biostatistics in Research Practice - Regression I

Biostatistics in Research Practice - Regression I Biostatistics in Research Practice - Regression I Simon Crouch 30th Januar 2007 In scientific studies, we often wish to model the relationships between observed variables over a sample of different subjects.

More information

Lecture 10: Introduction to Logistic Regression

Lecture 10: Introduction to Logistic Regression Lecture 10: Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 2007 Logistic Regression Regression for a response variable that follows a binomial distribution Recall the binomial

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 14 1 / 64 Data structure and Model t1 t2 tn i 1st subject y 11 y 12 y 1n1 2nd subject

More information

RANDOM and REPEATED statements - How to Use Them to Model the Covariance Structure in Proc Mixed. Charlie Liu, Dachuang Cao, Peiqi Chen, Tony Zagar

RANDOM and REPEATED statements - How to Use Them to Model the Covariance Structure in Proc Mixed. Charlie Liu, Dachuang Cao, Peiqi Chen, Tony Zagar Paper S02-2007 RANDOM and REPEATED statements - How to Use Them to Model the Covariance Structure in Proc Mixed Charlie Liu, Dachuang Cao, Peiqi Chen, Tony Zagar Eli Lilly & Company, Indianapolis, IN ABSTRACT

More information

Models for Clustered Data

Models for Clustered Data Models for Clustered Data Edps/Psych/Soc 589 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline Notation NELS88 data Fixed Effects ANOVA

More information

Discrete Multivariate Statistics

Discrete Multivariate Statistics Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are

More information

De-mystifying random effects models

De-mystifying random effects models De-mystifying random effects models Peter J Diggle Lecture 4, Leahurst, October 2012 Linear regression input variable x factor, covariate, explanatory variable,... output variable y response, end-point,

More information

Models for Clustered Data

Models for Clustered Data Models for Clustered Data Edps/Psych/Stat 587 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2017 Outline Notation NELS88 data Fixed Effects ANOVA

More information

Semiparametric Mixed Effects Models with Flexible Random Effects Distribution

Semiparametric Mixed Effects Models with Flexible Random Effects Distribution Semiparametric Mixed Effects Models with Flexible Random Effects Distribution Marie Davidian North Carolina State University davidian@stat.ncsu.edu www.stat.ncsu.edu/ davidian Joint work with A. Tsiatis,

More information

4 Introduction to modeling longitudinal data

4 Introduction to modeling longitudinal data 4 Introduction to modeling longitudinal data We are now in a position to introduce a basic statistical model for longitudinal data. The models and methods we discuss in subsequent chapters may be viewed

More information

Handout 11: Measurement Error

Handout 11: Measurement Error Handout 11: Measurement Error In which you learn to recognise the consequences for OLS estimation whenever some of the variables you use are not measured as accurately as you might expect. A (potential)

More information

Thursday Morning. Growth Modelling in Mplus. Using a set of repeated continuous measures of bodyweight

Thursday Morning. Growth Modelling in Mplus. Using a set of repeated continuous measures of bodyweight Thursday Morning Growth Modelling in Mplus Using a set of repeated continuous measures of bodyweight 1 Growth modelling Continuous Data Mplus model syntax refresher ALSPAC Confirmatory Factor Analysis

More information

1/15. Over or under dispersion Problem

1/15. Over or under dispersion Problem 1/15 Over or under dispersion Problem 2/15 Example 1: dogs and owners data set In the dogs and owners example, we had some concerns about the dependence among the measurements from each individual. Let

More information

1 The basics of panel data

1 The basics of panel data Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Related materials: Steven Buck Notes to accompany fixed effects material 4-16-14 ˆ Wooldridge 5e, Ch. 1.3: The Structure of Economic Data ˆ Wooldridge

More information

Model Mis-specification

Model Mis-specification Model Mis-specification Carlo Favero Favero () Model Mis-specification 1 / 28 Model Mis-specification Each specification can be interpreted of the result of a reduction process, what happens if the reduction

More information

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like.

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like. Measurement Error Often a data set will contain imperfect measures of the data we would ideally like. Aggregate Data: (GDP, Consumption, Investment are only best guesses of theoretical counterparts and

More information

Random Intercept Models

Random Intercept Models Random Intercept Models Edps/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline A very simple case of a random intercept

More information

Analysing longitudinal data when the visit times are informative

Analysing longitudinal data when the visit times are informative Analysing longitudinal data when the visit times are informative Eleanor Pullenayegum, PhD Scientist, Hospital for Sick Children Associate Professor, University of Toronto eleanor.pullenayegum@sickkids.ca

More information

LDA Midterm Due: 02/21/2005

LDA Midterm Due: 02/21/2005 LDA.665 Midterm Due: //5 Question : The randomized intervention trial is designed to answer the scientific questions: whether social network method is effective in retaining drug users in treatment programs,

More information

Designs for Clinical Trials

Designs for Clinical Trials Designs for Clinical Trials Chapter 5 Reading Instructions 5.: Introduction 5.: Parallel Group Designs (read) 5.3: Cluster Randomized Designs (less important) 5.4: Crossover Designs (read+copies) 5.5:

More information

Model and Working Correlation Structure Selection in GEE Analyses of Longitudinal Data

Model and Working Correlation Structure Selection in GEE Analyses of Longitudinal Data The 3rd Australian and New Zealand Stata Users Group Meeting, Sydney, 5 November 2009 1 Model and Working Correlation Structure Selection in GEE Analyses of Longitudinal Data Dr Jisheng Cui Public Health

More information

Section Inference for a Single Proportion

Section Inference for a Single Proportion Section 8.1 - Inference for a Single Proportion Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin Inference for a Single Proportion For most of what follows, we will be making two assumptions

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

Simulating Longer Vectors of Correlated Binary Random Variables via Multinomial Sampling

Simulating Longer Vectors of Correlated Binary Random Variables via Multinomial Sampling Simulating Longer Vectors of Correlated Binary Random Variables via Multinomial Sampling J. Shults a a Department of Biostatistics, University of Pennsylvania, PA 19104, USA (v4.0 released January 2015)

More information

One-sample categorical data: approximate inference

One-sample categorical data: approximate inference One-sample categorical data: approximate inference Patrick Breheny October 6 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction It is relatively easy to think about the distribution

More information

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

REPEATED MEASURES. Copyright c 2012 (Iowa State University) Statistics / 29

REPEATED MEASURES. Copyright c 2012 (Iowa State University) Statistics / 29 REPEATED MEASURES Copyright c 2012 (Iowa State University) Statistics 511 1 / 29 Repeated Measures Example In an exercise therapy study, subjects were assigned to one of three weightlifting programs i=1:

More information

ECON 497 Final Exam Page 1 of 12

ECON 497 Final Exam Page 1 of 12 ECON 497 Final Exam Page of 2 ECON 497: Economic Research and Forecasting Name: Spring 2008 Bellas Final Exam Return this exam to me by 4:00 on Wednesday, April 23. It may be e-mailed to me. It may be

More information

Example: Forced Expiratory Volume (FEV) Program L13. Example: Forced Expiratory Volume (FEV) Example: Forced Expiratory Volume (FEV)

Example: Forced Expiratory Volume (FEV) Program L13. Example: Forced Expiratory Volume (FEV) Example: Forced Expiratory Volume (FEV) Program L13 Relationships between two variables Correlation, cont d Regression Relationships between more than two variables Multiple linear regression Two numerical variables Linear or curved relationship?

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics STAT-S-301 Introduction to Time Series Regression and Forecasting (2016/2017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 Introduction to Time Series Regression

More information

BIOS 6649: Handout Exercise Solution

BIOS 6649: Handout Exercise Solution BIOS 6649: Handout Exercise Solution NOTE: I encourage you to work together, but the work you submit must be your own. Any plagiarism will result in loss of all marks. This assignment is based on weight-loss

More information

ECON Introductory Econometrics. Lecture 17: Experiments

ECON Introductory Econometrics. Lecture 17: Experiments ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.

More information

Advantages of Mixed-effects Regression Models (MRM; aka multilevel, hierarchical linear, linear mixed models) 1. MRM explicitly models individual

Advantages of Mixed-effects Regression Models (MRM; aka multilevel, hierarchical linear, linear mixed models) 1. MRM explicitly models individual Advantages of Mixed-effects Regression Models (MRM; aka multilevel, hierarchical linear, linear mixed models) 1. MRM explicitly models individual change across time 2. MRM more flexible in terms of repeated

More information

BIOS 312: Precision of Statistical Inference

BIOS 312: Precision of Statistical Inference and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample

More information

An R # Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM

An R # Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM An R Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM Lloyd J. Edwards, Ph.D. UNC-CH Department of Biostatistics email: Lloyd_Edwards@unc.edu Presented to the Department

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)

More information

Generalized Estimating Equations (gee) for glm type data

Generalized Estimating Equations (gee) for glm type data Generalized Estimating Equations (gee) for glm type data Søren Højsgaard mailto:sorenh@agrsci.dk Biometry Research Unit Danish Institute of Agricultural Sciences January 23, 2006 Printed: January 23, 2006

More information

Regression #8: Loose Ends

Regression #8: Loose Ends Regression #8: Loose Ends Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #8 1 / 30 In this lecture we investigate a variety of topics that you are probably familiar with, but need to touch

More information

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is Practice Final Exam Last Name:, First Name:. Please write LEGIBLY. Answer all questions on this exam in the space provided (you may use the back of any page if you need more space). Show all work but do

More information

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke BIOL 51A - Biostatistics 1 1 Lecture 1: Intro to Biostatistics Smoking: hazardous? FEV (l) 1 2 3 4 5 No Yes Smoke BIOL 51A - Biostatistics 1 2 Box Plot a.k.a box-and-whisker diagram or candlestick chart

More information

Practical Considerations Surrounding Normality

Practical Considerations Surrounding Normality Practical Considerations Surrounding Normality Prof. Kevin E. Thorpe Dalla Lana School of Public Health University of Toronto KE Thorpe (U of T) Normality 1 / 16 Objectives Objectives 1. Understand the

More information

Lecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016

Lecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016 Statistics 255 - Survival Analysis Presented March 8, 2016 Dan Gillen Department of Statistics University of California, Irvine 12.1 Examples Clustered or correlated survival times Disease onset in family

More information

Correlation and simple linear regression S5

Correlation and simple linear regression S5 Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/41 Introduction Eample: Brain size and

More information

Section I. Define or explain the following terms (3 points each) 1. centered vs. uncentered 2 R - 2. Frisch theorem -

Section I. Define or explain the following terms (3 points each) 1. centered vs. uncentered 2 R - 2. Frisch theorem - First Exam: Economics 388, Econometrics Spring 006 in R. Butler s class YOUR NAME: Section I (30 points) Questions 1-10 (3 points each) Section II (40 points) Questions 11-15 (10 points each) Section III

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

Mixed models with correlated measurement errors

Mixed models with correlated measurement errors Mixed models with correlated measurement errors Rasmus Waagepetersen October 9, 2018 Example from Department of Health Technology 25 subjects where exposed to electric pulses of 11 different durations

More information

Auto correlation 2. Note: In general we can have AR(p) errors which implies p lagged terms in the error structure, i.e.,

Auto correlation 2. Note: In general we can have AR(p) errors which implies p lagged terms in the error structure, i.e., 1 Motivation Auto correlation 2 Autocorrelation occurs when what happens today has an impact on what happens tomorrow, and perhaps further into the future This is a phenomena mainly found in time-series

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 5 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 44 Outline of Lecture 5 Now that we know the sampling distribution

More information

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 14/11/2017 This Week Categorical Variables Categorical

More information

Lecture (chapter 13): Association between variables measured at the interval-ratio level

Lecture (chapter 13): Association between variables measured at the interval-ratio level Lecture (chapter 13): Association between variables measured at the interval-ratio level Ernesto F. L. Amaral April 9 11, 2018 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015.

More information

Power for linear models of longitudinal data with applications to Alzheimer s Disease Phase II study design

Power for linear models of longitudinal data with applications to Alzheimer s Disease Phase II study design Power for linear models of longitudinal data with applications to Alzheimer s Disease Phase II study design Michael C. Donohue, Steven D. Edland, Anthony C. Gamst Division of Biostatistics and Bioinformatics

More information