Lecture 3 Linear random intercept models

Similar documents
Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes

Monday 7 th Febraury 2005

****Lab 4, Feb 4: EDA and OLS and WLS

Lecture 3.1 Basic Logistic LDA

Exercices for Applied Econometrics A

Problem Set 10: Panel Data

Fixed and Random Effects Models: Vartanian, SW 683

Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page!

multilevel modeling: concepts, applications and interpretations

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Exam ECON3150/4150: Introductory Econometrics. 18 May 2016; 09:00h-12.00h.

University of California at Berkeley Fall Introductory Applied Econometrics Final examination. Scores add up to 125 points

Simultaneous Equations with Error Components. Mike Bronner Marko Ledic Anja Breitwieser

Econometrics Homework 4 Solutions

1

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M.

Practice 2SLS with Artificial Data Part 1

Quantitative Methods Final Exam (2017/1)

Jeffrey M. Wooldridge Michigan State University

1 The basics of panel data

Microeconometrics (PhD) Problem set 2: Dynamic Panel Data Solutions

Introductory Econometrics. Lecture 13: Hypothesis testing in the multiple regression model, Part 1

Model and Working Correlation Structure Selection in GEE Analyses of Longitudinal Data

point estimates, standard errors, testing, and inference for nonlinear combinations

General Linear Model (Chapter 4)

ECON 497 Final Exam Page 1 of 12

Problem Set 5 ANSWERS

Empirical Application of Panel Data Regression

Lecture#12. Instrumental variables regression Causal parameters III

Confidence intervals for the variance component of random-effects linear models

Description Quick start Menu Syntax Options Remarks and examples Stored results Methods and formulas Acknowledgment References Also see

Applied Econometrics. Lecture 3: Introduction to Linear Panel Data Models

Longitudinal Data Analysis. RatSWD Nachwuchsworkshop Vorlesung von Josef Brüderl 25. August, 2009

Binary Dependent Variables

LONGITUDINAL DATA ANALYSIS Homework I, 2005 SOLUTION. A = ( 2) = 36; B = ( 4) = 94. Therefore A B = 36 ( 94) = 3384.

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington

Soc 63993, Homework #7 Answer Key: Nonlinear effects/ Intro to path analysis

Appendix A. Numeric example of Dimick Staiger Estimator and comparison between Dimick-Staiger Estimator and Hierarchical Poisson Estimator

Fortin Econ Econometric Review 1. 1 Panel Data Methods Fixed Effects Dummy Variables Regression... 7

Problem set - Selection and Diff-in-Diff

Section Least Squares Regression

Question 1 carries a weight of 25%; question 2 carries 25%; question 3 carries 20%; and question 4 carries 30%.

Graduate Econometrics Lecture 4: Heteroskedasticity

7 Introduction to Time Series

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics. 9) Heteroscedasticity and autocorrelation

8. Nonstandard standard error issues 8.1. The bias of robust standard errors

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

LDA Midterm Due: 02/21/2005

7 Introduction to Time Series Time Series vs. Cross-Sectional Data Detrending Time Series... 15

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

i (x i x) 2 1 N i x i(y i y) Var(x) = P (x 1 x) Var(x)

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like.

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

10) Time series econometrics

Lecture 4: Multivariate Regression, Part 2

Lecture 3: Multivariate Regression

ECON Introductory Econometrics. Lecture 11: Binary dependent variables

Multilevel Modeling (MLM) part 1. Robert Yu

Lab 10 - Binary Variables

sociology sociology Scatterplots Quantitative Research Methods: Introduction to correlation and regression Age vs Income

Statistical Modelling in Stata 5: Linear Models

STATA Commands for Unobserved Effects Panel Data

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

Empirical Asset Pricing

Homework Solutions Applied Logistic Regression

Nonrecursive Models Highlights Richard Williams, University of Notre Dame, Last revised April 6, 2015

ECO220Y Simple Regression: Testing the Slope

Problem Set #3-Key. wage Coef. Std. Err. t P> t [95% Conf. Interval]

Regression #8: Loose Ends

Suggested Answers Problem set 4 ECON 60303

Panel Data: Very Brief Overview Richard Williams, University of Notre Dame, Last revised April 6, 2015

The Simulation Extrapolation Method for Fitting Generalized Linear Models with Additive Measurement Error

Lab 07 Introduction to Econometrics

Case of single exogenous (iv) variable (with single or multiple mediators) iv à med à dv. = β 0. iv i. med i + α 1

Section I. Define or explain the following terms (3 points each) 1. centered vs. uncentered 2 R - 2. Frisch theorem -

One-stage dose-response meta-analysis

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Module 6 Case Studies in Longitudinal Data Analysis

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

Lecture 4: Generalized Linear Mixed Models

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

Handout 11: Measurement Error

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Recent Developments in Multilevel Modeling

Auto correlation 2. Note: In general we can have AR(p) errors which implies p lagged terms in the error structure, i.e.,

Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics

. *DEFINITIONS OF ARTIFICIAL DATA SET. mat m=(12,20,0) /*matrix of means of RHS vars: edu, exp, error*/

Lecture 10: Introduction to Logistic Regression

9) Time series econometrics

Final Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10)

Control Function and Related Methods: Nonlinear Models

Question 1 carries a weight of 25%; Question 2 carries 20%; Question 3 carries 20%; Question 4 carries 35%.

ECON3150/4150 Spring 2016

Transcription:

Lecture 3 Linear random intercept models Example: Weight of Guinea Pigs Body weights of 48 pigs in 9 successive weeks of follow-up (Table 3.1 DLZ) The response is measures at n different times, or under n different conditions. In the guinea pigs example the time of measurement is referred to as a "within-units" factor. For the pigs n=9 Although the pigs example considers a single treatment factor, it is straightforward to extend the situation to one where the groups are formed as the results of a factorial design (for example, if the pigs were separated into males and female and then allocated to the diet groups) 1

Pig Data 48 Pigs Weight (kg) Week A) Linear model with random intercept Y ij = U i + 0 + 1 t j + ij U i ~ N(0, 2 ) ij ~ N(0, 2 ) = 2 2 + 2 Variance between Variance within Intraclass correlation coefficient 2

Pigs data model 1 OLS fit. regress weight time Source SS df MS Number of obs = 432 -------------+------------------------------ F( 1, 430) = 5757.41 Model 111060.882 1 111060.882 Prob > F = 0.0000 Residual 8294.72677 430 19.2900622 R-squared = 0.9305 -------------+------------------------------ Adj R-squared = 0.9303 Total 119355.609 431 276.927167 Root MSE = 4.392 weight Coef. Std. Err. t P> t [95% Conf. Interval] time 6.209896.0818409 75.88 0.000 6.049038 6.370754 _cons 19.35561.4605447 42.03 0.000 18.45041 20.26081 OLS results Pigs data model 1 IND fit. xtreg weight time, pa i(id) corr(ind) GEE population-averaged model Number of obs = 432 Group variable: Id Number of groups = 48 Link: identity Obs per group: min = 9 Family: Gaussian avg = 9.0 Correlation: independent max = 9 Wald chi2(1) = 5784.19 Scale parameter: 19.20076 Prob > chi2 = 0.0000 Pearson chi2(432): 8294.73 Deviance = 8294.73 Dispersion (Pearson): 19.20076 Dispersion = 19.20076 time 6.209896.0816513 76.05 0.000 6.049862 6.369929 _cons 19.35561.4594773 42.13 0.000 18.45505 20.25617 Independence correlation model results 3

Example: Weight of Pigs For this type of repeated measures study we recognize two sources of random variation 1. Between: There is heterogeneity between pigs, due for example to natural biological (genetic?) variation 2. Within: There is random variation in the measurement process for a particular unit at any given time. For example, on any given day a particular guinea pig may yield different weight measurements due to differences in scale (equipment) and/or small fluctuations in weight during a day B) Marginal Model With a Uniform correlation structure E[Y ij ] = 0 + 1 t j Model for the mean cov(y ij ) = ( 2 + 2 )[ 11'+(1 )I] Model for the covariance matrix 4

20 40 60 80 10 ] = 0 + 1 time Marginal Model 0 2 4 6 8 10 time weight Fitted values 20 40 60 80 10 Random Effects Model U i ] = 0 + 1 time + U i ] = 0 + 1 time 0 2 4 6 8 10 time weight Xb Xb + u[id] 5

Models A and B are equivalent E[Y ij U i ] = U i + 0 + 1 t j E[Y ij ] = E[E[Y ij U i ]] = 0 + 1 t j cov(y ij ) = cov[e[y ij U i ]]+ E[cov[Y ij U i ]] cov[e[y ij U i ]] = cov(1u i ) = 2 11' E[cov[Y ij U i ]] = E[ 2 I] = 2 I cov(y ij ) = ( 2 + 2 )[ 11'+(1 )I] 2 = 2 + 2 Pigs Marginal model xtreg weight time, pa i(id) corr(exch) Iteration 1: tolerance = 5.585e-15 GEE population-averaged model Number of obs = 432 Group variable: Id Number of groups = 48 Link: identity Obs per group: min = 9 Family: Gaussian avg = 9.0 Correlation: exchangeable max = 9 Wald chi2(1) = 25337.48 Scale parameter: 19.20076 Prob > chi2 = 0.0000 time 6.209896.0390124 159.18 0.000 6.133433 6.286359 _cons 19.35561.5974055 32.40 0.000 18.18472 20.52651 Population Average, Marginal Model with 6

Pigs RE model xtreg weight time, re i(id) mle Random-effects ML regression Number of obs = 432 Group variable (i): Id Number of groups = 48 Random effects u_i ~ Gaussian Obs per group: min = 9 avg = 9.0 max = 9 LR chi2(1) = 1624.57 Log likelihood = -1014.9268 Prob > chi2 = 0.0000 time 6.209896.0390124 159.18 0.000 6.133433 6.286359 _cons 19.35561.5974055 32.40 0.000 18.18472 20.52651 /sigma_u 3.84935.4058114 3.130767 4.732863 /sigma_e 2.093625.0755471 1.95067 2.247056 rho.771714.0393959.6876303.8413114 Population Average, Marginal Model with Pigs data model 1 GEE fit. xtgee weight time, i(id) corr(exch) Iteration 1: tolerance = 5.585e-15 GEE population-averaged model Number of obs = 432 Group variable: Id Number of groups = 48 Link: identity Obs per group: min = 9 Family: Gaussian avg = 9.0 Correlation: exchangeable max = 9 Wald chi2(1) = 25337.48 Scale parameter: 19.20076 Prob > chi2 = 0.0000 time 6.209896.0390124 159.18 0.000 6.133433 6.286359 _cons 19.35561.5974055 32.40 0.000 18.18472 20.52651 GEE fit Marginal Model with 7

Pigs data model 1 GEE fit. xtgee weight time, i(id) corr(exch). xtcorr Estimated within-id correlation matrix R: c1 c2 c3 c4 c5 c6 c7 c8 c9 r1 1.0000 r2 0.7717 1.0000 r3 0.7717 0.7717 1.0000 r4 0.7717 0.7717 0.7717 1.0000 r5 0.7717 0.7717 0.7717 0.7717 1.0000 r6 0.7717 0.7717 0.7717 0.7717 0.7717 1.0000 r7 0.7717 0.7717 0.7717 0.7717 0.7717 0.7717 1.0000 r8 0.7717 0.7717 0.7717 0.7717 0.7717 0.7717 0.7717 1.0000 r9 0.7717 0.7717 0.7717 0.7717 0.7717 0.7717 0.7717 0.7717 1.0000 GEE fit Marginal Model with One group polynomial growth curve model Similarly, if you want to fit a quadratic curve E[Y ij U i ] = U i + 0 + 1 t j + 2 t j 2 8

Pigs RE model, quadratic trend. gen timesq = time*time. xtreg weight time timesq, re i(id) mle Random-effects ML regression Number of obs = 432 Group variable (i): Id Number of groups = 48 Random effects u_i ~ Gaussian Obs per group: min = 9 avg = 9.0 max = 9 LR chi2(2) = 1625.32 Log likelihood = -1014.5524 Prob > chi2 = 0.0000 time 6.358818.1763799 36.05 0.000 6.01312 6.704516 timesq -.0148922.017202-0.87 0.387 -.0486075.0188231 _cons 19.08259.675483 28.25 0.000 17.75867 20.40651 /sigma_u 3.849473.4057983 3.130909 4.732951 /sigma_e 2.091585.0754733 1.948769 2.244866 rho.7720686.0393503.6880712.8415775 20 40 60 80 10 Random Effects Model U i ] = 0 + 1 time + 2 time 2 + U i ] =? 0 2 4 6 8 10 time weight Xb Xb + u[id] 9

Pigs Marg. model, quadratic trend. xtgee weight time timesq, i(id) corr(exch) GEE population-averaged model Number of obs = 432 Group variable: Id Number of groups = 48 Link: identity Obs per group: min = 9 Family: Gaussian avg = 9.0 Correlation: exchangeable max = 9 Wald chi2(2) = 25387.68 Scale parameter: 19.19317 Prob > chi2 = 0.0000 time 6.358818.1763801 36.05 0.000 6.013119 6.704517 timesq -.0148922.017202-0.87 0.387 -.0486076.0188231 _cons 19.08259.6754833 28.25 0.000 17.75867 20.40651. xtcorr Pigs data model 1 GEE fit Estimated within-id correlation matrix R: c1 c2 c3 c4 c5 c6 c7 c8 c9 r1 1.0000 r2 0.7721 1.0000 r3 0.7721 0.7721 1.0000 r4 0.7721 0.7721 0.7721 1.0000 r5 0.7721 0.7721 0.7721 0.7721 1.0000 r6 0.7721 0.7721 0.7721 0.7721 0.7721 1.0000 r7 0.7721 0.7721 0.7721 0.7721 0.7721 0.7721 1.0000 r8 0.7721 0.7721 0.7721 0.7721 0.7721 0.7721 0.7721 1.0000 r9 0.7721 0.7721 0.7721 0.7721 0.7721 0.7721 0.7721 0.7721 1.0000 GEE fit Marginal Model with 10

Pigs Marginal model: AR(1) xtgee weight time, i(id) corr(ar1) t(time) GEE population-averaged model Number of obs = 432 Group and time vars: Id time Number of groups = 48 Link: identity Obs per group: min = 9 Family: Gaussian avg = 9.0 Correlation: AR(1) max = 9 Wald chi2(1) = 6254.91 Scale parameter: 19.26754 Prob > chi2 = 0.0000 time 6.272089.0793052 79.09 0.000 6.116654 6.427524 _cons 18.84218.6745715 27.93 0.000 17.52004 20.16431 GEE-fit Marginal Model with AR1 Correlation structure Pigs RE model: AR(1) xtregar weight time RE GLS regression with AR(1) disturbances Number of obs = 432 Group variable (i): Id Number of groups = 48 R-sq: within = 0.9851 Obs per group: min = 9 between = 0.0000 avg = 9.0 overall = 0.9305 max = 9 Wald chi2(2) = 12688.55 corr(u_i, Xb) = 0 (assumed) Prob > chi2 = 0.0000 time 6.257651.0555527 112.64 0.000 6.14877 6.366533 _cons 19.00945.6281622 30.26 0.000 17.77827 20.24062 rho_ar.73091237 (estimated autocorrelation coefficient) sigma_u 3.583343 sigma_e 1.5590851 rho_fov.84082696 (fraction of variance due to u_i) theta.60838037 GEE-fit Marginal Model with AR1 Correlation structure 11

20 40 60 80 10 Random Effects Model: Exch U i ] = 0 + 1 time + U i ] = 0 + 1 time 0 2 4 6 8 10 time weight Xb Xb + u[id] Important Points Modelling the correlation in longitudinal data is important to be able to obtain correct inferences on regression coefficients There are correspondences between random effect and marginal models in the linear case because the interpretation of the regression coefficients is the same as that in standard linear regression 12