Lab 4: Two-level Random Intercept Model

BIO 656 Lab4 009 Lab 4: Two-level Random Intercept Model Data: Peak expratory flow rate (pefr) measured twce, usng two dfferent nstruments, for 17 subjects. (from Chapter 1 of Multlevel and Longtudnal Modelng Usng Stata) Goals: 1. Revew how to ft a random ntercept model usng xtreg, xtmxed and gllamm.. Interpret parameters n a random ntercept model. 3. Model measurement error wth random ntercept model. 4. Obtan predctons from multlevel model. PART I Exploratory Data Analyss Data Structure: +----------------------------+ d wp1 wp wm1 wm ---------------------------- 1. 1 494 490 51 55. 395 397 430 415 3. 3 516 51 50 508 4. 4 434 401 48 444 5. 5 476 470 500 500 Varables d: subject d wp1: Wrght peak, occason 1 wp: Wrght peak, occason wm1: Mn Wrght, occason 1 wm: Mn Wrght, occason Dataset s n wde format. Repeated measurements of wp and wm are nested wthn subject. No mssng data Exploratory Analyss (We wll only work wth wm for now): Frst, calculate the overall mean lung functon and store t as a local varable, wm_mean.. generate mean_wm = (wm1+wm)/. summarze mean_wm Varable Obs Mean Std. Dev. Mn Max -------------+-------------------------------------------------------- mean_wm 17 453.9118 111.91 43.5 650. local wm_mean = r(mean) Let s dsplay the values of the repeated Mn Wrght meter measures of lung functon for each subject and the overall mean lung functon.. twoway (scatter wm1 d, msymbol(crcle)) (scatter wm d, symbol(crcle_hollow)), xttle(subject Id) yttle(mn Wrght Measurements) legend(order(1 "Occasson 1" "Occason ")) ylne(`wm_mean') 1

BIO 656 Lab4 009 Mn Wrght Measurements 00 300 400 500 600 700 0 5 10 15 0 Subject Id Occason 1 Occason Measurements taken from the same person were clustered together. It appears that the meann of the two observatons for each ndvdual are normally scattered (lke a normal dstrbuton) around the overall mean. Mght ths suggest a subject-level random ntercept model? (1) For an ndvdual, the two repeated Mn Wrght values (y 1 and y ) are tryng to capture the same true peak expratory flow rate (β ) that s unobservable. () Let s assume what we actually measured s the true value (β ) plus some random (measurement) error (ε ). So y = β + ε (3) Note that ths looks lke our typcal random-ntercept model: y = β + v + ε where β = β + v. By wrtng β ths way, we also allow ths model to accommodate pefr from dfferent people. (4) Now let s nclude the random components of our model:

BIO 656 Lab4 009 A measurement error dstrbuton that s dentcal for each ndvdual: ~ Normal 0, σ ε ( ) A dstrbuton descrbng the varaton n the true pefr n the populaton: v ~ Normal( 0, τ ) (5) Our fnal model: y ( 0, σ ), v ~ Normal( 0, ) = β + v + ε, ε ~ Normal τ Note that here β can be nterpreted as the average true pefr n the populaton (smlar to the red lne n the above graph). How would you descrbe the other model parameters presence n the scatter plot above? Reshape Data We need to reshape the data to a long format for the data analyss.. reshape long wm wp, (d) j(occason) note: j = 1 ) Data wde -> long ----------------------------------------------------------------------------- Number of obs. 17 -> 34 Number of varables 5 -> 4 j varable ( values) -> occason x varables: wm1 wm -> wm wp1 wp -> wp ----------------------------------------------------------------------------- +---------------------+ d occas~n wm --------------------- 1. 1 1 51 ( = 1, j = 1). 1 55 ( = 1, j = ) 3. 1 430 ( =, j = 1) 4. 415 ( =, j = 1) 5. 3 1 50 More Exploratory Analyss: Let s check some of the dstrbutonal assumptons (note that we only have 17 people). (1) Check ~ Normal( 0, τ ) v : sort(d) by d, egen mean_wm mean(wm) hst mean_wm, norm 3

BIO 656 Lab4 009 () Check ~ Normal( 0, σ ) ε gen wm_resd = wm-mean_wm hst wm_resd, norm Densty 0.001.00.003.004 00 300 400 500 600 700 mean_wm Densty 0.01.0.03.04-50 0 50 wm_resd PART II Fttng the Model and Interpretaton Fttng the random ntercept model wth xtreg. xtreg wm, (d) mle Iteraton 0: log lkelhood = -187.89003 Iteraton 1: log lkelhood = -184.95979 Iteraton : log lkelhood = -184.76189 Iteraton 3: log lkelhood = -184.5855 Iteraton 4: log lkelhood = -184.5784 Iteraton 5: log lkelhood = -184.57839 Random-effects ML regresson Number of obs = 34 Group varable (): d Number of groups = 17 Random effects u_ ~ Gaussan Obs per group: mn = avg =.0 max = Wald ch(0) = 0.00 Log lkelhood = -184.57839 Prob > ch =. wm Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons 453.9118 6.18616 17.33 0.000 40.5878 505.357 -------------+---------------------------------------------------------------- /sgma_u 107.0464 18.67858 76.0406 150.6949 /sgma_e 19.91083 3.414659 14.69 7.8656 rho.966560.0159494.910943.9878545 Lkelhood-rato test of sgma_u=0: chbar(01)= 46.7 Prob>=chbar = 0.000 Does the estmate of β (_const) = 453.9118 look famlar? 4

BIO 656 Lab4 009 In the output above, ρ (rho) can be nterpreted as ether the proporton of the total varance that s between subjects (or due to subjects) varance.between Var( v ) τ ρ = = = total.varance Var(y ) τ + σ the correlaton between the measurements on dfferent occasons for the same subject (ntra-class correlaton) Cov(y, y' ) τ τ ρ = Corr(y, y' )= = = Var(y ) Var(y ) τ + σ τ + σ τ + σ It can be a lttle confusng because, the covarance between measurements on dfferent occasons for the same subject s σ. Interpretatons Notce that ρ =.966 s very hgh! The repeated observatons wthn ndvduals are hghly correlated and the proporton of the total varance that s between subjects s very large. /sgma_u s 107.05, the estmate of the standard devaton of the random ntercepts. Hence we expect about 95% of the random ntercepts to fall wthn 00 (= approxmately 107.05*) unts on ether drecton of the estmated overall mean, 453.91, or n other words, between 50 and 650. The estmated wthn-subject standard devaton s /sgma_e = 19.9. Hence we expect 95% of the repeated observatons on an ndvdual to fall wthn 40 (= approxmately 19.9*) unts from the subject-specfc mean. The results from xtreg, mle are equvalent to those from xtmxed, mle. The dfference between xtreg and xtmxed s that xtreg s desgned more for cross-sectonal tme-seres lnear regresson and can only be used to ft a random ntercept. On the other hand, xtmxed s desgned for mult-level mxed effects lnear regresson and can be used to ft random coeffcents and dfferent levels of mxed effects. Fttng the random ntercept model wth xtmxed. xtmxed wm d:, mle Performng EM optmzaton: Performng gradent-based optmzaton: Iteraton 0: log lkelhood = -184.57839 Iteraton 1: log lkelhood = -184.57839 Computng standard errors: ' 5

BIO 656 Lab4 009 Mxed-effects ML regresson Number of obs = 34 Group varable: d Number of groups = 17 Obs per group: mn = avg =.0 max = Wald ch(0) =. Log lkelhood = -184.57839 Prob > ch =. wm Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons 453.9118 6.18616 17.33 0.000 40.5878 505.357 Random-effects Parameters Estmate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ d: Identty sd(_cons) 107.0464 18.67857 76.0406 150.6949 -----------------------------+------------------------------------------------ sd(resdual) 19.91083 3.414679 14.688 7.86565 LR test vs. lnear regresson: chbar(01) = 46.7 Prob >= chbar = 0.0000 Fttng the random ntercept model wth gllamm. gllamm wm, (d) np(1) adapt Runnng adaptve quadrature Iteraton 0: log lkelhood = -07.70 Iteraton 1: log lkelhood = -05.79654 Iteraton : log lkelhood = -185.7467 Iteraton 3: log lkelhood = -184.63453 Iteraton 4: log lkelhood = -184.57846 Iteraton 5: log lkelhood = -184.5784 Adaptve quadrature has converged, runnng Newton-Raphson Iteraton 0: log lkelhood = -184.5784 Iteraton 1: log lkelhood = -184.57839 number of level 1 unts = 34 number of level unts = 17 Condton Number = 15.64774 gllamm model log lkelhood = -184.57839 wm Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons 453.9116 6.18394 17.34 0.000 40.59 505.31 Varance at level 1 396.70879 (136.11609) Varances and covarances of random effects ***level (d) var(1): 11456.88 (3997.7689) 6

BIO 656 Lab4 009 Note that gllamm returns varances and not standard devatons. PART III Predcton Goal 1: So what s our best estmate of each subject s true peak expratory flow rate Recall that when constructng our model: y = β + v + ε, ε ~ Normal 0, σ, v ~ Normal 0, τ ( ) ( ). So we d lke to obtan the estmated value of β + v for each ndvdual. β s gven n the output so we need to extract the v s. Estmatng the random ntercepts usng emprcal Bayes and gllamm. gllapred eb, u (means and standard devatons wll be stored n ebm1 ebs1) Non-adaptve log-lkelhood: -0.5846-45.1480-5.1857-11.35-199.5193-190.8173-186.50-184.7457-184.5784-184.5784 log-lkelhood:-184.57839 Emprcal Bayes estmate of the subject-specfc mean,.e. β + v. gllapred eb, lnpred (lnear predctor wll be stored n eb) Non-adaptve log-lkelhood: -0.5846-45.1480-5.1857-11.35-199.5193-190.8173-186.50-184.7457-184.5784-184.5784 log-lkelhood:-184.57839. reshape wde wm wp eb ebm1 ebs1, (d) j(occason) (note: j = 1 ) Data long -> wde ----------------------------------------------------------------------------- Number of obs. 34 -> 17 Number of varables 8 -> 1 j varable ( values) occason -> (dropped) x varables: wm -> wm1 wm wp -> wp1 wp eb -> eb1 eb ebm1 -> ebm11 ebm1 ebs1 -> ebs11 ebs1 ----------------------------------------------------------------------------- Let s plot the estmated peak expratory flow rate:. twoway (scatter wm1 d, msymbol(crcle)) (scatter wm d, msymbol(crcle_hollow)) (scatter eb1 d, msymbol(x)), xttle(subject Id) yttle(mn Wrght Measurements) legend(order(1 "Occasson 1" "Occason " 3 "EB Subject-Spec Intercept")) ylne(`wm_mean') 7

BIO 656 Lab4 009 Mn Wrght Measurements 00 300 400 500 600 700 0 5 10 15 0 Subject Id Occason 1 Occason EB Subject-Spec Intercept Note that the estmated peak expratory flow rate (x) do not always fall n between the measurements at occason 1 and occason!!! Why? (Hnt: look at subject 6 and 13). Let s check our model assumptons agan wth the estmated ntercepts and resduals:. hst eb, norm. gen eb_resd = wm-eb. hst eb_resd, norm Densty 0.001.00.003.004.005 00 300 400 500 600 700 eb Densty 0.01.0.03.04-50 0 50 eb_resd 8

BIO 656 Lab4 009 Goal : Based on our model, can we make predcton about future observaton of a new measurement taken from an exstng subject or a new measurement from a new subject? Extra The random effect model above s motvated by measurement error. It s smlar to the usual LDA settng where we can vew the data as: wm 00 300 400 500 600 700 1 1. 1.4 1.6 1.8 occason To ncorporate both wp and wm measurements n a model we can use a threelevel random effect model: Subject (level 3) Method (level ) Repeated measurements (level 1) See textbook. 9