Multilevel Modeling of Non-Normal Data. Don Hedeker Department of Public Health Sciences University of Chicago.

Size: px
Start display at page:

Download "Multilevel Modeling of Non-Normal Data. Don Hedeker Department of Public Health Sciences University of Chicago."

Transcription

1 Multilevel Modeling of Non-Normal Data Don Hedeker Department of Public Health Sciences University of Chicago Hedeker, D. (2005). Generalized linear mixed models. In B. Everitt & D. Howell (Eds.), Encyclopedia of Statistics in Behavioral Science. Wiley. 1

2 What are Multilevel Data? Data that are hierarchically structured, nested, clustered Data collected from units organized or observed within units at a higher level (from which data are also obtained) data collected on students siblings repeated observations who are clustered within classrooms families individuals ==> these are examples of two-level data level 1 - (students) - measurement of primary outcome and important mediating variables level 2 - (classrooms) - provides context or organization of level-1 units which may influence outcome; other mediating variables 2

3 What is Multilevel Data Analysis? any set of analytical procedures that involve data gathered from individuals and from the social structure in which they are embedded and are analyzed in a manner that models the multilevel structure L. Burstein, Units of Analysis, 1985, Int Ency of Educ analysis that models the multilevel structure recognizes influence of structure on individual outcome structure classroom family individual may influence response from students siblings repeated observations 3

4 Why do Multilevel Data Analysis? assess amount of variability due to each level (e.g., family variance and individual variance) model level 1 outcome in terms of effects at both levels individual var. = fn(individual var. + family var.) assess interaction between level effects (e.g., individual outcome influenced by family SES for males, not females) Responses are not independent - individuals within clusters share influencing factors Multilevel analysis - another example of Golden Rule of Statistics: one person s error term is another person s (or many persons ) career 4

5 Multilevel models aka random-effects models random-coefficient models mixed-effects models hierarchical linear models Useful for analyzing Clustered data subjects (level-1) within clusters (level-2) e.g., clinics, hospitals, families, worksites, schools, classrooms, city wards Longitudinal data repeated obs. (level-1) within subjects (level-2) 5

6 cluster variables subject variables cluster subject tx group size outcome sex age n n n N n N..... i = 1... N clusters j = 1... n i subjects in cluster i 6

7 time-invariant variables time-varying variables subject time tx group sex age outcome dose n n n N n N..... i = 1... N subjects j = 1... n i timepoints for subject i 7

8 Multilevel models for categorical outcomes dichotomous outcomes mixed-effects logistic regression ordinal outcomes mixed-effects ordinal logistic regression proportional odds model partial or non-proportional odds model nominal outcomes mixed-effects nominal logistic regression discrete or grouped time-to-event data mixed-effects dichotomous or ordinal regression complementary log-log link for proportional (and non-proportional) hazards models 8

9 Logistic Regression Model P (Y log i = 1) = x 1 P (Y i = 1) iβ Dichotomous outcome (Y = 0 absence, Y = 1 presence). Function that links probabilities to regressors is the logit (or log odds) function log [P/(1 P ]. Logit is called the link function. The model can be written in terms of probabilities: 1 P (Y i = 1) = 1 + exp( x i β) Model is a linear model for the logits, not for the probabilities. Logits can take on any values between negative and positive infinity, probabilities can only take on values between 0 and 1. 9

10 10

11 The model can also be written in terms of the odds: P (Y i = 1) 1 P (Y i = 1) = exp(x iβ) exp β = change in odds for Y per unit change of x β = 0 yields no effect on the odds β > 0 increases odds Y is present with increasing x β < 0 decreases odds Y is present with increasing x 11

12 Dichotomous Response and Threshold Concept Continuous y i - an unobservable latent variable - related to dichotomous response Y i via threshold concept Response occurs (Y i = 1) if γ < y i otherwise, a response does not occur (Y i = 0) 12

13 The Threshold Concept in Practice How was your day? (what is your satisfaction level today?) Satisfaction may be continuous, but we usually emit a dichotomous response: 13

14 Model for Latent Continuous Responses Consider the model with p covariates for the latent response strength y i (i = 1, 2,..., N): y i = x iβ + ε i probit: ε i standard normal (mean=0, variance=1) logistic: ε i standard logistic (mean=0, variance=π 2 /3) β estimates from logistic regression are larger (in abs. value) than from probit regression by approximately π 2 /3 = 1.8 Underlying latent variable useful way of thinking of the problem not an essential assumption of the model 14

15 Random-intercept Logistic Regression Model Consider the model with p covariates for the response Y ij for subject j (j = 1, 2,..., n i ) in cluster i (i = 1, 2,..., N): log P (Y ij = 1) = x 1 P (Y ij = 1) ijβ + υ 0i where Y ij = dichotomous response for subject j in cluster i x ij = (p + 1) 1 covariate vector (includes 1 for intercept) β = (p + 1) 1 vector of unknown parameters υ 0i = cluster effects distributed N ID(0, σ 2 υ) and assumed independent of x variables 15

16 Characteristics of υ 0i N ID(0, σ 2 υ) separates model from ususal (fixed-effects) multiple logistic regression model takes on i = 1, 2,..., N values assess impact of cluster i on individual outcome; represents degree of subject clustering common for each cluster member, but changes for each cluster if υ 0i = 0, then cluster has no effect for cluster i if υ 0i = 0 for all clusters, cluster structure has no impact on individual data (σ 2 υ = 0) no need for multilevel approach ordinary logistic regression is OK if subject clustering has strong effect, estimates of υ 0i 0 and σ 2 υ will increase from 0 16

17 Model for Latent Continuous Responses Consider the model with p covariates for the n i 1 latent response strength y ij : where assuming y ij = x ijβ + υ 0i + ε ij ε ij standard normal (mean 0 and σ 2 = 1) leads to multilevel probit regression ε ij standard logistic (mean 0 and σ 2 = π 2 /3) leads to multilevel logistic regression 17

18 Underlying latent variable not an essential assumption of the model useful for obtaining intra-class correlation (r) and for design effect (d) r = σ2 υ σ 2 υ + σ 2 d = σ2 υ + σ 2 σ 2 = 1/(1 r) ratio of actual variance to the variance that would be obtained by simple random sampling (holding sample size constant) 18

19 Scaling of regression coefficients Fixed-effects model β estimates from logistic regression are larger (in abs. value) than from probit regression by approximately because π 2 /3 1 V (y) = σ 2 = π 2 /3 for logistic V (y) = σ 2 = 1 for probit =

20 Mixed-effects model β estimates from mixed-effects model are larger (in abs. value) than from fixed-effects model by approximately because d = συ 2 + σ 2 σ 2 V (y) = σ 2 υ + σ 2 in mixed-effects model V (y) = σ 2 in fixed-effects model difference depends on size of random-effects variance σ 2 υ 20

21 Within-Clusters / Between-Clusters models Within-clusters model - level 1 (j = 1,..., n i ) log observed response P (Y ij = 1) 1 P (Y ij = 1) = b 0i + b 1i Sex ij latent response y ij = b 0i + b 1i Sex ij + ε ij Between-clusters model - level 2 (i = 1,..., N) b 0i = β 0 + β 2 Grp i + υ 0i b 1i = β 1 + β 3 Grp i with υ 0i N ID(0, σ 2 υ) and ε ij LID(0, π 2 /3) 21

22 Put together, logit ij = b 0i + b 1i Sex ij = (β 0 + β 2 Grp i + υ 0i ) + (β 1 + β 3 Grp i )Sex ij = β 0 + β 1 Sex ij + β 2 Grp i + β 3 (Grp i Sex ij ) + υ 0i β 0 = logit when Sex = Grp = 0 β 1 = Sex effect when Grp = 0 β 2 = Grp effect when Sex = 0 β 3 = difference between Sex effect for Grp = 1 vs Grp = 0; or difference between Grp effect for Sex = 1 vs Sex = 0 coding of variables very important for correct interpretation. Also, these are controlling for cluster effect ( cluster-specific effects) 22

23 Effects of a School-based Intervention The Television School and Family Smoking Prevention and Cessation Project (Flay, et al., 1988); a subsample: sample th-graders classes - 28 schools 1 to 13 classes per school, 2 to 28 students per class outcome - knowledge of the effects of tobacco use timing - students tested at pre and post-intervention design - schools exposed to a social-resistance classroom curriculum (CC) a media (television) intervention (TV) CC combined with TV a no-treatment control group 23

24 Main question of interest: Influence of the intervention on the tobacco health knowledge scores (THKS)? Challenges in the analysis: outcome variable (THKS) is number correct of 7 items controlling for intra-school and intra-class variability potential explanatory variables are at different levels 24

25 Tobacco and Health Knowledge Scale Post-Intervention Scores 3 (out of 7) Subgroup Descriptive Statistics CC = no CC = yes TV=no TV=yes TV=no TV=yes n proportions odds logits

26 Within-Clusters / Between-Clusters components Within-clusters model - level 1 (j = 1,..., n i subjects) logit ij = b 0i Between-clusters model - level 2 (i = 1,..., N clusters) b 0i = β 0 + β 1 CC i + β 2 T V i + β 3 (CC i T V i ) + υ 0i υ 0i N ID(0, σ 2 υ) 26

27 β 0 β 1 = THKS logit for CC=no TV=no subgroup = logit diff. between CC=yes vs CC=no (for TV=no) b 0i = β 0 + (β 1 + β 3 T V i )CC i + β 2 T V i + υ 0i β 2 = logit diff. between TV=yes vs TV=no (for CC=no) b 0i = β 0 + (β 2 + β 3 CC i )T V i + β 1 CC i + υ 0i β 3 = difference in logit attributable to interaction υ 0i = random cluster deviation note: interpretation depends on coding of variables, and βs are adjusted for the cluster effects (cluster-specific effects) 27

28 3-level model Within-classrooms (and schools) model - level 1 (k = 1,..., n ij students) logit ijk = b 0ij Between-classrooms (within-schools) model - level 2 (j = 1,..., n i classrooms) b 0ij = b 0i + υ 0ij Between-schools model - level 3 (i = 1,..., N schools) b 0i = β 0 + β 1 CC i + β 2 T V i + β 3 (CC i T V i ) + υ 0i υ 0ij N ID(0, σ 2 υ(2) ) and υ 0i N ID(0, σ 2 υ(3) ) 28

29 β 0 β 1 β 2 β 3 = THKS logit for CC=no TV=no subgroup = logit diff. between CC=yes vs CC=no (for TV=no) = logit diff. between TV=yes vs TV=no (for CC=no) = difference in logit attributable to interaction υ 0ij = random classroom deviation υ 0i = random school deviation 29

30 Stata for multilevel analysis of dichotomous outcomes: melogit (version 13 and thereafter) Multiple levels of nesting, crossed random effects Full likelihood estimation using numerical quadrature for integration over the random effects non-adaptive, mode/curvature adaptive, mean/variance adaptive (default except for crossed random effects) 7 points per dimension are the default; more points provides greater accuracy, but also more computation time Laplace approximation (default for crossed random effects models) same as mode/curvature adaptive with one point can produce biased estimates, especially as the ICC is high and numbers of clusters and/or subjects is small 30

31 Stata Example: tvsfp binary.do log using u:\stata_examples\tvsfp_binary.log, replace infile school class thkso thksb ones thkspre cc tv cctv using clear summarize codebook school class * ordinary logistic regression logit thksb cc tv cctv, nolog * 2-level logistic regression melogit thksb cc tv cctv, nolog class: scalar m2 = e(ll) estat icc * 3-level logistic regression melogit thksb cc tv cctv, nolog school: class: scalar m3 = e(ll) estat icc 31

32 * random school and class effects with std errors predict u3 u2, reffects reses(u3se u2se) * assign a value of 1 for one obs in each class * & rank the RE estimates & class ids egen pick1class = tag(class) egen u2rank = rank(u2) if pick1class==1 egen classrank = rank(class) if pick1class==1 list class u2 u2se u2rank if pick1class==1 & classrank <= 10 * histogram of class random effects histogram u2 if pick1class==1, normal * std error bar chart (caterpillar plot) of class random effects serrbar u2 u2se u2rank if pick1class==1, scale(1.96) yline(0) * get LR test for comparing 2- and 3-level models display "chibar2(01) = " 2*(m3-m2) display "Prob > chibar2(01) = "chi2tail(1, 2*(m3-m2))/2 log close 32

33 . infile school class thkso thksb ones thkspre cc tv cctv using clear * cannot be read as a number for school[1601] (eof not at end of obs) (1,601 observations read). summarize Variable Obs Mean Std. Dev. Min Max school 1, class 1, thkso 1, thksb 1, ones 1, thkspre 1, cc 1, tv 1, cctv 1,

34 . codebook school class school type: numeric (float) range: [193,515] units: 1 unique values: 28 missing.: 1/1,601 mean: std. dev: percentiles: 10% 25% 50% 75% 90% class type: numeric (float) range: [193101,515113] units: 1 unique values: 135 missing.: 1/1,601 mean: std. dev: percentiles: 10% 25% 50% 75% 90%

35 . * ordinary logistic regression. logit thksb cc tv cctv, nolog Logistic regression Number of obs = 1,600 LR chi2(3) = Prob > chi2 = Log likelihood = Pseudo R2 = thksb Coef. Std. Err. z P> z [95% Conf. Interval] cc tv cctv _cons

36 . * 2-level logistic regression. melogit thksb cc tv cctv, nolog class: Mixed-effects logistic regression Number of obs = 1,600 Group variable: class Number of groups = 135 Obs per group: min = 1 avg = 11.9 max = 28 Integration method: mvaghermite Integration pts. = 7 Wald chi2(3) = Log likelihood = Prob > chi2 = thksb Coef. Std. Err. z P> z [95% Conf. Interval] cc tv cctv _cons class var(_cons) LR test vs. logistic model: chibar2(01) = Prob >= chibar2 =

37 . scalar m2 = e(ll). estat icc Residual intraclass correlation Level ICC Std. Err. [95% Conf. Interval] class

38 . *3-level logistic regression. melogit thksb cc tv cctv, nolog school: class: Mixed-effects logistic regression Number of obs = 1, No. of Observations per Group Group Variable Groups Minimum Average Maximum school class Integration method: mvaghermite Integration pts. = 7 Wald chi2(3) = Log likelihood = Prob > chi2 = thksb Coef. Std. Err. z P> z [95% Conf. Interval] cc tv cctv _cons

39 school var(_cons) school>class var(_cons) LR test vs. logistic model: chi2(2) = Prob > chi2 = Note: LR test is conservative and provided only for reference.. scalar m3 = e(ll). estat icc Residual intraclass correlation Level ICC Std. Err. [95% Conf. Interval] school class school

40 . * random school and class effects with std errors. predict u3 u2, reffects reses(u3se u2se). * assign a value of 1 for one obs in each class. * & rank the RE estimates & class ids. egen pick1class = tag(class). egen u2rank = rank(u2) if pick1class==1. egen classrank = rank(class) if pick1class==1. list class u2 u2se u2rank if pick1class==1 & classrank <= class u2 u2se u2rank

41 * histogram of class random effects histogram u2 if pick1class==1, normal Density empirical Bayes means for _cons[school>class] 41

42 * standard error bar chart (caterpillar plot) of class random effects serrbar u2 u2se u2rank if pick1class==1, scale(1.96) yline(0) empirical Bayes means for _cons[school>class] rank of (u2) 42

43 Model comparisons - Likelihood Ratio (LR) tests comparing mixed logistic to ordinary (fixed) logistic regression LR test vs. logistic regression: chibar2(01) = Prob>=chibar2 = H 0 : σ 2 υ (2) = 0, H A : σ 2 υ (2) > 0 one-sided test chibar2(01) refers to a 50:50 mixture of a χ 2 0 and a χ 2 1 distribution; chi-bar square distribution; p-value is obtained from χ 2 1, but is halved comparing 3-level mixed logistic to ordinary (fixed) logistic regression LR test vs. logistic regression: chi2(2) = Prob > chi2 = Note: LR test is conservative and provided only for reference. H 0 : σ 2 υ (2) = σ 2 υ (3) = 0 43

44 comparing 3-level to 2-level mixed logistic * 2-level logistic regression melogit thksb cc tv cctv class: scalar m2 = e(ll) * 3-level logistic regression melogit thksb cc tv cctv school: class: scalar m3 = e(ll) m2 = 2-level Log likelihood = m3 = 3-level Log likelihood = display "chibar2(01) = " 2*(m3-m2) chibar2(01) = display "Prob > chibar2(01) = "chi2tail(1, 2*(m3-m2))/2 Prob > chibar2(01) = H 0 : σ 2 υ (3) = 0, H A : σ 2 υ (3) > 0 one-sided test 44

45 THKS Post-Int (dichotomized) Scores - LR Estimates (std errs) Multilevel Fixed 2-level 3-level intercept (.099) (.140) (.192) CC (.145) (.203) (.278) TV (.139) (.199) (.270) CC TV (.204) (.287) (.390) class var (.087) (.081) school var.120 (.077) -2 log L p <.01 p <.05 p <.10 (Wald tests not done for vars) 45

46 SAS for multilevel analysis of dichotomous outcomes PROC GLIMMIX (version and thereafter) Multiple levels of nesting, crossed random effects Pseudo-likelihood estimation (by default) Linearization to avoid integration over the random effects Produces biased estimates if number of level-1 or level-2 units is small and/or ICC is large Full likelihood estimation using numerical quadrature for integration over the random effects METHOD=QUAD; however for 3-level models can only use METHOD=QUAD(QPOINTS=1) or METHOD=LAPLACE (these are equivalent) PROC NLMIXED Full likelihood estimation using numerical quadrature for integration over the random effects Only for 2-level models; allows programming features (can do 3-level models with SAS/STAT 13.2; 2nd maintenance release for SAS 9.4) 46

47 SAS Example: tvsfp binary.sas FILENAME TvsfpDat URL ; DATA one; INFILE TvsfpDat; INPUT sid cid thkso thksb int thkspre cc tv cctv; sometimes doesn t seem to work... ERROR: The connection has timed out.. NOTE: The SAS System stopped processing this step because of errors. in this case, easiest just to go to URL and download the data FILENAME TvsfpDat u:/mixdemo/tvsfpors.dat ; DATA one; INFILE TvsfpDat; INPUT sid cid thkso thksb int thkspre cc tv cctv; RUN; 47

48 /* logistic regression ignoring clustering */ PROC LOGISTIC; MODEL thksb (DESCENDING) = cc tv cctv; /* GLIMMIX: students in classrooms - Quasi-Like */ PROC GLIMMIX NOCLPRINT; CLASS cid; MODEL thksb (DESCENDING) = cc tv cctv / DIST=BINARY SOLUTION; RANDOM INTERCEPT / SUBJECT = cid TYPE=CHOL; RUN; TYPE=CHOL requests estimation of cluster standard deviation (σ υ ) rather than variance (σ 2 υ). More stable computationally if variance is close to zero. 48

49 /* GLIMMIX: students in classrooms - Full-Like */ PROC GLIMMIX NOCLPRINT METHOD=QUAD; CLASS cid; MODEL thksb (DESCENDING) = cc tv cctv / DIST=BINARY SOLUTION; RANDOM INTERCEPT / SUBJECT=cid TYPE=CHOL SOLUTION; COVTEST class variance GLM; ODS OUTPUT SOLUTIONR=ClassEffects; RUN; METHOD=QUAD requests full-likelihood estimation (using numerical quadrature) SOLUTION on RANDOM statement produces estimates of random classroom effects; ODS statement directs these to the data set ClassEffects COVTEST class variance GLM statement yields a likelihood ratio test of H 0 : σ 2 υ (2) = 0, H A : σ 2 υ (2) > 0 one-sided test 49

50 The GLIMMIX Procedure Model Information Data Set Response Variable Response Distribution Link Function Variance Function Variance Matrix Blocked By Estimation Technique Likelihood Approximation Degrees of Freedom Method WORK.ONE thksb Binary Logit Default cid Maximum Likelihood Gauss-Hermite Quadrature Containment Number of Observations Read 1600 Number of Observations Used

51 Response Profile Ordered Total Value thksb Frequency The GLIMMIX procedure is modeling the probability that thksb= 1. Dimensions G-side Cov. Parameters 1 Columns in X 4 Columns in Z per Subject 1 Subjects (Blocks in V) 135 Max Obs per Subject 28 51

52 Optimization Information Optimization Technique Dual Quasi-Newton Parameters in Optimization 5 Lower Boundaries 1 Upper Boundaries 0 Fixed Effects Not Profiled Starting From GLM estimates Quadrature Points 3 Iteration History Objective Max Iteration Restarts Evaluations Function Change Gradient Convergence criterion (GCONV=1E-8) satisfied. 52

53 Fit Statistics -2 Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) CAIC (smaller is better) HQIC (smaller is better) Fit Statistics for Conditional Distribution -2 log L(thksb r. effects) Pearson Chi-Square Pearson Chi-Square / DF 0.93 Covariance Parameter Estimates Standard Cov Parm Subject Estimate Error CHOL(1,1) cid

54 Solutions for Fixed Effects Standard Effect Estimate Error DF t Value Pr > t Intercept cc <.0001 tv cctv Type III Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F cc <.0001 tv cctv

55 Solution for Random Effects Std Err Effect Subject Estimate Pred DF t Value Pr > t Intercept cid Intercept cid Intercept cid Intercept cid Intercept cid Intercept cid Intercept cid Intercept cid Intercept cid Intercept cid Intercept cid Intercept cid Intercept cid

56 Tests of Covariance Parameters Based on the Likelihood Label DF -2 Log Like ChiSq Pr > ChiSq Note class variance <.0001 MI MI: P-value based on a mixture of chi-squares. Comparing mixed logistic to ordinary (fixed) logistic regression H 0 : σ 2 υ (2) = 0, H A : σ 2 υ (2) > 0 one-sided test mixture refers to a 50:50 mixture of a χ 2 0 and a χ 2 1 distribution; chi-bar square distribution; p-value is obtained from χ 2 1, but is halved 56

57 /* NLMIXED: students in classrooms - Full-Like */ PROC NLMIXED DATA=one; PARMS b0=-.34 b cc=.88 b tv=.27 b cctv=-.39 sd=1; z = b0 + b cc*cc + b tv*tv + b cctv*cctv + sd*u; IF (thksb=1) THEN p = 1/(1 + EXP(-z)); ELSE IF (thksb=0) THEN p = 1 - (1/(1 + EXP(-z))); logl = LOG(p); MODEL thksb GENERAL(logl); RANDOM u NORMAL(0,1) SUBJECT=cid; RUN; Programming features of PROC NLMIXED make it very flexible, though somewhat difficult to use; not really necessary for 2-level mixed logistic model 57

58 /* GLIMMIX: 3-level - quasi-likelihood */ PROC GLIMMIX NOCLPRINT DATA=one; CLASS cid sid; MODEL thksb (DESCENDING) = cc tv cctv / DIST=BINARY SOLUTION; RANDOM INTERCEPT / SUBJECT = cid(sid) TYPE=CHOL; RANDOM INTERCEPT / SUBJECT = sid TYPE=CHOL; RUN; /* GLIMMIX: 3-level - full-likelihood */ PROC GLIMMIX NOCLPRINT METHOD=QUAD(QPOINTS=1) DATA=one; CLASS cid sid; MODEL thksb (DESCENDING) = cc tv cctv / DIST=BINARY SOLUTION; RANDOM INTERCEPT / SUBJECT = cid(sid) TYPE=CHOL; RANDOM INTERCEPT / SUBJECT = sid TYPE=CHOL; COVTEST class & school variances GLM; COVTEST school variance. 0; RUN; 58

59 Tests of Covariance Parameters Based on the Likelihood Label DF -2 Log Like ChiSq Pr > ChiSq Note class & school variances < school variance MI MI: P-value based on a mixture of chi-squares. --: Standard test with unadjusted p-values. COVTEST class & school variances GLM compares 3-level mixed logistic to ordinary (fixed) logistic regression - test of independence H 0 : σ 2 υ (2) = σ 2 υ (3) = 0 COVTEST school variance. 0 compares 3-level mixed logistic to 2-level mixed logistic with random classroom effects. Mixture refers to a 50:50 mixture of a χ 2 0 and a χ 2 1 distribution. H 0 : συ 2 (3) = 0, H A : συ 2 (3) > 0 one-sided test 59

60 PROC SGPLOT DATA=ClassEffects; HISTOGRAM Estimate; DENSITY Estimate; RUN; 60

61 THKS Post-Int (dichotomized) Scores - LR Estimates (std errs) Fixed GLIMMIX full GLIMMIX quasi intercept (.099) (.140) (.190) (.137) (.204) CC (.145) (.203) (.277) (.199) (.293) TV (.139) (.199) (.268) (.195) (.286) CC TV (.204) (.287) (.387) (.281) (.409) class sd (.083) (.097) (.093) (.078) school sd (.110) (.115) -2 log L p <.01 p <.05 p <.10 (Wald-tests not done for sds) 61

62 62

63 Under SSI, Inc > SuperMix (English) or SuperMix (English) Student Under File click on Open Spreadsheet Open C:\SuperMixEn Examples\Workshop\Binary\tvsfpors.ss3 (or C:\SuperMixEn Student Examples\Workshop\Binary\tvsfpors.ss3) 63

64 C:\SuperMixEn Examples\Workshop\Binary\tvsfpors.ss3 64

65 Under File click on Open Existing Model Setup Open C:\SuperMixEn Examples\Workshop\Binary\tvbc.mum (or C:\SuperMixEn Student Examples\Workshop\Binary\tvbc.mum) 65

66 Note Dependent Variable Type should be binary 66

67 For the moment, unselect PreTHKS as an explanatory variable 67

68 Note Optimization Method should be adaptive quadrature 68

69 69

70 70

71 71

72 Empirical Bayes Estimates of Random Effects Select Analysis > View Level-2 Bayes Results Class ID, random effect number, estimate, variance, name 72

73 Select File > Model-based Graphs > Confidence Intervals 73

74 order of classes on x-axis is the same as order in the dataset 74

75 Under File click on Open Existing Model Setup Open C:\SuperMixEn Examples\Workshop\Binary\tvbsc.mum (or C:\SuperMixEn Student Examples\Workshop\Binary\tvbsc.mum) 75

76 Note Dependent Variable Type should be binary 76

77 For the moment, unselect PreTHKS as an explanatory variable 77

78 Note Optimization Method should be adaptive quadrature 78

79 79

80 80

81 81

82 Empirical Bayes Estimates of Random Class Effects Select Analysis > View Level-2 Bayes Results School ID, Class ID, random effect number, estimate, variance, name 82

83 Empirical Bayes Estimates of Random School Effects Select Analysis > View Level-3 Bayes Results School ID, random effect number, estimate, variance, name 83

84 Calculation of ICC - 2 level model r = σ2 υ σ 2 υ + σ 2 Random classrooms model (π 2 /3 = ) r = π 2 /3 = % of the unexplained variation is at the classroom level 84

85 Calculation of ICC - 3 level model Level-3 (likeness of students in the same school) r = σ 2 υ(3) σ 2 υ(3) + σ2 υ(2) + σ2 = π 2 /3 =.034 Level-2 (likeness of students in same classroom & school) r = σ 2 υ(3) + σ2 υ(2) σ 2 υ(3) + σ2 υ(2) + σ2 = π 2 /3 =.081 Level-2 (likeness of classes in the same school) r = σ 2 υ(3) σ 2 υ(3) + σ2 υ(2) = =.415 r <.5 : the school level contributes slightly less to variability than the class level average classroom post THKS scores are moderately similar within schools 85

86 CC TV logistic Ψ(z) = [1 + exp( z)] 1 estimate Fixed-effects model 0 0 Ψ(.341) Ψ( ) Ψ( ) Ψ( ).603 Random-classrooms model ˆd = ( π 2 /3)/(π 2 /3) 0 0 Ψ((.384)/ ˆd) Ψ(( )/ ˆd) Ψ(( )/ ˆd) Ψ(( )/ ˆd) level model ˆd = ( π 2 /3)/(π 2 /3) 0 0 Ψ((.391)/ ˆd) Ψ(( )/ ˆd) Ψ(( )/ ˆd) Ψ(( )/ ˆd).597 d = design effect = (σ 2 υ + σ 2 )/σ 2 or = (σ 2 υ(3) + σ2 υ(2) + σ2 )/σ 2 86

87 Stata mata script: tvsfp binary mataest1.do * 3-level model with intercept, cc, tv, cc*tv mata beta = ( \ \ \ ) xmat = (1, 0, 0, 0 \ 1, 0, 1, 0 \ 1, 1, 0, 0 \ 1, 1, 1, 1) xbeta = xmat*beta varc = vars = var1 = pi()^2/3 d = (vars + varc + var1)/var1 xbetad = xbeta/sqrt(d) estprob = invlogit(xbetad) estprob 87

88 Supermix population average estimates Population Average Estimates Standard Parameter Estimate Error z Value P Value intercept CC TV CC*TV CC TV logistic Ψ(z) = [1 + exp( z)] 1 estimate Population average estimates from 3-level analysis 0 0 Ψ(.367) Ψ( ) Ψ( ) Ψ( )

89 Stata mata script: tvsfp binary mataest1b.do * 3-level model with intercept, cc, tv, cc*tv - PA estimates mata beta = ( \ \ \ ) xmat = (1, 0, 0, 0 \ 1, 0, 1, 0 \ 1, 1, 0, 0 \ 1, 1, 1, 1) xbeta = xmat*beta estprob = invlogit(xbeta) estprob end 89

90 Within-Clusters / Between-Clusters components Within-clusters model - level 1 (j = 1,..., n i subjects) logit ij = b 0i + b 1i P RET HKS ij Between-clusters model - level 2 (i = 1,..., N clusters) b 0i = β 0 + β 2 CC i + β 3 T V i + β 4 (CC i T V i ) + υ 0i b 1i = β 1 υ 0i N ID(0, σ 2 υ) 90

91 β 0 β 1 β 2 β 3 β 4 = (PRETHKS adjusted) logit for CC=no TV=no subgroup = effect of PRETHKS on POSTTHKS = (PRETHKS adjusted) logit diff. between CC=yes vs CC=no (for TV=no) = (PRETHKS adjusted) logit diff. between TV=yes vs TV=no (for CC=no) = (PRETHKS adjusted) difference in logit attributable to interaction υ 0i = random cluster deviation 91

92 3-level model Within-classrooms (and schools) model - level 1 (k = 1,..., n ij students) logit ijk = b 0ij + b 1ij P RET HKS ijk Between-classrooms (within-schools) model - level 2 (j = 1,..., n i classrooms) b 0ij = b 0i + υ 0ij b 1ij = b 1i Between-schools model - level 3 (i = 1,..., N schools) b 0i = β 0 + β 2 CC i + β 3 T V i + β 4 (CC i T V i ) + υ 0i b 1i = β 1 υ 0ij N ID(0, σ 2 υ(2) ) and υ 0i N ID(0, σ 2 υ(3) ) 92

93 Stata code: tvsfp binary.do add thkspre to the explanatory variable list melogit thksb melogit thksb thkspre cc tv cctv class: thkspre cc tv cctv school: class: or meqrlogit thksb thkspre cc tv cctv class:, intp(11) meqrlogit thksb thkspre cc tv cctv school: class:, intp(11) meqrlogit uses the Cholesky (matrix square root) of the random-effects variance-covariance matrix in estimation (more stable if variances are close to zero) intp(11) changes the default of 7 quadrature points to 11 93

94 SAS code: tvsfp binary.sas add thkspre to the explanatory variable list on the MODEL statement /* GLIMMIX: 3-level - full-likelihood */ PROC GLIMMIX NOCLPRINT METHOD=QUAD(QPOINTS=1); CLASS cid sid; MODEL thksb (DESCENDING) = thkspre cc tv cctv / DIST=BINARY SOLUTION; RANDOM INTERCEPT / SUBJECT = cid(sid) TYPE=CHOL; RANDOM INTERCEPT / SUBJECT = sid TYPE=CHOL; RUN; 94

95 In Supermix, reopening TVBSC.mum and selecting PreTHKS as an explanatory variable 95

96 THKS Post-Int (dichotomized) Scores - LR Estimates (std err) Multilevel Fixed 2-level 3-level intercept (.141) (.170) (.196) PRETHKS (.044) (.046) (.046) CC (.150) (.197) (.245) TV (.143) (.192) (.236) CC TV (.210) (.277) (.343) class var (.080) (.081) school var.063 (.062) -2 log L p <.01 p <.05 p <.10 (Wald-tests not done for vars) 96

97 Calculation of ICC - 2 level models r = σ2 υ σ 2 υ + σ 2 Random classrooms model.219 r = π 2 /3 = % of the unexplained variation is at the classroom level 97

98 Calculation of ICC - 3 level model Level-3 (likeness of students in the same school) r = σ 2 υ(3) σ 2 υ(3) + σ2 υ(2) + σ2 = π 2 /3 =.018 Level-2 (likeness of students in same classroom & school) r = σ 2 υ(3) + σ2 υ(2) σ 2 υ(3) + σ2 υ(2) + σ2 = π 2 /3 =.063 Level-2 (likeness of classes in the same school) r = σ 2 υ(3) σ 2 υ(3) + σ2 υ(2) = =.276 r <.5 : the school level contributes less to variability than the class level average classroom post THKS scores are moderately similar within schools 98

99 CC TV logistic Ψ(z) = [1 + exp( z)] 1 estimate Fixed-effects model 0 0 Ψ( ) Ψ( ) Ψ( ) Ψ( ).610 Random-classrooms model ˆd = ( π 2 /3)/(π 2 /3) 0 0 Ψ(( )/ ˆd) Ψ(( )/ ˆd) Ψ(( )/ ˆd) Ψ(( )/ ˆd) level model ˆd = ( π 2 /3)/(π 2 /3) 0 0 Ψ(( )/ ˆd) Ψ(( )/ ˆd) Ψ(( )/ ˆd) Ψ(( )/ ˆd).605 d = design effect = (σ 2 υ + σ 2 )/σ 2 or = (σ 2 υ(3) + σ2 υ(2) + σ2 )/σ 2 99

100 Supermix population average estimates Population Average Estimates Standard Parameter Estimate Error z Value P Value intercept CC TV CC*TV PreTHKS CC TV logistic Ψ(z) = [1 + exp( z)] 1 estimate Population average estimates from 3-level analysis 0 0 Ψ( ) Ψ( ) Ψ( ) Ψ( )

101 Stata mata script: tvsfp binary mataest2.do * 3-level model with intercept, prethks, cc, tv, cc*tv mata beta = ( \ \ \ \ ) xmat = (1, 2.152, 0, 0, 0 \ 1, 2.087, 0, 1, 0 \ 1, 2.050, 1, 0, 0 \ 1, 1.979, 1, 1, 1) xbeta = xmat*beta varc = vars = var1 = pi()^2/3 d = (vars + varc + var1)/var1 xbetad = xbeta/sqrt(d) estprob = invlogit(xbetad) estprob end 101

102 Stata mata script: tvsfp binary mataest2b.do * 3-level model with int, prethks, cc, tv, cc*tv - PA estimates mata beta = ( \ \ \ \ ) xmat = (1, 2.152, 0, 0, 0 \ 1, 2.087, 0, 1, 0 \ 1, 2.050, 1, 0, 0 \ 1, 1.979, 1, 1, 1) xbeta = xmat*beta estprob = invlogit(xbeta) estprob end

Mixed Models for Longitudinal Binary Outcomes. Don Hedeker Department of Public Health Sciences University of Chicago.

Mixed Models for Longitudinal Binary Outcomes. Don Hedeker Department of Public Health Sciences University of Chicago. Mixed Models for Longitudinal Binary Outcomes Don Hedeker Department of Public Health Sciences University of Chicago hedeker@uchicago.edu https://hedeker-sites.uchicago.edu/ Hedeker, D. (2005). Generalized

More information

Mixed Models for Longitudinal Ordinal and Nominal Outcomes

Mixed Models for Longitudinal Ordinal and Nominal Outcomes Mixed Models for Longitudinal Ordinal and Nominal Outcomes Don Hedeker Department of Public Health Sciences Biological Sciences Division University of Chicago hedeker@uchicago.edu Hedeker, D. (2008). Multilevel

More information

Example 7b: Generalized Models for Ordinal Longitudinal Data using SAS GLIMMIX, STATA MEOLOGIT, and MPLUS (last proportional odds model only)

Example 7b: Generalized Models for Ordinal Longitudinal Data using SAS GLIMMIX, STATA MEOLOGIT, and MPLUS (last proportional odds model only) CLDP945 Example 7b page 1 Example 7b: Generalized Models for Ordinal Longitudinal Data using SAS GLIMMIX, STATA MEOLOGIT, and MPLUS (last proportional odds model only) This example comes from real data

More information

Mixed Models for Longitudinal Ordinal and Nominal Outcomes. Don Hedeker Department of Public Health Sciences University of Chicago

Mixed Models for Longitudinal Ordinal and Nominal Outcomes. Don Hedeker Department of Public Health Sciences University of Chicago Mixed Models for Longitudinal Ordinal and Nominal Outcomes Don Hedeker Department of Public Health Sciences University of Chicago hedeker@uchicago.edu https://hedeker-sites.uchicago.edu/ Hedeker, D. (2008).

More information

Why analyze as ordinal? Mixed Models for Longitudinal Ordinal Data Don Hedeker University of Illinois at Chicago

Why analyze as ordinal? Mixed Models for Longitudinal Ordinal Data Don Hedeker University of Illinois at Chicago Why analyze as ordinal? Mixed Models for Longitudinal Ordinal Data Don Hedeker University of Illinois at Chicago hedeker@uic.edu www.uic.edu/ hedeker/long.html Efficiency: Armstrong & Sloan (1989, Amer

More information

Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models:

Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models: Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models: Marginal models: based on the consequences of dependence on estimating model parameters.

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

Lecture 3.1 Basic Logistic LDA

Lecture 3.1 Basic Logistic LDA y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data

More information

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

More information

Lab 11. Multilevel Models. Description of Data

Lab 11. Multilevel Models. Description of Data Lab 11 Multilevel Models Henian Chen, M.D., Ph.D. Description of Data MULTILEVEL.TXT is clustered data for 386 women distributed across 40 groups. ID: 386 women, id from 1 to 386, individual level (level

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

Application of Item Response Theory Models for Intensive Longitudinal Data

Application of Item Response Theory Models for Intensive Longitudinal Data Application of Item Response Theory Models for Intensive Longitudinal Data Don Hedeker, Robin Mermelstein, & Brian Flay University of Illinois at Chicago hedeker@uic.edu Models for Intensive Longitudinal

More information

Longitudinal Modeling with Logistic Regression

Longitudinal Modeling with Logistic Regression Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to

More information

multilevel modeling: concepts, applications and interpretations

multilevel modeling: concepts, applications and interpretations multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models

More information

ML estimation: Random-intercepts logistic model. and z

ML estimation: Random-intercepts logistic model. and z ML estimation: Random-intercepts logistic model log p ij 1 p = x ijβ + υ i with υ i N(0, συ) 2 ij Standardizing the random effect, θ i = υ i /σ υ, yields log p ij 1 p = x ij β + σ υθ i with θ i N(0, 1)

More information

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1 CLDP 944 Example 3a page 1 From Between-Person to Within-Person Models for Longitudinal Data The models for this example come from Hoffman (2015) chapter 3 example 3a. We will be examining the extent to

More information

GEE for Longitudinal Data - Chapter 8

GEE for Longitudinal Data - Chapter 8 GEE for Longitudinal Data - Chapter 8 GEE: generalized estimating equations (Liang & Zeger, 1986; Zeger & Liang, 1986) extension of GLM to longitudinal data analysis using quasi-likelihood estimation method

More information

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3 STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae

More information

Generalized Models: Part 1

Generalized Models: Part 1 Generalized Models: Part 1 Topics: Introduction to generalized models Introduction to maximum likelihood estimation Models for binary outcomes Models for proportion outcomes Models for categorical outcomes

More information

Introduction to Within-Person Analysis and RM ANOVA

Introduction to Within-Person Analysis and RM ANOVA Introduction to Within-Person Analysis and RM ANOVA Today s Class: From between-person to within-person ANOVAs for longitudinal data Variance model comparisons using 2 LL CLP 944: Lecture 3 1 The Two Sides

More information

Missing Data in Longitudinal Studies: Mixed-effects Pattern-Mixture and Selection Models

Missing Data in Longitudinal Studies: Mixed-effects Pattern-Mixture and Selection Models Missing Data in Longitudinal Studies: Mixed-effects Pattern-Mixture and Selection Models Hedeker D & Gibbons RD (1997). Application of random-effects pattern-mixture models for missing data in longitudinal

More information

Introduction to Generalized Models

Introduction to Generalized Models Introduction to Generalized Models Today s topics: The big picture of generalized models Review of maximum likelihood estimation Models for binary outcomes Models for proportion outcomes Models for categorical

More information

Mixed Models for Longitudinal Ordinal and Nominal Data

Mixed Models for Longitudinal Ordinal and Nominal Data Mixed Models for Longitudinal Ordinal and Nominal Data Hedeker, D. (2008). Multilevel models for ordinal and nominal variables. In J. de Leeuw & E. Meijer (Eds.), Handbook of Multilevel Analysis. Springer,

More information

Simple logistic regression

Simple logistic regression Simple logistic regression Biometry 755 Spring 2009 Simple logistic regression p. 1/47 Model assumptions 1. The observed data are independent realizations of a binary response variable Y that follows a

More information

Advantages of Mixed-effects Regression Models (MRM; aka multilevel, hierarchical linear, linear mixed models) 1. MRM explicitly models individual

Advantages of Mixed-effects Regression Models (MRM; aka multilevel, hierarchical linear, linear mixed models) 1. MRM explicitly models individual Advantages of Mixed-effects Regression Models (MRM; aka multilevel, hierarchical linear, linear mixed models) 1. MRM explicitly models individual change across time 2. MRM more flexible in terms of repeated

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

Recent Developments in Multilevel Modeling

Recent Developments in Multilevel Modeling Recent Developments in Multilevel Modeling Roberto G. Gutierrez Director of Statistics StataCorp LP 2007 North American Stata Users Group Meeting, Boston R. Gutierrez (StataCorp) Multilevel Modeling August

More information

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study 1.4 0.0-6 7 8 9 10 11 12 13 14 15 16 17 18 19 age Model 1: A simple broken stick model with knot at 14 fit with

More information

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review

More information

You can specify the response in the form of a single variable or in the form of a ratio of two variables denoted events/trials.

You can specify the response in the form of a single variable or in the form of a ratio of two variables denoted events/trials. The GENMOD Procedure MODEL Statement MODEL response = < effects > < /options > ; MODEL events/trials = < effects > < /options > ; You can specify the response in the form of a single variable or in the

More information

Lecture 4: Generalized Linear Mixed Models

Lecture 4: Generalized Linear Mixed Models Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 An example with one random effect An example with two nested random effects

More information

Review of CLDP 944: Multilevel Models for Longitudinal Data

Review of CLDP 944: Multilevel Models for Longitudinal Data Review of CLDP 944: Multilevel Models for Longitudinal Data Topics: Review of general MLM concepts and terminology Model comparisons and significance testing Fixed and random effects of time Significance

More information

Binary Dependent Variables

Binary Dependent Variables Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome

More information

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 12: Effect modification, and confounding in logistic regression Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression

More information

A Re-Introduction to General Linear Models (GLM)

A Re-Introduction to General Linear Models (GLM) A Re-Introduction to General Linear Models (GLM) Today s Class: You do know the GLM Estimation (where the numbers in the output come from): From least squares to restricted maximum likelihood (REML) Reviewing

More information

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p ) Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p. 376-390) BIO656 2009 Goal: To see if a major health-care reform which took place in 1997 in Germany was

More information

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression

More information

Topic 20: Single Factor Analysis of Variance

Topic 20: Single Factor Analysis of Variance Topic 20: Single Factor Analysis of Variance Outline Single factor Analysis of Variance One set of treatments Cell means model Factor effects model Link to linear regression using indicator explanatory

More information

2. We care about proportion for categorical variable, but average for numerical one.

2. We care about proportion for categorical variable, but average for numerical one. Probit Model 1. We apply Probit model to Bank data. The dependent variable is deny, a dummy variable equaling one if a mortgage application is denied, and equaling zero if accepted. The key regressor is

More information

Models for binary data

Models for binary data Faculty of Health Sciences Models for binary data Analysis of repeated measurements 2015 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen 1 / 63 Program for

More information

Latent class analysis and finite mixture models with Stata

Latent class analysis and finite mixture models with Stata Latent class analysis and finite mixture models with Stata Isabel Canette Principal Mathematician and Statistician StataCorp LLC 2017 Stata Users Group Meeting Madrid, October 19th, 2017 Introduction Latent

More information

Homework Solutions Applied Logistic Regression

Homework Solutions Applied Logistic Regression Homework Solutions Applied Logistic Regression WEEK 6 Exercise 1 From the ICU data, use as the outcome variable vital status (STA) and CPR prior to ICU admission (CPR) as a covariate. (a) Demonstrate that

More information

Topic 23: Diagnostics and Remedies

Topic 23: Diagnostics and Remedies Topic 23: Diagnostics and Remedies Outline Diagnostics residual checks ANOVA remedial measures Diagnostics Overview We will take the diagnostics and remedial measures that we learned for regression and

More information

Variance component models part I

Variance component models part I Faculty of Health Sciences Variance component models part I Analysis of repeated measurements, 30th November 2012 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 An Introduction to Multilevel Models PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 Today s Class Concepts in Longitudinal Modeling Between-Person vs. +Within-Person

More information

Random Intercept Models

Random Intercept Models Random Intercept Models Edps/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline A very simple case of a random intercept

More information

STAT 705 Generalized linear mixed models

STAT 705 Generalized linear mixed models STAT 705 Generalized linear mixed models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 24 Generalized Linear Mixed Models We have considered random

More information

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes Lecture 2.1 Basic Linear LDA 1 Outline Linear OLS Models vs: Linear Marginal Models Linear Conditional Models Random Intercepts Random Intercepts & Slopes Cond l & Marginal Connections Empirical Bayes

More information

Categorical and Zero Inflated Growth Models

Categorical and Zero Inflated Growth Models Categorical and Zero Inflated Growth Models Alan C. Acock* Summer, 2009 *Alan C. Acock, Department of Human Development and Family Sciences, Oregon State University, Corvallis OR 97331 (alan.acock@oregonstate.edu).

More information

Chapter 11. Regression with a Binary Dependent Variable

Chapter 11. Regression with a Binary Dependent Variable Chapter 11 Regression with a Binary Dependent Variable 2 Regression with a Binary Dependent Variable (SW Chapter 11) So far the dependent variable (Y) has been continuous: district-wide average test score

More information

Introduction to Random Effects of Time and Model Estimation

Introduction to Random Effects of Time and Model Estimation Introduction to Random Effects of Time and Model Estimation Today s Class: The Big Picture Multilevel model notation Fixed vs. random effects of time Random intercept vs. random slope models How MLM =

More information

Multilevel Methodology

Multilevel Methodology Multilevel Methodology Geert Molenberghs Interuniversity Institute for Biostatistics and statistical Bioinformatics Universiteit Hasselt, Belgium geert.molenberghs@uhasselt.be www.censtat.uhasselt.be Katholieke

More information

Multilevel/Mixed Models and Longitudinal Analysis Using Stata

Multilevel/Mixed Models and Longitudinal Analysis Using Stata Multilevel/Mixed Models and Longitudinal Analysis Using Stata Isaac J. Washburn PhD Research Associate Oregon Social Learning Center Summer Workshop Series July 2010 Longitudinal Analysis 1 Longitudinal

More information

Multilevel Modeling Day 2 Intermediate and Advanced Issues: Multilevel Models as Mixed Models. Jian Wang September 18, 2012

Multilevel Modeling Day 2 Intermediate and Advanced Issues: Multilevel Models as Mixed Models. Jian Wang September 18, 2012 Multilevel Modeling Day 2 Intermediate and Advanced Issues: Multilevel Models as Mixed Models Jian Wang September 18, 2012 What are mixed models The simplest multilevel models are in fact mixed models:

More information

dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" = -/\<>*"; ODS LISTING;

dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR= = -/\<>*; ODS LISTING; dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" ---- + ---+= -/\*"; ODS LISTING; *** Table 23.2 ********************************************; *** Moore, David

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen 2 / 28 Preparing data for analysis The

More information

STAT 7030: Categorical Data Analysis

STAT 7030: Categorical Data Analysis STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012

More information

SAS Code for Data Manipulation: SPSS Code for Data Manipulation: STATA Code for Data Manipulation: Psyc 945 Example 1 page 1

SAS Code for Data Manipulation: SPSS Code for Data Manipulation: STATA Code for Data Manipulation: Psyc 945 Example 1 page 1 Psyc 945 Example page Example : Unconditional Models for Change in Number Match 3 Response Time (complete data, syntax, and output available for SAS, SPSS, and STATA electronically) These data come from

More information

Lecture 2: Poisson and logistic regression

Lecture 2: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

Spring RMC Professional Development Series January 14, Generalized Linear Mixed Models (GLMMs): Concepts and some Demonstrations

Spring RMC Professional Development Series January 14, Generalized Linear Mixed Models (GLMMs): Concepts and some Demonstrations Spring RMC Professional Development Series January 14, 2016 Generalized Linear Mixed Models (GLMMs): Concepts and some Demonstrations Ann A. O Connell, Ed.D. Professor, Educational Studies (QREM) Director,

More information

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011) Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October

More information

Fixed effects results...32

Fixed effects results...32 1 MODELS FOR CONTINUOUS OUTCOMES...7 1.1 MODELS BASED ON A SUBSET OF THE NESARC DATA...7 1.1.1 The data...7 1.1.1.1 Importing the data and defining variable types...8 1.1.1.2 Exploring the data...12 Univariate

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses ST3241 Categorical Data Analysis I Multicategory Logit Models Logit Models For Nominal Responses 1 Models For Nominal Responses Y is nominal with J categories. Let {π 1,, π J } denote the response probabilities

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen Outline Data in wide and long format

More information

PSC 8185: Multilevel Modeling Fitting Random Coefficient Binary Response Models in Stata

PSC 8185: Multilevel Modeling Fitting Random Coefficient Binary Response Models in Stata PSC 8185: Multilevel Modeling Fitting Random Coefficient Binary Response Models in Stata Consider the following two-level model random coefficient logit model. This is a Supreme Court decision making model,

More information

Modelling Rates. Mark Lunt. Arthritis Research UK Epidemiology Unit University of Manchester

Modelling Rates. Mark Lunt. Arthritis Research UK Epidemiology Unit University of Manchester Modelling Rates Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 05/12/2017 Modelling Rates Can model prevalence (proportion) with logistic regression Cannot model incidence in

More information

An Application of a Mixed-Effects Location Scale Model for Analysis of Ecological Momentary Assessment (EMA) Data

An Application of a Mixed-Effects Location Scale Model for Analysis of Ecological Momentary Assessment (EMA) Data An Application of a Mixed-Effects Location Scale Model for Analysis of Ecological Momentary Assessment (EMA) Data Don Hedeker, Robin Mermelstein, & Hakan Demirtas University of Illinois at Chicago hedeker@uic.edu

More information

A Journey to Latent Class Analysis (LCA)

A Journey to Latent Class Analysis (LCA) A Journey to Latent Class Analysis (LCA) Jeff Pitblado StataCorp LLC 2017 Nordic and Baltic Stata Users Group Meeting Stockholm, Sweden Outline Motivation by: prefix if clause suest command Factor variables

More information

Introducing Generalized Linear Models: Logistic Regression

Introducing Generalized Linear Models: Logistic Regression Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and

More information

Interpreting and using heterogeneous choice & generalized ordered logit models

Interpreting and using heterogeneous choice & generalized ordered logit models Interpreting and using heterogeneous choice & generalized ordered logit models Richard Williams Department of Sociology University of Notre Dame July 2006 http://www.nd.edu/~rwilliam/ The gologit/gologit2

More information

Chapter 1. Modeling Basics

Chapter 1. Modeling Basics Chapter 1. Modeling Basics What is a model? Model equation and probability distribution Types of model effects Writing models in matrix form Summary 1 What is a statistical model? A model is a mathematical

More information

SAS Analysis Examples Replication C8. * SAS Analysis Examples Replication for ASDA 2nd Edition * Berglund April 2017 * Chapter 8 ;

SAS Analysis Examples Replication C8. * SAS Analysis Examples Replication for ASDA 2nd Edition * Berglund April 2017 * Chapter 8 ; SAS Analysis Examples Replication C8 * SAS Analysis Examples Replication for ASDA 2nd Edition * Berglund April 2017 * Chapter 8 ; libname ncsr "P:\ASDA 2\Data sets\ncsr\" ; data c8_ncsr ; set ncsr.ncsr_sub_13nov2015

More information

SAS Syntax and Output for Data Manipulation:

SAS Syntax and Output for Data Manipulation: CLP 944 Example 5 page 1 Practice with Fixed and Random Effects of Time in Modeling Within-Person Change The models for this example come from Hoffman (2015) chapter 5. We will be examining the extent

More information

Sociology 362 Data Exercise 6 Logistic Regression 2

Sociology 362 Data Exercise 6 Logistic Regression 2 Sociology 362 Data Exercise 6 Logistic Regression 2 The questions below refer to the data and output beginning on the next page. Although the raw data are given there, you do not have to do any Stata runs

More information

Lecture 5: Poisson and logistic regression

Lecture 5: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 3-5 March 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

Appendix A. Numeric example of Dimick Staiger Estimator and comparison between Dimick-Staiger Estimator and Hierarchical Poisson Estimator

Appendix A. Numeric example of Dimick Staiger Estimator and comparison between Dimick-Staiger Estimator and Hierarchical Poisson Estimator Appendix A. Numeric example of Dimick Staiger Estimator and comparison between Dimick-Staiger Estimator and Hierarchical Poisson Estimator As described in the manuscript, the Dimick-Staiger (DS) estimator

More information

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */ CLP 944 Example 4 page 1 Within-Personn Fluctuation in Symptom Severity over Time These data come from a study of weekly fluctuation in psoriasis severity. There was no intervention and no real reason

More information

Logistic & Tobit Regression

Logistic & Tobit Regression Logistic & Tobit Regression Different Types of Regression Binary Regression (D) Logistic transformation + e P( y x) = 1 + e! " x! + " x " P( y x) % ln$ ' = ( + ) x # 1! P( y x) & logit of P(y x){ P(y

More information

More Mixed-Effects Models for Ordinal & Nominal Data. Don Hedeker University of Illinois at Chicago

More Mixed-Effects Models for Ordinal & Nominal Data. Don Hedeker University of Illinois at Chicago More Mixed-Effects Models for Ordinal & Nominal Data Don Hedeker University of Illinois at Chicago This work was supported by National Institute of Mental Health Contract N44MH32056. 1 Proportional and

More information

Case of single exogenous (iv) variable (with single or multiple mediators) iv à med à dv. = β 0. iv i. med i + α 1

Case of single exogenous (iv) variable (with single or multiple mediators) iv à med à dv. = β 0. iv i. med i + α 1 Mediation Analysis: OLS vs. SUR vs. ISUR vs. 3SLS vs. SEM Note by Hubert Gatignon July 7, 2013, updated November 15, 2013, April 11, 2014, May 21, 2016 and August 10, 2016 In Chap. 11 of Statistical Analysis

More information

Model and Working Correlation Structure Selection in GEE Analyses of Longitudinal Data

Model and Working Correlation Structure Selection in GEE Analyses of Longitudinal Data The 3rd Australian and New Zealand Stata Users Group Meeting, Sydney, 5 November 2009 1 Model and Working Correlation Structure Selection in GEE Analyses of Longitudinal Data Dr Jisheng Cui Public Health

More information

A Re-Introduction to General Linear Models

A Re-Introduction to General Linear Models A Re-Introduction to General Linear Models Today s Class: Big picture overview Why we are using restricted maximum likelihood within MIXED instead of least squares within GLM Linear model interpretation

More information

Using the same data as before, here is part of the output we get in Stata when we do a logistic regression of Grade on Gpa, Tuce and Psi.

Using the same data as before, here is part of the output we get in Stata when we do a logistic regression of Grade on Gpa, Tuce and Psi. Logistic Regression, Part III: Hypothesis Testing, Comparisons to OLS Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 14, 2018 This handout steals heavily

More information

Generalized Multilevel Models for Non-Normal Outcomes

Generalized Multilevel Models for Non-Normal Outcomes Generalized Multilevel Models for Non-Normal Outcomes Topics: 3 parts of a generalized (multilevel) model Models for binary, proportion, and categorical outcomes Complications for generalized multilevel

More information

Answer to exercise: Blood pressure lowering drugs

Answer to exercise: Blood pressure lowering drugs Answer to exercise: Blood pressure lowering drugs The data set bloodpressure.txt contains data from a cross-over trial, involving three different formulations of a drug for lowering of blood pressure:

More information

Section 9c. Propensity scores. Controlling for bias & confounding in observational studies

Section 9c. Propensity scores. Controlling for bias & confounding in observational studies Section 9c Propensity scores Controlling for bias & confounding in observational studies 1 Logistic regression and propensity scores Consider comparing an outcome in two treatment groups: A vs B. In a

More information

University of California at Berkeley Fall Introductory Applied Econometrics Final examination. Scores add up to 125 points

University of California at Berkeley Fall Introductory Applied Econometrics Final examination. Scores add up to 125 points EEP 118 / IAS 118 Elisabeth Sadoulet and Kelly Jones University of California at Berkeley Fall 2008 Introductory Applied Econometrics Final examination Scores add up to 125 points Your name: SID: 1 1.

More information

Monday 7 th Febraury 2005

Monday 7 th Febraury 2005 Monday 7 th Febraury 2 Analysis of Pigs data Data: Body weights of 48 pigs at 9 successive follow-up visits. This is an equally spaced data. It is always a good habit to reshape the data, so we can easily

More information

Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data

Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data Bryan A. Comstock and Patrick J. Heagerty Department of Biostatistics University of Washington

More information

8 Nominal and Ordinal Logistic Regression

8 Nominal and Ordinal Logistic Regression 8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on

More information

Models for Binary Outcomes

Models for Binary Outcomes Models for Binary Outcomes Introduction The simple or binary response (for example, success or failure) analysis models the relationship between a binary response variable and one or more explanatory variables.

More information

STA6938-Logistic Regression Model

STA6938-Logistic Regression Model Dr. Ying Zhang STA6938-Logistic Regression Model Topic 2-Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of

More information

ssh tap sas913, sas https://www.statlab.umd.edu/sasdoc/sashtml/onldoc.htm

ssh tap sas913, sas https://www.statlab.umd.edu/sasdoc/sashtml/onldoc.htm Kedem, STAT 430 SAS Examples: Logistic Regression ==================================== ssh abc@glue.umd.edu, tap sas913, sas https://www.statlab.umd.edu/sasdoc/sashtml/onldoc.htm a. Logistic regression.

More information

Thursday Morning. Growth Modelling in Mplus. Using a set of repeated continuous measures of bodyweight

Thursday Morning. Growth Modelling in Mplus. Using a set of repeated continuous measures of bodyweight Thursday Morning Growth Modelling in Mplus Using a set of repeated continuous measures of bodyweight 1 Growth modelling Continuous Data Mplus model syntax refresher ALSPAC Confirmatory Factor Analysis

More information

McGill University. Faculty of Science. Department of Mathematics and Statistics. Statistics Part A Comprehensive Exam Methodology Paper

McGill University. Faculty of Science. Department of Mathematics and Statistics. Statistics Part A Comprehensive Exam Methodology Paper Student Name: ID: McGill University Faculty of Science Department of Mathematics and Statistics Statistics Part A Comprehensive Exam Methodology Paper Date: Friday, May 13, 2016 Time: 13:00 17:00 Instructions

More information

This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed.

This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed. EXST3201 Chapter 13c Geaghan Fall 2005: Page 1 Linear Models Y ij = µ + βi + τ j + βτij + εijk This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed.

More information

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation NELS 88 Table 2.3 Adjusted odds ratios of eighth-grade students in 988 performing below basic levels of reading and mathematics in 988 and dropping out of school, 988 to 990, by basic demographics Variable

More information

Introduction and Background to Multilevel Analysis

Introduction and Background to Multilevel Analysis Introduction and Background to Multilevel Analysis Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background and

More information