Mixed Models for Longitudinal Ordinal and Nominal Data

Similar documents
Mixed Models for Longitudinal Ordinal and Nominal Outcomes

Why analyze as ordinal? Mixed Models for Longitudinal Ordinal Data Don Hedeker University of Illinois at Chicago

Mixed Models for Longitudinal Ordinal and Nominal Outcomes. Don Hedeker Department of Public Health Sciences University of Chicago

More Mixed-Effects Models for Ordinal & Nominal Data. Don Hedeker University of Illinois at Chicago

ML estimation: Random-intercepts logistic model. and z

Mixed Models for Longitudinal Binary Outcomes. Don Hedeker Department of Public Health Sciences University of Chicago.

Missing Data in Longitudinal Studies: Mixed-effects Pattern-Mixture and Selection Models

Mixed-Effects Pattern-Mixture Models for Incomplete Longitudinal Data. Don Hedeker University of Illinois at Chicago

Generalized Linear Models for Non-Normal Data

Multilevel Modeling of Non-Normal Data. Don Hedeker Department of Public Health Sciences University of Chicago.

GEE for Longitudinal Data - Chapter 8

STAT 705 Generalized linear mixed models

Advantages of Mixed-effects Regression Models (MRM; aka multilevel, hierarchical linear, linear mixed models) 1. MRM explicitly models individual

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models:

Example 7b: Generalized Models for Ordinal Longitudinal Data using SAS GLIMMIX, STATA MEOLOGIT, and MPLUS (last proportional odds model only)

Application of Item Response Theory Models for Intensive Longitudinal Data

Generalized Models: Part 1

Introduction (Alex Dmitrienko, Lilly) Web-based training program

An Application of a Mixed-Effects Location Scale Model for Analysis of Ecological Momentary Assessment (EMA) Data

Introduction to Generalized Models

Lecture 3.1 Basic Logistic LDA

SAS PROC NLMIXED Mike Patefield The University of Reading 12 May

Longitudinal Modeling with Logistic Regression

Models for binary data

8 Nominal and Ordinal Logistic Regression

Introduction to Random Effects of Time and Model Estimation

Generalized Multilevel Models for Non-Normal Outcomes

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Introduction to Within-Person Analysis and RM ANOVA

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

Spring RMC Professional Development Series January 14, Generalized Linear Mixed Models (GLMMs): Concepts and some Demonstrations

A Re-Introduction to General Linear Models (GLM)

Comparing IRT with Other Models

Recent Developments in Multilevel Modeling

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

You can specify the response in the form of a single variable or in the form of a ratio of two variables denoted events/trials.

Lab 11. Multilevel Models. Description of Data

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

Review of CLDP 944: Multilevel Models for Longitudinal Data

Model Estimation Example

The Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen

SAS Code for Data Manipulation: SPSS Code for Data Manipulation: STATA Code for Data Manipulation: Psyc 945 Example 1 page 1

Correspondence Analysis of Longitudinal Data

Simple logistic regression

Estimation: Problems & Solutions

Data-analysis and Retrieval Ordinal Classification

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1

SAS Code: Joint Models for Continuous and Discrete Longitudinal Data

Fitting mixed-effects models for repeated ordinal outcomes with the NLMIXED procedure

Sample Size Estimation for Longitudinal Studies. Don Hedeker University of Illinois at Chicago hedeker

Models for Binary Outcomes

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )

Generalized Linear Modeling - Logistic Regression

CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA

Homework Solutions Applied Logistic Regression

STAT 7030: Categorical Data Analysis

Modelling heterogeneous variance-covariance components in two-level multilevel models with application to school effects educational research

Estimation: Problems & Solutions

RESEARCH ARTICLE. Likelihood-based Approach for Analysis of Longitudinal Nominal Data using Marginalized Random Effects Models

Economics 583: Econometric Theory I A Primer on Asymptotics

Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data

Ronald Heck Week 14 1 EDEP 768E: Seminar in Categorical Data Modeling (F2012) Nov. 17, 2012

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

Chapter 1. Modeling Basics

2. We care about proportion for categorical variable, but average for numerical one.

SAS Syntax and Output for Data Manipulation:

Review of Multinomial Distribution If n trials are performed: in each trial there are J > 2 possible outcomes (categories) Multicategory Logit Models

PQL Estimation Biases in Generalized Linear Mixed Models

DIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS

Stat 642, Lecture notes for 04/12/05 96

Introduction to mtm: An R Package for Marginalized Transition Models

Correlated data. Non-normal outcomes. Reminder on binary data. Non-normal data. Faculty of Health Sciences. Non-normal outcomes

Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples

Categorical Data Analysis Chapter 3

Beyond GLM and likelihood

Investigating Models with Two or Three Categories

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

Journal of Statistical Software

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

Multilevel Methodology

Multinomial Logistic Regression Models

ECON 594: Lecture #6

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE

A Re-Introduction to General Linear Models

STAT 526 Advanced Statistical Methodology

Limited Dependent Variable Models II

Growth models for categorical response variables: standard, latent-class, and hybrid approaches

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti

Models for Ordinal Response Data

Lecture 15 (Part 2): Logistic Regression & Common Odds Ratio, (With Simulations)

Outline for today. Computation of the likelihood function for GLMMs. Likelihood for generalized linear mixed model

Random Effects. Edps/Psych/Stat 587. Carolyn J. Anderson. Fall Department of Educational Psychology. university of illinois at urbana-champaign

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

De-mystifying random effects models

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models

Transcription:

Mixed Models for Longitudinal Ordinal and Nominal Data Hedeker, D. (2008). Multilevel models for ordinal and nominal variables. In J. de Leeuw & E. Meijer (Eds.), Handbook of Multilevel Analysis. Springer, New York. Hedeker, D. (2005). Generalized linear mixed models. In B. Everitt & D. Howell (Eds.), Encyclopedia of Statistics in Behavioral Science. Wiley. Chapters 10 & 11 in Hedeker, D. & Gibbons, R.D. (2006). Longitudinal Data Analysis. Wiley. 1

Why analyze as ordinal? Efficiency: Armstrong & Sloan (1989, Amer Jrn of Epid) report efficiency losses between 89% to 99% comparing an ordinal to continuous outcome, depending on the number of categories and distribution within the ordinal categories. Bias: continuous model can yield correlated residuals and regressors when applied to ordinal outcomes, because the continuous model does not take into account the ceiling and floor effects of the ordinal outcome. This can result in biased estimates of regression coefficients and is most critical when the ordinal variables is highly skewed. Logic: continuous model can yield predicted values outside of the range of the ordinal variable. 2

Proportional Odds Model - McCullagh (1980) log P (Y c) = γ c x β 1 P (Y c) c = 1,..., C 1 for the C categories of the ordinal outcome x = vector of explanatory variables (plus the intercept) γ c = thresholds; reflect cumulative odds when x = 0 (for identification: γ 1 = 0 or β 0 = 0) positive association between x and Y is reflected by β > 0 the effect of x is assumed to be the same for each cumulative odds ratio odds that the response is greater than or equal to c (for fixed c) is multiplied by e β for every unit change in x: 1 P (Y c) P (Y c) = e γ c (e β ) x 3

Ordinal Model for Dichotomous Response: same as it ever was! log P (Y = 0) 1 P (Y = 0) = 0 x β P (Y = 0) 1 P (Y = 0) = exp(0 x β) 1 P (Y = 0) P (Y = 0) = [exp(0 x β)] 1 1 P (Y = 0) P (Y = 0) = exp(x β) log P (Y = 1) 1 P (Y = 1) = x β 4

Ordinal Response and Threshold Concept Continuous y i - unobservable latent variable - related to ordinal response Y i via threshold concept series of threshold values γ 1, γ 2,..., γ C 1 C = number of ordered categories γ 0 = and γ C = Response occurs in category c, Y i = c if γ c 1 < y i < γ c 5

The Threshold Concept in Practice How was your day? (what is your level of satisfaction today?) Satisfaction may be continuous, but we sometimes emit an ordinal response: 6

Model for Latent Continuous Responses Consider the model with p covariates for the latent response strength y i (i = 1, 2,..., N): y i = x i β + ε i probit: ε i standard normal (mean=0, variance=1) logistic: ε i standard logistic (mean=0, variance=π 2 /3) β estimates from logistic regression are larger (in abs. value) than from probit regression by approximately π 2 /3 = 1.8 Underlying latent variable useful way of thinking of the problem not an essential assumption of the model 7

Mixed-effects ordinal logistic regression model (Hedeker & Gibbons, 1994, 1996) i = 1,... N level-2 units (clusters or subjects) j = 1,..., n i level-1 units (subjects or repeated observations) c = 1, 2,..., C response categories Y ij = ordinal response of level-2 unit i and level-1 unit j How was your day? (asked repeatedly each day for a week) 8

Mixed-effects ordinal logistic regression model λ ijc = log P ijc = γ c (x (1 P ijc ) ijβ + z ijυ i ) P ijc = Pr (Y ij c υ ; γ c, β, Σ υ ) = 1 1+exp( λ ijc ) p ijc = Pr (Y ij = c υ ; γ c, β, Σ υ ) = P ijc P ijc 1 C 1 strictly increasing model thresholds γ c x ij = p 1 covariate vector z ij = r 1 design vector for random effects β = p 1 fixed regression parameters υ i = r 1 random effects for level-2 unit i N(0, Σ υ ) 9

Model for Latent Continuous Responses Model with p covariates for the latent response strength y ij : y ij = x ijβ + υ 0i + ε ij where υ 0i N(0, σ 2 υ), and assuming ε ij standard normal (mean 0 and σ 2 = 1) leads to mixed-effects ordinal probit regression ε ij standard logistic (mean 0 and σ 2 = π 2 /3) leads to mixed-effects ordinal logistic regression 10

Underlying latent variable not an essential assumption of the model useful for obtaining intra-class correlation (r) and for design effect (d) r = σ2 υ σ 2 υ + σ 2 d = σ2 υ + σ2 σ 2 = 1/(1 r) ratio of actual variance to the variance that would be obtained by simple random sampling (holding sample size constant) 11

Scaling of regression coefficients Fixed-effects model β estimates from logistic regression are larger (in abs. value) than from probit regression by approximately because π 2 /3 1 V (y) = σ 2 = π 2 /3 for logistic V (y) = σ 2 = 1 for probit = 1.8 12

Mixed-effects model β estimates from mixed-effects (random intercepts) model are larger (in abs. value) than from fixed-effects model by approximately d = συ 2 + σ 2 σ 2 because V (y) = σ 2 υ + σ2 in mixed-effects (random intercepts) model V (y) = σ 2 in fixed-effects model difference depends on size of random-effects variance σ 2 υ more complex for models with multiple random effects 13

Standardization of the random effects With υ = T θ (where T T = Σ υ is the Cholesky) λ ijc = γ c (x ijβ + z ijt θ i ) θ i are distributed multivariate standard normal P ijc = Pr (Y ij c θ ; γ c, β, T ) = 1 1+exp( λ ijc ) p ijc = Pr (Y ij = c θ ; γ c, β, T ) = P ijc P ijc 1 14

Conditional probability for response vector y i = n i 1 vector of responses from level-2 unit i Conditional Probability l(y i θ; γ c, β, T ) = l i = n i j=1 C c=1 (p ijc) d ijc where d ijc = 1 if Y ij = c and 0 otherwise Conditional Independence: responses from level-2 unit i are independent given θ 15

Maximum (Marginal) Likelihood Estimation Marginal Probability h(y i ) = h i = θ l i g(θ) dθ g(θ) = multivariate standard normal density Maximize the marginal log-likelihood from N level-2 units log L = N log h i with respect to η = [ γ c. β. T ] i Full-likelihood approach found in SAS PROC NLMIXED & GLIMMIX, STATA, SUPERMIX, MIXOR 16

Numerical Quadrature for integration over θ method to numerically perform an integration θ f(θ)g(θ)dθ Q q=1 f(b q)a(b q ) where B q (q = 1,..., Q) are the quadrature nodes or points A(B q ) (q = 1,..., Q) are the weights (sum = 1) the more points you use, the more accurate the approximation, but the more time it takes For standard normal distribution, Gauss-Hermite quadrature does yield a likelihood value that can be used for LR tests 17

Quadrature: integration over θ Standard normal distribution - tabled values of Q nodes B q and weights A(B q ) logit λ ijcq = γ c (x ij β + z ij T B q) probabilities p ijcq = 1 1 + exp( λ ijcq ) 1 1 + exp( λ ij, c 1, q ) likelihoods l(y i B q ; γ c, β, T ) = l iq = n i j=1 C c=1 (p ijcq) d ijc h(y i ) Q q=1 l iq A(B q ) 18

Empirical Bayes estimates (univariate case) ˆθ i = E(θ i y i ) = 1 h i θ θ i l i g(θ) dθ 1 h i Q q=1 B q l iq A(B q ) The variance of this estimator is obtained as: V (ˆθ i y i ) = 1 h i θ (θ i ˆθ i ) 2 l i g(θ) dθ 1 h i Q q=1 (B q ˆθ i ) 2 l iq A(B q ) At convergence, one more round of quadrature and the values of h i = h(y i ) which vary by i units l iq = l(y i B q ; γ c, β, σ υ ) which vary by i units and quad pts 19

Subject-specific probabilities estimated logits given θ i ˆλ ijc (θ i ) = ˆγ c x ij ˆβ + z ij ˆT θi estimated probabilities ˆp ijc (θ i ) = Marginal probabilities 1 1 + exp( ˆλ ijc (θ i )) 1 1 + exp( ˆλ ijc 1 (θ i )) ˆp ijc = θ ˆp ijc(θ i ) g(θ) dθ Q q=1 ˆp ijc(b q ) A(B q ) 20

Adaptive Quadrature adapt quadrature points and weights for each subject, and at each iteration, using EB estimates of their location ˆθ i and uncertainty s 2 i = V (ˆθ i y i ) requires fewer points to obtain accurate solution especially useful if subject random effects are very spread out (i.e., ICC is high) Adapted quadrature points and weights, from original points B q and weights A q, where φ( ) = normal pdf B iq = ˆθ i + s i B q A iq = 2π s i exp(b 2 q/2) φ(b iq ) A q 21

Multiple Random Effects quadrature solution must integrate over each random effect dimension (r = number of random effects) B q = (B q1, B q2,..., B qr ) = r dimension quad pt vector A(B q ) = r h=1 A(B qh) = product of univariate weights curse of dimensionality: Q r total points, where Q is the number of points per dimension (e.g., Q = 10 and r = 3 leads to evaluation at 1000 points) adaptive quadrature especially useful here, since Q can be lower 22

Other methods for integration of θ Methods based on first- or second-order Taylor series expansions Marginal quasi-likelihood (MQL) involves expansion around the fixed part of the model Penalized or predictive quasi-likelihood (PQL) also includes the random part in its expansion Both are available in the SAS PROC GLIMMIX and MLwiN fast, but doesn t yield a likelihood for LR tests can yield downwardly biased estimates in certain situations (if N and/or n is small, or ICC is high), especially for MQL 23

Laplace approximation - Raudenbush et. al., (2000) a combination of a fully multivariate Taylor series expansion and Laplace approximation fast and computationally accurate yields a likelihood for LR tests available in HLM, though not for all models Other methods Markov Chain Monte Carlo (MCMC) Bayesian approach (in BUGS) Maximum Simulated Likelihood (in some STATA programs) in econometric, transportation, political science literatures 24

Treatment-Related Change Across Time Data from the NIMH Schizophrenia collaborative study on treatment related changes in overall severity. IMPS item 79, Severity of Illness, was scored as: 1 = normal or borderline mentally ill 2 = mildly or moderately ill 3 = markedly ill 4 = severely or among the most extremely ill The experimental design and corresponding sample sizes: Sample size at Week Group 0 1 2 3 4 5 6 completers PLC (n=108) 107 105 5 87 2 2 70 65% DRUG (n=329) 327 321 9 287 9 7 265 81% Drug = Chlorpromazine, Fluphenazine, or Thioridazine Main question of interest: Was there differential improvement for the drug groups relative to the control group? 25

26

27

Within-Subjects / Between-Subjects components Within-subjects model - level 1 (j = 1,..., n i obs) λ ijc = γ c [b 0i + b 1i W eekj ] Between-subjects model - level 2 (i = 1,..., N subjects) b 0i = β 0 + β 2 Grp i + υ 0i b 1i = β 1 + β 3 Grp i υ 0i N ID(0, σ 2 υ) 28

0 β 0 = γ1 = week 0 IMPS79 1st logit (1 vs 2-4) γ 2 β 0 = γ2 = week 0 IMPS79 2nd logit (1-2 vs 3-4) γ 3 β 0 = γ3 = week 0 IMPS79 3rd logit (1-3 vs 4) β 1 β 2 β 3 υ 0i = IMPS79 (sqrt) weekly logit change for PLC patients (Grp = 0) = difference in week 0 IMPS79 logit for DRUG patients (Grp = 1) = difference in IMPS79 (sqrt) weekly logit change for DRUG patients (Grp = 1) = individual deviation from group trend 29

NIMH Schizophrenia Study: Severity of Illness (N = 437) Ordinal Logistic ML Estimates (se) - random intercept model ML estimates se z p < threshold 1-5.859 0.332-17.66.001 threshold 2-2.826 0.290-9.75.001 threshold 3-0.709 0.275-2.58.01 Drug (0 = plc; 1 = drug) -0.058 0.314-0.19.86 Time (sqrt week) -0.766 0.131-5.86.001 Drug by Time -1.206 0.153-7.90.001 Intercept variance 3.774 0.465 Intra-person correlation = 3.774/(3.774 + π 2 /3) =.53 2 log L = 3402.8 30

Within-Subjects / Between-Subjects components Within-subjects model - level 1 (j = 1,..., n i obs) logit cij = γ c [b 0i + b 1i W eekj ] Between-subjects model - level 2 (i = 1,..., N subjects) b 0i = β 0 + β 2 Grp i + υ 0i b 1i = β 1 + β 3 Grp i + υ 1i υ i N ID(0, Σ) 31

Ordinal Logistic Random Intercept and Trend Model ML estimates se z p < threshold 1-7.321 0.473-15.49.001 threshold 2-3.419 0.386-8.86.001 threshold 3-0.813 0.351-2.32.02 Drug (0=plc; 1 = drug) 0.057 0.391 0.15.88 Time (sqrt week) -0.883 0.218-4.05.001 Drug by Time -1.695 0.252-6.72.001 Intercept var 7.001 1.320 Int-Time covar -1.510 0.532 (r υ0 υ 1 =.40) Time var 2.010 0.419 2 log L = 3325.5, χ 2 2 = 77.3, p <.001 32

Model Fit of Observed Proportions 33

SAS parameterization of ordinal model PROC LOGISTIC, PROC GLIMMIX log P (Y c) = γ c + x β or P (Y c) = Ψ(γ c + x β) 1 P (Y c) where Ψ(z) = 1/[1 + exp( z)] here, negative association between x and Y if β > 0 use DESCENDING option to yield: log P (Y c) = γ c + x β or P (Y c) = Ψ(γ c + x β) 1 P (Y c) SAS labels γ c as Intercepts 34

SAS ordinal example without covariates PROC LOGISTIC DESCENDING; MODEL thkso = ; Response Profile Ordered Total Value thkso Frequency 1 4 447 2 3 400 3 2 398 4 1 355 Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 4 1-0.9476 0.0557 289.2237 <.0001 Intercept 3 1 0.1176 0.0501 5.5161 0.0188 Intercept 2 1 1.2548 0.0602 434.9187 <.0001 35

Intercept 4 = -0.9476 = logit of response in category 4 Intercept 3 = 0.1176 = logit of response in category 3 or 4 Intercept 2 = 1.2548 = logit of response in category 2, 3 or 4 Probability of response in category 4 = 1/ [1 + exp( ( 0.9476))] =.2794 = 447/1600 Probability of response in category 3 or 4 = 1/ [1 + exp( (0.1176))] =.5294 = (400 + 447)/1600 Probability of response in category 2, 3 or 4 = 1/ [1 + exp( (1.2548))] =.7781 = (398 + 400 + 447)/1600 category specific probabilities are obtained by subtraction Probability of response in category 3 =.5294 -.2794 =.25 Intercept 2 > Intercept 3 > Intercept 4 36

SAS SYNTAX: SCHZONL.SAS - ordinal fixed- and mixed-effects logistic regression DATA one; INFILE c:\schizx1.dat ; INPUT id imps79 imps79b imps79o int tx week sweek txswk ; /* get rid of observations with missing values */ IF imps79 > -9; PROC FORMAT; VALUE tx 0 = placebo 1 = drug ; /* ordinal fixed-effects logistic regression */ PROC LOGISTIC DATA=one DESCENDING; MODEL imps79o = tx sweek tx*sweek; RUN; /* ordinal random-intercept logistic regression via GLIMMIX */ PROC GLIMMIX DATA=one METHOD=QUAD(QPOINTS=21) NOCLPRINT; CLASS id; MODEL imps79o(desc) = tx sweek tx*sweek / SOLUTION DIST=MULTINOMIAL LINK=CUMLOGIT; RANDOM INTERCEPT / SUBJECT=id; RUN; 37

/* ordinal random-intercept logistic regression via NLMIXED */ PROC NLMIXED DATA=one QPOINTS=21; PARMS b tx=0 b sweek=-.54 b txswk=-.75 varu=1 g1=-3.8 g2=-1.8 g3=-.42; z = b tx*tx + b sweek*sweek + b txswk*tx*sweek + u; IF (imps79o=1) THEN p = 1 / (1 + EXP(-(g1-z))); ELSE IF (imps79o=2) THEN p = (1/(1 + EXP(-(g2-z)))) - (1/(1 + EXP(-(g1-z)))); ELSE IF (imps79o=3) THEN p = (1/(1 + EXP(-(g3-z)))) - (1/(1 + EXP(-(g2-z)))); ELSE IF (imps79o=4) THEN p = 1 - (1 / (1 + EXP(-(g3-z)))); loglike = LOG(p); MODEL imps79o GENERAL(loglike); RANDOM u NORMAL(0,varu) SUBJECT=id; ESTIMATE icc varu/((((atan(1)*4)**2)/3)+varu); RUN; 38

/* ordinal random trend logistic regression via GLIMMIX */ PROC GLIMMIX DATA=one METHOD=QUAD(QPOINTS=11) NOCLPRINT; CLASS id; MODEL imps79o(desc) = tx sweek tx*sweek / SOLUTION DIST=MULTINOMIAL LINK=CUMLOGIT; RANDOM INTERCEPT sweek / SUBJECT=id TYPE=UN GCORR SOLUTION; ODS LISTING EXCLUDE SOLUTIONR; ODS OUTPUT SOLUTIONR=ebest2; RUN; /* ordinal random trend logistic regression via NLMIXED */ PROC NLMIXED DATA=one QPOINTS=11; PARMS b tx=-.1 b sweek=-.8 b txswk=-1.2 v0=3.8 c01=0 v1=1 g1=-6 g2=-3 g3=-.7; z = b tx*tx + b sweek*sweek + b txswk*tx*sweek + u0 + u1*sweek; IF (imps79o=1) THEN p = 1 / (1 + EXP(-(g1-z))); ELSE IF (imps79o=2) THEN p = (1/(1 + EXP(-(g2-z)))) - (1/(1 + EXP(-(g1-z)))); ELSE IF (imps79o=3) THEN p = (1/(1 + EXP(-(g3-z)))) - (1/(1 + EXP(-(g2-z)))); ELSE IF (imps79o=4) THEN p = 1 - (1 / (1 + EXP(-(g3-z)))); loglike = LOG(p); MODEL imps79o GENERAL(loglike); RANDOM u0 u1 NORMAL([0,0], [v0,c01,v1]) SUBJECT=id out=ebest2b; ESTIMATE re corr c01/sqrt(v0*v1); RUN; 39

Model fit of observed marginal proportions 1. ŷ i = X i ˆβ 2. calculate marginalization vector ŝ = 1 σ Diag( ˆV (yi )) 1/2 ˆV (y i ) = Z i ˆΣZ i + σ 2 I i σ = 1 for probit and σ = π/ 3 for logistic Z i = design matrix for random effects for random-intercepts model Z i = 1 i, and so, ŝ = ˆσ 2 υ/σ 2 + 1 40

3. perform element-wise division ẑ i = ŷ i /. ŝ 4. ˆp i = Φ(ẑ i ) for probit and ˆp i = Ψ(ẑ i ) for logistic 5. In practice, for logistic, (15π)/(16 3) works better than π/ 3 as σ (Zeger et al., 1988, Biometrics) 6. Logistic is approximate; relies on cumulative Gaussian approximation to the logistic function 41

SAS IML code: SCHZOFIT.SAS - computing marginal probabilities - ordinal models TITLE1 NIMH Schizophrenia Data - Estimated Marginal Probabilities ; PROC IML; /* Results from NLMIXED analysis: random intercept model */; x0 = { 0 0.00000 0, 0 1.00000 0, 0 1.73205 0, 0 2.44949 0}; x1 = { 1 0.00000 0.00000, 1 1.00000 1.00000, 1 1.73205 1.73205, 1 2.44949 2.44949}; varu = {3.774}; beta = {-.058, -.766, -1.206}; thresh = {-5.859, -2.826, -.709}; 42

/* Approximate Marginalization Method */; pi = ATAN(1)*4; nt = 4; ivec = J(nt,1,1); zvec = J(nt,1,1); evec = (15/16)**2 * (pi**2)/3 * ivec; /* nt by nt matrix with evec on the diagonal and zeros elsewhere */; emat = diag(evec); /* variance-covariance matrix of underlying latent variable */; vary = zvec * varu * T(zvec) + emat; sdy = sqrt(vecdiag(vary) / vecdiag(emat)); 43

za0 = (thresh[1] - x0*beta) / sdy ; zb0 = (thresh[2] - x0*beta) / sdy; zc0 = (thresh[3] - x0*beta) / sdy; za1 = (thresh[1] - x1*beta) / sdy; zb1 = (thresh[2] - x1*beta) / sdy; zc1 = (thresh[3] - x1*beta) / sdy; grp0a = 1 / ( 1 + EXP(- za0)); grp0b = 1 / ( 1 + EXP(- zb0)); grp0c = 1 / ( 1 + EXP(- zc0)); grp1a = 1 / ( 1 + EXP(- za1)); grp1b = 1 / ( 1 + EXP(- zb1)); grp1c = 1 / ( 1 + EXP(- zc1)); print Random intercept model ; print Approximate Marginalization Method ; print marginal prob for group 0 - catg 1 grp0a [FORMAT=8.4]; print marginal prob for group 0 - catg 2 (grp0b-grp0a) [FORMAT=8.4]; print marginal prob for group 0 - catg 3 (grp0c-grp0b) [FORMAT=8.4]; print marginal prob for group 0 - catg 4 (1-grp0c) [FORMAT=8.4]; print marginal prob for group 1 - catg 1 grp1a [FORMAT=8.4]; print marginal prob for group 1 - catg 2 (grp1b-grp1a) [FORMAT=8.4]; print marginal prob for group 1 - catg 3 (grp1c-grp1b) [FORMAT=8.4]; print marginal prob for group 1 - catg 4 (1-grp1c) [FORMAT=8.4]; 44

/* Random Intercept and Trend Model */; varu = {7.001-1.510, -1.510 2.010}; beta = {.057, -.883, -1.695}; thresh = {-7.321, -3.419, -.813}; zmat = {1 0.00000, 1 1.00000, 1 1.73205, 1 2.44949}; /* variance-covariance matrix of underlying latent variable */; vary = zmat * varu * T(zmat) + emat; sdy = sqrt(vecdiag(vary) / vecdiag(emat)); 45

za0 = (thresh[1] - x0*beta) / sdy ; zb0 = (thresh[2] - x0*beta) / sdy; zc0 = (thresh[3] - x0*beta) / sdy; za1 = (thresh[1] - x1*beta) / sdy; zb1 = (thresh[2] - x1*beta) / sdy; zc1 = (thresh[3] - x1*beta) / sdy; grp0a = 1 / ( 1 + EXP(- za0)); grp0b = 1 / ( 1 + EXP(- zb0)); grp0c = 1 / ( 1 + EXP(- zc0)); grp1a = 1 / ( 1 + EXP(- za1)); grp1b = 1 / ( 1 + EXP(- zb1)); grp1c = 1 / ( 1 + EXP(- zc1)); print Random intercept and trend model ; print Approximate Marginalization Method ; print marginal prob for group 0 - catg 1 grp0a [FORMAT=8.4]; print marginal prob for group 0 - catg 2 (grp0b-grp0a) [FORMAT=8.4]; print marginal prob for group 0 - catg 3 (grp0c-grp0b) [FORMAT=8.4]; print marginal prob for group 0 - catg 4 (1-grp0c) [FORMAT=8.4]; print marginal prob for group 1 - catg 1 grp1a [FORMAT=8.4]; print marginal prob for group 1 - catg 2 (grp1b-grp1a) [FORMAT=8.4]; print marginal prob for group 1 - catg 3 (grp1c-grp1b) [FORMAT=8.4]; print marginal prob for group 1 - catg 4 (1-grp1c) [FORMAT=8.4]; 46

Proportional and Non-proportional Odds Proportional Odds model log P (Y ij c) = γ c [ x 1 P (Y ij c) ijβ + z ] ijυ i with υ i N(0, T T = Σ) = γ c [ x ij β + z ij T θ ] i with θ i N(0, I) relationship between the explanatory variables and the cumulative logits does not depend on c effects of x variables DO NOT vary across the C 1 cumulative logits 47

Hedeker & Mermelstein (1998, Mult Behav Res) extension: log P (Y ij c) 1 P (Y ij c) = γ c(0) [ u ij γ c + x ij β + z ij T θ i u ij = h 1 vector for the set of h covariates for which proportional odds is not assumed effects of u variables DO vary across the C 1 cumulative logits more flexible model for ordinal response relations ] 48

Proportional Odds Assumption: covariate effects are the same across all cumulative logits Response group Absent Mild Severe total control 27 46 27 100 cumulative odds 27 73 =.37 73 27 = 2.7 logit -1 1 treatment 38 44 18 100 cumulative odds 38 62 =.61 82 18 = 4.6 logit -.5 1.5 group difference =.5 for both cumulative logits 49

log log P (Y ij = 1) = γ 1 x β 1 P (Y ij = 2 or 3) P (Y ij = 1 or 2) P (Y ij = 3) = γ 2 x β 1 γ 1 = 1, γ 2 = 1, β 1 =.5 (covariate effect is same for both cumulative logits) 50

Non-Proportional Odds: covariate effects vary across the cumulative logits Response group Absent Mild Severe total control 27 46 27 100 cumulative odds 27 73 =.37 73 27 = 2.7 logit -1 1 treatment 28 60 12 100 cumulative odds 28 72 =.39 88 12 = 7.3 logit -.95 2 UNEQUAL group difference across cumulative logits 51

log log P (Y ij = 1) = γ P (Y ij = 2 or 3) 1 (0) u γ 1 P (Y ij = 1 or 2) P (Y ij = 3) = γ 2 (0) u γ 2 γ 1 (0) = 1, γ 2(0) = 1, γ 1 =.05, γ 2 =.1 (covariate effect varies across the cumulative logits) 52

NIMH Schiz Study: Severity of Illness (N = 437) Ordinal LR Estimates (se) - random intercept and trend model Proportional Non-Proportional Odds Model Odds Model 1 vs 2-4 1,2 vs 3,4 1-3 vs 4 Drug (0=plc; 1 = drug).056-1.067 -.326.271 (.391) (1.235) (.510) (.433) Time (sqrt week) -.884-1.387-1.171 -.729 (.218) (.598) (.283) (.243) Drug by Time -1.694-1.177-1.420-1.818 (.253) (.632) (.317) (.297) 2 log L 3325.49 3321.75 Proportional Odds accepted (χ 2 6 = 3325.49 3321.75 = 3.74) 53

San Diego Homeless Research Project (Hough) 361 mentally ill subjects who were homeless or at very high risk of becoming homeless 2 conditions: HUD Section 8 rental certificates (yes/no) baseline and 6, 12, and 24 month follow-ups Categorical outcome: housing status streets / shelters (Y = 1) community / institutions (Y = 2) independent (Y = 3) Question: Do Section 8 certificates influence housing status across time? 54

55

56

Housing status across time: 1289 observations within 361 subjects Ordinal Mixed Regression Model estimates and standard errors (se) Proportional Odds Non-Proportional Odds Model Non-street 1 Independent 2 term estimate se estimate se estimate se threshold 1.220.198.322.207 threshold 2 2.964.230 2.700.298 t1 (6 month) 1.736.235 2.297.303 1.080.343 t2 (12 month) 2.315.247 3.346.387 1.646.340 t3 (24 month) 2.499.253 2.823.348 2.147.337 section 8 (y=1).497.276.592.294.324.394 section 8 by t1 1.408.341.567.467 2.022.471 section 8 by t2 1.173.353 -.959.501 2.016.476 section 8 by t3.638.349 -.369.480 1.071.464 subject variance 2.128.354 2.124.353 (ICC.4) 2 log L 2274.4 2222.2 (χ 2 7 = 52.2) bold indicates p <.05 italic indicates.05 < p <.10 1 = independent + community vs street 2 = independent vs community + street 57

58

SAS NLMIXED code: SANDONL.SAS - random-intercepts proportional odds model DATA one; INFILE c:\sdhouse.dat ; INPUT Id housing int section8 time1 time2 time3 sect8t1 sect8t2 sect8t3; /* recode missing values */ IF housing = 999 THEN housing =.; PROC NLMIXED DATA=one QPOINTS=21; PARMS btime1=1.4 btime2=1.8 btime3=2.0 bsection8=.4 bsect8t1=.9 bsect8t2=.9 bsect8t3=.4 g1=.2 g2=2.3 varu=1; z = btime1*time1 + btime2*time2 + btime3*time3 + bsection8*section8 + bsect8t1*sect8t1 + bsect8t2*sect8t2 + bsect8t3*sect8t3+ u; IF (Housing=0) THEN p = 1 / (1 + EXP(-(g1-z))); ELSE IF (Housing=1) THEN p = (1/(1 + EXP(-(g2-z)))) - (1/(1 + EXP(-(g1-z)))); ELSE IF (Housing=2) THEN p = 1 - (1 / (1 + EXP(-(g2-z)))); loglike = LOG(p); MODEL housing GENERAL(loglike); RANDOM u NORMAL(0,varu) SUBJECT=Id; ESTIMATE icc varu/((((atan(1)*4)**2)/3) + varu); RUN; 59

SAS NLMIXED: SANDONL.SAS - random-intercepts non-proportional odds model using data from the earlier NLMIXED example PROC NLMIXED DATA=one QPOINTS=21; PARMS g1 Time1=1.4 g1 TIme2=1.8 g1 TIme3=2.0 g1 Section8=.4 g1 Sect8T1=.9 g1 Sect8T2=.9 g1 Sect8T3=.4 g2 Time1=1.4 g2 TIme2=1.8 g2 TIme3=2.0 g2 Section8=.4 g2 Sect8T1=.9 g2 Sect8T2=.9 g2 Sect8T3=.4 g1=.2 g2=2.3 varu=1; z1 = g1 Time1*Time1 + g1 Time2*Time2 + g1 Time3*Time3 + g1 Section8*Section8 + g1 Sect8T1*Sect8T1 + g1 Sect8T2*Sect8T2 + g1 Sect8T3*Sect8T3+ u; z2 = g2 Time1*Time1 + g2 Time2*Time2 + g2 Time3*Time3 + g2 Section8*Section8 + g2 Sect8T1*Sect8T1 + g2 Sect8T2*Sect8T2 + g2 Sect8T3*Sect8T3+ u; IF (Housing=0) THEN p = 1 / (1 + EXP(-(g1-z1))); ELSE IF (Housing=1) THEN p = (1/(1 + EXP(-(g2-z2)))) - (1/(1 + EXP(-(g1-z1)))); ELSE IF (Housing=2) THEN p = 1 - (1 / (1 + EXP(-(g2-z2)))); loglike = LOG(p); MODEL Housing GENERAL(loglike); RANDOM u NORMAL(0,varu) SUBJECT=Id; ESTIMATE icc varu/((((atan(1)*4)**2)/3) + varu); RUN; 60

SAS IML code: SANDOFIT.SAS - computing marginal probabilities - ordinal models TITLE1 San Diego Homeless Data - Estimated Marginal Probabilities ; PROC IML; /* Results from NLMIXED analysis: proportional odds model */; x0 = { 0 0 0 0 0 0 0, 1 0 0 0 0 0 0, 0 1 0 0 0 0 0, 0 0 1 0 0 0 0}; x1 = { 0 0 0 1 0 0 0, 1 0 0 1 1 0 0, 0 1 0 1 0 1 0, 0 0 1 1 0 0 1}; varu = {2.128}; beta = {1.736, 2.315, 2.499, 0.497, 1.408, 1.173,.638}; thresh = {0.220, 2.964}; 61

/* number of quadrature points, quadrature nodes & weights */ nq = 10; bq = { -4.85946282833231, -3.58182348355193, -2.48432584163895, -1.46598909439116, -0.48493570751550, 0.48493570751550, 1.46598909439116, 2.48432584163895, 3.58182348355193, 4.85946282833231}; wq = { 0.00000431065265, 0.00075807095698, 0.01911158107317, 0.13548370704150, 0.34464234526294, 0.34464234526294, 0.13548370704150, 0.01911158107317, 0.00075807095698, 0.00000431065265}; /* initialize to zero */ grp0a = J(4,1,0); grp0b = J(4,1,0); grp1a = J(4,1,0); grp1b = J(4,1,0); 62

DO q = 1 to nq; END; za0 = thresh[1] - (x0*beta + SQRT(varu)*bq[q]); zb0 = thresh[2] - (x0*beta + SQRT(varu)*bq[q]); za1 = thresh[1] - (x1*beta + SQRT(varu)*bq[q]); zb1 = thresh[2] - (x1*beta + SQRT(varu)*bq[q]); grp0a = grp0a + ( 1 / ( 1 + EXP(- za0)))*wq[q]; grp0b = grp0b + ( 1 / ( 1 + EXP(- zb0)))*wq[q]; grp1a = grp1a + ( 1 / ( 1 + EXP(- za1)))*wq[q]; grp1b = grp1b + ( 1 / ( 1 + EXP(- zb1)))*wq[q]; print Proportional odds model ; print Quadrature method - 10 points ; print marginal prob for group 0 - catg 1 grp0a [FORMAT=8.4]; print marginal prob for group 0 - catg 2 (grp0b-grp0a) [FORMAT=8.4]; print marginal prob for group 0 - catg 3 (1-grp0b) [FORMAT=8.4]; print marginal prob for group 1 - catg 1 grp1a [FORMAT=8.4]; print marginal prob for group 1 - catg 2 (grp1b-grp1a) [FORMAT=8.4]; print marginal prob for group 1 - catg 3 (1-grp1b) [FORMAT=8.4]; 63

/* Non-Proportional Odds Model */; varu = {2.124}; gam1 = {2.297, 3.346, 2.823,.592,.567, -.959, -.369}; gam2 = {1.080, 1.646, 2.147,.324, 2.022, 2.016, 1.071}; thresh = {.322, 2.700}; /* initialize to zero */ grp0a = J(4,1,0); grp0b = J(4,1,0); grp1a = J(4,1,0); grp1b = J(4,1,0); 64

DO q = 1 to nq; END; za0 = thresh[1] - (x0*gam1 + SQRT(varu)*bq[q]); zb0 = thresh[2] - (x0*gam2 + SQRT(varu)*bq[q]); za1 = thresh[1] - (x1*gam1 + SQRT(varu)*bq[q]); zb1 = thresh[2] - (x1*gam2 + SQRT(varu)*bq[q]); grp0a = grp0a + ( 1 / ( 1 + EXP(- za0)))*wq[q]; grp0b = grp0b + ( 1 / ( 1 + EXP(- zb0)))*wq[q]; grp1a = grp1a + ( 1 / ( 1 + EXP(- za1)))*wq[q]; grp1b = grp1b + ( 1 / ( 1 + EXP(- zb1)))*wq[q]; print Non-proportional odds model ; print Quadrature method - 10 points ; print marginal prob for group 0 - catg 1 grp0a [FORMAT=8.4]; print marginal prob for group 0 - catg 2 (grp0b-grp0a) [FORMAT=8.4]; print marginal prob for group 0 - catg 3 (1-grp0b) [FORMAT=8.4]; print marginal prob for group 1 - catg 1 grp1a [FORMAT=8.4]; print marginal prob for group 1 - catg 2 (grp1b-grp1a) [FORMAT=8.4]; print marginal prob for group 1 - catg 3 (1-grp1b) [FORMAT=8.4]; 65

Mixed-effects Multinomial Logistic Regression Model for Nominal Responses (Hedeker, 2003) Y ij = nominal response of level-2 unit i and level-1 unit j log p ijc p ij1 = u ijγ c + z ijυ ic C 1 contrasts to reference cell (c = 1) regression effects γ c vary across contrasts (C 1) r random-effects N(0, Σ υ ) c = 2, 3,... C For example, with C = 3 contrast ordinal nominal c1 2 & 3 vs 1 2 vs 1 c2 3 vs 1 & 2 3 vs 1 66

Model in terms of the category probabilities p ijc = Pr(Y ij = c υ ic ) = exp(z ijc ) 1 + C h=2 exp(z ijh ) for c = 2, 3,..., C p ij1 = Pr(Y ij = 1 υ ic ) = 1 1 + C h=2 exp(z ijh ) where the multinomial logit z ijc = u ij γ c + z ij υ ic 67

68

Housing status across time: 1289 observations within 361 subjects Nominal Mixed Regression Model estimates & standard errors (se) Community vs Street Independent vs Street term estimate se estimate se intercept -.635.198-2.600.353 t1 (6 month) 1.914.299 2.332.410 t2 (12 month) 2.686.373 3.677.468 t3 (24 month) 1.982.342 3.735.444 section 8 (y=1).518.276.675.445 section 8 by t1 -.639.469 1.886.591 section 8 by t2-2.499.534.556.617 section 8 by t3-1.234.491.256.589 subject variance 1.099.325 3.249.674 2 log L 2214.35 bold indicates p <.05 italic indicates.05 < p <.10 69

Housing status across time: 1289 observations within 361 subjects Nominal Mixed Regression Model estimates & standard errors (se) Community vs Street Independent vs Street term estimate se estimate se intercept -.622.233-2.600.384 t1 (6 month) 2.374.348 2.888.454 t2 (12 month) 3.345.442 4.394.530 t3 (24 month) 2.589.402 4.310.496 section 8 (y=1).652.328.831.494 section 8 by t1 -.331.523 1.969.644 section 8 by t2-2.475.590.287.677 section 8 by t3-1.160.544.202.644 subject variance 2.702.325 5.821 1.136 covariance 2.884.765 r =.73 2 log L 2180.9 χ 2 1 = 33.5 bold indicates p <.05 italic indicates.05 < p <.10 70

71

72

SAS code: SANDNNL.SAS - Nominal models with reference cell contrasts /* Nominal logistic regression via LOGISTIC */ PROC LOGISTIC DATA=one; MODEL Housing(REF=FIRST) = Time1 Time2 Time3 Section8 Sect8T1 Sect8T2 Sect8T3 / LINK=GLOGIT; /* Nominal random intercept (uncorrelated) logistic regression via GLIMMIX */ PROC GLIMMIX DATA=one METHOD=QUAD(QPOINTS=11) NOCLPRINT; CLASS Id Housing; MODEL Housing(REF=FIRST) = Time1 Time2 Time3 Section8 Sect8T1 Sect8T2 Sect8T3 / SOLUTION DIST=MULTINOMIAL LINK=GLOGIT; RANDOM INTERCEPT / SUBJECT=Id GROUP=Housing; /* Nominal random intercept (correlated) logistic regression via NLMIXED */ PROC NLMIXED DATA=one QPOINTS=11; PARMS g1 0=-.49 g1 Time1=1.63 g1 Time2=2.37 g1 Time3=1.8 g1 Section8=.43 g1 Sect8T1=-.46 g1 Sect8T2=-2.12 g1 Sect8T3=-1.1 var1=1 g2 0=-1.7 g2 Time1=1.90 g2 Time2=2.97 g2 Time3=2.9 g2 Section8=.54 g2 Sect8T1=1.13 g2 Sect8T2=-.04 g2 Sect8T3=-.07 var2=1 cov12=0; z1 = g1 0 + g1 Time1*Time1 + g1 Time2*Time2 + g1 Time3*Time3 + g1 Section8*Section8 + g1 Sect8T1*Sect8T1 + g1 Sect8T2*Sect8T2 + g1 Sect8T3*Sect8T3 + u1; z2 = g2 0 + g2 Time1*Time1 + g2 Time2*Time2 + g2 Time3*Time3 + g2 Section8*Section8 + g2 Sect8T1*Sect8T1 + g2 Sect8T2*Sect8T2 + g2 Sect8T3*Sect8T3 + u2; IF (Housing=0) THEN p = 1 / (1 + EXP(z1) + EXP(z2)); ELSE IF (Housing=1) THEN p = EXP(z1) / (1 + EXP(z1) + EXP(z2)); ELSE IF (Housing=2) THEN p = EXP(z2) / (1 + EXP(z1) + EXP(z2)); LogLike = LOG(p); MODEL Housing GENERAL(LogLike); RANDOM u1 u2 NORMAL([0,0],[var1,cov12,var2]) SUBJECT=Id; ESTIMATE RE corr cov12 / SQRT(var1*var2); RUN; 73

SAS IML code: SANDNFIT.SAS - computing marginal probabilities - nominal model TITLE1 San Diego Homeless Data - Estimated Marginal Probabilities ; PROC IML; /* Results from GLIMMIX analysis - uncorrelated random effects */; u0 = { 1 0 0 0 0 0 0 0, 1 1 0 0 0 0 0 0, 1 0 1 0 0 0 0 0, 1 0 0 1 0 0 0 0}; u1 = { 1 0 0 0 1 0 0 0, 1 1 0 0 1 1 0 0, 1 0 1 0 1 0 1 0, 1 0 0 1 1 0 0 1}; varu = {1.099, 3.249}; gam1 = {1.914, 2.686, 1.982,.518, -.639, -2.499, -1.234}; gam2 = {-2.600, 2.332, 3.677, 3.735,.675, 1.886,.556,.256}; 74

/* number of quadrature points, quadrature nodes & weights */ nq = 10; bq = { -4.85946282833231, -3.58182348355193, -2.48432584163895, -1.46598909439116, -0.48493570751550, 0.48493570751550, 1.46598909439116, 2.48432584163895, 3.58182348355193, 4.85946282833231}; wq = { 0.00000431065265, 0.00075807095698, 0.01911158107317, 0.13548370704150, 0.34464234526294, 0.34464234526294, 0.13548370704150, 0.01911158107317, 0.00075807095698, 0.00000431065265}; /* initialize to zero */ grp0a = J(4,1,0); grp0b = J(4,1,0); grp0c = J(4,1,0); grp1a = J(4,1,0); grp1b = J(4,1,0); grp1c = J(4,1,0); 75

DO q1 = 1 to nq; DO q2 = 1 to nq; z1 0 = u0*gam1 + SQRT(varu[1])*bq[q1]; z2 0 = u0*gam2 + SQRT(varu[2])*bq[q2]; z1 1 = u1*gam1 + SQRT(varu[1])*bq[q1]; z2 1 = u1*gam2 + SQRT(varu[2])*bq[q2]; denom0 = 1 + EXP(z1 0) + EXP(z2 0); denom1 = 1 + EXP(z1 1) + EXP(z2 1); grp0a = grp0a + (1 / denom0)*wq[q1]*wq[q2]; grp0b = grp0b + (EXP(z1 0) / denom0)*wq[q1]*wq[q2]; grp0c = grp0c + (EXP(z2 0) / denom0)*wq[q1]*wq[q2]; grp1a = grp1a + (1 / denom1)*wq[q1]*wq[q2]; grp1b = grp1b + (EXP(z1 1) / denom1)*wq[q1]*wq[q2]; grp1c = grp1c + (EXP(z2 1) / denom1)*wq[q1]*wq[q2]; END;END; print Uncorrelated random effects model fit - Quadrature method - 10 points ; print marginal prob for group 0 - category 1 grp0a [FORMAT=8.4]; print marginal prob for group 0 - category 2 grp0b [FORMAT=8.4]; print marginal prob for group 0 - category 3 grp0c [FORMAT=8.4]; print marginal prob for group 1 - category 1 grp1a [FORMAT=8.4]; print marginal prob for group 1 - category 2 grp1b [FORMAT=8.4]; print marginal prob for group 1 - category 3 grp1c [FORMAT=8.4]; 76

/* Results from NLMIXED analysis - correlated random effects */; varu = {2.702 2.884, 2.884 5.821}; gam1 = {-0.622, 2.374, 3.345, 2.589, 0.652, -0.331, -2.475, -1.160}; gam2 = {-2.599, 2.888, 4.394, 4.310, 0.831, 1.969, 0.287, 0.202}; /* get the Cholesky in lower triangular form */ chol = T(ROOT(varu)); /* select the two rows of the Cholesky for the two category contrasts */ chol1 = chol[1,]; chol2 = chol[2,]; /* initialize to zero */ grp0a = J(4,1,0); grp0b = J(4,1,0); grp0c = J(4,1,0); grp1a = J(4,1,0); grp1b = J(4,1,0); grp1c = J(4,1,0); 77

DO q1 = 1 to nq; DO q2 = 1 to nq; bqq = bq[q1] // bq[q2]; z1 0 = u0*gam1 + chol1*bqq; z2 0 = u0*gam2 + chol2*bqq; z1 1 = u1*gam1 + chol1*bqq; z1 1 = u1*gam2 + chol2*bqq; denom0 = 1 + EXP(z1 0) + EXP(z2 0); denom1 = 1 + EXP(z1 1) + EXP(z2 1); grp0a = grp0a + (1 / denom0)*wq[q1]*wq[q2]; grp0b = grp0b + (EXP(z1 0) / denom0)*wq[q1]*wq[q2]; grp0c = grp0c + (EXP(z2 0) / denom0)*wq[q1]*wq[q2]; grp1a = grp1a + (1 / denom1)*wq[q1]*wq[q2]; grp1b = grp1b + (EXP(z1 1) / denom1)*wq[q1]*wq[q2]; grp1c = grp1c + (EXP(z2 1) / denom1)*wq[q1]*wq[q2]; END;END; print Correlated random effects model fit - Quadrature method - 10 points ; print marginal prob for group 0 - category 1 grp0a [FORMAT=8.4]; print marginal prob for group 0 - category 2 grp0b [FORMAT=8.4]; print marginal prob for group 0 - category 3 grp0c [FORMAT=8.4]; print marginal prob for group 1 - category 1 grp1a [FORMAT=8.4]; print marginal prob for group 1 - category 2 grp1b [FORMAT=8.4]; print marginal prob for group 1 - category 3 grp1c [FORMAT=8.4]; 78

Reference cell coding (3 categories) category z1 z2 1 housing=0 0 0 2 housing=1 1 0 3 housing=2 0 1 Helmert contrasts (3 categories) category z1 z2 1 housing=0-2/3 0 2 housing=1 1/3-1/2 3 housing=2 1/3 1/2 contrasts z1 and z2 sum to zero and represent unit differences 79

Housing status across time: 1289 observations within 361 subjects Nominal mixed model estimates and std errors (se) - Helmert Contrasts Ind & Comm Independent vs Street vs Community term estimate se estimate se intercept -1.610.260-1.976.363 t1 (6 month vs base) 2.631.356.514.385 t2 (12 month vs base) 3.870.448 1.049.388 t3 (24 month vs base) 3.450.407 1.721.390 section 8 (yes=1 no=0).741.355.179.447 section 8 by t1.819.524 2.300.525 section 8 by t2-1.094.572 2.762.550 section 8 by t3 -.479.536 1.361.523 subject variance 3.574.776 2.755.589 covariance 1.559.477 r =.50 2 log L 2180.9 bold indicates p <.05 italic indicates.05 < p <.10 80

SAS NLMIXED code: SANDNNL.SAS - random-intercepts nominal model with Helmert contrasts PROC NLMIXED DATA=one QPOINTS=11; PARMS h1 0=-.49 h1 Time1=1.63 h1 Time2=2.37 h1 Time3=1.8 h1 Section8=.43 h1 Sect8T1=-.46 h1 Sect8T2=-2.12 h1 Sect8T3=-1.1 var1=1 h2 0=-1.7 h2 Time1=1.90 h2 Time2=2.97 h2 Time3=2.9 h2 Section8=.54 h2 Sect8T1=1.13 h2 Sect8T2=-.04 h2 Sect8T3=-.07 var2=1 cov12=0; z1 = h1 0 + h1 Time1*Time1 + h1 Time2*Time2 + h1 Time3*Time3 + h1 Section8*Section8 + h1 Sect8T1*Sect8T1 + h1 Sect8T2*Sect8T2 + h1 Sect8T3*Sect8T3 + u1; z2 = h2 0 + h2 Time1*Time1 + h2 Time2*Time2 + h2 Time3*Time3 + h2 Section8*Section8 + h2 Sect8T1*Sect8T1 + h2 Sect8T2*Sect8T2 + h2 Sect8T3*Sect8T3 + u2; IF (Housing=0) THEN p = EXP(-2/3*z1) / (EXP(-2/3*z1) + EXP(1/3*z1-1/2*z2) + EXP(1/3*z1 + 1/2*z2)); ELSE IF (Housing=1) THEN p = EXP(1/3*z1-1/2*z2) / (EXP(-2/3*z1) + EXP(1/3*z1-1/2*z2) + EXP(1/3*z1 + 1/2*z2)); ELSE IF (Housing=2) THEN p = EXP(1/3*z1 + 1/2*z2) / (EXP(-2/3*z1) + EXP(1/3*z1-1/2*z2) + EXP(1/3*z1 + 1/2*z2); LogLike = LOG(p); MODEL Housing GENERAL(LogLike); RANDOM u1 u2 NORMAL([0,0],[var1,cov12,var2]) SUBJECT=Id; ESTIMATE RE corr cov12 / SQRT(var1*var2); RUN; 81

Summary Models for longitudinal categorical data as developed as models for continuous data Proportional odds models Non and partial proportional odds models Nominal models (with Reference-cell or Helmert contrasts) Scaling models (Hedeker, Berbaum, & Mermelstein, 2006; Hedeker, Demirtas, & Mermelstein, 2009) Available software includes SAS PROCs GLIMMIX & NLMIXED, STATA, SuperMix, MIXOR, MIXNO,... 82

SuperMix Free student and 15-day trial editions http://www.ssicentral.com/supermix/downloads.html Datasets and examples http://www.ssicentral.com/supermix/examples.html Manual and documentation in PDF form http://www.ssicentral.com/supermix/resources.html 83