Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22

Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial, multinomial. Basic ideas on MLE theory, CI s and testing. Pneumonia in calves example. You should know basic MLE/CLT theory as applied to the binomial distribution, and be able to construct simple Wald, LRT, and score tests. You should be able to perform a simple hypothesis test and form a CI from an estimate ˆβ j and associated standard error se( ˆβ j ). Recall z 0.025 = 1.96. 2 / 22

Chapter 2: I J tables 2 2 tables: OR θ. X Y θ = 1. Differences in proportions, relative risk. Estimability of odds ratio from case/control data interpretation flips. Types of sampling: multinomial, product multinomial, Poisson. 2 2 K tables. Simpson s paradox: marginal association has different direction than conditional association. Homogeneous association θ XY (1) = = θ XY (K), conditional independence X Y Z. Conditional independence does not imply marginal independence. Death penalty example, infection cream across 8 clinics example. On to I J tables. Ordinal trends: γ measure of concordance and polychoric correlation ρ P. 3 / 22

Chapter 3: Estimation, testing in I J tables Estimation: OR, RR, difference in proportions for 2 2 tables. Testing independence: Pearson and LRT. Large sample χ 2 (I 1)(J 1) approximation. If reject H 0 : X Y find out why: residuals and partitioning χ 2. I J tables with ordinal outcomes. Focusing on a measure of association: ˆγ, polychoric correlation ˆρ, Pearson correlation based on replacing outcomes with scores ˆρ; all in PROC FREQ. Exact (incl. Fisher) tests of H 0 : X Y by conditioning on sufficient statistics (marginal totals). 4 / 22

Chapter 4: How does categorical response or counts change with predictors? GLMs GLMs: basic notation. Binomial and Poisson regression; identity and canonical links. Crab satellite data! Deviance G 2, saturated model. Negative binomial regression (just mentioned). Bit of GLM theory: moments, fitting procedures, residuals. Quasi-likelihood adds dispersion φ. MOM estimation from large sample theory, overdispersion inflate MLE SE s via ˆφ. SCALE=PEARSON or SCALE=DEVIANCE. 5 / 22

Chapter 5: logistic regression I Logistic regression with one predictor. Parameter estimates give odds ratios. Case/control (retrospective) studies don t change parameter estimates (only intercept estimate). More crab analyses. GOF for logistic regression. Grouped data versus ungrouped. Hosmer and Lemeshow. Categorical predictors; interactions; quadratic effects. Multiple predictors; type 3 tests. A bit on fitting. 6 / 22

Chapter 6: Logistic regression II Building models. Hierarchical models. Backwards elimination; stepwise procedures. AIC. Crab data (yet again). Diagnostics: residuals (Pearson and standardized Pearson r i ), Cook s distance-type influence statistic c i. Dfbeta ij. Logistic regression residuals LOESS smooth plot. Predictive ability assessment: ROC curve, CTABLE, default. 7 / 22

Chapter 6: Logistic regression II 2 2 K tables: CMH (Cochran-Mantel-Haenszel) versus logistic approach. Estimation of stratum (block) effects (useful for model checking in GLMM!). Testing X Y Z: additive versus interaction alternatives. Clinical trial data on infection cream. Additive = homogeneous association: one overall treatment effect. Finite ˆβ. Sample size and power in study design. 8 / 22

Chapter 7: Logistic regression III, adding flexibility Alternate links: probit, complimentary log-log, Cauchy. Left out: nonparametric estimation of link. Small sample testing of β j in logistic regression. Bayesian approach works for small samples. Use of Jeffreys prior asmyptotically (FIRTH in LOGISTIC) and for small samples (BAYES COEFPRIOR=JEFFREYS in GENMOD) Generalized additive models. 9 / 22

Chapter 8: extending the logistic regression model to nominal and ordinal multinomial outcomes Baseline-category logit models for nominal multinomial response. Alligator food!!! Know how to write down model and obtain probabilities. Cumulative logit (proportional odds) models for ordinal multinomial response. log P(Y j x 1)/P(Y > j x 1 ) P(Y j x 2 )/P(Y > j x 2 ) = β (x 1 x 2 ) is log cumulative odds ratio. Latent variable motivation. Mental impairment example. Skipped discrete survival. Discrete choice model. 10 / 22

Chapter 11: matched pairs & marginal versus conditional modeling Marginal analysis of dependent proportions. Prime minister approval rating data!!! McNemar s test of marginal homogeneity for 2 2 table. Conditional logistic regression. Matched case/control studies: gives different conditional likelihood than unmatched case/control data. Introduces idea of subject-specific effects u i. In PROC LOGISTIC add a STRATA statement. From text: Conditional ML is also appropriate with retrospective sampling. In that case, bias can occur with a random effects approach because the clusters are not randomly sampled. 11 / 22

I I tables Marginal homogeneity (Stuart-Maxwell) in I I table. Symmetry. κ statistic for rater agreement. 12 / 22

Chapter 12: Marginal modeling of clustered data: GEE approach GEE approach to marginal modeling. Focuses on estimation of population averaged (marginal) effects. Working correlation structures: exchangeable, AR(1), etc. Sandwich estimator ĉov(ˆβ) uses estimated working covariance matrix as well as empirical estimate; requires proper specification of the mean E(Y ij ) = g 1 (x ijβ) to be valid (as most models do). Longitudinal mental depression data. Interaction of time and treatment. QIC. Markov transitional modeling for time series type Bernoulli data. 13 / 22

Chapter 13: Conditional modeling of clustered data: GLMMs GLMM used a lot, and widespread use of random effects and latent variables models in general. Basic idea: random effect u i induces positive correlation among repeated measurements in cluster i: (Y i1,..., Y ini ). Can represent latent, unmeasured covariates or predisposition toward the event being modeled. e.g. level of sleeplessness, tolerance for pain, clinic population effect. Only looked at univariate u i. Logistic-normal model. Marginal from conditional: P(Y ij = 1) = E(Y ij ) e cx ij β /(1 + e cx ij β ) where c = 1/ 1 + 0.6σ 2. 14 / 22

More on GLMMs Longitudinal mental depression example again. Differences in interpretation between GEE approach and GLMM. Clinical trials example again. Checking normality of random effects. 2 2 K tables where stratum effect u i modeled explicitly via random effects (homework problem). Testing H 0 : σ = 0 in logistic-normal model from fitting model with and without random effects. Is Wald test from table of coefficients okay here? Left out: nonparametric modeling of random effects u 1,..., u n. Diagnostics. Multilevel models with layers of random effects. Other correlation models, e.g. temporal, spatial. PQL approach (fast, easy, and inaccurate similar to GLIMMIX). 15 / 22

Chapters 9 and 10: log-linear models Back to tables, but higher order than I J, e.g. I J K L. Model the cell counts directly as outcome. Every model implies a conditional dependence structure. Shorthand [ABD][CBD] implies? Collapsibility theorem. Diagnostics: {r ijkl } and G 2. 16 / 22

Omitted or only briefly mentioned... Various models: quasi-symmetry, quasi-independence, Bradley-Terry model (Chapter 11); adjacent categories logits, etc... Additive models: various alternative fitting approaches, interaction surfaces, etc... Marginal approach to tables via MLE (12.1). Much, much more...we scratched the surface but covered a lot of ground. 17 / 22

You should be able to... Briefly describe in words what the polychoric correlation ˆρ, gamma statistic ˆγ, and Pearson statistic ˆρ (based on scores) measure. For what type of data are these measures valid? Show how odds ratio interpretation flips in 2 2 table. Be able to interpret a logistic regression model with numerous categorical predictors involving interactions. Be able to coherently interpret the residual and partitioning approaches to following up tests of independence in I J tables. Patterns of residual signs? Be able to describe in words what the quasi-likelihood approach to modeling overdispersion does for Poisson and binomial data. That is, what is var(y i ) modeled as under the actual (real) probability models versus the quasi-likelihood approaches? 18 / 22

You should be able to... Be able to interpret output for logistic regression models with logistic and identity links. Be able to interpret output for Poisson regression models with log and identity links. Have good working knowledge of what the deviance and Pearson GOF tests measure and when you can trust the p-values. Have an idea of what the Hosmer and Lemeshow GOF test measures. What are the null and alternative hypotheses in all of these tests? Have an idea of when asymptotic χ 2 tests for H 0 : X Y are valid. 19 / 22

You should be able to... Know these models: (a) logistic regression with continuous and categorial predictors, (b) baseline category logit for nominal and ordinal, (c) proportional odds for ordinal, (d) marginal and conditional (i.e. random effects) versions of logistic regression. Be able to obtain odds and probabilities, relative risks, et cetera from these models for any covariate combination. Marginal approaches (Chapters 11 and 12). Course focused more on GEE approach of Chapter 12. Understand what handful of working correlation structures imply about clusters of outcomes. Be able to interpret SAS output. Understand difference and interpretation between marginal and conditional approaches. 20 / 22

Fixed vs. random blocks... Conditional approaches (Chapters 11 and 13). Course focused more on maximum likelihood approach to fully specified model (u 1,..., u n iid N(0, σ 2 )). Correct test for H 0 : σ = 0. Interpretation and comparison to fixed effects analogue. I think of random effects as a sample from some large (theoretically infinite) population; if iid they imply exchangeability. Fixed effects are used if you do not have exchangeability (e.g. age groups), or there s only a few of them. Either way, these effects are usually of second interest (they imply blocks) to treatment or population effects. 21 / 22

We re done! Thanks for taking STAT 770 and wading through to the end. Thanks to Shiwen Shen for being an outstanding TA! Have a great break! 22 / 22