Interactions among Categorical Predictors

Similar documents
Interactions among Continuous Predictors

A Re-Introduction to General Linear Models (GLM)

A Re-Introduction to General Linear Models

Simple, Marginal, and Interaction Effects in General Linear Models

Time-Invariant Predictors in Longitudinal Models

Simple, Marginal, and Interaction Effects in General Linear Models: Part 1

Time-Invariant Predictors in Longitudinal Models

SAS Code for Data Manipulation: SPSS Code for Data Manipulation: STATA Code for Data Manipulation: Psyc 945 Example 1 page 1

Review of CLDP 944: Multilevel Models for Longitudinal Data

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models

Time Invariant Predictors in Longitudinal Models

Introduction to Within-Person Analysis and RM ANOVA

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Interactions between Binary & Quantitative Predictors

Generalized Linear Models for Non-Normal Data

Three Factor Completely Randomized Design with One Continuous Factor: Using SPSS GLM UNIVARIATE R. C. Gardner Department of Psychology

Review of the General Linear Model

Designing Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D.

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling

Describing Within-Person Fluctuation over Time using Alternative Covariance Structures

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

ANOVA Longitudinal Models for the Practice Effects Data: via GLM

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

General Linear Model (Chapter 4)

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data

Generalized Linear Models for Count, Skewed, and If and How Much Outcomes

Statistical Distribution Assumptions of General Linear Models

SAS Syntax and Output for Data Manipulation:

Introduction to Generalized Models

Advanced Quantitative Data Analysis

SPSS Output. ANOVA a b Residual Coefficients a Standardized Coefficients

Introduction to Regression

Maximum Likelihood Estimation; Robust Maximum Likelihood; Missing Data with Maximum Likelihood

Power Analysis. Ben Kite KU CRMDA 2015 Summer Methodology Institute

Review of Multiple Regression

Practical Biostatistics

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

Generalized Models: Part 1

STAT 7030: Categorical Data Analysis

8 Analysis of Covariance

Binary Logistic Regression

Regression With a Categorical Independent Variable

Research Design - - Topic 13a Split Plot Design with Either a Continuous or Categorical Between Subjects Factor 2008 R.C. Gardner, Ph.D.

MULTILEVEL MODELS. Multilevel-analysis in SPSS - step by step

4:3 LEC - PLANNED COMPARISONS AND REGRESSION ANALYSES

Regression With a Categorical Independent Variable

The inductive effect in nitridosilicates and oxysilicates and its effects on 5d energy levels of Ce 3+

Describing Within-Person Change over Time

Lecture 18 Miscellaneous Topics in Multiple Regression

Suppose we obtain a MLR equation as follows:

Review of Unconditional Multilevel Models for Longitudinal Data

Regression With a Categorical Independent Variable

Formula for the t-test

A discussion on multiple regression models

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

Simple logistic regression

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Longitudinal Data Analysis of Health Outcomes

Logistic Regression. Continued Psy 524 Ainsworth

Lecture 9: Linear Regression

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

Example. Multiple Regression. Review of ANOVA & Simple Regression /749 Experimental Design for Behavioral and Social Sciences

Correlation and Regression Bangkok, 14-18, Sept. 2015

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

EDF 7405 Advanced Quantitative Methods in Educational Research. Data are available on IQ of the child and seven potential predictors.

Research Design: Topic 18 Hierarchical Linear Modeling (Measures within Persons) 2010 R.C. Gardner, Ph.d.

An R # Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM

Data Analysis 1 LINEAR REGRESSION. Chapter 03

Correlation and simple linear regression S5

Supplemental Materials. In the main text, we recommend graphing physiological values for individual dyad

MANOVA is an extension of the univariate ANOVA as it involves more than one Dependent Variable (DV). The following are assumptions for using MANOVA:

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

THE PEARSON CORRELATION COEFFICIENT

Confidence Intervals, Testing and ANOVA Summary

Ronald Heck Week 14 1 EDEP 768E: Seminar in Categorical Data Modeling (F2012) Nov. 17, 2012

Introduction to Random Effects of Time and Model Estimation

Introduction to Matrix Algebra and the Multivariate Normal Distribution

Lecture 18: Simple Linear Regression

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Least Squares Analyses of Variance and Covariance

Mean Vector Inferences

Example name. Subgroups analysis, Regression. Synopsis

Econometrics. 4) Statistical inference

Intro to Linear Regression

Online Appendix for Sterba, S.K. (2013). Understanding linkages among mixture models. Multivariate Behavioral Research, 48,

Review of Statistics 101

Class Introduction and Overview; Review of ANOVA, Regression, and Psychological Measurement

Intro to Linear Regression

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA

CHAPTER 7 - FACTORIAL ANOVA

MLMED. User Guide. Nicholas J. Rockwood The Ohio State University Beta Version May, 2017

Introduction to SAS proc mixed

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1

CAMPBELL COLLABORATION

Self-Assessment Weeks 6 and 7: Multiple Regression with a Qualitative Predictor; Multiple Comparisons

Chapter 14 Student Lecture Notes 14-1

Transcription:

Interactions among Categorical Predictors Today s Class: Reviewing significance tests Manual contrasts for categorical predictors Program-created contrasts for categorical predictors SPLH 861: Lecture 3 1

Testing Significance of Fixed Effects in the Model for the Means Any single (df=1) fixed effect has 4-5 pieces of output: Estimate = best guess for the fixed effect from our data Standard Error = precision of fixed effect estimate (quality of most likely estimate) t-value or z-value = Estimate / Standard Error Wald test p-value = probability that fixed effect estimate is 0 95% Confidence Interval = Estimate ± 1.96*SE = range in which true (population) value of estimate is expected to fall 95% of the time Compare Wald test statistic to critical value at chosen level of significance (known as alpha) Whether the p-value is based on t or z varies by program SPLH 861: Lecture 3 2

Evaluating Significance of Fixed Effects Fixed effects can be tested via Wald tests: the ratio of its estimate/se forms a statistic we compare to a distribution Numerator DF = 1 (Univariate Wald Test) Denominator DF is assumed infinite use z distribution (Mplus, STATA) Denominator DF is estimated instead use t distribution (SAS, SPSS) Numerator DF > 1 (test 2+ effects at once) (Multivariate Wald Test) ( Omnibus F-Test use χ 2 distribution (Mplus, STATA) F * df = χ 2 use F distribution (SAS, SPSS) SPLH 861: Lecture 3 3

Multivariate Wald Tests: F or χ 2 Tests of more than effect at once (numerator df>1) are seen in many different contexts: Test of significance of Model R 2 from 0 test of whether all regression coefficients are 0 simultaneously Test of significance of change in Model R 2 test of whether all *new* regression coefficients are 0 simultaneously Omnibus ANOVA Test of whether there are any differences (in main effects or interactions) across 3+ groups Provided by default for predictors designated as categorical, but can also be requested for any combination of predictors (and their main effects and interactions) via SPSS TEST, SAS CONTRAST, and STATA TEST (for c. predictors) or CONTRAST (for i. predictors) WILL NOT BE USEFUL WHEN INTERACTIONS ARE PRESENT! SPLH 861: Lecture 3 4

Categorical Predictors (3+ Groups) Two alternatives for how to include categorical predictors 1. Create and include manual dummy-coded contrasts Need g 1 contrasts for g groups, added all at once, treated as continued (WITH in SPSS, by default in SAS, c. in STATA) Corresponds more directly to linear more representation Easier to set own reference group and contrasts of interest 2. Let the program create and include contrasts for you Treat as categorical: BY in SPSS, CLASS in SAS, i. in STATA SPSS and SAS: reference = highest/last group; STATA: reference = lowest/first group More convenient if you have many groups, want many contrasts, or have interactions among categorical predictors Program marginalizes over these effects when estimating other effects SPLH 861: Lecture 3 5

Categorical Predictors: Manual Coding Model: Treatgroup variable: Control=1, TreatA=2, TreatB=3, TreatC=4 New variables da= 0, 1, 0, 0 difference between Control and TA to be created db= 0, 0, 1, 0 difference between Control and TB for the model: dc= 0, 0, 0, 1 difference between Control and TC How does the model give us all possible group differences? By determining each group s mean, and then the difference Control Mean (Reference) Treatment A Mean Treatment B Mean Treatment C Mean + + + The model for the 4 groups directly provides 3 differences (control vs. each treatment), and indirectly provides another 3 differences (differences between treatments) SPLH 861: Lecture 3 6

Group Differences from Dummy Codes Model: Control Mean (Reference) Treatment A Mean Treatment B Mean Treatment C Mean + + + Alt Group Ref Group Difference Control vs. TA = Control vs. TB = Control vs. TC = TA vs. TB = TA vs. TC = TB vs. TC = SPLH 861: Lecture 3 7

TESTs when using dummy codes Alt Group Ref Group Difference Control vs. TA = β β β β Control vs. TB = β β β β Control vs. TC = β β β β TA vs. TB = β β β β β β TA vs. TC = β β β β β β TB vs. TC = β β β β β β ECHO 'Differences among 4 groups'. MIXED y WITH da db dc /METHOD = REML /PRINT = SOLUTION TESTCOV /FIXED = da db dc /TEST = "Omnibus F-test" da 1; db 1; dc 1 /TEST = "Control Mean" intercept 1 da 0 db 0 dc 0 /TEST = "TA Mean" intercept 1 da 1 db 0 dc 0 /TEST = "TB Mean" intercept 1 da 0 db 1 dc 0 /TEST = "TC Mean" intercept 1 da 0 db 0 dc 1 /TEST = "Mean: Control vs. TA" da 1 db 0 dc 0 /TEST = "Mean: Control vs. TB" da 0 db 1 dc 0 /TEST = "Mean: Control vs. TC" da 0 db 0 dc 1 /TEST = "Mean: TA vs. TB" da -1 db 1 dc 0 /TEST = "Mean: TA vs. TC" da -1 db 0 dc 1 /TEST = "Mean: TB vs. TC" da 0 db -1 dc 1. Note the order of the equations: the reference group mean is subtracted from the alternative group mean. In SAS ESTIMATE statements (or SPSS TEST or STATA LINCOM), the variables refer to their betas; the numbers refer to the operations of their betas. Intercepts are used only in predicted outcomes. Positive values indicate addition; negative values indicate subtraction. SPLH 861: Lecture 3 8

Interactions with manual group differences When doing manual contrasts, interactions have to be specified with each group contrast as well For example, adding interaction with age (0=85): y β β da β db β dc β Age 85 β da Age 85 β db Age 85 β dc Age 85 e ECHO 'Group by Age (0=85)'. MIXED y WITH da db dc age /METHOD = REML /PRINT = SOLUTION TESTCOV /FIXED = da db dc age da*age db*age dc*age /TEST = "Omnibus main effect F-test" da 1; db 1; dc 1 /TEST = "Omnibus interaction F-test" da*age 1; db*age 1; dc*age 1 /TEST = "Age Slope for Control" age 1 da*age 0 db*dage 0 dc*age 0 /TEST = "Age Slope for Treat A" age 1 da*age 1 db*dage 0 dc*age 0 /TEST = "Age Slope for Treat B" age 1 da*age 0 db*dage 1 dc*age 0 /TEST = "Age Slope for Treat C" age 1 da*age 0 db*dage 0 dc*age 1 /TEST = "Age Slope: Control vs. Treat A" da*age 1 db*dage 0 dc*age 0 /TEST = "Age Slope: Control vs. Treat B" da*age 0 db*dage 1 dc*age 0 /TEST = "Age Slope: Control vs. Treat C" da*age 0 db*dage 0 dc*age 1 /TEST = "Age Slope: Treat A vs. Treat B" da*age -1 db*dage 1 dc*age 0 /TEST = "Age Slope: Treat A vs. Treat C" da*age -1 db*dage 0 dc*age 1 /TEST = "Age Slope: Treat B vs. Treat C" da*age 0 db*dage -1 dc*age 1. SPLH 861: Lecture 3 9

Using BY/CLASS/i. statements instead Designate as categorical in program syntax If you let SAS/SPSS do the dummy coding via CLASS/BY, then the highest/last group is default reference Hard to change reference group (must re-code variable) Type III test of fixed effects provide omnibus tests by default LSMEANS/EMMEANS can be used to get all means and comparisons without specifying each individual contrast If you let STATA do the dummy coding via i.group, then the lowest/first group is reference Easy to change reference group, e.g., last = ref ib(last).group CONTRAST used to get omnibus tests instead of TEST MARGINS can be used to get all means and comparisons with much less code than describing each individual contrast SPLH 861: Lecture 3 10

Main Effects of Categorical Predictors ECHO 'Differences among 4 groups'. MIXED y BY treatgroup /METHOD = REML /PRINT = SOLUTION TESTCOV /FIXED = treatgroup /EMMEANS = TABLES(treatgroup) COMPARE(treatgroup) OR write all of the below instead of EMMEANS line note that one value has to be given for each possible level of the categorical predictor /TEST = "Control Mean" intercept 1 treatgroup 1 0 0 0 /TEST = "T1 Mean" intercept 1 treatgroup 0 1 0 0 /TEST = "T2 Mean" intercept 1 treatgroup 0 0 1 0 /TEST = "T3 Mean" intercept 1 treatgroup 0 0 0 1 Here, 1 means for that group only /TEST = "Control vs. T1" treatgroup -1 1 0 0 /TEST = "Control vs. T2" treatgroup -1 0 1 0 /TEST = "Control vs. T3" treatgroup -1 0 0 1 /TEST = "T1 vs. T2" treatgroup 0-1 1 0 /TEST = "T1 vs. T3" treatgroup 0-1 0 1 /TEST = "T2 vs. T3" treatgroup 0 0-1 1 Contrasts must sum to 0; here 1 = ref, 1 = alt, and 0 = ignore Can also make up whatever contrasts you feel like: /TEST = "Treat ABC Mean" intercept 1 treatgroup 0 1 1 1 DIVISOR=3 /TEST = "Control vs. Mean of ABC" treatgroup -3 1 1 1 DIVISOR=3 SPLH 861: Lecture 3 11

Interactions with Categorical Predictors ECHO 'Group by Age (same model via categorical group)'. MIXED y BY treatgroup WITH age /METHOD = REML /PRINT = SOLUTION /FIXED = treatgroup age treatgroup*age In requesting means, have to specify at what level of the other predictors: /EMMEANS = TABLES(treatgroup) COMPARE(treatgroup) WITH(age=0) In requesting anything, always have to say for what group(s): /TEST = "Age Slope: Control" age 1 treatgroup*age 1 0 0 0 /TEST = "Age Slope: Treat A" age 1 treatgroup*age 0 1 0 0 /TEST = "Age Slope: Treat B" age 1 treatgroup*age 0 0 1 0 /TEST = "Age Slope: Treat C" age 1 treatgroup*age 0 0 0 1 Here, 1 means for that group only, but now it s referring to the age slope /TEST = "Age Slope: Control vs. TA" treatgroup*age -1 1 0 0 /TEST = "Age Slope: Control vs. TB" treatgroup*age -1 0 1 0 /TEST = "Age Slope: Control vs. TC" treatgroup*age -1 0 0 1 /TEST = "Age Slope: TA vs. TB" treatgroup*age 0-1 1 0 /TEST = "Age Slope: TA vs. TC" treatgroup*age 0-1 0 1 /TEST = "Age Slope: TB vs. TC" treatgroup*age 0 0-1 1 Contrasts must sum to 0; here 1 = ref, 1 = alt, and 0 = ignore Can also make up whatever contrasts you feel like: /TEST = "Age Slope: Mean across ABC" age 1 treatgroup*age 0 1 1 1 DIVISOR=3 /TEST = "Age Slope: Control vs ABC" treatgroup*age -3 1 1 1 DIVISOR=3 SPLH 861: Lecture 3 12

Categorical Predictors = Marginal Effects Letting the program build contrasts for categorical predictors (instead of creating manual dummy codes) does the following: Allows LSMEANS/EMMEANS/MARGINS (for cell means and differences) Provides omnibus (multiple df) group F-tests (or χ 2 tests) Marginalizes the group effect across interacting predictors omnibus F-tests represent marginal main effects (instead of simple) e.g., /FIXED = Treatgroup Gender Treatgroup*Gender (in which Treatgroup is always categorical ) Type 3 Tests of Fixed Effects Interpretation if gender is continuous Interpretation if gender is categorical Gender Marginal gender diff Marginal gender diff Treatgroup Group diff if gender=0 Marginal group diff Treatgroup*Gender Interaction Interaction SPLH 861: Lecture 3 13

Interactions: Interaction = Moderation: the effect of a predictor depends on the value of the interacting predictor Interactions among categorical predictors are commonly evaluated (e.g., ANOVA), but by default: Estimate all possible interactions among categorical predictors Software does this for you; nonsignificant interactions usually still are kept in the model (even if only significant interactions are interpreted) Omnibus marginal main effects are provided But are basically useless if given significant interactions Omnibus interaction effects are provided But are basically useless in actually understanding the interaction Let s see how to make software give us more useful info SPLH 861: Lecture 3 14