Applied Epidemiologic Analysis

Similar documents
8 Nominal and Ordinal Logistic Regression

Simple Linear Regression

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

Lecture 12: Effect modification, and confounding in logistic regression

Binary Logistic Regression

Statistics in medicine

Sociology 593 Exam 2 Answer Key March 28, 2002

Sociology 593 Exam 2 March 28, 2002

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

Generalized Linear Models for Non-Normal Data

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Online supplement. Absolute Value of Lung Function (FEV 1 or FVC) Explains the Sex Difference in. Breathlessness in the General Population

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

Review of Multiple Regression

Stat 642, Lecture notes for 04/12/05 96

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Intro to Linear Regression

Tests for the Odds Ratio in a Matched Case-Control Design with a Quantitative X

Chapter 5: Logistic Regression-I

Intro to Linear Regression

Interpreting and using heterogeneous choice & generalized ordered logit models

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

STAT 7030: Categorical Data Analysis

Statistics in medicine

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

Group comparisons in logit and probit using predicted probabilities 1

A Re-Introduction to General Linear Models (GLM)

Introducing Generalized Linear Models: Logistic Regression

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.1 Logistic Regression (Dose - Response)

Chapter 1 Statistical Inference

FAQ: Linear and Multiple Regression Analysis: Coefficients

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Correlation and Regression Bangkok, 14-18, Sept. 2015

STATISTICS Relationships between variables: Correlation

Lecture 4 Multiple linear regression

22s:152 Applied Linear Regression

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Part IV Statistics in Epidemiology

Lecture 2: Poisson and logistic regression

Practical Biostatistics

Multiple OLS Regression

Interactions and Centering in Regression: MRC09 Salaries for graduate faculty in psychology

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM

Black White Total Observed Expected χ 2 = (f observed f expected ) 2 f expected (83 126) 2 ( )2 126

Investigating Models with Two or Three Categories

Chapter 11. Correlation and Regression

A Re-Introduction to General Linear Models

Truck prices - linear model? Truck prices - log transform of the response variable. Interpreting models with log transformation

More Statistics tutorial at Logistic Regression and the new:

Lab 11. Multilevel Models. Description of Data

Simple Linear Regression: One Qualitative IV

Review of Statistics 101

Simple, Marginal, and Interaction Effects in General Linear Models

Lecture 5: Poisson and logistic regression

Econometrics. 4) Statistical inference

Comparing IRT with Other Models

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Applied Statistics Friday, January 15, 2016

Statistics 203 Introduction to Regression Models and ANOVA Practice Exam

Job Training Partnership Act (JTPA)

Lectures 5 & 6: Hypothesis Testing

Regression so far... Lecture 21 - Logistic Regression. Odds. Recap of what you should know how to do... At this point we have covered: Sta102 / BME102

Single-level Models for Binary Responses

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues

Lab 07 Introduction to Econometrics

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

8 Analysis of Covariance

1 The Classic Bivariate Least Squares Model

Longitudinal Modeling with Logistic Regression

Correlation and regression

Self-Assessment Weeks 8: Multiple Regression with Qualitative Predictors; Multiple Comparisons

Correlation and Simple Linear Regression

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /8/2016 1/38

Extensions of One-Way ANOVA.

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

Lecture (chapter 13): Association between variables measured at the interval-ratio level

Topic 1. Definitions

Multiple Linear Regression II. Lecture 8. Overview. Readings

Multiple Linear Regression II. Lecture 8. Overview. Readings. Summary of MLR I. Summary of MLR I. Summary of MLR I

Cohen s s Kappa and Log-linear Models

Extensions of One-Way ANOVA.

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

Multiple Linear Regression II. Lecture 8. Overview. Readings

Multiple Linear Regression II. Lecture 8. Overview. Readings. Summary of MLR I. Summary of MLR I. Summary of MLR I

Chapter 4 Regression with Categorical Predictor Variables Page 1. Overview of regression with categorical predictors

Single and multiple linear regression analysis

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Chapter 6: Exploring Data: Relationships Lesson Plan

Econometric Modelling Prof. Rudra P. Pradhan Department of Management Indian Institute of Technology, Kharagpur

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection

Multilevel Modeling: A Second Course

Unit 6 - Introduction to linear regression

Simple Linear Regression: One Quantitative IV

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

Ronald Heck Week 14 1 EDEP 768E: Seminar in Categorical Data Modeling (F2012) Nov. 17, 2012

T-Test QUESTION T-TEST GROUPS = sex(1 2) /MISSING = ANALYSIS /VARIABLES = quiz1 quiz2 quiz3 quiz4 quiz5 final total /CRITERIA = CI(.95).

Transcription:

Patricia Cohen, Ph.D. Henian Chen, M.D., Ph. D. Teaching Assistants Julie Kranick Chelsea Morroni Sylvia Taylor Judith Weissman Lecture 13 Interactional questions and analyses Goals: To understand how the interpretation of an interaction effect depends on the data analytic program as well as the data. To understand why lower-order terms are always required even when they don t contribute (and the enduring value of centering). To understand how to graph effects that are conditional. Frequent and useful conventions for doing so. Biologic vs statistical interaction: What interactions are relevant to epidemiology? Synergism and other conceptually suggestive labels such as moderation, augmentation, conditionality are used to translate statistical interactions into substantive contexts. Independently of the statistical model, interaction means that the relationship to or effect of some variable on the dependent variable is conditional on (depends on or varies with) values of another variable. However, how it varies depends on the model: e.g. whether it varies as a multiplicative function or as an additive function will depend on the model.

The role of the statistical model in determining the meaning of an interaction For models that are linear in a log function the increases in the outcome variable per unit increase in the predictor are multiplicative if translated into the original units (e.g. odds or probability) For models that are linear in other link functions (e.g. the identity function as in OLS or the probit function) the interaction indicates changes in the relationship of one variable to the DV as an additive or linear function of the other variable. Statisticians differ from researchers Statistician Uses mathematical probability models to determine what kind of mathematical treatment of data will produce proper standard errors Whether the model estimates reflect how nature works is not her problem. Researcher Goal is understanding the phenomena being studied. Check the data for conformity to the assumptions Should be cautious in changing data to fit the statistical requirements. Also needs caution in the interpretations whenever the assumptions of the model may not fit the substantive problem. Epidemiological practices that can help identify the model that best represents the data Examination of changes in relationships across strata If your sample is large enough, can get OR or other measures of effect by strata. Often examine trends in OR across ordinal strata by age. Such an examination would also reveal the shape of a conditional effect: Does it change as a multiplicative function of the other variable? Alternative categorizations of some variables in order to see conditionality.

Interactions and the relationship to curvilinear relationships, scaling, and scale transformations Curvilinear relationships may be thought of as effects of a variable (e.g. an exposure) that are conditional on its own value. Curvilinear relationships may sometimes (but not always appropriately) be made linear by a transform of the predictor. Although the models that we have examined different link functions, that is, different forms of the dependent variable, data problems may be solved by using different functions of independent variables as well, including log or other transforms Some relationships are intrinsically non-linear: that is, there is no transform that will both linearize and produce homoscedastic error: i.e. equal differences between the predicted and observed values of the DV (residuals) that average zero with an equal variance at all points on the regression line. Interaction effects in OLS Interactions among categorical variables (e.g. strata) What you are examining depends on how you coded the categories. Most epidemiological studies code strata with dummy variable coding but that is not always optimal. Other methods of coding categorical variables may be more appropriate to one s hypotheses, and more statistically powerful. Interaction effects in OLS Scale by scale interactions: It is almost always useful to center variables before investigating interaction or curvilinear relationships. Subtracting the mean from the two variables involved will remove the correlation between those variables and their product (except the correlation due to skew in one or both variables), It is always prudent and usually illuminating to graph significant interactions. If the scale units do not have well-understood meaning: Often examine the effect of one variable for values of the other variable that are one SD above and below the mean.

Suppose we had the following analytic findings in an OLS analysis Y Mean = 5, Range - 6.5 to 20 E1 Mean = 3 SD = 1.5 E2 Mean = 7 SD = 3 Regression equation Y = 1.778 +.290 (E1) +.336 (E2) R 2 =.198 Adding the interaction of E1 and E2 Y = 8.538-1.938 (E1) -.663 (E2) +.327 (E1E2) R 2 =.546 Graphing this relationship Y = 8.538-1.938 (E1) -.663 (E2) +.327 (E1xE2) We are going to plot two lines. One will represent those cases with an E1 one SD below the mean. The other will be one SD above the mean. For each line, we choose two arbitrary points for E2 (because 2 calculated points will define a straight line). Graphing Y = 8.538-1.938 (E1) -.663 (E2) +.327 (E1E2) First, for E1 when it is one standard deviation below the mean: E1 mean = 3 SD = 1.5 Low E1 = 3 1.5 = 1.5 Arbitrary values for E2 6, 8 Point 1 Y = 8.538-1.938 (1.5) -.663 (6) +.327 (1.5 * 6 ) = 4.596 Point 2 Y = 8.538-1.938 (1.5) -.663 (8) +.327 (1.5 * 8 ) = 4.251 10 7 Y 4 1 1 2 3 4 5 6 7 8 9 10 11 12 13 E2

Graphing Y = 8.538-1.938 (E1) -.663 (E2) +.327 (E1E2) Now, for E1 when it is one standard deviation above the mean: HighE1 = 3 + 1.5 = 4.5 E2 arbitrary points 6, 8 Point 1 Y = 8.538-1.938 (4.5) -.663 (6) +.327 (4.5 * 6) = 4.668 Point 2 Y = 8.538-1.938 (4.5) -.663 (8) +.327 (4.5 * 8) = 6.285 10 7 Y 4 1 1 2 3 4 5 6 7 8 9 10 11 12 13 E2 Graphing Y = 8.538-1.938 (E1) -.663 (E2) +.327 (E1E2) Now putting the two lines together: 10 High E1 7 Y 4 Low E1 1 1 2 3 4 5 6 7 8 9 10 11 12 13 E2 Centering the variables Suppose we center the E1 and E2 variables by subtracting the mean from each value. We then also use the product of these centered variables to represent the interaction. Doing so has absolutely no effect on the relationship of these variables to Y. However, it will make the interaction term (almost) uncorrelated with the main effects, that is, the centered E 1 and E 2. Equation for these centered variables is: Y = 4.948 +.351 * E 1c +.318 * E 2c +.327 * E 1c E 2c Note that the value for the interaction term remains the same: this will always be the case for the highest order interaction term in the model.

Graphing Y = 4.948 +.351 * E 1c +.318 * E 2c +.327 * E 1c E 2c First, for E 1c when it is one standard deviation below the mean: E 1c mean = 0 SD = 1.5 Low E1 = - 1.5 Arbitrary values for E 2c -1, 1 Point 1 Y = 4.948 +.351 (-1.5) +.318 (-1) +.327 (-1.5 * -1 ) = 4.422 Point 2 Y = 4.948 +.351 (-1.5) +.318 (1) +.327 (-1.5 * 1 ) = 4.077 10 7 Y2 4 Applied Centered Epidemiologic scale Analysis Original scale 1-6 -5-4 -3-2 -1 0 1 2 3 4 5 6 1 2 3 4 5 6 7 8 9 10 11 12 13 Graphing Y = 4.948 +.351 * E 1c +.318 * E 2c +.327 * E 1c E 2c Now, for E 1c when it is one standard deviation above the mean: E 1c mean = 0 SD = 1.5 High E1 = 1.5 Arbitrary values for E 2c -1, 1 Point 1 Y = 4.948 +.351 (1.5) +.318 (-1) +.327 (1.5 * -1 ) = 4.666 Point 2 Y = 4.948 +.351 (-1.5) +.318 (1) +.327 (1.5 * 1 ) = 6.283 10 High E1 7 Y 4 Low E1 Applied Centered Epidemiologic scale Analysis Original scale 1-6 -5-4 -3-2 -1 0 1 2 3 4 5 6 1 2 3 4 5 6 7 8 9 10 11 12 13 Centered Note also that we can express the relationship of E 1c to Y for the entire group more easily, since the mean for E 2c is 0. When E2c = 0, the product term is also 0, so Y = 4.948 +.351 * E 1c +.318 * E 2c +.327 * E 1c E 2c becomes Y = 4.948 +.351 * E 1c

Alternative means of coding categorical variables to examine and test interactions One of the problems when we have a serious interest in conditional relationships is the relatively low statistical power such analyses often have. This means we need to do everything possible to increase our statistical power to find such a relationship. One methodological way to increase statistical power is to represent categorical data in the analyses in the form most likely to represent these changes in relationship. Hypothetical study 2: (see data set provided) Type of work, commuting and back pain Hypothesis: The less the occupation involves physical exertion the more the commute time will be related to back pain. Coded with 5 dummy variables with the unskilled labor group as the reference UL Unskilled labor Sk Skilled labor Sa Sales WC White collar P Professional C = Commuting time (confounders are assumed to have been measured but not in this data set) est.pain = B 0 + B d C+ B sl SL + B sa SA + B wc WC + B p P+ B slxd SLxC + B saxd SAxC + B wcxd WCxC + B pxd PxC + B con Confounders Suppose that we have coded the occupational categories as dummy variables with unskilled labor as the reference OD1 = Unskilled labor = 1, otherwise = 0 OD2 = Skilled labor = 1, otherwise = 0 OD3 = Sales = 1, otherwise = 0 OD4 = White collar = 1, otherwise = 0 OD5 = Professional = 1, otherwise = 0 Remember that we always need one fewer variables than there are groups to represent all the categories (there are g 1 df) estpain = 4.568 +.131 Commute -.236 OD2 -.739 OD3-1.701 OD4-1.278 OD5 +.037 OD2xC +.115 OD3xC +.219 OD4xC +.200 OD5xC R 2 =.128

Graphing Pain = 4.568 +.131 C -.236 OD2 -.739 OD3-1.701 OD4-1.278 OD5 +.037 OD2xC +.115 OD3xC +.219 OD4xC +.200 OD5xC Now we wish to have a line for each of the occupational groups. Since when any one of the occupational dummy variables is 1 the other three are zero, most of the terms in the equation drop out in each calculation: For OD2, red terms are all zero. Pain = 4.568 +.131 Commute -.236 OD2 -.739 OD3-1.701 OD4-1.278 OD5 +.037 OD2xC +.115 OD3xC +.219 OD4xC +.200 OD5xC Recalling that to plot a linear relationship any two points will do, in order to plot the Commute effect for OD2 (skilled labor) we pick arbitrary points for Commute = 3,5 OD2, Commute Point 1 Y = 4.568 +.131(3 )-.236 +.037(3) = 4.836 OD2,Commute Point 2 Y = 4.568 +.131(5 )-.236 +.037(5) = 5.172 Graphing Pain = 4.568 +.131 C -.236 OD2 -.739 OD3-1.701 OD4-1.278 OD5 +.037 OD2xC +.115 OD3xC +.219 OD4xC +.200 OD5xC OD2 Pnt 1 Y = 4.568 +.131(3 )-.236 +.037(3) = 4.836 OD2 Pnt 2 Y = 4.568 +.131(5 )-.236 +.037(5) = 5.172 10 8 OD3 Pnt 1 Y = 4.568 +.131(3 )-.739 +.115(3) = 4.567 OD3 Pnt 2 Y = 4.568 +.131(5 )-.739 +.115(5) = 5.059 OD4 Pnt 1 Y = 4.568 +.131(3 )- 1.701 +.219(3) = 3.917 OD4 Pnt 2 Y = 4.568 +.131(5 )- 1.701 +.219(5) = 4.617 OD5 Pnt 1 Y = 4.568 +.131(3 )-.739 +.200(3) = 4.283 OD5 Pnt 2 Y = 4.568 +.131(5 )-.739 +.200(5) = 4.945 Commute time 6 4 2 0 3 4 5 6 PAIN Skilled labor Sales White collar Professional And the standard errors of the effects in the interaction analyses: Effect SE t p Intercept 4.568 0.202 22.598 0.000 Commute 0.131 0.049 2.670 0.008 OD2-0.236 0.285-0.826 0.409 OD3-0.739 0.291-2.539 0.011 OD4-1.701 0.300-5.679 0.000 OD5-1.278 0.293-4.369 0.000 OD2xC 0.037 0.069 0.531 0.595 So we note that only the effects for white collar and professionals are significantly different from the reference group of unskilled labor OD3xC 0.115 0.071 1.629 0.103 OD4xC 0.219 0.073 3.025 0.003 OD5xC Applied Epidemiologic 0.200 Analysis 0.071 2.819 0.005

Using simple slopes An alternative way of getting these graphs which is easier, and which provides a standard error for the slope of each group is accomplished by the following method of coding for the simple slopes Omit the main effect for the scaled variable and include all g Interactions between the scaled variable and the dummy variables. For our illustration this results in the following equation: Pain = 5.091 -.089 OD2 -.279 OD3 -.824 OD4 -.477OD5 +.131 OD1xC +.168 OD2xC +.246 OD3xC +.350 OD4xC +.331 OD5C Here we can determine readily which slopes are significantly greater than zero. Now let us examine the significance test on the interactive effects: Without interaction components: Source Sum-of-Squares df Mean-Square F-ratio P Regression 284.853 5 56.971 55.294 0.000 Residual 2054.476 1994 1.030 With interaction components: Source Sum-of-Squares df Mean-Square F-ratio P Regression 299.786 9 33.310 32.500 0.000 Residual 2039.543 1990 1.025 Interaction effect = 299.79-284.85 = 14.94 / 4 df = 3.74/ 1.02 F = 3.64 ** Because this example used a large (n = 2000) sample this effect is statistically significant. With a smaller sample of, e.g. 500, it is likely not to be significant even though the effect of commuting was over three times as large in some groups than in others. The problem in testing this set of interactions may be because the test we have carried out requires 4 df in the numerator, when our a priori hypotheses was much simpler.

But suppose we had coded the variables differently? There are many different ways to do so (in fact, an infinite number). Perhaps we have the hypothesis that the more one s job involves physical labor the less will be the impact of commuting time on back pain. In this case we may choose to code our categorical variable to reflect that major hypothesis, using a set of orthogonal contrasts as follows: Contrast 1: Contrasts skilled and unskilled labor with white collar and professional Code: SK and UL = 1, SA = 0, WH and PR = -1 Contrast 2: Contrasts sales with all other Code: SA =.8, all others -.2 Contrast 3: Contrasts skilled and unskilled labor Code: UL = -.5, SK=.5, all others =0 Contrast 4: Contrasts while collar and professional Code: WH = -.5, PR=.5, all others = 0 Effects and their standard errors for the new codes Note that only the first contrast reflects our a priori hypothesis. Effect Coefficient Std Error CONSTANT 4.758 0.023 COMMUTEC 0.245 0.023 OCONT1-0.303 0.025 OCONT2 0.069 0.057 OCONT3-0.089 0.072 OCONT4 0.347 0.072 OCCOM1 0.096 0.025 OCCOM2 0.001 0.057 OCCOM3 0.037 0.069 OCCOM4-0.019 0.074 In subsequent analyses we may reasonably remove the last three contrasts from consideration, their presence being only a check on the limits of our hypothesis. The moral of the story The moral of this story is that there is often a problem of statistical power when we are examining interactive effects. One way to cope with this problem is to code for statistical power to be concentrated on the strongest hypotheses. Often in epidemiology the variable that may be affecting the magnitude of another variable (e.g. exposure) with the outcome (e.g. disease) is age or another ordinal or scaled variable. It may be the case that our strongest hypothesis is that the effect of the other variable will change roughly linearly with age. Such a test is most powerfully carried out without categorizing age. Another less powerful but reasonable option is to test the trend for the relationship to increase or decrease over the ordered categories.

Cigarette Smoking and Alcohol Consumption in Relation to Cognitive Performance in Middle Age Sandra Kalmijn, Martin P. J. van Boxtel, Monique W. M. Verschuren, Jelle Jolles, and Lenore J. Launer, American Journal of Epidemiology, 156, 936-944. Goals of the study: to determine the relationship between smoking history (pack years), alcohol consumption, and cognitive function in men and women at an average age of 56. N = 1,947 participants from the Netherlands. Findings: Primary differences were in speeded tests, for which negative effects were found for current smoking, and an interaction between sex and alcohol consumption. TABLE 4. Adjusted beta coefficients reflecting the difference in the standardized score, according to alcohol drinking categories, among men and women (portion of table only) Note that the cognitive measure is standardized to a mean of zero and an SD of one, and no mean differences between men and women are reported. Cigarette Smoking and Alcohol Consumption in Relation to Cognitive Performance in Middle Age 1.0 Cognitive speed 0.5 0.0-0.5-1.0 0 <1 GROUP 1-2 2 to <4 4 to <8 8+ SEX Men Women

Interactions involving 3 or more predictors Even if you have a number of predictors, an important interaction should be at least faintly visible at some categorical level involving the IVs and the DV (if the DV is a scale, use mean scores). If a study is really intended to find an hypothesized triple interaction, statistical power problems should be taken very seriously in both the study design and in the analysis. Interaction effects in logistic regression The link function connecting the dependent variable and the sum of weighted independent variables is the logit or log odds of outcome. In order to translate back into the original units of the DV, we exponentiate both sides of the equation. The predicted odds of the DV (e.g. disease) = sum of the products of the exponentiated weighted risks. When interactions involve different effects in different populations (e.g., sex or age groups), interactions are generally examined by displaying effects such as the OR for the exposure in each stratum. Low-Dose Exposure to Asbestos and Lung Cancer: Dose-Response Relations and Interaction with Smoking in a Population-based Case- Referent Study in Stockholm, Sweden Per Gustavsson, Fredrik Nyberg, Göran Pershagen, Patrik Schéele, Robert Jakobsson, and Nils Plato, American Journal of Epidemiology, 155, 1016-1022. Study goals: Determination of whether there was a synergistic effect of asbestos and smoking on the risk of lung cancer. Study sample: 1038 incident male lung cancer cases in Sweden and 2359 referents matched on age and inclusion year for whom occupational exposures were determined. Findings: Lung cancer risk increased almost linearly with cumulative dose of asbestos rather than exponentially as predicted by the logistic model. To remedy this the asbestos doses were transformed logarithmically (dose = ln(fiber-years +1). For smoking the dose response was between linearity and exponentiality and the smoking measure was transformed to the square root of grams/day, which fit much better.

Low-Dose Exposure to Asbestos and Lung Cancer Low-Dose Exposure to Asbestos and Lung Cancer Subsequent analyses using various functions of the two exposure variables all showed that the effect of the interaction (product) was less than one (e.g. OR =.85 with confidence limits not including 1.0), indicating that the combination of the two risks did not have the multiplicative effect assumed by the logistic model, although it was more than additive. (Note: they did not show this graphically) Interaction effects in multilevel logistic regression For longitudinal data the clusters represent the multiple assessments of individual respondents. The interactional questions that may be asked of these data include the following: To what extent are the changes over time conditional on some stable characteristic of the respondents? Does the relationship between the (changing) outcome variable and some (changing) predictor vary as a function of some characteristic of the respondents? In each of these analyses the principles of centering and graphing continue to be critical to understanding the findings.