Spring RMC Professional Development Series January 14, Generalized Linear Mixed Models (GLMMs): Concepts and some Demonstrations

Size: px
Start display at page:

Download "Spring RMC Professional Development Series January 14, Generalized Linear Mixed Models (GLMMs): Concepts and some Demonstrations"

Transcription

1 Spring RMC Professional Development Series January 14, 2016 Generalized Linear Mixed Models (GLMMs): Concepts and some Demonstrations Ann A. O Connell, Ed.D. Professor, Educational Studies (QREM) Director, Research Methodology Center College of Education and Human Ecology, OSU What are GLMs? Generalized linear models refer to an approach used to model response variables that are: Discrete (dichotomous, ordinal, nominal) [logistic or logit models] Limited within a range (proportions or rates, time to events) [binomial, or hazard or survival models] Counts of events [Poisson or negative binomial models] When the (conditional) distribution is normal, we end up with our familiar linear model For example, Anova or regression analysis; this is a special case called the general linear model 2 1

2 Clustered Data Designs in education and the social or behavioral sciences often rely on clusters (classrooms, schools, hospitals, neighborhoods) from which to collect data These hierarchies bring an additional level of complexity to studies analyzing discrete or limited outcomes. 3 Why worry about clustering? Research has consistently demonstrated that people within the same setting or context tend to be more similar to each other than they are to people in a different group or context. Violates the assumption of independence in the data Repeated observations share this same phenomena (correlated observations) Statistically, the standard errors for parameter estimates in a model that ignores clustering are extremely biased downwards Too small, every effect is statistically significant! 4 2

3 HLMs versus HGLMs (1) What are HLMs? HLMs are used to fit models to hierarchical data with normally distributed errors at level one. Linear mixed model: fixed and random effects Hierarchical models capture the structure of data obtained from many naturalistic settings Allows us to model data that are not independent (i.e., can model the covariance structure) Clustering, or repeated measures HLM is a special case of the HGLM 5 HLMs versus HGLMs (2) What are HGLMs? Hierarchical generalized linear models are used to fit models to hierarchical data where the errors at level one are not (necessarily) normally distributed Mixed model, includes both fixed and random effects Multilevel logistic, multilevel Poisson, etc. Generalized Linear Mixed Models HGLM is the general case and GLMM is a special case In GLMM, the higher-order random effects are assumed to be normally distributed 6 3

4 Examples Examples of outcomes where the application of GLMM would be appropriate: Data on whether or not an adolescent drops out of school, across multiple schools Data on proportion of school-aged children attending public schools across multiple neighborhoods Counts of number and type of suspensions in a school in past year Student proficiency in reading or math (below basic, basic, proficient, goal, advanced), across schools Community differences in time to first drinking episode among adolescents (age, months, days, years ) 7 Goals for today Provide an introduction to GLMMs Clarify some of the complexity involved in estimating and interpreting GLMMs Illustrate several models using HLMv7.1 software Multilevel logistic, ordinal, Poisson, overdispersed Poisson Highlight challenges and limitations to working with these kinds of models 8 4

5 Background Approach to building GLMMs parallels that of GLMs in single-level (fixed-effects) analyses Some non-glm strategies to dealing with non-normally distributed outcomes involved transformations of the DV: Severe skew Proportions Counts ln(y) 2arcsin(p 1/2 ) y 1/2 1/y p/(1-p) (logit) Probit (inverse normal) But: Empirical transformations don t always work well; and for dichotomies, there is no fix towards normality! 9 GLM Approach Embed information about the distribution of the DV and a desired transformation directly within the statistical model Three related model components A. Sampling Model What is the distribution of interest? B. Link function What transformation might be used to link the predicted values to the observed values for Y, and appropriately constrain model results to be within a certain interval (i.e., between 0 and 1 for proportions) C. Structural (Linear) Component How are the predicted values from the link function related to the covariates in the models? 10 5

6 Estimation Concerns GLMs (like logistic regression, etc.) are estimated through ML methods Most use a distribution from the exponential family (canonical links) Multilevel models are also estimated through ML Combined complexity for GLMMs leads to complex models as well as complex estimation procedures There are pros and cons to the available choices of estimation Choice affects other aspects of the model 11 Issues Particular to GLMM Estimation Deviance Variance partitioning and estimating the ICC Unit-specific versus Population average inferences Over- or under dispersion (for counts) I will talk briefly, we can revisit after looking at some examples 12 6

7 Model Representation Some software (including HLM) use separate models to describe relationships among predictors at each level Others use a mixed-model approach Conceptually, substitute level-two equations into level-one model, gather fixed and random effects X Z X and Z are the fixed and random effects design matrices, and and are vectors for the fixed and random effects parameters 13 Distribution of Random Effects The random effects,, are assumed to follow a normal distribution: ~ N(0, G) G represents the covariance structure for the random effects Need to estimate the fixed effects and the elements of G Dispersion at level one is given by the underlying distribution (for logistic, this is Bernoulli; for ordinal, it s multinomial, for counts, it s Poisson or some variation on Poisson) 14 7

8 Estimation In the normal case, V = var(y) = ZGZ T + R, where R is the covariance matrix for the levelone residuals For GLMM s, V is not as easily specified Likelihood function is a non-linear function of the fixed effects, and the residuals at level one are heteroscedastic and depend on the mean Integration required for solution intractable Two approaches dominant in literature: Approximate the model Approximate the log-likelihood 15 Approximating the Model Pseudo-likelihood techniques PL pseudo-likelihood PQL penalized quasi-likelihood, compensates for initial estimation of the variance structure Linearization methods (i.e., Taylor series) as the algorithm for a model based on pseudo data i.e., starting with normal mixed model; iterates through process until convergence is reached Advantages: Generally converges quickly; default in many statistical packages Disadvantages: does not yield a reliable deviance for model comparison (Snders & Bosker, 2012) Research has also shown these estimates to be biased (in logistic HLM), more so in small samples; suggestion of carry over to GLMMs of other forms 16 8

9 Approximating the log-likelihood The integral in the log-likelihood is approximated numerically Quadrature methods Gauss-Hermite quadrature Adaptive quadrature Laplace MCMC Advantages these methods yield deviance statistics that can be used for LR tests Disadvantages May have convergence problems 17 Deviance Deviance comparisons may be unreliable under PQL PQL not recommended when number of groups is small HLM has an option for Laplace iterations to obtain a model deviance for some models HLM can also fit models through Adaptive Quadrature, although this is often hard to converge with complex or poorly specified models 18 9

10 ICC ICC for non-normal models Raudenbush & Bryk (2002); Snders and Bosker (1999); Goldstein, Browne & Rasbash (2002); Browne, Subramanian, Jones & Goldstein (2005) Assume outcome is a latent variable that was discretized Implicit in the logit model, the level-1 errors are heteroscedastic, but assumed to have a standard logistic distribution with mean 0 and variance 2 /3 From scaling of the probability density and cumulative distribution functions for the logistic distribution Can be used for ICC calculation for logistic 19 Unit Specific vs. Population Average Inferences Situation that occurs with non-linear link functions Decision on choice is based on research aims The unit-specific model (hierarchically structured model) describes a process that s occurring within each group or level-2 unit. In most situations we want to see how these processes vary across the level 2 units Population average models ask a different question not focusing on specifics occurring within contexts May only want an average for the population without regard to group: i.e., how does risk of being at or below category 3 differ by gender (averaging across all schools) Is one characteristic of a unit-specific model 20 10

11 Illustrations of GLMMs Methodological considerations and structure of the models Logistic for dichotomous outcomes Cumulative logistic for ordinal outcomes Poisson and over-dispersed Poisson for counts Substantive examples using the ECLS-K Numeracy proficiency for first-graders at the end of first grade. 0, 1, 2, 3, 4, 5 = 6 proficiency levels 6539 students 569 schools 21 Proficiency Categories Proficiency Brief Description Category 0 Did not pass level 1 1 Can identify shapes and numbers 2 Can understand relative size and recognize patterns 3 Can understand ordinality and sequencing 4 Can solve simple addition and subtraction problems 5 Can solve simple multiplication and division problems 22 11

12 Student Level Descriptives Student Level Descriptive Statistics for the Analytic Sample, N = 6539 ProfMath Total n N = 9 N = 53 N = 192 N = 1231 N = 3188 N = 1866 N = 6539 Cum.P..11%.95% 3.88% 22.71% 71.47% 100% --- % Male 33.33% 52.83% 48.44% 49.39% 46.55% 54.39% 49.4% NumRisks M (SD) (1.17) (1.06) (.97) (.82) (.72) (.50) (.72) 23 School Level Variables School Level Descriptive Statistics for the Analytic Sample, J = 569 Variable Mean SD Minimum Maximum Neighborhood Problems (NBHOODCLIM) Private (PUBPRIV2) 18% Public: Pubpriv2 = 0 Private: Pubpriv2 =

13 Proficiency Outcomes For multilevel logistic models I used the highest two categories as the target of interest (Prof45) For the multilevel ordinal models, I used the entire scale, 0 to 5 (profmath) For the count models, I assumed the outcomes represented actual counts (for demonstration) All variables measured at the end of first grade 25 Multilevel Logistic Model We want to model proficiency in terms of scoring in categories 4 or 5 for children within schools Y = Response for i th child in j th school Y = 0 or 1 (1 = success, i.e., score of 4 or 5) Are there child- or school-level characteristics that can help explain the likelihood of success? 26 13

14 Building the Model We specify three features: Sampling model (level one), Link function, and Structural model For HLM (normally distributed residuals), the sampling model at level one is Normal: Y E( Y ) Y 2 ~ NID(, ) The link is the identity link because no transformation is necessary The structural model is the regression model: Y X X X r 0j 1j 1 2 j 2... pj p 27 Dichotomous Data: A. Sampling Model For binary outcomes, the sampling model is Bernoulli, which is a special case of the Binomial distribution A binomial random variable counts the number of successes in m trials When m=1, the outcome is binary (0,1), rather than a count 28 14

15 Binomial Distribution Y p ~ B( m, p ) Where m refers to the number of trials (for Bernoulli, m = 1) p refers to the probability of success on each trial, i.e., the success probability for the i th person in the j th group. Specifying the sampling distribution as Binomial (or Bernoulli) identifies the nature of the level-1 variation as Binomial 29 Expected Values and Variance for Bernoulli Models E( Y Var( Y p ) p p ) p (1 p ) Level-one errors are Heteroscedastic depends on each person s P(success) Not constant across individuals; depends on p-hat 30 15

16 Dichotomous Data: B. Level-1 Link function We call the transformed predicted values:, This transformation process is called linking. For binary outcomes, the link is (typically) the logit link (the canonical link for binary data) Indicates how the transformed variable relates back to the original data logit( p ) ln 1 p p 31 Dichotomous Data: C. Structural Model The structural model describes how the transformed predicted value is related to the predictors (a linear structural model) In this example, I am just looking at level one for now j 1 j X1 2 j X 2 Rarely do we need to write these three components out for a normal distribution (i.e. HLM), but for GLMM the distinction regarding these three elements of the model (sample, link, structure) becomes quite useful. pj X p 32 16

17 Level 2 Models Level 2 models for GLMM have the same form as the standard HLM Coefficients from the level-1 model can be fixed, randomly varying, or non-randomly varying Similar to the standard HLM, model building usually starts with the empty model to approximate the ICC, and estimate overall probability of success More on ICC later 33 Level 1 and Level 2 models Clusters are considered as a random sample from some population of clusters The success probabilities within the clusters, P, are regarded as random variables Thus we have our level 1 and level 2 models X X... X qj 0 j 1 j 1 2 j 2 S q q0 qs s1 W sj u qj pj p Level-2 random effects assumed normally distributed, in the GLMM 34 17

18 Interpreting the logit is the model s logit prediction ˆ If logit = 0, P(success) = P(failure) Odds = 1 If logit < 0, P(success < P(failure) Odds < 1 If logit > 0, P(success > P(failure) Odds > 1 35 Estimating Probability To get from ˆ (predicted logit) to estimated probability, use back transformation: Odds = exp ( ˆ ) pˆ odds 1 odds (or, use : p ˆ 1 ) 1 exp( ˆ ) 36 18

19 Example ECLS-K data Y = whether or not the i th child in the j th school is proficient in numeracy at end of first grade (prof45) IVs at level 1 are Number of family risk factors (Numrisks) Gender (n.s.) IVs at level 2 are NbhoodClimate PubPriv2 (private=1, public=0) 37 ICC Based on hierarchical linear probability model, = ICC 2 00 Based on mean, variance of logistic distribution, =.1435 ICC

20 Final Logistic Model Results Fixed Effects Coefficient (SE) Odds Ratio Model for the Intercepts (β o ) Intercept (γ 00 ) (.055) ** Neighborhood Problems (γ 01 ) (.016).903 ** Public/Private (γ 02 ).473 (.111) ** Model for NumRisks Slope (β 1 ) Intercept (γ 10 ) (.043).719 ** Random Effects (Var. Components) Variance Intercept (τ oo ).340 ** 39 Probability Estimates Explanatory Variables Model Predictions NumRisks PubPriv2 NbhoodClim Logits Odds Estimated Probability 0 Public Public Public Private Private Private

21 Summary Logistic Model NumRisks has a negative effect on being in higher categories 4 and 5, = Children with more family risks are less likely to be in the two higher categories NbhoodClim, = -.102; with more severe climate, less likely to be in highest proficiency categories PubPriv2, =.473; children in private schools more likely to be in higher two categories 41 MULTILEVEL ORDINAL MODEL 42 21

22 Examples of Ordinal Variables ECLS-K proficiency in early numeracy (6 levels) Teachers stages of concern for adoption of an instructional innovation (8 levels) CBO capacity for implementation of effective HIV prevention interventions (8 levels) No Child Left Behind: States set proficiency standards based on educational assessments (5 levels) Transtheoretical model of behavior change (5 levels) 43 Proportional Odds Model One of several regression models appropriate for ordinal data, and also the most common, is the proportional or cumulative odds model. Model predicts the logit=ln(odds) of being in category k or below. These ln(odds) can be back-transformed into odds and then into cumulative probabilities. We are generally interested in the odds (or probability) of being at or below a specific category (relative to being in higher categories)

23 Student-level Responses R = proficiency of i th student in j th school. Need K-1 dummy variables, Y k such that: Y k = 1 if R < k, and 0 otherwise. With K=6 proficiency categories (0 to 5), we have: Y 1 = 1 if R = 0 Y 2 = 1 if R <1 Y 3 = 1 if R <2 Y 4 = 1 if R <3 Y 5 = 1 if R <4 [Y 6 = 1 if R <5 Y 6 = 1 always!] 45 Cumulative Probabilities Using this approach, the probabilities of response are cumulative probabilities: P(Y 1 ) = P(R = 0) P(Y 2 ) = P(R <1) P(Y 3 ) = P(R <2) P(Y 4 ) = P(R <3) P(Y 5 ) = P(R <4) P(Y 6 ) = P(R < 5) =

24 Cumulative Odds The odds of an event is a ratio of the probability that the event happens to the probability that it does not happen. P( R k) P( R k) Odds 1 P( R k) P( R k) If Odds = 1.0, 50/50 chance of an event occurring. If Odds < 1.0, numerator is less likely than denominator, so there is a higher probability that the event does not occur: [Consider.4/.6 =.67] If Odds > 1.0, numerator is more likely than denominator, indicating higher probability that event does occur: [Consider.6/.4 = 1.5] 47 Cumulative Comparisons Category k=0 (Proficiency 0) k=1 (Proficiency 1) k=2 (Proficiency 2) k=3 (Proficiency 3) k=4 (Proficiency 4) Cumulative Cumulative Odds Probability [ Y kj ] P R 0 PR 0 PR 0 P R 1 PR 1 PR 1 P R 2 PR 2 PR 2 P R 3 PR 3 PR 3 P R 4 PR 4 PR 4 Probability Comparison Proficiency 0 versus all levels above Proficiency 0 and 1 combined versus all levels above Proficiency 0,1,2 combined versus 3, 4, 5 combined Proficiency 0,1,2,3 combined versus 4,5 combined Proficiency 0,1,2,3,4 versus proficiency

25 Proportionality Assumption Proportional odds, sometimes referred to as equal slopes assumption Effect of an explanatory variable remains the same across all simultaneous comparisons or splits to the DV Very restrictive, but parsimonious, assumption Straightforward test in single-level models Ad-hoc approaches for multilevel 49 Level-1 Model logit k P( R ln P( R k) β k) 0j Q q1 β qj X q K 1 D k 2 k δ k This model assumes proportional odds, which means that the effect of the predictor variables on the odds doesn t depend on the category K. ( parallel odds ) The delta s are the thresholds (like intercepts) for each category (the common 0j is the intercept for the first category) 50 25

26 Level-2 Model β qj γ q0 Sq s γ 1 qswsj u qj Assume the random effects are multivariate normal. var(u ) qj τ qq 51 Logit for the Cumulative Distribution Similar to logistic, the ordinal model uses the logit link k = logit prediction for being at or below the k th category for the i th child in the j th school and to estimate cumulative probability: exp( k ) x k. 1 exp( ) k 52 26

27 Series of Models Ordinal empty model Ordinal contextual model with NumRisks as a fixed level 1 predictor and NbhoodClim and PubPriv2 as school level predictors of the intercept Model parallel to the earlier logistic model 53 ICC Based on hierarchical linear probability model, = ICC 2 00 Based on mean, variance of logistic distribution, =.1488 ICC

28 Ordinal Contextual Model Level 1: Level 2: P( R Y k ln( k) ln P( R k) β k) 0 j β 1 j NUMRISKS β0 j γ00 γ01nbhoodclimj γ02pubpriv2 j u0 j K 1 k2 D k δ k 1j γ Ordinal Model Results Fixed Effects Coefficient (SE) Odds Ratio Model for the Intercepts (β o ) Intercept (γ 00 ) (.344).001 ** Neighborhood Problems (γ 01 ).122 (.015) ** Public/Private (γ 02 ) (.088).655 ** Model for NUMRISKS Slope. (β 1 ) Intercept (γ 10 ).394 (.038) ** For thresholds: δ (.315) ** δ (.335) ** δ (.340) ** δ (.342) ** Random Effects (Var. Components) Variance Var. in Intercepts (τ oo ).309 ** 56 28

29 Probability Estimates School Type Num Risks Nbhood Clim P(R < cat. 0) P(R < cat. 1) P(R < cat. 2) P(R < cat. 3) P(R < cat. 4) Public Private Summary: Ordinal Model NumRisks has a positive effect, =.394 As the number of family risks increases, the probability of being at or below a given category, rather than beyond that category, tends to increase Children with greater family risks are more likely to be at or below a given category NbhoodClim, =.122; with more severe climate, increased likelihood of being at or below PubPriv2, = -.424; children in private schools less likely to be at or below given category 58 29

30 Ad-hoc Investigation of PO Response Estimated R < 0 R < 1 R < 2 R < 3 R < 4 Fixed Effects OR OR OR OR OR Model for the Intercepts (β o ) Intercept (γ 00 ) 0.00 ** 0.01 **.03 ** 0.22 ** 1.97 ** NBHOODCLIM (γ 01 ) * 1.10 ** 1.11 ** 1.16 ** PUBPRIV2 (γ 02 ) * 0.26 ** 0.62 ** 0.69 ** Model for the NUMRISKS slopes (β 1 ) Intercept (γ 10 ) 1.99 * 1.95 ** 1.76 ** 1.39 ** 1.68 ** Entry is the OR for each split. Average tends to match OR for the Cumulative Model. Example: Average for NBHOODCLIM = 1.11; Cum. OR = Note: some software (SuperMix, SAS) can do this test directly. 59 MODELS FOR COUNTS 60 30

31 Y A. Sampling Model: Poisson Distribution ~ P( m, ) Where Y = number of events occurring during an interval of length m m = interval for the rate (i.e., time-span, length, population size, etc.) Must be greater than zero May be constant for every unit Referred to as exposure, i.e., the interval during which you could be exposed to or experience the event (included in model as an offset ) event rate (i.e., 5 times per past year) 61 Expected Values and Variance for Poisson Models EY ( ) m Var( Y ) m Mean and variance are assumed to be equal. The smaller the mean event rate, the smaller the variability of the counts Often leads to situation where we have data that are overdispersed (more variability than expected if data followed a true Poisson process); could also yield underdispersion ) (need replication in order to estimate overdispersion parameter) 62 31

32 B. Level-1 Link function For binary outcomes, the link is (typically) the logit link For counts, the link is the log link. Indicates how the transformed variable relates back to the original data Bernoulli logit( p Poisson log( ) ) ln 1 p p 63 C. Structural Model Similar to previous (logistic models), but link is now the log link... X qj 0 j 1 j X1 2 j X 2 S q q0 qs s1 W sj u qj pj p 64 32

33 Interpreting the log(event rate) ˆ is the Poisson model s prediction ˆ ˆ log( ) expected counts = the log of the event rate for the collection of covariates ˆ ˆ exp( ) = the rate parameter, or the average number of events expected in the time period If log = 0, event rate = 1 If log < 0, event rate < 1 (but non-negative) If log > 0, event rate > 1 65 Estimating Event Rates for Poisson Model (1) (assuming constant unit of exposure) Constant term (level one) gives us predictions for log of event rate when all predictors are zero (baseline rate) May not be substantively meaningful for the set of predictors Coefficient of X is the expected difference in log(event rate) when X increases by one-unit 33

34 Estimating Event Rates for Poisson Model (2) Exp(b) = expected multiplicative increase in the rate (i.e., on the expected number of events) This is why exp(b) referred to as event rate ratios Percent fewer, or percent more, in terms of number of events (1 exp(b))*100% = percent change (increase or decrease) in the rate, for increase of one unit on X 67 Results - Poisson Fixed Effects Coefficient (SE) Event Ratio Model for the Intercepts (βo) Intercept (γ00) (.009) 4.13 ** Neighborhood Problems (γ01) (.003).988 ** Public/Private (γ02).042 (.016) ** Model for NumRisks Slope (β1 ) Intercept (γ10) (.010).947 ** Random Effects (Var. Components) Intercept (τoo) Variance.000 n.s

35 ˆ ˆ Example: Poisson Model Estimates NumRisks PubPriv2 NbhoodClim = ln(event rate) Expected no. of events 0 Public Public Public Private Private Private Note that Poisson assumptions truly do not fit this example; it is for demonstration and interpretation. Expected number of events mimics expected score for a child. 69 Over- or Underdispersion An issue for non-linear models, particularly for counts Actual data may not follow the strict Poisson model, where variance is equal to the expected value HLM allows you to specify a scale factor for the level-1 variance Var= 2 * for Poisson Note: 2 is not a variance, it s a scaling factor!! If no over/under dispersion, scaling factor = 1; if under- or over-dispersed, it s less than or greater than 1, respectively

36 Scaling factor The scaling factor is used to better estimate the standard errors and variances in the model Little change in fixed effects For our example, the intercept variance was 0 in the Poisson model, and is still small but significant in the scaled model. Page 13 handout 71 Factors affecting dispersion Overdispersion Unaccounted for clustering at level 1 Extreme outliers, or missing levels, or small group sizes (<3) Underdispersion Level one variance may be smaller than assumed Misspecification, such as omission of important variables or large interaction effects In our example, we see under-dispersion 2 = You can see in handout, standard errors on page 12 are now slightly smaller than they were in the Poisson model Better option: negative binomial (for another day) 72 36

37 THANK YOU! 73 SOME EXTRA 74 37

38 SuperMix test for Proportional Odds Create interaction terms between predictors and thresholds. SuperMix has an option to perform this test this directly PO model: -2LL = 14, (9 params) Non-PO model: -2LL = 14, (21 params) DIFF = 2 (12) = 51.16, p<.0001 Evidence for non-proportional odds across at least one predictor Based on previous slide, effect for private schools steadily increasing Effect of private schooling not noticeable for kids at lower levels, but increasingly strong for private-school kids to be beyond, rather than at or below, in higher categories. 75 Summary Considerations (1) We know from logistic HLM that pseudo-likelihood methods not recommended when number of groups is small, or target probabilities are either very small or very large AQ recommended for logistic HLM, but Pinheiro & Chao (2006) warn against assuming this is the also the case for other GLMM s. More research is needed to establish estimation validity under PQL versus AQ or other methods for non-dichotomous models Student (freeware) version of HLM makes it an attractive option PQL default in v6 and v7 Researchers need to be wary of defaults Consider and be aware of implications of choice for estimation

39 Summary Considerations (2) Choice of software may impact quality of model/inferences Vary in terms of default estimation methods and options for additional estimation approaches. As with most statistical packages, the default may not be the best strategy to pursue! Deviances often reported under PL options, so caution is required. Stata, HLMv7, SuperMix, R, GLIMMIX All have different estimation options, different output characteristics Laplace, AQ may have convergence problems, often take longer to reach a solution SAS PROC GLIMMIX likely most extensive array of options; although a challenge to learn/use; syntax mirrors PROC MIXED and SPSS MIXED. Good reference: Stroup (2013) 77 39

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011) Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October

More information

Introducing Generalized Linear Models: Logistic Regression

Introducing Generalized Linear Models: Logistic Regression Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and

More information

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review

More information

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science. Texts in Statistical Science Generalized Linear Mixed Models Modern Concepts, Methods and Applications Walter W. Stroup CRC Press Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint

More information

Generalized Models: Part 1

Generalized Models: Part 1 Generalized Models: Part 1 Topics: Introduction to generalized models Introduction to maximum likelihood estimation Models for binary outcomes Models for proportion outcomes Models for categorical outcomes

More information

Investigating Models with Two or Three Categories

Investigating Models with Two or Three Categories Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might

More information

Partitioning variation in multilevel models.

Partitioning variation in multilevel models. Partitioning variation in multilevel models. by Harvey Goldstein, William Browne and Jon Rasbash Institute of Education, London, UK. Summary. In multilevel modelling, the residual variation in a response

More information

Ronald Heck Week 14 1 EDEP 768E: Seminar in Categorical Data Modeling (F2012) Nov. 17, 2012

Ronald Heck Week 14 1 EDEP 768E: Seminar in Categorical Data Modeling (F2012) Nov. 17, 2012 Ronald Heck Week 14 1 From Single Level to Multilevel Categorical Models This week we develop a two-level model to examine the event probability for an ordinal response variable with three categories (persist

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,

More information

Introduction to Generalized Models

Introduction to Generalized Models Introduction to Generalized Models Today s topics: The big picture of generalized models Review of maximum likelihood estimation Models for binary outcomes Models for proportion outcomes Models for categorical

More information

Binary Logistic Regression

Binary Logistic Regression The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b

More information

Semiparametric Generalized Linear Models

Semiparametric Generalized Linear Models Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student

More information

Generalized Linear Probability Models in HLM R. B. Taylor Department of Criminal Justice Temple University (c) 2000 by Ralph B.

Generalized Linear Probability Models in HLM R. B. Taylor Department of Criminal Justice Temple University (c) 2000 by Ralph B. Generalized Linear Probability Models in HLM R. B. Taylor Department of Criminal Justice Temple University (c) 2000 by Ralph B. Taylor fi=hlml15 The Problem Up to now we have been addressing multilevel

More information

Generalized Multilevel Models for Non-Normal Outcomes

Generalized Multilevel Models for Non-Normal Outcomes Generalized Multilevel Models for Non-Normal Outcomes Topics: 3 parts of a generalized (multilevel) model Models for binary, proportion, and categorical outcomes Complications for generalized multilevel

More information

H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL

H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL Intesar N. El-Saeiti Department of Statistics, Faculty of Science, University of Bengahzi-Libya. entesar.el-saeiti@uob.edu.ly

More information

Mixed Models for Longitudinal Ordinal and Nominal Outcomes

Mixed Models for Longitudinal Ordinal and Nominal Outcomes Mixed Models for Longitudinal Ordinal and Nominal Outcomes Don Hedeker Department of Public Health Sciences Biological Sciences Division University of Chicago hedeker@uchicago.edu Hedeker, D. (2008). Multilevel

More information

Modelling heterogeneous variance-covariance components in two-level multilevel models with application to school effects educational research

Modelling heterogeneous variance-covariance components in two-level multilevel models with application to school effects educational research Modelling heterogeneous variance-covariance components in two-level multilevel models with application to school effects educational research Research Methods Festival Oxford 9 th July 014 George Leckie

More information

Comparing IRT with Other Models

Comparing IRT with Other Models Comparing IRT with Other Models Lecture #14 ICPSR Item Response Theory Workshop Lecture #14: 1of 45 Lecture Overview The final set of slides will describe a parallel between IRT and another commonly used

More information

multilevel modeling: concepts, applications and interpretations

multilevel modeling: concepts, applications and interpretations multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models

More information

Review of Multiple Regression

Review of Multiple Regression Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate

More information

Poisson regression: Further topics

Poisson regression: Further topics Poisson regression: Further topics April 21 Overdispersion One of the defining characteristics of Poisson regression is its lack of a scale parameter: E(Y ) = Var(Y ), and no parameter is available to

More information

Chapter 1. Modeling Basics

Chapter 1. Modeling Basics Chapter 1. Modeling Basics What is a model? Model equation and probability distribution Types of model effects Writing models in matrix form Summary 1 What is a statistical model? A model is a mathematical

More information

Longitudinal Modeling with Logistic Regression

Longitudinal Modeling with Logistic Regression Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to

More information

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 An Introduction to Multilevel Models PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 Today s Class Concepts in Longitudinal Modeling Between-Person vs. +Within-Person

More information

Generalized linear models

Generalized linear models Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models

More information

Generalized Linear Models for Count, Skewed, and If and How Much Outcomes

Generalized Linear Models for Count, Skewed, and If and How Much Outcomes Generalized Linear Models for Count, Skewed, and If and How Much Outcomes Today s Class: Review of 3 parts of a generalized model Models for discrete count or continuous skewed outcomes Models for two-part

More information

Estimation and Centering

Estimation and Centering Estimation and Centering PSYED 3486 Feifei Ye University of Pittsburgh Main Topics Estimating the level-1 coefficients for a particular unit Reading: R&B, Chapter 3 (p85-94) Centering-Location of X Reading

More information

LOGISTIC REGRESSION Joseph M. Hilbe

LOGISTIC REGRESSION Joseph M. Hilbe LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of

More information

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

More information

A Re-Introduction to General Linear Models

A Re-Introduction to General Linear Models A Re-Introduction to General Linear Models Today s Class: Big picture overview Why we are using restricted maximum likelihood within MIXED instead of least squares within GLM Linear model interpretation

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model EPSY 905: Multivariate Analysis Lecture 1 20 January 2016 EPSY 905: Lecture 1 -

More information

WU Weiterbildung. Linear Mixed Models

WU Weiterbildung. Linear Mixed Models Linear Mixed Effects Models WU Weiterbildung SLIDE 1 Outline 1 Estimation: ML vs. REML 2 Special Models On Two Levels Mixed ANOVA Or Random ANOVA Random Intercept Model Random Coefficients Model Intercept-and-Slopes-as-Outcomes

More information

ML estimation: Random-intercepts logistic model. and z

ML estimation: Random-intercepts logistic model. and z ML estimation: Random-intercepts logistic model log p ij 1 p = x ijβ + υ i with υ i N(0, συ) 2 ij Standardizing the random effect, θ i = υ i /σ υ, yields log p ij 1 p = x ij β + σ υθ i with θ i N(0, 1)

More information

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a Chapter 9 Regression with a Binary Dependent Variable Multiple Choice ) The binary dependent variable model is an example of a a. regression model, which has as a regressor, among others, a binary variable.

More information

8 Nominal and Ordinal Logistic Regression

8 Nominal and Ordinal Logistic Regression 8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on

More information

A Re-Introduction to General Linear Models (GLM)

A Re-Introduction to General Linear Models (GLM) A Re-Introduction to General Linear Models (GLM) Today s Class: You do know the GLM Estimation (where the numbers in the output come from): From least squares to restricted maximum likelihood (REML) Reviewing

More information

Model Assumptions; Predicting Heterogeneity of Variance

Model Assumptions; Predicting Heterogeneity of Variance Model Assumptions; Predicting Heterogeneity of Variance Today s topics: Model assumptions Normality Constant variance Predicting heterogeneity of variance CLP 945: Lecture 6 1 Checking for Violations of

More information

Random Intercept Models

Random Intercept Models Random Intercept Models Edps/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline A very simple case of a random intercept

More information

Correlation and regression

Correlation and regression 1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

More information

Generalized Linear Models (GLZ)

Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the

More information

Linear Regression With Special Variables

Linear Regression With Special Variables Linear Regression With Special Variables Junhui Qian December 21, 2014 Outline Standardized Scores Quadratic Terms Interaction Terms Binary Explanatory Variables Binary Choice Models Standardized Scores:

More information

Stat 642, Lecture notes for 04/12/05 96

Stat 642, Lecture notes for 04/12/05 96 Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do

More information

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation NELS 88 Table 2.3 Adjusted odds ratios of eighth-grade students in 988 performing below basic levels of reading and mathematics in 988 and dropping out of school, 988 to 990, by basic demographics Variable

More information

Logistic Regression: Regression with a Binary Dependent Variable

Logistic Regression: Regression with a Binary Dependent Variable Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression

More information

Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson

Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson Lecture 10: Alternatives to OLS with limited dependent variables PEA vs APE Logit/Probit Poisson PEA vs APE PEA: partial effect at the average The effect of some x on y for a hypothetical case with sample

More information

7. Assumes that there is little or no multicollinearity (however, SPSS will not assess this in the [binary] Logistic Regression procedure).

7. Assumes that there is little or no multicollinearity (however, SPSS will not assess this in the [binary] Logistic Regression procedure). 1 Neuendorf Logistic Regression The Model: Y Assumptions: 1. Metric (interval/ratio) data for 2+ IVs, and dichotomous (binomial; 2-value), categorical/nominal data for a single DV... bear in mind that

More information

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates Madison January 11, 2011 Contents 1 Definition 1 2 Links 2 3 Example 7 4 Model building 9 5 Conclusions 14

More information

Multilevel Modeling: A Second Course

Multilevel Modeling: A Second Course Multilevel Modeling: A Second Course Kristopher Preacher, Ph.D. Upcoming Seminar: February 2-3, 2017, Ft. Myers, Florida What this workshop will accomplish I will review the basics of multilevel modeling

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Generalized Linear Models

Generalized Linear Models York SPIDA John Fox Notes Generalized Linear Models Copyright 2010 by John Fox Generalized Linear Models 1 1. Topics I The structure of generalized linear models I Poisson and other generalized linear

More information

11. Generalized Linear Models: An Introduction

11. Generalized Linear Models: An Introduction Sociology 740 John Fox Lecture Notes 11. Generalized Linear Models: An Introduction Copyright 2014 by John Fox Generalized Linear Models: An Introduction 1 1. Introduction I A synthesis due to Nelder and

More information

ISQS 5349 Spring 2013 Final Exam

ISQS 5349 Spring 2013 Final Exam ISQS 5349 Spring 2013 Final Exam Name: General Instructions: Closed books, notes, no electronic devices. Points (out of 200) are in parentheses. Put written answers on separate paper; multiple choices

More information

STAT 705 Generalized linear mixed models

STAT 705 Generalized linear mixed models STAT 705 Generalized linear mixed models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 24 Generalized Linear Mixed Models We have considered random

More information

Package HGLMMM for Hierarchical Generalized Linear Models

Package HGLMMM for Hierarchical Generalized Linear Models Package HGLMMM for Hierarchical Generalized Linear Models Marek Molas Emmanuel Lesaffre Erasmus MC Erasmus Universiteit - Rotterdam The Netherlands ERASMUSMC - Biostatistics 20-04-2010 1 / 52 Outline General

More information

2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 When and why do we use logistic regression? Binary Multinomial Theory behind logistic regression Assessing the model Assessing predictors

More information

Simple logistic regression

Simple logistic regression Simple logistic regression Biometry 755 Spring 2009 Simple logistic regression p. 1/47 Model assumptions 1. The observed data are independent realizations of a binary response variable Y that follows a

More information

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form: Outline for today What is a generalized linear model Linear predictors and link functions Example: fit a constant (the proportion) Analysis of deviance table Example: fit dose-response data using logistic

More information

A multivariate multilevel model for the analysis of TIMMS & PIRLS data

A multivariate multilevel model for the analysis of TIMMS & PIRLS data A multivariate multilevel model for the analysis of TIMMS & PIRLS data European Congress of Methodology July 23-25, 2014 - Utrecht Leonardo Grilli 1, Fulvia Pennoni 2, Carla Rampichini 1, Isabella Romeo

More information

Binary Choice Models Probit & Logit. = 0 with Pr = 0 = 1. decision-making purchase of durable consumer products unemployment

Binary Choice Models Probit & Logit. = 0 with Pr = 0 = 1. decision-making purchase of durable consumer products unemployment BINARY CHOICE MODELS Y ( Y ) ( Y ) 1 with Pr = 1 = P = 0 with Pr = 0 = 1 P Examples: decision-making purchase of durable consumer products unemployment Estimation with OLS? Yi = Xiβ + εi Problems: nonsense

More information

Review of CLDP 944: Multilevel Models for Longitudinal Data

Review of CLDP 944: Multilevel Models for Longitudinal Data Review of CLDP 944: Multilevel Models for Longitudinal Data Topics: Review of general MLM concepts and terminology Model comparisons and significance testing Fixed and random effects of time Significance

More information

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates 2011-03-16 Contents 1 Generalized Linear Mixed Models Generalized Linear Mixed Models When using linear mixed

More information

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Ninth ARTNeT Capacity Building Workshop for Trade Research Trade Flows and Trade Policy Analysis Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis" June 2013 Bangkok, Thailand Cosimo Beverelli and Rainer Lanz (World Trade Organization) 1 Selected econometric

More information

Generalized logit models for nominal multinomial responses. Local odds ratios

Generalized logit models for nominal multinomial responses. Local odds ratios Generalized logit models for nominal multinomial responses Categorical Data Analysis, Summer 2015 1/17 Local odds ratios Y 1 2 3 4 1 π 11 π 12 π 13 π 14 π 1+ X 2 π 21 π 22 π 23 π 24 π 2+ 3 π 31 π 32 π

More information

Introduction to Within-Person Analysis and RM ANOVA

Introduction to Within-Person Analysis and RM ANOVA Introduction to Within-Person Analysis and RM ANOVA Today s Class: From between-person to within-person ANOVAs for longitudinal data Variance model comparisons using 2 LL CLP 944: Lecture 3 1 The Two Sides

More information

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA Topics: Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA What are MI and DIF? Testing measurement invariance in CFA Testing differential item functioning in IRT/IFA

More information

Introduction To Logistic Regression

Introduction To Logistic Regression Introduction To Lecture 22 April 28, 2005 Applied Regression Analysis Lecture #22-4/28/2005 Slide 1 of 28 Today s Lecture Logistic regression. Today s Lecture Lecture #22-4/28/2005 Slide 2 of 28 Background

More information

Lecture 3.1 Basic Logistic LDA

Lecture 3.1 Basic Logistic LDA y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

Chapter 22: Log-linear regression for Poisson counts

Chapter 22: Log-linear regression for Poisson counts Chapter 22: Log-linear regression for Poisson counts Exposure to ionizing radiation is recognized as a cancer risk. In the United States, EPA sets guidelines specifying upper limits on the amount of exposure

More information

Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models

Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models EPSY 905: Multivariate Analysis Spring 2016 Lecture #12 April 20, 2016 EPSY 905: RM ANOVA, MANOVA, and Mixed Models

More information

STAT5044: Regression and Anova

STAT5044: Regression and Anova STAT5044: Regression and Anova Inyoung Kim 1 / 18 Outline 1 Logistic regression for Binary data 2 Poisson regression for Count data 2 / 18 GLM Let Y denote a binary response variable. Each observation

More information

Lecture 1 Introduction to Multi-level Models

Lecture 1 Introduction to Multi-level Models Lecture 1 Introduction to Multi-level Models Course Website: http://www.biostat.jhsph.edu/~ejohnson/multilevel.htm All lecture materials extracted and further developed from the Multilevel Model course

More information

Recent Developments in Multilevel Modeling

Recent Developments in Multilevel Modeling Recent Developments in Multilevel Modeling Roberto G. Gutierrez Director of Statistics StataCorp LP 2007 North American Stata Users Group Meeting, Boston R. Gutierrez (StataCorp) Multilevel Modeling August

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Introduction to GSEM in Stata

Introduction to GSEM in Stata Introduction to GSEM in Stata Christopher F Baum ECON 8823: Applied Econometrics Boston College, Spring 2016 Christopher F Baum (BC / DIW) Introduction to GSEM in Stata Boston College, Spring 2016 1 /

More information

Research Design: Topic 18 Hierarchical Linear Modeling (Measures within Persons) 2010 R.C. Gardner, Ph.d.

Research Design: Topic 18 Hierarchical Linear Modeling (Measures within Persons) 2010 R.C. Gardner, Ph.d. Research Design: Topic 8 Hierarchical Linear Modeling (Measures within Persons) R.C. Gardner, Ph.d. General Rationale, Purpose, and Applications Linear Growth Models HLM can also be used with repeated

More information

The Application and Promise of Hierarchical Linear Modeling (HLM) in Studying First-Year Student Programs

The Application and Promise of Hierarchical Linear Modeling (HLM) in Studying First-Year Student Programs The Application and Promise of Hierarchical Linear Modeling (HLM) in Studying First-Year Student Programs Chad S. Briggs, Kathie Lorentz & Eric Davis Education & Outreach University Housing Southern Illinois

More information

Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data

Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data Bryan A. Comstock and Patrick J. Heagerty Department of Biostatistics University of Washington

More information

Categorical and Zero Inflated Growth Models

Categorical and Zero Inflated Growth Models Categorical and Zero Inflated Growth Models Alan C. Acock* Summer, 2009 *Alan C. Acock, Department of Human Development and Family Sciences, Oregon State University, Corvallis OR 97331 (alan.acock@oregonstate.edu).

More information

An ordinal number is used to represent a magnitude, such that we can compare ordinal numbers and order them by the quantity they represent.

An ordinal number is used to represent a magnitude, such that we can compare ordinal numbers and order them by the quantity they represent. Statistical Methods in Business Lecture 6. Binomial Logistic Regression An ordinal number is used to represent a magnitude, such that we can compare ordinal numbers and order them by the quantity they

More information

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models Optimum Design for Mixed Effects Non-Linear and generalized Linear Models Cambridge, August 9-12, 2011 Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

More information

Centering Predictor and Mediator Variables in Multilevel and Time-Series Models

Centering Predictor and Mediator Variables in Multilevel and Time-Series Models Centering Predictor and Mediator Variables in Multilevel and Time-Series Models Tihomir Asparouhov and Bengt Muthén Part 2 May 7, 2018 Tihomir Asparouhov and Bengt Muthén Part 2 Muthén & Muthén 1/ 42 Overview

More information

Sampling and Sample Size. Shawn Cole Harvard Business School

Sampling and Sample Size. Shawn Cole Harvard Business School Sampling and Sample Size Shawn Cole Harvard Business School Calculating Sample Size Effect Size Power Significance Level Variance ICC EffectSize 2 ( ) 1 σ = t( 1 κ ) + tα * * 1+ ρ( m 1) P N ( 1 P) Proportion

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 1: August 22, 2012

More information

The Basic Two-Level Regression Model

The Basic Two-Level Regression Model 7 Manuscript version, chapter in J.J. Hox, M. Moerbeek & R. van de Schoot (018). Multilevel Analysis. Techniques and Applications. New York, NY: Routledge. The Basic Two-Level Regression Model Summary.

More information

More Statistics tutorial at Logistic Regression and the new:

More Statistics tutorial at  Logistic Regression and the new: Logistic Regression and the new: Residual Logistic Regression 1 Outline 1. Logistic Regression 2. Confounding Variables 3. Controlling for Confounding Variables 4. Residual Linear Regression 5. Residual

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

General structural model Part 2: Categorical variables and beyond. Psychology 588: Covariance structure and factor models

General structural model Part 2: Categorical variables and beyond. Psychology 588: Covariance structure and factor models General structural model Part 2: Categorical variables and beyond Psychology 588: Covariance structure and factor models Categorical variables 2 Conventional (linear) SEM assumes continuous observed variables

More information

Models for Clustered Data

Models for Clustered Data Models for Clustered Data Edps/Psych/Soc 589 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline Notation NELS88 data Fixed Effects ANOVA

More information

Advanced Quantitative Data Analysis

Advanced Quantitative Data Analysis Chapter 24 Advanced Quantitative Data Analysis Daniel Muijs Doing Regression Analysis in SPSS When we want to do regression analysis in SPSS, we have to go through the following steps: 1 As usual, we choose

More information

An Introduction to Path Analysis

An Introduction to Path Analysis An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Today s Class (or 3): Summary of steps in building unconditional models for time What happens to missing predictors Effects of time-invariant predictors

More information

9 Generalized Linear Models

9 Generalized Linear Models 9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models

More information

Models for Clustered Data

Models for Clustered Data Models for Clustered Data Edps/Psych/Stat 587 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2017 Outline Notation NELS88 data Fixed Effects ANOVA

More information