Correlational Structure in the Random-Effect Structure of Mixed Models

Size: px
Start display at page:

Download "Correlational Structure in the Random-Effect Structure of Mixed Models"

Transcription

1 Correlational Structure in the Random-Effect Structure of Mixed Models March 25, 2009

2 Outline

3 random effects in mixed-effects modeling goal to provide an intuitive guide to understanding the role of correlation parameters in the random effects part of a mixed model two examples example 1: variation in the realization of correlational structure involving a fixed factorial predictor fixed factor levels nested under the random-effect factor example 2: self-paced reading latencies of correlational structure involving covariates subject covariates crossed with item item covariates crossed with subject

4 Outline

5 the -a/aj alternative forms forms with -a form with -aj infinitive maxat maxat masculine sg past maxal maxal 1sg present mašu maxaju 2sg present mašeš maxa(j)eš 3sg present mašet maxa(j)et 1pl present mašem maxa(j)em 2pl present mašete maxa(j)ete 3pl present mašut maxajut imperative maši(te) maxaj(te) present active participle mašuščij maxajuščij gerund maša maxaja

6 systematicities in this variation Count a aj Count a aj s p f i a g dental labial velar Counts of -a (black) and -aj (white) realizations for six paradigm slots (left) and place of articulation of the final consonant of the root (right). a: active present participle, p: third person plural, s: third person sigular, f: first/second person, i: infinitive, g: gerund.

7 different verbs show different patterns logit xnykat zhazhdat schepat schipat stonat pleskat poloskat prjatat pryskat kudaxtat kurlykat maxat kapat klepat klikat svistat metat kloxtat tykat pyxat murlykat mykat kolebat vnimat ryskat schekotat kolyxat xlestat paxat krapat alkat blistat bryzgat cherpat dremat dvigat glodat s p f i a g s p f i a g s p f i a g s p f i a g s p f i a g s p f i a g s p f i a g The log odds (of -a versus -aj) for each of the six paradigm slots. A log odds greater than zero indicates a preference for -a, a log odds smaller than zero a preference for -aj.

8 a model with random intercepts for verbs contrast coding for Paradigm and Place reference level (Active participle for Paradigm, dental for Place) contrasts (group mean differences) with respect to the reference level (e.g., Gerund versus Active Participle, Labial versus Dental) we begin with a model with random intercepts for Verb, thereby allowing the verbs to differ in the extent to which they prefer -a over -aj (equally across all forms in the paradigm)

9 a model with random intercepts for verbs > russian.lmer = lmer(cbind(a, aj) ~ Paradigm + + Place + (1 Verb), data = russian, family = "binomial") Random effects: Groups Name Variance Std.Dev. Verb (Intercept) Number of obs: 222, groups: Verb, 37 Fixed effects: Estimate Std. Error z value Pr(> z ) (Intercept) Paradigmf Paradigmg Paradigmi Paradigmp Paradigms Placelabial Placevelar

10 problems with this initial model differences between verbs are restricted to just the intercept our dotplot suggests, however, that verbs may differ with respect to the paradigm slots for which they prefer or disprefer -a versus -aj furthermore, we have assumed that the likelihood of a given variant for a given verb in one paradigm slot is independent of the likelihood of a given variant for that same verb in another paradigm slot, which seems unlikely

11 the observations across paradigm cells for a given verb are not independent (dots represent verbs) 2 2 a r = 0.63 rs = 0.6 p = 1e f r = rs = r = 0.55 p = 4e 04 rs = 0.52 p = 9e 04 g r = rs = r = 0.71 rs = 0.72 r = 0.64 rs = 0.65 r = 0.73 rs = 0.73 r = 0.88 rs = 0.87 r = 0.8 rs = 0.75 r = 0.63 rs = 0.52 p = 9e 04 r = 0.55 p = 4e 04 rs = r = rs = i r = 0.66 rs = 0.65 r = 0.62 rs = 0.6 p = 1e 04 p r = 0.83 rs = 0.82 s

12 anticipating the consequences of contrast coding for the random effects structure we are modeling Paradigm with contrast coding bringing flexibility into the model for verb specific preferences across the paradigm will therefore be implemented in terms of adjustments to the intercept and adjustments to contrast coefficients to anticipate the correlational structure of these adjustments, we redo our previous plot, retaining the log odds for the reference level, Active participle but for all other levels of Paradigm, we replace the observed log odds by its difference with the corresponding value for the reference level

13 anticipating the consequences of contrast coding for the random effects structure the correlations involving the reference level change sign, the other correlations remain positive this pattern is strongest for weakly correlated random variables

14 simulated data before contrasts y x r= z x r= y z r= y x r= z x r= y z r= y x r= z x r= y z r=0.97

15 simulated data after contrasts y x x r= z x x r= y x z x r= y x x r= z x x r= y x z x r= y x x r= z x x r= y x z x r=0.97

16 anticipating the consequences of contrast coding for the random effects structure (dots represent verbs) a r = 0.56 p = 3e 04 rs = 0.55 p = 5e 04 r = 0.62 rs = f r = 0.69 rs = 0.74 g r = 0.8 rs = 0.79 r = rs = r = rs = r = 0.8 rs = 0.81 r = 0.84 rs = 0.87 r = 0.72 rs = 0.75 r = 0.78 rs = 0.63 r = 0.66 rs = 0.64 r = 0.53 p = 8e 04 rs = 0.56 p = 3e 04 i r = 0.7 rs = 0.69 r = 0.61 p = 1e 04 rs = 0.6 p = 1e 04 p r = 0.74 rs = 0.82 s

17 an improved model > russian.lmer1 = lmer(cbind(a, aj) ~ Paradigm + + Place + (1 + Paradigm Verb), data = russian, + family = "binomial") Random effects: Name Variance Std.Dev. Corr (Intercept) Paradigmf Paradigmg Paradigmi Paradigmp Paradigms > pairscor.fnc(ranef(russian.lmer1)$verb)

18 visualization of the BLUPs Intercept r = 0.61 p = 1e 04 rs = 0.63 r = rs = f r = 0.81 rs = 0.72 g r = 0.72 rs = 0.75 r = 0.92 rs = 0.91 r = 0.64 rs = 0.56 p = 4e 04 i r = rs = r = 0.93 rs = 0.9 r = 0.94 rs = 0.89 r = 0.77 rs = 0.72 p r = 0.54 p = 5e 04 rs = 0.56 p = 4e 04 r = 0.86 rs = 0.91 r = 0.54 p = 6e 04 rs = 0.55 p = 6e 04 r = 0.96 rs = 0.94 r = 0.69 rs = 0.74 s

19 are 15 additional parameters justified? observed proportion expected proportion model observed proportion expected proportion model 2

20 model comparison with likelihood ratio test russian.lmer = lmer(cbind(a, aj) ~ Paradigm + Place + (1 Verb), data=russian, family="binomial") russian.lmer1 = lmer(cbind(a, aj) ~ Paradigm + Place + (1+Paradigm Verb), data=russian, family="binomial") anova(russian.lmer, russian.lmer1)... Df AIC BIC loglik Chisq Chi Df Pr(>Chisq) russian.lmer russian.lmer < 2.2e-16

21 Summary models become more precise if you take non-independence seriously the coding used for factor levels determines your interpretation of the random effects correlational structure for contrast coding, the default in R, pairwise correlations change sign for pairs involving the reference level

22 Outline

23 self-paced reading experiment 87 poems, in all 2315 different word forms 326 subjects self-paced reading latencies three random-effect factors Poem Word Subject

24 Words random intercepts possibly, additional random slopes/contrasts for properties of the subjects the subject s age reading latencies for a given word might depend specifically on whether you are a younger or an older subject) RT in questionaire (the subject s response latency in an on-line questionaire requesting from the subject an estimate of the number of poems read annually) reading latencies for a given word might depend on whether you are a slow, careful evaluator or a fast, superficial responder the subject s sex a given word might be read more quickly by females (or males) (female words versus male words) note: it is important to center predictors

25 spoken British English (BNC) females she, her, said, n t, I, and, to, cos, oh, Christmas, thought, lovely, nice, mm, had, did, going, yes, really males fucking, er, the, yeah, aye, right, hundred, fuck, is, of, two, three, a, four, ah, no rlh97.html

26 spoken British English (BNC) females she, her, said, n t, I, and, to, cos, oh, Christmas, thought, lovely, nice, mm, had, did, going, yes, really males fucking, er, the, yeah, aye, right, hundred, fuck, is, of, two, three, a, four, ah, no rlh97.html

27 spoken British English (BNC) females she, her, said, n t, I, and, to, cos, oh, Christmas, thought, lovely, nice, mm, had, did, going, yes, really males fucking, er, the, yeah, aye, right, hundred, fuck, is, of, two, three, a, four, ah, no rlh97.html

28 spoken British English (BNC) females she, her, said, n t, I, and, to, cos, oh, Christmas, thought, lovely, nice, mm, had, did, going, yes, really males fucking, er, the, yeah, aye, right, hundred, fuck, is, of, two, three, a, four, ah, no rlh97.html

29 spoken British English (BNC) females she, her, said, n t, I, and, to, cos, oh, Christmas, thought, lovely, nice, mm, had, did, going, yes, really males fucking, er, the, yeah, aye, right, hundred, fuck, is, of, two, three, a, four, ah, no rlh97.html

30 the Word: exploration with lmlist > items.lmlist = lmlist(readingtime ~ Age + RTquestionaire + + Sex Word, data = dat) > items = data.frame(coef(items.lmlist)) > pairscor.fnc(items, cex = 0.5) for each word, we fit a separate model to the reading times of the subjects reading that word with as predictors the age, questionaire RT, and sex of those subjects for each word, we thus obtain an intercept, slopes for Age and RTquestionaire, and a contrast coefficient for Sex we plot these coefficients using a pairwise scatterplot matrix

31 the Word: exploration with lmlist Intercept r = rs = Age r = 0.08 p = 2e 04 rs = 0.08 p = 1e 04 r = rs = RTquestionaire r = 0.28 rs = 0.23 r = 0.09 rs = r = rs = 0.08 p = 1e 04 Sex each point represents a verb type lines are nonparametric scatterplot smoothers

32 the Word: Age and Intercept items$age items$intercept to the left in the graph: here we see words that are read faster by older subjects (a regression of Reading Time on Age has negative slope for these words) to the right in the graph: here we see words that are read slower by older subjects (a regression of Reading Time on Age has positive slope for these words)

33 the Word: Age and Intercept items$age items$intercept in the center of the graph: zero slope, so no Age effect: here we see words that are processed the same irrespective of Age these words also have the smallest intercepts, so overall, these words elicit the shortest mean reading latencies

34 the Word: Sex and Intercept the left panel shows the intercepts for females (vertical) and the contrast for males (horizontal) there are relatively few words to the right of X=0 (words for which the intercept for males has to be adjusted upwards compared to the intercept for females) as we move to the left, we meet words for which the intercept (appropriate for females) has to be adjusted downward for males the right panel shows the intercepts for females (vertical) and the reconstructed intercepts for males (horizontal) items$sex items$intercept items$sex + items$intercept items$intercept

35 the Word: Sex and Intercept what makes words more or less easy to process by males or females? to answer this question, we consider Subject as random-effect factor

36 the Subject random intercepts possibly, random slopes for properties of the Words, e.g., the word s frequency the word s number of constituent morphemes background: Ullman s hypothesis that females have superior verbal memory and hence have a stronger frequency effect than males

37 the Subject: Nmorphs and Frequency > subjects.lmlist = lmlist(leestijd ~ Nmorphs + + SurfFreq Subject, data = dat) > subjects = data.frame(coef(subjects.lmlist)) > pairs(subjects[, 1:3]) > t.test(surffreq ~ Sex, data = subjects) t = , df = , p-value = Intercept Nmorphs SurfFreq

38 joint analysis: the model specification dat.lmer = lmer(readingtime ~ Trial + NumberOfWordsIntoLine + SentenceLength Sex*SurfFreq + I(SurfFreq^2) + RTquestionaire + Nmorphs*Sex + Age + (1 Poem)+(1+Nmorphs+SurfFreq Subject)+(1+RTquestionaire+Age Word), data=dat)

39 joint analysis: random effects Random effects: Groups Name Std.Dev. Corr Word (Intercept) RTquestionaire Age Subject (Intercept) Nmorphs SurfFreq Poem (Intercept) Residual Number of obs: , groups: Word, 2315; Subject, 326; Gedicht, 87

40 visualization subject BLUPs (Intercept) r = 0.69 rs = 0.67 Nmorphs r = 0.67 rs = 0.72 r = 0.51 rs = SurfFreq

41 visualization word BLUPs (Intercept) r = 0.98 rs = 0.97 ChoiceRT r = 0.94 rs = 0.92 r = 0.86 rs = Leeftijd

42 joint analysis: random effects likelihood ratio tests support each additional parameter in the model for instance, comparing a model with only random intercepts for subject with a model with additional structure for Nmorphs and SurfFreq: anova(dat.lmer0, dat.lmer1) Df AIC BIC loglik Chisq Chi Df Pr(>Chisq) dat.lmer dat.lmer < 2.2e-16

43 modeling strategy explore with visualization where random slopes and correlations might be required add additional parameters incrementally: complex random effects structure can be difficult to fit

44 joint analysis: fixed effects Fixed effects: Estimate Std. Error t value... Nmorphs RTquestionaire Sexm Age SurfFreq I(SurfFreq^2) SurfFreq:Sexm Sexm:Nmorphs

45 model criticism 1 > pdf("qqplot.pdf", he = 5, wi = 5) > plot(qnorm(p = seq(0.001, 0.999, length = 20)), + quantile(resid(dat.lmer2), seq(0.001, 0.999, + length = 20))) > dev.off() quantile(resid(dat.lmer2), seq(0.001, 0.999, length = 20)) qnorm(p = seq(0.001, 0.999, length = 20))

46 model criticism 2 Residual Frequency

47 Outline

48 we have validated the Sex by Frequency interaction in the fixed-effect part of the model by bringing into the model all potential other sources that might explain this interaction: a potential confound with other available subject-specific properties (1+Age+RTquestionaire Word) a potential confound with individual differences in sensitivity to frequency (1+Frequency+Nmorphs Subject)

49 Outline

50 we obtain better models when we pay careful attention to the modeling of the correlational structure for the random effect factors we have a better tool for understanding subject (item) variability than traditional methods such as median splits with separate subanalyses

Outline. Mixed models in R using the lme4 package Part 3: Longitudinal data. Sleep deprivation data. Simple longitudinal data

Outline. Mixed models in R using the lme4 package Part 3: Longitudinal data. Sleep deprivation data. Simple longitudinal data Outline Mixed models in R using the lme4 package Part 3: Longitudinal data Douglas Bates Longitudinal data: sleepstudy A model with random effects for intercept and slope University of Wisconsin - Madison

More information

Mixed models in R using the lme4 package Part 7: Generalized linear mixed models

Mixed models in R using the lme4 package Part 7: Generalized linear mixed models Mixed models in R using the lme4 package Part 7: Generalized linear mixed models Douglas Bates University of Wisconsin - Madison and R Development Core Team University of

More information

Value Added Modeling

Value Added Modeling Value Added Modeling Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background for VAMs Recall from previous lectures

More information

lme4 Luke Chang Last Revised July 16, Fitting Linear Mixed Models with a Varying Intercept

lme4 Luke Chang Last Revised July 16, Fitting Linear Mixed Models with a Varying Intercept lme4 Luke Chang Last Revised July 16, 2010 1 Using lme4 1.1 Fitting Linear Mixed Models with a Varying Intercept We will now work through the same Ultimatum Game example from the regression section and

More information

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates Madison January 11, 2011 Contents 1 Definition 1 2 Links 2 3 Example 7 4 Model building 9 5 Conclusions 14

More information

Outline. Mixed models in R using the lme4 package Part 5: Generalized linear mixed models. Parts of LMMs carried over to GLMMs

Outline. Mixed models in R using the lme4 package Part 5: Generalized linear mixed models. Parts of LMMs carried over to GLMMs Outline Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates University of Wisconsin - Madison and R Development Core Team UseR!2009,

More information

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates 2011-03-16 Contents 1 Generalized Linear Mixed Models Generalized Linear Mixed Models When using linear mixed

More information

Random and Mixed Effects Models - Part II

Random and Mixed Effects Models - Part II Random and Mixed Effects Models - Part II Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Two-Factor Random Effects Model Example: Miles per Gallon (Neter, Kutner, Nachtsheim, & Wasserman, problem

More information

Generalized Linear and Nonlinear Mixed-Effects Models

Generalized Linear and Nonlinear Mixed-Effects Models Generalized Linear and Nonlinear Mixed-Effects Models Douglas Bates University of Wisconsin - Madison and R Development Core Team University of Potsdam August 8, 2008 Outline

More information

A Brief and Friendly Introduction to Mixed-Effects Models in Psycholinguistics

A Brief and Friendly Introduction to Mixed-Effects Models in Psycholinguistics A Brief and Friendly Introduction to Mixed-Effects Models in Psycholinguistics Cluster-specific parameters ( random effects ) Σb Parameters governing inter-cluster variability b1 b2 bm x11 x1n1 x21 x2n2

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Wheel for assessing spinal block study

Wheel for assessing spinal block study Wheel for assessing spinal block study Xue Han, xue.han@vanderbilt.edu Matt Shotwell, matt.shotwell@vanderbilt.edu Department of Biostatistics Vanderbilt University December 13, 2012 Contents 1 Preliminary

More information

Introduction and Background to Multilevel Analysis

Introduction and Background to Multilevel Analysis Introduction and Background to Multilevel Analysis Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background and

More information

Correlated Data: Linear Mixed Models with Random Intercepts

Correlated Data: Linear Mixed Models with Random Intercepts 1 Correlated Data: Linear Mixed Models with Random Intercepts Mixed Effects Models This lecture introduces linear mixed effects models. Linear mixed models are a type of regression model, which generalise

More information

Mixed effects models

Mixed effects models Mixed effects models The basic theory and application in R Mitchel van Loon Research Paper Business Analytics Mixed effects models The basic theory and application in R Author: Mitchel van Loon Research

More information

36-463/663: Hierarchical Linear Models

36-463/663: Hierarchical Linear Models 36-463/663: Hierarchical Linear Models Lmer model selection and residuals Brian Junker 132E Baker Hall brian@stat.cmu.edu 1 Outline The London Schools Data (again!) A nice random-intercepts, random-slopes

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

Non-Gaussian Response Variables

Non-Gaussian Response Variables Non-Gaussian Response Variables What is the Generalized Model Doing? The fixed effects are like the factors in a traditional analysis of variance or linear model The random effects are different A generalized

More information

Using R formulae to test for main effects in the presence of higher-order interactions

Using R formulae to test for main effects in the presence of higher-order interactions Using R formulae to test for main effects in the presence of higher-order interactions Roger Levy arxiv:1405.2094v2 [stat.me] 15 Jan 2018 January 16, 2018 Abstract Traditional analysis of variance (ANOVA)

More information

STAT 7030: Categorical Data Analysis

STAT 7030: Categorical Data Analysis STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012

More information

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall)

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall) Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall) We will cover Chs. 5 and 6 first, then 3 and 4. Mon,

More information

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study 1.4 0.0-6 7 8 9 10 11 12 13 14 15 16 17 18 19 age Model 1: A simple broken stick model with knot at 14 fit with

More information

Logistic Regression - problem 6.14

Logistic Regression - problem 6.14 Logistic Regression - problem 6.14 Let x 1, x 2,, x m be given values of an input variable x and let Y 1,, Y m be independent binomial random variables whose distributions depend on the corresponding values

More information

Generalized linear models

Generalized linear models Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models

More information

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3 STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae

More information

Mixed models in R using the lme4 package Part 2: Longitudinal data, modeling interactions

Mixed models in R using the lme4 package Part 2: Longitudinal data, modeling interactions Mixed models in R using the lme4 package Part 2: Longitudinal data, modeling interactions Douglas Bates Department of Statistics University of Wisconsin - Madison Madison January 11, 2011

More information

Simple logistic regression

Simple logistic regression Simple logistic regression Biometry 755 Spring 2009 Simple logistic regression p. 1/47 Model assumptions 1. The observed data are independent realizations of a binary response variable Y that follows a

More information

A brief introduction to mixed models

A brief introduction to mixed models A brief introduction to mixed models University of Gothenburg Gothenburg April 6, 2017 Outline An introduction to mixed models based on a few examples: Definition of standard mixed models. Parameter estimation.

More information

Section 9c. Propensity scores. Controlling for bias & confounding in observational studies

Section 9c. Propensity scores. Controlling for bias & confounding in observational studies Section 9c Propensity scores Controlling for bias & confounding in observational studies 1 Logistic regression and propensity scores Consider comparing an outcome in two treatment groups: A vs B. In a

More information

A Brief and Friendly Introduction to Mixed-Effects Models in Linguistics

A Brief and Friendly Introduction to Mixed-Effects Models in Linguistics A Brief and Friendly Introduction to Mixed-Effects Models in Linguistics Cluster-specific parameters ( random effects ) Σb Parameters governing inter-cluster variability b1 b2 bm x11 x1n1 x21 x2n2 xm1

More information

Analysis of binary repeated measures data with R

Analysis of binary repeated measures data with R Analysis of binary repeated measures data with R Right-handed basketball players take right and left-handed shots from 3 locations in a different random order for each player. Hit or miss is recorded.

More information

Lecture 4 Multiple linear regression

Lecture 4 Multiple linear regression Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters

More information

22s:152 Applied Linear Regression

22s:152 Applied Linear Regression 22s:152 Applied Linear Regression Chapter 7: Dummy Variable Regression So far, we ve only considered quantitative variables in our models. We can integrate categorical predictors by constructing artificial

More information

Unit 5 Logistic Regression Practice Problems

Unit 5 Logistic Regression Practice Problems Unit 5 Logistic Regression Practice Problems SOLUTIONS R Users Source: Afifi A., Clark VA and May S. Computer Aided Multivariate Analysis, Fourth Edition. Boca Raton: Chapman and Hall, 2004. Exercises

More information

A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn

A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn A Handbook of Statistical Analyses Using R Brian S. Everitt and Torsten Hothorn CHAPTER 6 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, and Colonic Polyps

More information

Logistic Regression. Some slides from Craig Burkett. STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy

Logistic Regression. Some slides from Craig Burkett. STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy Logistic Regression Some slides from Craig Burkett STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy Titanic Survival Case Study The RMS Titanic A British passenger liner Collided

More information

Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models:

Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models: Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models: Marginal models: based on the consequences of dependence on estimating model parameters.

More information

Log-linear Models for Contingency Tables

Log-linear Models for Contingency Tables Log-linear Models for Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Log-linear Models for Two-way Contingency Tables Example: Business Administration Majors and Gender A

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs) 36-309/749 Experimental Design for Behavioral and Social Sciences Dec 1, 2015 Lecture 11: Mixed Models (HLMs) Independent Errors Assumption An error is the deviation of an individual observed outcome (DV)

More information

Week 7 Multiple factors. Ch , Some miscellaneous parts

Week 7 Multiple factors. Ch , Some miscellaneous parts Week 7 Multiple factors Ch. 18-19, Some miscellaneous parts Multiple Factors Most experiments will involve multiple factors, some of which will be nuisance variables Dealing with these factors requires

More information

Generalised linear models. Response variable can take a number of different formats

Generalised linear models. Response variable can take a number of different formats Generalised linear models Response variable can take a number of different formats Structure Limitations of linear models and GLM theory GLM for count data GLM for presence \ absence data GLM for proportion

More information

Logistic Regressions. Stat 430

Logistic Regressions. Stat 430 Logistic Regressions Stat 430 Final Project Final Project is, again, team based You will decide on a project - only constraint is: you are supposed to use techniques for a solution that are related to

More information

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation Bivariate Regression & Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate Linear Regression Line SPSS Output Interpretation Covariance ou already

More information

Introduction to Within-Person Analysis and RM ANOVA

Introduction to Within-Person Analysis and RM ANOVA Introduction to Within-Person Analysis and RM ANOVA Today s Class: From between-person to within-person ANOVAs for longitudinal data Variance model comparisons using 2 LL CLP 944: Lecture 3 1 The Two Sides

More information

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review

More information

Introducing Generalized Linear Models: Logistic Regression

Introducing Generalized Linear Models: Logistic Regression Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and

More information

Logistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression

Logistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression Logistic Regression Usual linear regression (repetition) y i = b 0 + b 1 x 1i + b 2 x 2i + e i, e i N(0,σ 2 ) or: y i N(b 0 + b 1 x 1i + b 2 x 2i,σ 2 ) Example (DGA, p. 336): E(PEmax) = 47.355 + 1.024

More information

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

unadjusted model for baseline cholesterol 22:31 Monday, April 19, unadjusted model for baseline cholesterol 22:31 Monday, April 19, 2004 1 Class Level Information Class Levels Values TRETGRP 3 3 4 5 SEX 2 0 1 Number of observations 916 unadjusted model for baseline cholesterol

More information

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form Outline Statistical inference for linear mixed models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark general form of linear mixed models examples of analyses using linear mixed

More information

Lecture 5: Poisson and logistic regression

Lecture 5: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 3-5 March 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

Answer to exercise: Blood pressure lowering drugs

Answer to exercise: Blood pressure lowering drugs Answer to exercise: Blood pressure lowering drugs The data set bloodpressure.txt contains data from a cross-over trial, involving three different formulations of a drug for lowering of blood pressure:

More information

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011) Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October

More information

Lecture 2: Poisson and logistic regression

Lecture 2: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

Regression and Models with Multiple Factors. Ch. 17, 18

Regression and Models with Multiple Factors. Ch. 17, 18 Regression and Models with Multiple Factors Ch. 17, 18 Mass 15 20 25 Scatter Plot 70 75 80 Snout-Vent Length Mass 15 20 25 Linear Regression 70 75 80 Snout-Vent Length Least-squares The method of least

More information

Testing Independence

Testing Independence Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages

More information

R Hints for Chapter 10

R Hints for Chapter 10 R Hints for Chapter 10 The multiple logistic regression model assumes that the success probability p for a binomial random variable depends on independent variables or design variables x 1, x 2,, x k.

More information

Review of Multiple Regression

Review of Multiple Regression Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate

More information

Analytical Graphing. lets start with the best graph ever made

Analytical Graphing. lets start with the best graph ever made Analytical Graphing lets start with the best graph ever made Probably the best statistical graphic ever drawn, this map by Charles Joseph Minard portrays the losses suffered by Napoleon's army in the Russian

More information

3 Joint Distributions 71

3 Joint Distributions 71 2.2.3 The Normal Distribution 54 2.2.4 The Beta Density 58 2.3 Functions of a Random Variable 58 2.4 Concluding Remarks 64 2.5 Problems 64 3 Joint Distributions 71 3.1 Introduction 71 3.2 Discrete Random

More information

4 Multicategory Logistic Regression

4 Multicategory Logistic Regression 4 Multicategory Logistic Regression 4.1 Baseline Model for nominal response Response variable Y has J > 2 categories, i = 1,, J π 1,..., π J are the probabilities that observations fall into the categories

More information

SPSS LAB FILE 1

SPSS LAB FILE  1 SPSS LAB FILE www.mcdtu.wordpress.com 1 www.mcdtu.wordpress.com 2 www.mcdtu.wordpress.com 3 OBJECTIVE 1: Transporation of Data Set to SPSS Editor INPUTS: Files: group1.xlsx, group1.txt PROCEDURE FOLLOWED:

More information

Statistics 135 Fall 2008 Final Exam

Statistics 135 Fall 2008 Final Exam Name: SID: Statistics 135 Fall 2008 Final Exam Show your work. The number of points each question is worth is shown at the beginning of the question. There are 10 problems. 1. [2] The normal equations

More information

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont. TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted

More information

Coping with Additional Sources of Variation: ANCOVA and Random Effects

Coping with Additional Sources of Variation: ANCOVA and Random Effects Coping with Additional Sources of Variation: ANCOVA and Random Effects 1/49 More Noise in Experiments & Observations Your fixed coefficients are not always so fixed Continuous variation between samples

More information

Workshop 9.3a: Randomized block designs

Workshop 9.3a: Randomized block designs -1- Workshop 93a: Randomized block designs Murray Logan November 23, 16 Table of contents 1 Randomized Block (RCB) designs 1 2 Worked Examples 12 1 Randomized Block (RCB) designs 11 RCB design Simple Randomized

More information

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM 1 REGRESSION AND CORRELATION As we learned in Chapter 9 ( Bivariate Tables ), the differential access to the Internet is real and persistent. Celeste Campos-Castillo s (015) research confirmed the impact

More information

Count data page 1. Count data. 1. Estimating, testing proportions

Count data page 1. Count data. 1. Estimating, testing proportions Count data page 1 Count data 1. Estimating, testing proportions 100 seeds, 45 germinate. We estimate probability p that a plant will germinate to be 0.45 for this population. Is a 50% germination rate

More information

Cohen s s Kappa and Log-linear Models

Cohen s s Kappa and Log-linear Models Cohen s s Kappa and Log-linear Models HRP 261 03/03/03 10-11 11 am 1. Cohen s Kappa Actual agreement = sum of the proportions found on the diagonals. π ii Cohen: Compare the actual agreement with the chance

More information

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Logistic Regression 1 / 38 Logistic Regression 1 Introduction

More information

16.400/453J Human Factors Engineering. Design of Experiments II

16.400/453J Human Factors Engineering. Design of Experiments II J Human Factors Engineering Design of Experiments II Review Experiment Design and Descriptive Statistics Research question, independent and dependent variables, histograms, box plots, etc. Inferential

More information

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */ CLP 944 Example 4 page 1 Within-Personn Fluctuation in Symptom Severity over Time These data come from a study of weekly fluctuation in psoriasis severity. There was no intervention and no real reason

More information

Additional Notes: Investigating a Random Slope. When we have fixed level-1 predictors at level 2 we show them like this:

Additional Notes: Investigating a Random Slope. When we have fixed level-1 predictors at level 2 we show them like this: Ron Heck, Summer 01 Seminars 1 Multilevel Regression Models and Their Applications Seminar Additional Notes: Investigating a Random Slope We can begin with Model 3 and add a Random slope parameter. If

More information

Item Reliability Analysis

Item Reliability Analysis Item Reliability Analysis Revised: 10/11/2017 Summary... 1 Data Input... 4 Analysis Options... 5 Tables and Graphs... 5 Analysis Summary... 6 Matrix Plot... 8 Alpha Plot... 10 Correlation Matrix... 11

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

Chapter 12 - Part I: Correlation Analysis

Chapter 12 - Part I: Correlation Analysis ST coursework due Friday, April - Chapter - Part I: Correlation Analysis Textbook Assignment Page - # Page - #, Page - # Lab Assignment # (available on ST webpage) GOALS When you have completed this lecture,

More information

Review of Multinomial Distribution If n trials are performed: in each trial there are J > 2 possible outcomes (categories) Multicategory Logit Models

Review of Multinomial Distribution If n trials are performed: in each trial there are J > 2 possible outcomes (categories) Multicategory Logit Models Chapter 6 Multicategory Logit Models Response Y has J > 2 categories. Extensions of logistic regression for nominal and ordinal Y assume a multinomial distribution for Y. 6.1 Logit Models for Nominal Responses

More information

Lecture (chapter 13): Association between variables measured at the interval-ratio level

Lecture (chapter 13): Association between variables measured at the interval-ratio level Lecture (chapter 13): Association between variables measured at the interval-ratio level Ernesto F. L. Amaral April 9 11, 2018 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015.

More information

Nemours Biomedical Research Statistics Course. Li Xie Nemours Biostatistics Core October 14, 2014

Nemours Biomedical Research Statistics Course. Li Xie Nemours Biostatistics Core October 14, 2014 Nemours Biomedical Research Statistics Course Li Xie Nemours Biostatistics Core October 14, 2014 Outline Recap Introduction to Logistic Regression Recap Descriptive statistics Variable type Example of

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression 36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form

More information

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form: Outline for today What is a generalized linear model Linear predictors and link functions Example: fit a constant (the proportion) Analysis of deviance table Example: fit dose-response data using logistic

More information

Final Exam. Name: Solution:

Final Exam. Name: Solution: Final Exam. Name: Instructions. Answer all questions on the exam. Open books, open notes, but no electronic devices. The first 13 problems are worth 5 points each. The rest are worth 1 point each. HW1.

More information

Random and Mixed Effects Models - Part III

Random and Mixed Effects Models - Part III Random and Mixed Effects Models - Part III Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Quasi-F Tests When we get to more than two categorical factors, some times there are not nice F tests

More information

Interactions in Logistic Regression

Interactions in Logistic Regression Interactions in Logistic Regression > # UCBAdmissions is a 3-D table: Gender by Dept by Admit > # Same data in another format: > # One col for Yes counts, another for No counts. > Berkeley = read.table("http://www.utstat.toronto.edu/~brunner/312f12/

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

Ron Heck, Fall Week 3: Notes Building a Two-Level Model

Ron Heck, Fall Week 3: Notes Building a Two-Level Model Ron Heck, Fall 2011 1 EDEP 768E: Seminar on Multilevel Modeling rev. 9/6/2011@11:27pm Week 3: Notes Building a Two-Level Model We will build a model to explain student math achievement using student-level

More information

Basic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation).

Basic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation). Basic Statistics There are three types of error: 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation). 2. Systematic error - always too high or too low

More information

9 Generalized Linear Models

9 Generalized Linear Models 9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models

More information

MATH 1150 Chapter 2 Notation and Terminology

MATH 1150 Chapter 2 Notation and Terminology MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the

More information

ECON 497 Midterm Spring

ECON 497 Midterm Spring ECON 497 Midterm Spring 2009 1 ECON 497: Economic Research and Forecasting Name: Spring 2009 Bellas Midterm You have three hours and twenty minutes to complete this exam. Answer all questions and explain

More information

CAS MA575 Linear Models

CAS MA575 Linear Models CAS MA575 Linear Models Boston University, Fall 2013 Midterm Exam (Correction) Instructor: Cedric Ginestet Date: 22 Oct 2013. Maximal Score: 200pts. Please Note: You will only be graded on work and answers

More information

ST440/540: Applied Bayesian Statistics. (9) Model selection and goodness-of-fit checks

ST440/540: Applied Bayesian Statistics. (9) Model selection and goodness-of-fit checks (9) Model selection and goodness-of-fit checks Objectives In this module we will study methods for model comparisons and checking for model adequacy For model comparisons there are a finite number of candidate

More information

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data Today s Class: Review of concepts in multivariate data Introduction to random intercepts Crossed random effects models

More information

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION FOR SAMPLE OF RAW DATA (E.G. 4, 1, 7, 5, 11, 6, 9, 7, 11, 5, 4, 7) BE ABLE TO COMPUTE MEAN G / STANDARD DEVIATION MEDIAN AND QUARTILES Σ ( Σ) / 1 GROUPED DATA E.G. AGE FREQ. 0-9 53 10-19 4...... 80-89

More information

Beyond GLM and likelihood

Beyond GLM and likelihood Stat 6620: Applied Linear Models Department of Statistics Western Michigan University Statistics curriculum Core knowledge (modeling and estimation) Math stat 1 (probability, distributions, convergence

More information

Exam Applied Statistical Regression. Good Luck!

Exam Applied Statistical Regression. Good Luck! Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.

More information

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75

More information

Multiple Regression Examples

Multiple Regression Examples Multiple Regression Examples Example: Tree data. we have seen that a simple linear regression of usable volume on diameter at chest height is not suitable, but that a quadratic model y = β 0 + β 1 x +

More information