Introduction to Multilevel Modelling

Size: px
Start display at page:

Download "Introduction to Multilevel Modelling"

Transcription

1 Introduction to Multilevel Modelling Leonardo Trujillo, PhD. Luis Guillermo Díaz, PhD (c). Departamento de Estadística Universidad Nacional de Colombia XXI Simposio de Estadística 19 3 July 011, Bogotá, Colombia

2 Course Outline Review. Why multilevel models?. Fixed and Random Effects Model. Review. Random Intercepts and Random Slopes Model. Multilevel Models for Binary Response Data and Proportions ( Multilevel Logistic Regression ).

3 The Independence Assumption Why does clustered data matter? Standard analyses assume independence. The standard errors of model parameters are estimated under this assumption Observations within the clusters being positively correlated will underestimate these standard errors. Other names: hierarchical models, randomeffects or random-coefficient models, mixedeffect models. 3

4 The Importance of Data Structures Data in the real world has structure that tends to violate the assumptions of: independence homogeneity of residual variance (continuous data) That is why we need additional techniques for modeling as multilevel models 4

5 The Independence Assumption Survey data not always (in fact, rarely) comes from a Simple Random Sampling (SRS) The data collection process could generate outcomes that cannot be considered independent This is particularly true of social surveys as these often have multi-stage designs 5

6 The Independence Assumption Traditional approaches treat this clustering as a nuisance that must be accounted for Parameters are estimated in the usual manner Standard error estimates need to be adjusted for the impact of the clustering 6

7 The Independence Assumption There is still a possibility of natural clustering in the population even if we have collected our data in an unclustered way Model-based approach The population is represented to be generated under a model from which the data was selected Two particular interests: the impact of variables that describe the context as well as variables that relate to the individual response 7

8 Examples Pupils within classes within schools A pupil s performance will not only depend on their characteristics but also on the class they are in and the school from which that class is drawn (Goldstein, 1995) Individuals within households within communities Patients within wards within hospitals Longitudinal data: multiple observations over time are nested within units, typically subjects. 8

9 General Framework will represent the outcome of individual i from area j. Y ij will be expressed as a function of the individual variables x 0ij, x 1ij, Y ij i will represent level one and j level two The index of level one units within the jth level two unit will be represented as i = 1,, n j and the level two units as j = 1,, J. 9

10 Hierarchical Structures Only interested in simple population structures: Structures are strictly hierarchical Level one within level two within level three NOT appropriate for all scenarios Pupils are nested within classes within schools However, pupils are also nested within communities. 10

11 Hierarchical Structures Comm 1 School 1 3 Pupils

12 Different Approaches Aggregate Analysis Disaggregate Analysis Fixed effects Standard models with robust standard errors (STATA; SAS) MULTILEVEL MODELS!! (MLwiN, R, STATA (xtreg, xtmixed, gllamm) Free 30-day evaluation version of the Mlwin software from 1

13 Different approaches Alternative 1 - Ignore the problem Problems with standard error estimates. The problem gets worse according to the nature of the clustering and the variable being analysed. Anyway, how would we account for the impact of higher level variables in our analysis? Alternative - Standard models with robust standard errors This solves the problem with the standard error estimates but NOT the above question. 13

14 Different Approaches Example: Students nested in Schools ID School AP 011 Income PS Type UNC UNC UNC PUJ PUJ PUJ Public Public Public Private Private Private Ignores the variation at the school level. Performance of individuals belonging p.e. to UNC could be very correlated 14

15 Different Approaches Example: Students nested in Schools ID School AAP UNC PUJ UV Average Income Average PS??? Type Public Private Public It is impossible to predict individual outcomes. 15

16 Different Approaches Example: Students nested in Schools ID School AAP UNC PUJ UV Income PS Type Public Private Public It is impossible to predict individual outcomes. 16

17 Aggregate Analysis We have the data for each individual, for each level two unit j. We still can calculate y j = 1 n j n j j = 1 y ij, n j 1 1 x 0 j = x0ij, x1 j = x1ij,... n n j j = 1 j n j j= 1 y j is modeled as a linear function of x,, 0 j x 1 j 17

18 Aggregate Analysis Three Problems: 1. The analysis has much less power as we only have j = 1,..., J observations. Ecological Fallacy Relationship between aggregate-level variables is NOT necessarily the same as the relationship between the individual-level variables. 3. In fact, when you aggregate levels, correlations tend to increase 18

19 Two Stage (Multistage) Approach Example: Students nested in Schools ID School AP 011 Income PS 1 3 UNC UNC UNC ID School b0 binc bps Type 1 3 UNC PUJ UV Public Private Public Small sample sizes in particular groups. Does not account for interactions indiv-groups. 19

20 Disaggregate Analysis Fixed Effects Model One-way ANOVA model for the response y ij : y ij = µ + u j + e ij very simple model Fixed parameters µ total mean and u j effect per group, n j u j = 0 Random parameters e ij individual level residual, e ij ~ N (0, σ e ) Cov (e ij, e kl ) = 0 0

21 Fixed Effects Model µ + u 1 µ µ + u µ + u 3 1

22 Fixed Effects Model Advantages Simple calculations The distribution of the u j s is not specified Ideal to cope with large and extreme between-group differences

23 Fixed Effects Model Disadvantages If J (the number of level units) is large, there will be a large number of model parameters If nj (the number of level one units within each unit j) is small, each u j will be poorly estimated If the J level units are a sample from a population and we want to make inference about that population, the model does not make sense. 3

24 Disaggregate Analysis Random Effects Model Extension to the fixed effects model for the response y ij : y ij = µ + u j + e ij one-way random effects model Fixed parameters: µ overall mean Random parameters: u j : group level residual, u j ~ N (0, σ u ), Cov (u j, u l ) = 0 e ij : individual level residual, e ij ~ N (0, σ e ), Cov (e ij, e kj ) = 0. Also, cov(e ij, u j ) = cov(e ij, u l ) = 0 4

25 Fixed Effects Model µ + u 1 µ µ + u µ + u 3 u j and e ij are random error terms after controlling for the fixed components in the model. Works well when: The units are a random sample from the population and J is large The u j s are with no extremes and all relatively small 5

26 Multilevel Model Example Bland and Altman (1986). Statistical methods for assessing agreement between two methods of clinical measurement. Lancet I: (Most cited paper in Lancet). Peak expiratory flow rate (PEFR) measurements. Person's maximum speed of expiration. PEFR measured twice (in liters per minute) using the Wright peak flow meter. Twice using the Mini Wright peak flow meter (more portable, lower cost). Q: How to assess the quality of the two instruments? 6

27 Multilevel Model Example Subject Wright peak flow meter Mini Wright peak flow meter First Second First Second If the new method agrees sufficiently well with the old one, the old may be replaced. Four measures clustered in each method and then the two methods clustered in each subject (three-level model) y ijk = µ + u j + u k + e ijk 7

28 Multilevel Model - Example Firstly, we will consider the problem to analyse the differences between the two sets of measurements using the Mini Wright peak flow meter. y ij = µ + u j + e ij Two repeated measures nested in 17 clusters (individuals). Random or fixed effects approach? The answer depends on the target of inference: population of clusters (random) or the particular clusters in the dataset (fixed). 8

29 Multilevel Model - Example It is often said that the random-effects should only be used if there are more than 0 clusters in the sample. This is true if the variance components are of interest since σ u will be poorly estimated. However, if a random effects approach is used merely to make appropriate inferences regarding β, this is not strictly required. One way fixed effects ANOVA model has 19 parameters (β, α1,, α17, σ e ) and one constraint. The one way random effects ANOVA model has only three parameters (β, σ, e ) (PARSIMONIA). σ u 9

30 Intraclass Correlation ( y, y ) Corr = ij kj Cov Var( y ( y, y ) ij ij kj ) Var( y kj ) ρ = u σ u e)( u e u σ = ( σ + σ σ + σ ) σ u + σ e Correlation of individual level units within level two units. Referred to as the intra-cluster correlation. 30

31 Intraclass correlation The within-cluster variance has increased giving a much smaller intraclass correlation. In contrast, the Pearson correlation seems to be similar. The Pearson correlation is only defined for pairs of variables whereas the intraclass correlation summarizes dependence for clusters of size larger than. 31

32 Multilevel Model - Example. xtmixed wm id:,mle wm Coef StdErr z P> z 95% Conf. Int. _cons Random Effects Estimat. StdErr. 95% Conf. Int. sd(_cons) sd(residual) Log likelihood= Intracluster correlation = 0.97 The Mini Wright peak flow meter is very reliable 3

33 Multilevel Models Advantages Only a few parameters needed to estimate the structure (efficient estimation) When there are few observations per higher level unit, the model can still estimate overall effects Inference from the sample to the population is possible Disadvantages Checking the assumptions about distributions of errors and independence need to be done 33

34 Example (Brown, 004) We will be studying fertility in Bangladesh (measured by the number of children ever born to a particular woman - CEB) using data from the Bangladesh Fertility Survey (1988). As well as looking at the effect of a woman s individual characteristics, we are also interested in community-level and district-level differences in fertility. Variables: DISTRICT, COMM, WOMAN, CEB, AGE, EDUC (1 to 4), HINDU (0,1) and FIND (0 to 8). Eight variables, concerning 401 women in 68 communities (villages) within 60 districts. 34

35 Example (Brown, 004) Data are the first three villages in the dataset. y ij is the number of children ever born to woman i from Bangladesh village j (CEB ij ). Data: j 1 y ij 7, 1, 5, 8, 1, 0,8 n j 7 y j ,, 0, 10, 5, 4, 1, 1, , 9, 4, 5, 1,, 5, All Source df SS MS Between ANOVA: Within Total

36 Example Fixed Effects Model y ij = µ + u j + e ij, = 0 n j u j, eij ~ N(0, σ e ) µˆ = 4.08, = ; σ ˆe j 1 3 û j H 0 : u 1 = u = u 3 F,1 (0.05) = 3.44 F = 4.45/10.81 = 0.41 H 0 NOT rejected at the 5% level! 36

37 Example Random Effects Model y ij = µ + u j + e ij uj ~ N (0, σ u ), eij ~ N (0, σ e ), µˆ = 4.08, = σˆe ˆ σ u = = Test for H 0 as for fixed effects model (but does not really make sense when random effect is negative!). 37

38 Random Intercepts Model Extension of the last model for the fixed part: Data y ij, x 0ij, x 1ij, j = 1,, J and i = 1,, n j Model y ij = β 0 x 0ij + β 1 x 1ij + u j x 0ij + e ij Fixed Part: β 0, β 1 Random Part: u j, e ij Same assumptions as before plus covariates 38

39 RIM Interpretation y x Slope β 1 is constant Intercept varies (hence the name) across j but is constant within j 39

40 Classic Example y ij = attainment at age 16 x ij = attainment at age 11 i = pupil and j = school For a given pupil, u j is the school effect. This is constant for all pupils within school j. For a given school, the effect of a one unit increase in x 1ij for a pupil is β 1 units increase in E(y ij ). Note: There are no interactions between school and since the school lines are parallel. x 1ij 40

41 Alternative formulation of the model y ij = β 0j x 0ij + β 1 x 1ij + e ij β 0j = β 0 + u j Within school model Between school model Idea: Between schools, the intercept varies Between and within a school, the slope is the same 41

42 Intra-Cluster Correlation ρ = σ σ u u + σ e Residual correlation of level one units within a level two unit after controlling for the effect of x 1ij Using the example above it is the residual school homogeneity after controlling for pupils attainment at age 11 4

43 Estimation Fixed Effects Model yij = µ + u j + e ij Parameters of the model are: µ, u j, σ e Use a standard ANOVA approach for the estimation. n j 1 J Let y j = yij, with n n = n j and a constraint on u j. j j= 1 j = 1 n j u j = 0 vs u j = 0 43

44 Estimation Fixed Effects Model Under this model 1 y = n J j = 1 n j y j 1 = n J n j j= 1 i= 1 y ij is an unbiased estimator of µ; y j y is an unbiased estimator of u j ANOVA Table ; Source df SS MS Between clusters J-1 SSB SSB/(J-1) = MSB Within clusters n-j SSW SSW/(n-J) = MSW Total n-1 SST 44

45 Estimation - Random Effects Model y ij = µ + u j + e ij Parameters of the model are µ, σ e and Var(u j ) = σ u. E( ) = µ y µˆ = y E(MSW) = σ e σˆ e = MSW 45

46 Estimation - Random Effects Model Also, Var ( y ) = Var( E( y u ) + E( Var( y u ) j j σe u n j = σ + j j j In order to get an estimator for σ u, we need to combine across the groups and taking into account that we have unequal within group sample sizes ˆ σ u = n MSB MSW J n j ( J j= 1 n 1) 46

47 Estimation - Random Effects Model ˆ < 0 σ u Problem: if MSB < MSW!! This happens in practice if n j is small (not uncommon). Usually, set negative estimates to zero but this is no longer an unbiased estimator of! σ u These are referred to as ANOVA estimators; Many alternative estimators; and under normality assumptions we can get slightly more efficient ML estimators. 47

48 Estimation RIM Iterative Generalised Least Squares (IGLS): βˆ Estimate assuming some initial values for ; ˆ GLS Estimate based on the current values of ; βˆ Re-estimate based on current values of ; ˆ GLS Re-estimate based on the current values of ; ˆ βˆ βˆ ˆ GLS GLS Repeat process until some convergence criterion is satisfied. 48

49 Properties of Estimators If u s and e s have a normal distribution then: IGLS = Maximum Likelihood. All estimators asymptotically efficient (J > ) If not, then: IGLS gives consistent estimators; Estimator µˆ asymptotically efficient; Estimators and σˆu σˆe NOT asymptotically efficient! σ u ( IGLS) PROBLEM: ˆ can be negative! 49

50 Conclusions Can easily write-down the analytic form of the ANOVA type estimators for a simple model BUT ANOVA type estimators NOT fully efficient IGLS estimators are ML estimators under certain assumptions with associated estimated standard errors IGLS provides a framework within which a wide class of models can be incorporated 50

51 Example Linear Regression (OLS) ceb ij N(XB, Ω) ceb ij = β 0i const (0.004) age ij β 0i = 3.957(0.039) + e 0ij [e 0ij ] N(0, Ω e ) : Ωe = [3.575(0.103)] -*loglikelihood(igls) = (401 of 401 cases in use). This is a base model to build form!!! 51

52 Example Random Intercept Model ceb ij N(XB, Ω) ceb ij = β 0ij const (0.004) age ij β 0ij = 3.943(0.050) + u 0j + e 0ij [u0j] N(0, Ωu) : Ωu = [0.57(0.056)] [e 0ij ] N(0, Ω e ) : Ωe = [3.314(0.101)] -*loglikelihood(igls) = (401 of 401 cases in use). 5

53 Interpretation Fixed Part Random Part ρ = σ σ u u + σ e = = 0.07 The residual correlation of women within a community is 0.07 or, alternatively, 7.% of the total residual variation is due to the community. Within the population of communities about 95% would have an expected number of children ever born to a woman of average age between 3 and 5. 53

54 Interpretation ceb ceb ij = β 0ij age ij age Variation in the intercept is generated by the random effect term! Slopes are constant! 54

55 Random Slopes Model Is the assumption of a constant slope across villages true? Is there an interaction between age and village? Extend our random intercepts model to also allow the slope on age to vary Normality assumptions as before u s and e s are independent 55

56 Random Slopes Model ceb age Within a village, the slope and intercept are NOT independent. In the graph, the correlation is positive. It appears that those clusters with higher than average y also have a faster increase y as age increases. 56

57 Example Random Slopes Model ceb ij N(XB, Ω) ceb ij = β 0ij const + β 1j age ij β 0ij = 3.978(0.05) + u 0j + e 0ij β 1j = 0.41(0.005) + u 1j u u 0 j 1 j N(0, Ω u ) : Ω u = 0.33 (0.061) (0.005) 0.00 (0.001) [e 0ij ] N(0, Ω e ) : Ω e = [3.11(0.098)] -*loglikelihood(igls) = (401 of 401 cases in use). 57

58 Further Developments Can expand the model to include additional explanatory variables EDUC, HINDU, FIND Also expand the model to include additional levels: women within villages within districts, for example Good initial analysis outside MLwiN Get the basic multilevel structure (fit the model with age and see whether we need two/three levels) Develop the basic model adding variables based on initial modelling Finally, look for random slopes based on substantive ideas 58

59 Further Developments Multivel Logistic Regression With multilevel models for non-linear data estimation is not quite so simple Numerical Integration - Difficult to implement and only really feasible for models with simple hierarchical structure Monte Carlo Markov Chain (MCMC) Methods - Computationally intensive methods based on Bayesian inference Iterative bootstrap - Computationally intensive method for bias correction 59

60 Example Random Slopes Model ceb ijk N(XB, ) ceb ijk = β 0ijk const + β 1j age ijk (0.107)lower ijk (0.18)upper ijk (0.108)sec ijk + β 5 hindu ijk β 0ijk = 4.084(0.068) + v 0k + u 0jk + e 0ijk β 1j = 0.40(0.005) + u 1jk β 5j = -0.44(0.14) + u 5jk -*loglikelihood(igls) = (401 of 401 cases in use). 60

61 Multilevel Model PEF Modelo 1. y ijk = µ + u jk + u k + e ijk Modelo. y ijk = µ + βxj + u jk + u k + e ijk. xtmixed wm id:,mle (b4). xtmixed w method id: method:,mle w Coef. StdErr z P> z 95% Conf. Inter method _cons Random-Eff. Pars. Estimat. StdErr 95% Conf. Inter. sd(_cons) id sd(_cons) - method sd(residual)

62 Multilevel Model PEF Modelo. y ijk = µ + βxj + u jk + u k + e ijk Modelo 3. y ijk = µ + βxj + u k + e ijk. lrtest model3 model LR chibar(01)= 8.68 Prob>chibar = Modelo 4. y ijk = µ + u jk + u k + e ijk w Coef. StdErr z P> z 95% Conf. Inter _cons Random-Eff. Pars. Estimat. StdErr 95% Conf. Inter. sd(_cons) id sd(_cons) - method sd(residual)

63 Multilevel Model - PEP Corr(method, subject) = = 0.97 Corr(subject)= = 0.94 There is no evidence of systematic bias between the methods. The methods appear to have good test-retest reliability. 63

64 Binary Response Data y ij = 0 or 1. Examples include being dead or alive, agreeing or disagreeing with a statement and succeding of failing to accomplish something. Assume y ij ~ Bin (n ij, ), n ij = 1 for binary data π ij π ij where = Pr (y ij = 1) = E(y ij ) π ij Var(y ij ) = (1 - π ij )/n ij 64

65 Binomial Response Data y ij is a proportion, e.g. unemployment rate in neighbourhood i in region j n ij is denominator for the proportion, e.g. number eligible to work in neighbourhood i in region j Models for binary data may be applied to proportions 65

66 Binomial Response Data We are interested in the expectation (mean) of the response as a function of the covariate E(yi xi) = Pr(yi =1 xi) In linear regression, the conditional expectation of the response is modeled as a linear function of the covariate E(yi xi)=β1+ βxi For dichotomous responses, this approach may be problematic because the probability must lie between 0 and 1, whereas regression line increase (or decrease) indefinitely as the covariable increases (decreases). 66

67 Binomial Response Data Instead a nonlinear function is specified in two ways: Pr(yi =1 xi) = h(β1+ βxi) g{pr(yi =1 xi)} = β1+ βxi where h(.) is the inverse of the function g(.). Here g(.) is known as the link function and h(.) as the inverse link function. Three components of a generalized linear model: the linear predictor, the link function and the distribution of the response given the covariates. For dichotomous responses, this is specified as Bernoulli(πi). 67

68 Binomial Response Data Typical choices of link function are the logit or probit links. For the logit link, Pr(yi =1 xi) = h(β1+ βxi)= logit{pr(yi =1 xi)} = ln Pr exp 1+ exp ( y ) i = 1 xi ( ) ( β ) 1 + βxi ( β + β x ) 1 Pr yi = 1xi The term in curly braces represents the odds that yi=1 given xi, the expected number of 1 responses for each 0 response. 1 = i β β 1 + xi 68

69 Guatemala immunization data Pebley, Goldman and Rodriguez (1996). Prenatal and delivery care and childhood immunization in Guatemala: Do family and community matter?. Demography, 33: National Survey of Maternal and Child Health (ENSMI) conducted in Guatemala in Nationally representative sample of 5,160 women aged between 15 and 44 was interviewed. 69

70 Guatemala immunization data Beginning in 1986, the Guatemalan government undertook a series of campaigns to immunize the population against major childhood diseases. The data considered,159 children aged 1-4 years for which we have community data on health services and who received at least one immunization during the campaign. Response variable: whether the children received the full set of immunizations. Children i nested in mothers j nested in communities k. 70

71 Guatemala immunization data Level 1 (child): Yijk: indicator variable - child receiving full set of immunizations. Xijk: dummy variable - child being at least years old and hence eligible for full set of immunizations. Level (mother): Mom: Identifier for mothers (j) Ethnicity: dummy variables with baseline Latino X3jk: Mother is indigenous, not Spanish speaking X4jk: Mother is indigenous, Spanish speaking Mother s education: dummy variables with baseline no education X5jk: mother has primary education X6jk: mother has secondary education 71

72 Guatemala immunization data Husband s education: dummy variable with baseline no education X7jk: Husband has primary education X8jk: Husband has secondary education X9jk: Husband education not known. Level 3 (community): Cluster: Identifier for communities (k) X10k: dummy variable for community being rural X11k: percentage of population that was indigenous in

73 Guatemala immunization data Three level random intercept logit model logit{pr(yijk=1)}= σ m β 1 + βxijk β11x11 k + u jk + uk + ε σ c ujk has a N(0, ) and uk has a N(0, ). As usual, the random effects are assumed independent of each other and across clusters. uk is independent across units as well. In STATA, we use gllamm. Gllamm uses numerical integration and it is recommended for GLM multilevel models not for the single linear models. Xtreg and xtmixed exploit the closed form of the likelihood for random effect models with normally distributed continuous responses. ijk 73

74 Guatemala immunization data gllamm immun kidp indnospa indspa momedpri momedsec husedpri husedsec huseddk rural pcind81, family (binomial) link(logit) i(mom cluster) nip(5) immun Coef. Std. Err z P>z 95% Confidence Interv. kidp indnospa indspa momedpri momedsec husedpri husedsec huseddk rural pcind81 _cons

75 Guatemala immunization data 75

76 Final Remarks Data in the real world has structure to tends to violate the assumptions of independence and homogeneity of the residual variance The structures with which these data were generated depend on the data collection mechanism and the natural structures within the population The standard analysis assume independence and estimate the standard errors of model parameters accordingly. If observations within the clusters are positively correlated this will underestimate the standard errors 76

77 Final Remarks The problem with the aggregate analysis, as known as the ecological fallacy, is that the relationship between aggregate level variables could NOT be the same as the relationship between the individual-level variables, because the correlations tend to increase as you aggregate levels 77

78 References Goldstein, H. (003). Multilevel Statistical Models. 3rd edn. London: Hodder Arnold. Leyland, A.H. and Goldstein, H. (001) (Eds.) Multilevel Modelling of Health Statistics. Wiley. Pfeffermann, D., Skinner, C.J., Holmes, D.J., Goldstein, H. and Rabash, J. (1998b). Weighting for unequal selection probabilities in multilevel models. Journal of the Royal Statistical Society B, 60, Rabe-Hesketh, S and Skrondal, A. (008). Multilevel and Longitudinal Modeling Using STATA. STATA Press. Snijders, T. A. and Bosker, R. J., (1999). Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modelling. Thousand Oaks: Sage Publications. 78

79 Thank you! 79

Recent Developments in Multilevel Modeling

Recent Developments in Multilevel Modeling Recent Developments in Multilevel Modeling Roberto G. Gutierrez Director of Statistics StataCorp LP 2007 North American Stata Users Group Meeting, Boston R. Gutierrez (StataCorp) Multilevel Modeling August

More information

multilevel modeling: concepts, applications and interpretations

multilevel modeling: concepts, applications and interpretations multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models

More information

Lecture 3.1 Basic Logistic LDA

Lecture 3.1 Basic Logistic LDA y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data

More information

Modelling heterogeneous variance-covariance components in two-level multilevel models with application to school effects educational research

Modelling heterogeneous variance-covariance components in two-level multilevel models with application to school effects educational research Modelling heterogeneous variance-covariance components in two-level multilevel models with application to school effects educational research Research Methods Festival Oxford 9 th July 014 George Leckie

More information

Multilevel Modeling Day 2 Intermediate and Advanced Issues: Multilevel Models as Mixed Models. Jian Wang September 18, 2012

Multilevel Modeling Day 2 Intermediate and Advanced Issues: Multilevel Models as Mixed Models. Jian Wang September 18, 2012 Multilevel Modeling Day 2 Intermediate and Advanced Issues: Multilevel Models as Mixed Models Jian Wang September 18, 2012 What are mixed models The simplest multilevel models are in fact mixed models:

More information

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes Lecture 2.1 Basic Linear LDA 1 Outline Linear OLS Models vs: Linear Marginal Models Linear Conditional Models Random Intercepts Random Intercepts & Slopes Cond l & Marginal Connections Empirical Bayes

More information

Multi-level Models: Idea

Multi-level Models: Idea Review of 140.656 Review Introduction to multi-level models The two-stage normal-normal model Two-stage linear models with random effects Three-stage linear models Two-stage logistic regression with random

More information

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

More information

WU Weiterbildung. Linear Mixed Models

WU Weiterbildung. Linear Mixed Models Linear Mixed Effects Models WU Weiterbildung SLIDE 1 Outline 1 Estimation: ML vs. REML 2 Special Models On Two Levels Mixed ANOVA Or Random ANOVA Random Intercept Model Random Coefficients Model Intercept-and-Slopes-as-Outcomes

More information

Variance partitioning in multilevel logistic models that exhibit overdispersion

Variance partitioning in multilevel logistic models that exhibit overdispersion J. R. Statist. Soc. A (2005) 168, Part 3, pp. 599 613 Variance partitioning in multilevel logistic models that exhibit overdispersion W. J. Browne, University of Nottingham, UK S. V. Subramanian, Harvard

More information

Confidence intervals for the variance component of random-effects linear models

Confidence intervals for the variance component of random-effects linear models The Stata Journal (2004) 4, Number 4, pp. 429 435 Confidence intervals for the variance component of random-effects linear models Matteo Bottai Arnold School of Public Health University of South Carolina

More information

Multilevel/Mixed Models and Longitudinal Analysis Using Stata

Multilevel/Mixed Models and Longitudinal Analysis Using Stata Multilevel/Mixed Models and Longitudinal Analysis Using Stata Isaac J. Washburn PhD Research Associate Oregon Social Learning Center Summer Workshop Series July 2010 Longitudinal Analysis 1 Longitudinal

More information

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p ) Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p. 376-390) BIO656 2009 Goal: To see if a major health-care reform which took place in 1997 in Germany was

More information

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit R. G. Pierse 1 Introduction In lecture 5 of last semester s course, we looked at the reasons for including dichotomous variables

More information

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 12: Effect modification, and confounding in logistic regression Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression

More information

Multilevel Modeling: When and Why 1. 1 Why multilevel data need multilevel models

Multilevel Modeling: When and Why 1. 1 Why multilevel data need multilevel models Multilevel Modeling: When and Why 1 J. Hox University of Amsterdam & Utrecht University Amsterdam/Utrecht, the Netherlands Abstract: Multilevel models have become popular for the analysis of a variety

More information

Partitioning variation in multilevel models.

Partitioning variation in multilevel models. Partitioning variation in multilevel models. by Harvey Goldstein, William Browne and Jon Rasbash Institute of Education, London, UK. Summary. In multilevel modelling, the residual variation in a response

More information

Introduction. Dottorato XX ciclo febbraio Outline. A hierarchical structure. Hierarchical structures: type 2. Hierarchical structures: type 1

Introduction. Dottorato XX ciclo febbraio Outline. A hierarchical structure. Hierarchical structures: type 2. Hierarchical structures: type 1 Outline Introduction to multilevel analysis. Introduction >. > Leonardo Grilli 3. Estimation > 4. Software & Books > Email: grilli@ds.unifi.it Web: http://www.ds.unifi.it/grilli/ Department of Statistics

More information

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 217, Chicago, Illinois Outline 1. Opportunities and challenges of panel data. a. Data requirements b. Control

More information

Lecture 4: Generalized Linear Mixed Models

Lecture 4: Generalized Linear Mixed Models Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 An example with one random effect An example with two nested random effects

More information

Multilevel Modeling (MLM) part 1. Robert Yu

Multilevel Modeling (MLM) part 1. Robert Yu Multilevel Modeling (MLM) part 1 Robert Yu a few words before the talk This is a report from attending a 2 day training course of Multilevel Modeling by Dr. Raykov Tenko, held on March 22 23, 2012, in

More information

Binary Dependent Variables

Binary Dependent Variables Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome

More information

University of California at Berkeley Fall Introductory Applied Econometrics Final examination. Scores add up to 125 points

University of California at Berkeley Fall Introductory Applied Econometrics Final examination. Scores add up to 125 points EEP 118 / IAS 118 Elisabeth Sadoulet and Kelly Jones University of California at Berkeley Fall 2008 Introductory Applied Econometrics Final examination Scores add up to 125 points Your name: SID: 1 1.

More information

ANALYSIS OF CORRELATED DATA SAMPLING FROM CLUSTERS CLUSTER-RANDOMIZED TRIALS

ANALYSIS OF CORRELATED DATA SAMPLING FROM CLUSTERS CLUSTER-RANDOMIZED TRIALS ANALYSIS OF CORRELATED DATA SAMPLING FROM CLUSTERS CLUSTER-RANDOMIZED TRIALS Background Independent observations: Short review of well-known facts Comparison of two groups continuous response Control group:

More information

Introduction to. Multilevel Analysis

Introduction to. Multilevel Analysis Introduction to Multilevel Analysis Tom Snijders University of Oxford University of Groningen December 2009 Tom AB Snijders Introduction to Multilevel Analysis 1 Multilevel Analysis based on the Hierarchical

More information

Random Intercept Models

Random Intercept Models Random Intercept Models Edps/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline A very simple case of a random intercept

More information

Title. Description. Special-interest postestimation commands. xtmelogit postestimation Postestimation tools for xtmelogit

Title. Description. Special-interest postestimation commands. xtmelogit postestimation Postestimation tools for xtmelogit Title xtmelogit postestimation Postestimation tools for xtmelogit Description The following postestimation commands are of special interest after xtmelogit: Command Description estat group summarize the

More information

Discrete Response Multilevel Models for Repeated Measures: An Application to Voting Intentions Data

Discrete Response Multilevel Models for Repeated Measures: An Application to Voting Intentions Data Quality & Quantity 34: 323 330, 2000. 2000 Kluwer Academic Publishers. Printed in the Netherlands. 323 Note Discrete Response Multilevel Models for Repeated Measures: An Application to Voting Intentions

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

LOGISTIC REGRESSION Joseph M. Hilbe

LOGISTIC REGRESSION Joseph M. Hilbe LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of

More information

INTRODUCTION TO MULTILEVEL MODELLING FOR REPEATED MEASURES DATA. Belfast 9 th June to 10 th June, 2011

INTRODUCTION TO MULTILEVEL MODELLING FOR REPEATED MEASURES DATA. Belfast 9 th June to 10 th June, 2011 INTRODUCTION TO MULTILEVEL MODELLING FOR REPEATED MEASURES DATA Belfast 9 th June to 10 th June, 2011 Dr James J Brown Southampton Statistical Sciences Research Institute (UoS) ADMIN Research Centre (IoE

More information

Introducing Generalized Linear Models: Logistic Regression

Introducing Generalized Linear Models: Logistic Regression Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and

More information

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011) Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October

More information

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Ninth ARTNeT Capacity Building Workshop for Trade Research Trade Flows and Trade Policy Analysis Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis" June 2013 Bangkok, Thailand Cosimo Beverelli and Rainer Lanz (World Trade Organization) 1 Selected econometric

More information

Multilevel Modeling: A Second Course

Multilevel Modeling: A Second Course Multilevel Modeling: A Second Course Kristopher Preacher, Ph.D. Upcoming Seminar: February 2-3, 2017, Ft. Myers, Florida What this workshop will accomplish I will review the basics of multilevel modeling

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

Correlation and regression

Correlation and regression 1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

More information

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models Optimum Design for Mixed Effects Non-Linear and generalized Linear Models Cambridge, August 9-12, 2011 Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

More information

Longitudinal Data Analysis Using SAS Paul D. Allison, Ph.D. Upcoming Seminar: October 13-14, 2017, Boston, Massachusetts

Longitudinal Data Analysis Using SAS Paul D. Allison, Ph.D. Upcoming Seminar: October 13-14, 2017, Boston, Massachusetts Longitudinal Data Analysis Using SAS Paul D. Allison, Ph.D. Upcoming Seminar: October 13-14, 217, Boston, Massachusetts Outline 1. Opportunities and challenges of panel data. a. Data requirements b. Control

More information

The Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen

The Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen The Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen January 23-24, 2012 Page 1 Part I The Single Level Logit Model: A Review Motivating Example Imagine we are interested in voting

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

More information

Variance component models

Variance component models Faculty of Health Sciences Variance component models Analysis of repeated measurements, NFA 2016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen Topics for

More information

Review of CLDP 944: Multilevel Models for Longitudinal Data

Review of CLDP 944: Multilevel Models for Longitudinal Data Review of CLDP 944: Multilevel Models for Longitudinal Data Topics: Review of general MLM concepts and terminology Model comparisons and significance testing Fixed and random effects of time Significance

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

Analysis of variance and regression. December 4, 2007

Analysis of variance and regression. December 4, 2007 Analysis of variance and regression December 4, 2007 Variance component models Variance components One-way anova with random variation estimation interpretations Two-way anova with random variation Crossed

More information

Lecture 2: Poisson and logistic regression

Lecture 2: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

Monday 7 th Febraury 2005

Monday 7 th Febraury 2005 Monday 7 th Febraury 2 Analysis of Pigs data Data: Body weights of 48 pigs at 9 successive follow-up visits. This is an equally spaced data. It is always a good habit to reshape the data, so we can easily

More information

A Non-parametric bootstrap for multilevel models

A Non-parametric bootstrap for multilevel models A Non-parametric bootstrap for multilevel models By James Carpenter London School of Hygiene and ropical Medicine Harvey Goldstein and Jon asbash Institute of Education 1. Introduction Bootstrapping is

More information

Sampling. February 24, Multilevel analysis and multistage samples

Sampling. February 24, Multilevel analysis and multistage samples Sampling Tom A.B. Snijders ICS Department of Statistics and Measurement Theory University of Groningen Grote Kruisstraat 2/1 9712 TS Groningen, The Netherlands email t.a.b.snijders@ppsw.rug.nl February

More information

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs Logistic Regression, Part I: Problems with the Linear Probability Model (LPM) Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 22, 2015 This handout steals

More information

Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering

Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering John J. Dziak The Pennsylvania State University Inbal Nahum-Shani The University of Michigan Copyright 016, Penn State.

More information

Introduction to Within-Person Analysis and RM ANOVA

Introduction to Within-Person Analysis and RM ANOVA Introduction to Within-Person Analysis and RM ANOVA Today s Class: From between-person to within-person ANOVAs for longitudinal data Variance model comparisons using 2 LL CLP 944: Lecture 3 1 The Two Sides

More information

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data Today s Class: Review of concepts in multivariate data Introduction to random intercepts Crossed random effects models

More information

Determining Sample Sizes for Surveys with Data Analyzed by Hierarchical Linear Models

Determining Sample Sizes for Surveys with Data Analyzed by Hierarchical Linear Models Journal of Of cial Statistics, Vol. 14, No. 3, 1998, pp. 267±275 Determining Sample Sizes for Surveys with Data Analyzed by Hierarchical Linear Models Michael P. ohen 1 Behavioral and social data commonly

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 An Introduction to Multilevel Models PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 Today s Class Concepts in Longitudinal Modeling Between-Person vs. +Within-Person

More information

Variance component models part I

Variance component models part I Faculty of Health Sciences Variance component models part I Analysis of repeated measurements, 30th November 2012 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010 1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Models for Clustered Data

Models for Clustered Data Models for Clustered Data Edps/Psych/Soc 589 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline Notation NELS88 data Fixed Effects ANOVA

More information

Lecture 10: Introduction to Logistic Regression

Lecture 10: Introduction to Logistic Regression Lecture 10: Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 2007 Logistic Regression Regression for a response variable that follows a binomial distribution Recall the binomial

More information

Applied Economics. Regression with a Binary Dependent Variable. Department of Economics Universidad Carlos III de Madrid

Applied Economics. Regression with a Binary Dependent Variable. Department of Economics Universidad Carlos III de Madrid Applied Economics Regression with a Binary Dependent Variable Department of Economics Universidad Carlos III de Madrid See Stock and Watson (chapter 11) 1 / 28 Binary Dependent Variables: What is Different?

More information

Models for Clustered Data

Models for Clustered Data Models for Clustered Data Edps/Psych/Stat 587 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2017 Outline Notation NELS88 data Fixed Effects ANOVA

More information

Mixed Models for Longitudinal Binary Outcomes. Don Hedeker Department of Public Health Sciences University of Chicago.

Mixed Models for Longitudinal Binary Outcomes. Don Hedeker Department of Public Health Sciences University of Chicago. Mixed Models for Longitudinal Binary Outcomes Don Hedeker Department of Public Health Sciences University of Chicago hedeker@uchicago.edu https://hedeker-sites.uchicago.edu/ Hedeker, D. (2005). Generalized

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline

More information

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed

More information

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like.

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like. Measurement Error Often a data set will contain imperfect measures of the data we would ideally like. Aggregate Data: (GDP, Consumption, Investment are only best guesses of theoretical counterparts and

More information

Correlation and Regression Bangkok, 14-18, Sept. 2015

Correlation and Regression Bangkok, 14-18, Sept. 2015 Analysing and Understanding Learning Assessment for Evidence-based Policy Making Correlation and Regression Bangkok, 14-18, Sept. 2015 Australian Council for Educational Research Correlation The strength

More information

Chapter 11. Regression with a Binary Dependent Variable

Chapter 11. Regression with a Binary Dependent Variable Chapter 11 Regression with a Binary Dependent Variable 2 Regression with a Binary Dependent Variable (SW Chapter 11) So far the dependent variable (Y) has been continuous: district-wide average test score

More information

More Statistics tutorial at Logistic Regression and the new:

More Statistics tutorial at  Logistic Regression and the new: Logistic Regression and the new: Residual Logistic Regression 1 Outline 1. Logistic Regression 2. Confounding Variables 3. Controlling for Confounding Variables 4. Residual Linear Regression 5. Residual

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

Lecture 4: Multivariate Regression, Part 2

Lecture 4: Multivariate Regression, Part 2 Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above

More information

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression

More information

STA6938-Logistic Regression Model

STA6938-Logistic Regression Model Dr. Ying Zhang STA6938-Logistic Regression Model Topic 2-Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of

More information

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression

More information

Lecture 4: Multivariate Regression, Part 2

Lecture 4: Multivariate Regression, Part 2 Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above

More information

ROBUSTNESS OF MULTILEVEL PARAMETER ESTIMATES AGAINST SMALL SAMPLE SIZES

ROBUSTNESS OF MULTILEVEL PARAMETER ESTIMATES AGAINST SMALL SAMPLE SIZES ROBUSTNESS OF MULTILEVEL PARAMETER ESTIMATES AGAINST SMALL SAMPLE SIZES Cora J.M. Maas 1 Utrecht University, The Netherlands Joop J. Hox Utrecht University, The Netherlands In social sciences, research

More information

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests ECON4150 - Introductory Econometrics Lecture 7: OLS with Multiple Regressors Hypotheses tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 7 Lecture outline 2 Hypothesis test for single

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical

More information

Homework Solutions Applied Logistic Regression

Homework Solutions Applied Logistic Regression Homework Solutions Applied Logistic Regression WEEK 6 Exercise 1 From the ICU data, use as the outcome variable vital status (STA) and CPR prior to ICU admission (CPR) as a covariate. (a) Demonstrate that

More information

Longitudinal and Multilevel Methods for Multinomial Logit David K. Guilkey

Longitudinal and Multilevel Methods for Multinomial Logit David K. Guilkey Longitudinal and Multilevel Methods for Multinomial Logit David K. Guilkey Focus of this talk: Unordered categorical dependent variables Models will be logit based Empirical example uses data from the

More information

Spring RMC Professional Development Series January 14, Generalized Linear Mixed Models (GLMMs): Concepts and some Demonstrations

Spring RMC Professional Development Series January 14, Generalized Linear Mixed Models (GLMMs): Concepts and some Demonstrations Spring RMC Professional Development Series January 14, 2016 Generalized Linear Mixed Models (GLMMs): Concepts and some Demonstrations Ann A. O Connell, Ed.D. Professor, Educational Studies (QREM) Director,

More information

Binary Dependent Variable. Regression with a

Binary Dependent Variable. Regression with a Beykent University Faculty of Business and Economics Department of Economics Econometrics II Yrd.Doç.Dr. Özgür Ömer Ersin Regression with a Binary Dependent Variable (SW Chapter 11) SW Ch. 11 1/59 Regression

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is Practice Final Exam Last Name:, First Name:. Please write LEGIBLY. Answer all questions on this exam in the space provided (you may use the back of any page if you need more space). Show all work but do

More information

ECO375 Tutorial 8 Instrumental Variables

ECO375 Tutorial 8 Instrumental Variables ECO375 Tutorial 8 Instrumental Variables Matt Tudball University of Toronto Mississauga November 16, 2017 Matt Tudball (University of Toronto) ECO375H5 November 16, 2017 1 / 22 Review: Endogeneity Instrumental

More information

Longitudinal Modeling with Logistic Regression

Longitudinal Modeling with Logistic Regression Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to

More information

Binary Logistic Regression

Binary Logistic Regression The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b

More information

The Use of Survey Weights in Regression Modelling

The Use of Survey Weights in Regression Modelling The Use of Survey Weights in Regression Modelling Chris Skinner London School of Economics and Political Science (with Jae-Kwang Kim, Iowa State University) Colorado State University, June 2013 1 Weighting

More information

Lecture 5: Poisson and logistic regression

Lecture 5: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 3-5 March 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL

H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL Intesar N. El-Saeiti Department of Statistics, Faculty of Science, University of Bengahzi-Libya. entesar.el-saeiti@uob.edu.ly

More information

Designing Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D.

Designing Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D. Designing Multilevel Models Using SPSS 11.5 Mixed Model John Painter, Ph.D. Jordan Institute for Families School of Social Work University of North Carolina at Chapel Hill 1 Creating Multilevel Models

More information

Lecture (chapter 13): Association between variables measured at the interval-ratio level

Lecture (chapter 13): Association between variables measured at the interval-ratio level Lecture (chapter 13): Association between variables measured at the interval-ratio level Ernesto F. L. Amaral April 9 11, 2018 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015.

More information

A Re-Introduction to General Linear Models (GLM)

A Re-Introduction to General Linear Models (GLM) A Re-Introduction to General Linear Models (GLM) Today s Class: You do know the GLM Estimation (where the numbers in the output come from): From least squares to restricted maximum likelihood (REML) Reviewing

More information

Introduction to Random Effects of Time and Model Estimation

Introduction to Random Effects of Time and Model Estimation Introduction to Random Effects of Time and Model Estimation Today s Class: The Big Picture Multilevel model notation Fixed vs. random effects of time Random intercept vs. random slope models How MLM =

More information

Varians- og regressionsanalyse

Varians- og regressionsanalyse Faculty of Health Sciences Varians- og regressionsanalyse Variance component models Lene Theil Skovgaard Department of Biostatistics Variance component models Definitions and motivation One-way anova with

More information

LINEAR MULTILEVEL MODELS. Data are often hierarchical. By this we mean that data contain information

LINEAR MULTILEVEL MODELS. Data are often hierarchical. By this we mean that data contain information LINEAR MULTILEVEL MODELS JAN DE LEEUW ABSTRACT. This is an entry for The Encyclopedia of Statistics in Behavioral Science, to be published by Wiley in 2005. 1. HIERARCHICAL DATA Data are often hierarchical.

More information