Lecture 4: Generalized Linear Mixed Models

Similar documents
Lecture 2: Poisson and logistic regression

Lecture 5: Poisson and logistic regression

Modelling Rates. Mark Lunt. Arthritis Research UK Epidemiology Unit University of Manchester

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Appendix A. Numeric example of Dimick Staiger Estimator and comparison between Dimick-Staiger Estimator and Hierarchical Poisson Estimator

Multilevel Modeling Day 2 Intermediate and Advanced Issues: Multilevel Models as Mixed Models. Jian Wang September 18, 2012

multilevel modeling: concepts, applications and interpretations

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes

Recent Developments in Multilevel Modeling

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials

Lecture 10: Introduction to Logistic Regression

Lecture 12: Effect modification, and confounding in logistic regression

Homework Solutions Applied Logistic Regression

Lecture 3.1 Basic Logistic LDA

Title. Description. Special-interest postestimation commands. xtmelogit postestimation Postestimation tools for xtmelogit

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Monday 7 th Febraury 2005

One-stage dose-response meta-analysis

Lecture 3 Linear random intercept models

Statistical Modelling with Stata: Binary Outcomes

Understanding the multinomial-poisson transformation

Sociology 362 Data Exercise 6 Logistic Regression 2

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Model and Working Correlation Structure Selection in GEE Analyses of Longitudinal Data

Multilevel Modeling of Non-Normal Data. Don Hedeker Department of Public Health Sciences University of Chicago.

Module 6 Case Studies in Longitudinal Data Analysis

options description set confidence level; default is level(95) maximum number of iterations post estimation results

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models:

Mixed Models for Longitudinal Binary Outcomes. Don Hedeker Department of Public Health Sciences University of Chicago.

Confidence intervals for the variance component of random-effects linear models

Multilevel/Mixed Models and Longitudinal Analysis Using Stata

Linear Regression Models P8111

****Lab 4, Feb 4: EDA and OLS and WLS

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STAT5044: Regression and Anova

Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method

Generalized linear models

Multilevel Modeling (MLM) part 1. Robert Yu

PSC 8185: Multilevel Modeling Fitting Random Coefficient Binary Response Models in Stata

4. MA(2) +drift: y t = µ + ɛ t + θ 1 ɛ t 1 + θ 2 ɛ t 2. Mean: where θ(l) = 1 + θ 1 L + θ 2 L 2. Therefore,

Consider Table 1 (Note connection to start-stop process).

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies.

Logit estimates Number of obs = 5054 Wald chi2(1) = 2.70 Prob > chi2 = Log pseudolikelihood = Pseudo R2 =

Introduction A research example from ASR mltcooksd mlt2stage Outlook. Multi Level Tools. Influential cases in multi level modeling

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

Latent class analysis and finite mixture models with Stata

Cluster Analysis using SaTScan

Case-control studies

Analyzing Proportions

ZERO INFLATED POISSON REGRESSION

Lecture#17. Time series III

Lecture 1 Introduction to Multi-level Models

A Journey to Latent Class Analysis (LCA)

Group Comparisons: Differences in Composition Versus Differences in Models and Effects

Meta-analysis of epidemiological dose-response studies

Mohammed. Research in Pharmacoepidemiology National School of Pharmacy, University of Otago

Multi-level Models: Idea

Propensity Score Matching and Analysis TEXAS EVALUATION NETWORK INSTITUTE AUSTIN, TX NOVEMBER 9, 2018

4/9/2014. Outline for Stochastic Frontier Analysis. Stochastic Frontier Production Function. Stochastic Frontier Production Function

A short guide and a forest plot command (ipdforest) for one-stage meta-analysis

Lecture 3: Measures of effect: Risk Difference Attributable Fraction Risk Ratio and Odds Ratio

Applied Survival Analysis Lab 10: Analysis of multiple failures

Logistic Regression. Building, Interpreting and Assessing the Goodness-of-fit for a logistic regression model

Sample Size and Power Considerations for Longitudinal Studies

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Analysing repeated measurements whilst accounting for derivative tracking, varying within-subject variance and autocorrelation: the xtiou command

Unit 9: Inferences for Proportions and Count Data

Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity. Dankmar Böhning

Today. HW 1: due February 4, pm. Aspects of Design CD Chapter 2. Continue with Chapter 2 of ELM. In the News:

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

You can specify the response in the form of a single variable or in the form of a ratio of two variables denoted events/trials.

8 Nominal and Ordinal Logistic Regression

Testing and Model Selection

Chapter 1. Modeling Basics

Frequency table: Var2 (Spreadsheet1) Count Cumulative Percent Cumulative From To. Percent <x<=

ECON 594: Lecture #6

Correlation and regression

Testing methodology. It often the case that we try to determine the form of the model on the basis of data

Assessing the Calibration of Dichotomous Outcome Models with the Calibration Belt

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.

Logistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression

Instantaneous geometric rates via Generalized Linear Models

Control Function and Related Methods: Nonlinear Models

Binary Dependent Variables

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression

STAT 525 Fall Final exam. Tuesday December 14, 2010

Stat 315c: Transposable Data Rasch model and friends

University of California at Berkeley Fall Introductory Applied Econometrics Final examination. Scores add up to 125 points

Chapter 1 Statistical Inference

Nonlinear Econometric Analysis (ECO 722) : Homework 2 Answers. (1 θ) if y i = 0. which can be written in an analytically more convenient way as

General Regression Model

Using the same data as before, here is part of the output we get in Stata when we do a logistic regression of Grade on Gpa, Tuce and Psi.

A double-hurdle count model for completed fertility data from the developing world

UNIVERSITY OF TORONTO Faculty of Arts and Science

Linear regression is designed for a quantitative response variable; in the model equation

Transcription:

Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014

An example with one random effect An example with two nested random effects

An example with one random effect An example: health awareness study three states in the US participated in a health awareness study each state independently devised a health awareness program three cities within each state were selected for participation and five households within each city were randomly selected to evaluate the effectiveness of the program a composite index (a count number) was formed (the large the index, the greater the awareness) the data have the following hierarchical structure:

An example with one random effect data: household state city 1 2 3 4 5 1 1 42 56 35 40 28 1 2 26 38 42 35 53 1 3 34 51 60 29 44 2 1 47 58 39 62 65 2 2 56 43 65 70 59 2 3 68 51 49 71 57 3 1 19 36 24 12 33 3 2 18 40 27 31 23 3 3 16 28 45 30 21

An example with one random effect Poisson model with random effect for state for the health awareness index Y ijk for household k, in city j, and state i: log E[Y ijk ] αi = log µ ijk = µ + α i with a state random effect α i N(0, σ 2 S ) and a Poisson error Y ijk Po(µ ijk )

An example with one random effect Poisson model with random effect for state let P(Y ijk = y) = Po(y µ ijk ) = Po(y µ + α i ) likelihood L = Po(y ijk µ + α i ) i,j,k (in the fixed effect case) but α i N(0, σs 2 ), e.g. normal random, so L = Po(y ijk µ + α i )φ(α i )dα i i α i j,k where φ(α i ) is a normal density with mean 0 and variance σ 2 S

An example with one random effect

An example with one random effect Mixed-effects Poisson regression Number of obs = 45 Group variable: state Number of groups = 3 Obs per group: min = 15 avg = 15.0 max = 15 Integration points = 1 Wald chi2(0) =. Log likelihood = -183.93181 Prob > chi2 =. index IRR Std. Err. z P> z [95% Conf. Interval] _cons 39.81456 7.124649 20.59 0.000 28.03645 56.54067 Random-effects Parameters Estimate Std. Err. [95% Conf. Interval] state: Identity sd(_cons).307096.1278477.1358028.6944475 LR test vs. Poisson regression: chibar2(01) = 154.39 Prob>=chibar2 = 0.0000 Note: log-likelihood calculations are based on the Laplacian approximation.

An example with two nested random effects Poisson model with random effect for state and random effect for city nested within state let P(Y ijk = y) = Po(y µ ijk ) = Po(y µ + α i + β j(i) ) where β j(i) N(0, σt 2 ), e.g. normal random likelihood L = i α i β j j Po(y ijk µ + α i + β j(i) )φ(β j )dβ j φ(α i )dα i k where φ(α i ) is a normal density with mean 0 and variance σ 2 S where φ(β j ) is a normal density with mean 0 and variance σ 2 T

An example with two nested random effects

An example with two nested random effects Mixed-effects Poisson regression Number of obs = 45 No. of Observations per Group Integration Group Variable Groups Minimum Average Maximum Points state 3 15 15.0 15 1 city 9 5 5.0 5 1 Wald chi2(0) =. Log likelihood = -183.93181 Prob > chi2 =. index IRR Std. Err. z P> z [95% Conf. Interval] _cons 39.81457 7.124658 20.59 0.000 28.03644 56.54069 Random-effects Parameters Estimate Std. Err. [95% Conf. Interval] state: Identity sd(_cons).3070964.127848.1358029.694449 city: Identity sd(_cons) 7.65e-12.0492277 0. LR test vs. Poisson regression: chi2(2) = 154.39 Prob > chi2 = 0.0000 Note: LR test is conservative and provided only for reference. Note: log-likelihood calculations are based on the Laplacian approximation.

Meta-Analysis on BCG vaccine against tuberculosis Colditz et al. 1974, JAMA provide a meta-analysis to examine the efficacy of BCG vaccine against tuberculosis

Data on the meta-analysis of BCG and TB the data contain the following details 13 studies each study contains: TB cases for BCG intervention number at risk for BCG intervention TB cases for control number at risk for control also two covariates are given: year of study and latitude expressed in degrees from equator

intervention control study year latitude TB cases total TB cases total 1 1933 55 6 306 29 303 2 1935 52 4 123 11 139 3 1935 52 180 1541 372 1451 4 1937 42 17 1716 65 1665 5 1941 42 3 231 11 220 6 1947 33 5 2498 3 2341 7 1949 18 186 50634 141 27338 8 1950 53 62 13598 248 12867 9 1950 13 33 5069 47 5808 10 1950 33 27 16913 29 17854 11 1965 18 8 2545 10 629 12 1965 27 29 7499 45 7277 13 1968 13 505 88391 499 88391

Data analysis on the meta-analysis of BCG and TB these kind of data can be analyzed by taking TB case as disease occurrence response intervention as exposure (fixed effect) study as random effect latitude and year as further fixed effects

Mixed Logistic Regression Model log p xij 1 p xij = µ + α i + β INTER INTER ij + β LAT LAT ij where α i N(0, σ 2 S ) each trial arm within each study contributes a binomial likelihood ( nij y ij ) p y ij x ij (1 p xij ) n ij y ij where p xij = exp(µ + α i + β INTER INTER ij + β LAT LAT ij ) 1 + exp(µ + α i + β INTER INTER ij + β LAT LAT ij )

Mixed Logistic Likelihood L = i ( nij α i y ij j ) p y ij x ij (1 p xij ) n ij y ij φ(α i )dα i where φ(α i ) is a normal density with mean 0 and variance σ 2 S

Integration points = 1 Wald chi2(1) = 134.12 Log likelihood = -196.3842 Prob > chi2 = 0.0000 cases Odds Ratio Std. Err. z P> z [95% Conf. Interval] intervention.6203579.0255761-11.58 0.000.5722016.672567 _cons.0149296.005892-10.65 0.000.0068885.0323576 Random-effects Parameters Estimate Std. Err. [95% Conf. Interval] study: Identity sd(_cons) 1.410135.2813828.953694 2.085029 LR test vs. logistic regression: chibar2(01) = 3259.71 Prob>=chibar2 = 0.0000 Note: log-likelihood calculations are based on the Laplacian approximation..

Integration points = 1 Wald chi2(2) = 145.29 Log likelihood = -192.38326 Prob > chi2 = 0.0000 cases Odds Ratio Std. Err. z P> z [95% Conf. Interval] intervention.6206475.0255857-11.57 0.000.5724728.6728762 latitude 1.064961.0201333 3.33 0.001 1.026223 1.105162 _cons.001693.0012119-8.91 0.000.0004162.0068863 Random-effects Parameters Estimate Std. Err. [95% Conf. Interval] study: Identity sd(_cons) 1.027901.2080747.6912662 1.528471 LR test vs. logistic regression: chibar2(01) = 2180.88 Prob>=chibar2 = 0.0000

Integration points = 1 Wald chi2(3) = 148.12 Log likelihood = -191.68717 Prob > chi2 = 0.0000 cases Odds Ratio Std. Err. z P> z [95% Conf. Interval] intervention.6207321.0255885-11.57 0.000.5725521.6729664 latitude 1.037569.0289224 1.32 0.186.9824025 1.095833 year.9560516.035339-1.22 0.224.8892379 1.027886 _cons.0363575.0948862-1.27 0.204.0002183 6.054397 Random-effects Parameters Estimate Std. Err. [95% Conf. Interval] study: Identity sd(_cons).9696404.197926.6499213 1.44664 LR test vs. logistic regression: chibar2(01) = 2019.42 Prob>=chibar2 = 0.0000

Integration points = 1 Wald chi2(2) = 144.88 Log likelihood = -192.50774 Prob > chi2 = 0.0000 cases Odds Ratio Std. Err. z P> z [95% Conf. Interval] intervention.6206583.025586-11.57 0.000.5724831.6728877 year.9208262.0232455-3.27 0.001.8763747.9675324 _cons.7934227.9916217-0.19 0.853.0684969 9.190487 Random-effects Parameters Estimate Std. Err. [95% Conf. Interval] study: Identity sd(_cons) 1.035618.2104227.6954207 1.542238 LR test vs. logistic regression: chibar2(01) = 2346.37 Prob>=chibar2 = 0.0000

model evaluation model log L AIC BIC intervention -196.3842 398.7684 402.5427 + latitude -192.3833 392.7665 397.7989 + year -191.6872 393.3743 399.6648 - latitude -192.5077 393.0155 398.0479