Hierarchical Hurdle Models for Zero-In(De)flated Count Data of Complex Designs

Size: px
Start display at page:

Download "Hierarchical Hurdle Models for Zero-In(De)flated Count Data of Complex Designs"

Transcription

1 for Zero-In(De)flated Count Data of Complex Designs Marek Molas 1, Emmanuel Lesaffre 1,2 1 Erasmus MC 2 L-Biostat Erasmus Universiteit - Rotterdam Katholieke Universiteit Leuven The Netherlands Belgium 26th August 2009 International Society for Clinical Biostatistics 30 Prague, Czech Republic ERASMUSMC - Biostatistics / 22

2 Outline Challenges in the analysis of the motivating example Hurdle models H-likelihood for hurdle models Practical application ERASMUSMC - Biostatistics / 22

3 Motivating Example Randomized clinical trial (General Practice Erasmus MC) Two physical exercises regimes: standard and Tai-Chi Chuan Outcome: number of days elderly patients experienced a fall Each patient is followed for about a year: 5 measurements Baseline covariates: age, gender, BMI, alcohol use Is there a reduction of number of falls in the Tai-Chi Chuan patients? ERASMUSMC - Biostatistics / 22

4 Histogram - Tai-Chi Chuan Period 1 Period 2 Control Group Period 3 Period 4 Period 5 Density Density Density Density Density Falls Falls Falls Falls Falls Period 1 Period 2 Tai Chi Chuan Group Period 3 Period 4 Period 5 Density Density Density Density Density Falls Falls Falls Falls Falls ERASMUSMC - Biostatistics / 22

5 The Zero-Inflated Poisson and the Hurdle Model The zero-inflated Poisson model is a mixture: Point mass at zero Standard Poisson distribution The hurdle model has two components: Point mass at zero Truncated Poisson distribution ERASMUSMC - Biostatistics / 22

6 The Zero-Inflated Poisson and the Hurdle Model Zero-Inflated Poisson Model Hurdle Model P(X = 0) =p Z + (1 p Z )e µ Z P(X = k) =(1 p Z )e µ Z µk Z k! P(X = 0) =p H 1 P(X = k) =(1 p H ) 1 e µ e µ µk H H H k! ERASMUSMC - Biostatistics / 22

7 The Zero-Inflated Poisson and the Hurdle Model Relation between zero-value probabilities: {p H = p Z + (1 p Z )e µ Z } > 0 Inflation part probability: p Z = p H e µ H 1 e µ H Relation between means of the distributional parts: µ Z = µ H Hurdle model allows for zero-deflation and zero-inflation ZIP model allows only zero-inflation ERASMUSMC - Biostatistics / 22

8 Hurdle Model for Clustered Data Add covariates and random effects as follows: logit(p H ) = X T 0 β ( ) 0 + b 0 b0 log(µ H ) = X T 1 β G(µ,Σ) 1 + b 1 b 1 Likelihood factorizes if random effects are independent: Bernoulli model Truncated Poisson model ERASMUSMC - Biostatistics / 22

9 Hurdle Model for Clustered Data Marginal Likelihood - SAS PROC NLMIXED (Min & Agresti 2005) 2-level data Normal random effects Different distribution for random effects (Liu & Yu 2007) Marginal Likelihood - Non-parametric Maximum Likelihood (Min & Agresti 2005) ERASMUSMC - Biostatistics / 22

10 Hurdle Model for Clustered Data What are we looking for? Method to handle multilevel or multi-membership data Method to allow for dispersion parameters to depend on covariates Efficient estimation ERASMUSMC - Biostatistics / 22

11 Hurdle Model for Clustered Data H - Likelihood (Lee & Nelder, 1996, 2001; Noh & Lee 2007) What is it? Computational framework allowing estimation of models involving random effects It offers: Efficient estimation algorithm REML type of inference for dispersion components Implications: Random effects can be not normal Multilevel / multi-membership data is easily handled Dispersion parameters can depend on covariates Overdispersion can depend on covariates ERASMUSMC - Biostatistics / 22

12 H-likelihood Three types of parameters: Fixed effects parameters - β Random effects - v Dispersion parameters - λ ERASMUSMC - Biostatistics / 22

13 H-likelihood Extended likelihood L E (β,λ, v y, v) = Extended log-likelihood N n i f β,λ (y ij v i )f λ (v i ) i=1 j=1 h = log(l E (β,λ, v y, v)) ERASMUSMC - Biostatistics / 22

14 H-likelihood Adjusted profile likelihood: Estimation of β: p v (h) = h(β,λ, v) v=ˆv 0.5 log D(h, v) 2π Laplace approximation to the integral: L M (β,λ y) = N i=1 n i v=ˆv f β,λ (y ij v i )f λ (v i )d v i. j=1 ERASMUSMC - Biostatistics / 22

15 H-likelihood Adjusted profile likelihood: Estimation of λ: p β,v (h) = h(β,λ, v) β=ˆβ,v=ˆv 0.5 log D [h,(β, v)] 2π β=ˆβ,v=ˆv ERASMUSMC - Biostatistics / 22

16 Application to the Hurdle Model - Truncated Poisson Standard exponential family distribution f β (y ij v i ) = exp(y ij θ ij b(θ ij ) + c(y ij )) θ ij = x T ij β + zt ij v i Truncated exponential family distribution f β (y ij v i ) = exp(y ij θ ij b(θ ij ) log(m(θ ij )) + c(y ij )) θ ij = x T ij β + zt ij v i ERASMUSMC - Biostatistics / 22

17 Truncated Poisson Distribution Standard weight matrix Modified weight matrix W = W + W = ( M (θ) M(θ) Standard adjusted dependent variable ( ) 2 µ V 1 (µ) η ( M ) ) 2 (θ) V 1 (µ)w M(θ) z = η + (y µ)( η/ µ) Modified adjusted dependent variable z = η + (W/ W)(y µ M (θ) M(θ) )( η/ µ) ERASMUSMC - Biostatistics / 22

18 Application - Hurdle Model Bernoulli model Truncated Poisson model logit[p(y ij > 0)] = x T ij β + v i log u i = exp(v i) 1 + exp(v i ) ( ) 1 1 u i Beta, λ 10 λ 10 log(λ 10 ) = γ γ 101 Female i ( µij days ij ) = x T ij β + v i + v ij v i N(0, λ 20 ) u ij = exp(v ij ) u ij Gamma ( ) 1, λ 21 λ 21 log(λ 21 ) = γ γ 211 Female i ERASMUSMC - Biostatistics / 22

19 Application Bernoulli Model Effect Estimate P-value Intercept Female Time Time*Trt(Tai-Chi) Age γ γ 101 (Female) ERASMUSMC - Biostatistics / 22

20 Application Truncated Poisson Model Effect Estimate P-value Intercept <0.001 Female Time Time*Trt(Tai-Chi) γ γ γ 211 (Female) ERASMUSMC - Biostatistics / 22

21 Further Research Joint estimation of the binary and truncated Poisson model within H-likelihood framework Correlated random effects Non-normal random compoenents - copula approach How to make the correlation depend on covariates Number of random components greater then 2- copula - covariates? ERASMUSMC - Biostatistics / 22

22 Thank you for your attention! ERASMUSMC - Biostatistics / 22

Package HGLMMM for Hierarchical Generalized Linear Models

Package HGLMMM for Hierarchical Generalized Linear Models Package HGLMMM for Hierarchical Generalized Linear Models Marek Molas Emmanuel Lesaffre Erasmus MC Erasmus Universiteit - Rotterdam The Netherlands ERASMUSMC - Biostatistics 20-04-2010 1 / 52 Outline General

More information

Including historical data in the analysis of clinical trials using the modified power priors: theoretical overview and sampling algorithms

Including historical data in the analysis of clinical trials using the modified power priors: theoretical overview and sampling algorithms Including historical data in the analysis of clinical trials using the modified power priors: theoretical overview and sampling algorithms Joost van Rosmalen 1, David Dejardin 2,3, and Emmanuel Lesaffre

More information

Practical considerations for survival models

Practical considerations for survival models Including historical data in the analysis of clinical trials using the modified power prior Practical considerations for survival models David Dejardin 1 2, Joost van Rosmalen 3 and Emmanuel Lesaffre 1

More information

Hierarchical Generalized Linear Model Approach For Estimating Of Working Population In Kepulauan Riau Province

Hierarchical Generalized Linear Model Approach For Estimating Of Working Population In Kepulauan Riau Province IOP Conference Series: Earth and Environmental Science PAPER OPEN ACCESS Hierarchical Generalized Linear Model Approach For Estimating Of Working Population In Kepulauan Riau Province To cite this article:

More information

High-Throughput Sequencing Course

High-Throughput Sequencing Course High-Throughput Sequencing Course DESeq Model for RNA-Seq Biostatistics and Bioinformatics Summer 2017 Outline Review: Standard linear regression model (e.g., to model gene expression as function of an

More information

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1 Parametric Modelling of Over-dispersed Count Data Part III / MMath (Applied Statistics) 1 Introduction Poisson regression is the de facto approach for handling count data What happens then when Poisson

More information

A Joint Model with Marginal Interpretation for Longitudinal Continuous and Time-to-event Outcomes

A Joint Model with Marginal Interpretation for Longitudinal Continuous and Time-to-event Outcomes A Joint Model with Marginal Interpretation for Longitudinal Continuous and Time-to-event Outcomes Achmad Efendi 1, Geert Molenberghs 2,1, Edmund Njagi 1, Paul Dendale 3 1 I-BioStat, Katholieke Universiteit

More information

Research Projects. Hanxiang Peng. March 4, Department of Mathematical Sciences Indiana University-Purdue University at Indianapolis

Research Projects. Hanxiang Peng. March 4, Department of Mathematical Sciences Indiana University-Purdue University at Indianapolis Hanxiang Department of Mathematical Sciences Indiana University-Purdue University at Indianapolis March 4, 2009 Outline Project I: Free Knot Spline Cox Model Project I: Free Knot Spline Cox Model Consider

More information

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will

More information

Discrete Choice Modeling

Discrete Choice Modeling [Part 6] 1/55 0 Introduction 1 Summary 2 Binary Choice 3 Panel Data 4 Bivariate Probit 5 Ordered Choice 6 7 Multinomial Choice 8 Nested Logit 9 Heterogeneity 10 Latent Class 11 Mixed Logit 12 Stated Preference

More information

Generalized Linear Models for Count, Skewed, and If and How Much Outcomes

Generalized Linear Models for Count, Skewed, and If and How Much Outcomes Generalized Linear Models for Count, Skewed, and If and How Much Outcomes Today s Class: Review of 3 parts of a generalized model Models for discrete count or continuous skewed outcomes Models for two-part

More information

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

STA216: Generalized Linear Models. Lecture 1. Review and Introduction STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general

More information

Semiparametric Generalized Linear Models

Semiparametric Generalized Linear Models Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student

More information

PQL Estimation Biases in Generalized Linear Mixed Models

PQL Estimation Biases in Generalized Linear Mixed Models PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

Lecture 2: Categorical Variable. A nice book about categorical variable is An Introduction to Categorical Data Analysis authored by Alan Agresti

Lecture 2: Categorical Variable. A nice book about categorical variable is An Introduction to Categorical Data Analysis authored by Alan Agresti Lecture 2: Categorical Variable A nice book about categorical variable is An Introduction to Categorical Data Analysis authored by Alan Agresti 1 Categorical Variable Categorical variable is qualitative

More information

ABSTRACT. The First Order Autoregressive (AR(1)) Mixed Effects Zero Inflated Poisson (ZIP)

ABSTRACT. The First Order Autoregressive (AR(1)) Mixed Effects Zero Inflated Poisson (ZIP) ABSTRACT Title of Dissertation: FIRST ORDER AUTOREGRESSIVE MIXED EFFECTS ZERO INFLATED POISSON MODEL FOR LONGITUDINAL DATA A BAYESIAN APPROACH Chin-Fang Weng, Doctor of Philosophy, 2014 Dissertation Directed

More information

Introduction (Alex Dmitrienko, Lilly) Web-based training program

Introduction (Alex Dmitrienko, Lilly) Web-based training program Web-based training Introduction (Alex Dmitrienko, Lilly) Web-based training program http://www.amstat.org/sections/sbiop/webinarseries.html Four-part web-based training series Geert Verbeke (Katholieke

More information

CS Lecture 19. Exponential Families & Expectation Propagation

CS Lecture 19. Exponential Families & Expectation Propagation CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces

More information

Multilevel Methodology

Multilevel Methodology Multilevel Methodology Geert Molenberghs Interuniversity Institute for Biostatistics and statistical Bioinformatics Universiteit Hasselt, Belgium geert.molenberghs@uhasselt.be www.censtat.uhasselt.be Katholieke

More information

Semiparametric Mixed Effects Models with Flexible Random Effects Distribution

Semiparametric Mixed Effects Models with Flexible Random Effects Distribution Semiparametric Mixed Effects Models with Flexible Random Effects Distribution Marie Davidian North Carolina State University davidian@stat.ncsu.edu www.stat.ncsu.edu/ davidian Joint work with A. Tsiatis,

More information

Overdispersion Workshop in generalized linear models Uppsala, June 11-12, Outline. Overdispersion

Overdispersion Workshop in generalized linear models Uppsala, June 11-12, Outline. Overdispersion Biostokastikum Overdispersion is not uncommon in practice. In fact, some would maintain that overdispersion is the norm in practice and nominal dispersion the exception McCullagh and Nelder (1989) Overdispersion

More information

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science. Texts in Statistical Science Generalized Linear Mixed Models Modern Concepts, Methods and Applications Walter W. Stroup CRC Press Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint

More information

Chapter 4: Generalized Linear Models-II

Chapter 4: Generalized Linear Models-II : Generalized Linear Models-II Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay

More information

Generalized Linear Models. Kurt Hornik

Generalized Linear Models. Kurt Hornik Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general

More information

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 1.1 The Probability Model...1 1.2 Finite Discrete Models with Equally Likely Outcomes...5 1.2.1 Tree Diagrams...6 1.2.2 The Multiplication Principle...8

More information

Practical Considerations Surrounding Normality

Practical Considerations Surrounding Normality Practical Considerations Surrounding Normality Prof. Kevin E. Thorpe Dalla Lana School of Public Health University of Toronto KE Thorpe (U of T) Normality 1 / 16 Objectives Objectives 1. Understand the

More information

The combined model: A tool for simulating correlated counts with overdispersion. George KALEMA Samuel IDDI Geert MOLENBERGHS

The combined model: A tool for simulating correlated counts with overdispersion. George KALEMA Samuel IDDI Geert MOLENBERGHS The combined model: A tool for simulating correlated counts with overdispersion George KALEMA Samuel IDDI Geert MOLENBERGHS Interuniversity Institute for Biostatistics and statistical Bioinformatics George

More information

Lecture 6: Gaussian Mixture Models (GMM)

Lecture 6: Gaussian Mixture Models (GMM) Helsinki Institute for Information Technology Lecture 6: Gaussian Mixture Models (GMM) Pedram Daee 3.11.2015 Outline Gaussian Mixture Models (GMM) Models Model families and parameters Parameter learning

More information

Generalized Multilevel Models for Non-Normal Outcomes

Generalized Multilevel Models for Non-Normal Outcomes Generalized Multilevel Models for Non-Normal Outcomes Topics: 3 parts of a generalized (multilevel) model Models for binary, proportion, and categorical outcomes Complications for generalized multilevel

More information

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

Count and Duration Models

Count and Duration Models Count and Duration Models Stephen Pettigrew April 2, 2015 Stephen Pettigrew Count and Duration Models April 2, 2015 1 / 1 Outline Stephen Pettigrew Count and Duration Models April 2, 2015 2 / 1 Logistics

More information

Type I and type II error under random-effects misspecification in generalized linear mixed models Link Peer-reviewed author version

Type I and type II error under random-effects misspecification in generalized linear mixed models Link Peer-reviewed author version Type I and type II error under random-effects misspecification in generalized linear mixed models Link Peer-reviewed author version Made available by Hasselt University Library in Document Server@UHasselt

More information

A CASE STUDY IN HANDLING OVER-DISPERSION IN NEMATODE COUNT DATA SCOTT EDWIN DOUGLAS KREIDER. B.S., The College of William & Mary, 2008 A THESIS

A CASE STUDY IN HANDLING OVER-DISPERSION IN NEMATODE COUNT DATA SCOTT EDWIN DOUGLAS KREIDER. B.S., The College of William & Mary, 2008 A THESIS A CASE STUDY IN HANDLING OVER-DISPERSION IN NEMATODE COUNT DATA by SCOTT EDWIN DOUGLAS KREIDER B.S., The College of William & Mary, 2008 A THESIS submitted in partial fulfillment of the requirements for

More information

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,

More information

Web-based Supplementary Material for A Two-Part Joint. Model for the Analysis of Survival and Longitudinal Binary. Data with excess Zeros

Web-based Supplementary Material for A Two-Part Joint. Model for the Analysis of Survival and Longitudinal Binary. Data with excess Zeros Web-based Supplementary Material for A Two-Part Joint Model for the Analysis of Survival and Longitudinal Binary Data with excess Zeros Dimitris Rizopoulos, 1 Geert Verbeke, 1 Emmanuel Lesaffre 1 and Yves

More information

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random STA 216: GENERALIZED LINEAR MODELS Lecture 1. Review and Introduction Much of statistics is based on the assumption that random variables are continuous & normally distributed. Normal linear regression

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

Count and Duration Models

Count and Duration Models Count and Duration Models Stephen Pettigrew April 2, 2014 Stephen Pettigrew Count and Duration Models April 2, 2014 1 / 61 Outline 1 Logistics 2 Last week s assessment question 3 Counts: Poisson Model

More information

Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models:

Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models: Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models: Marginal models: based on the consequences of dependence on estimating model parameters.

More information

A strategy for modelling count data which may have extra zeros

A strategy for modelling count data which may have extra zeros A strategy for modelling count data which may have extra zeros Alan Welsh Centre for Mathematics and its Applications Australian National University The Data Response is the number of Leadbeater s possum

More information

Generalized linear mixed models for dependent compound risk models

Generalized linear mixed models for dependent compound risk models Generalized linear mixed models for dependent compound risk models Emiliano A. Valdez joint work with H. Jeong, J. Ahn and S. Park University of Connecticut ASTIN/AFIR Colloquium 2017 Panama City, Panama

More information

H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL

H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL Intesar N. El-Saeiti Department of Statistics, Faculty of Science, University of Bengahzi-Libya. entesar.el-saeiti@uob.edu.ly

More information

Generalized Linear Models 1

Generalized Linear Models 1 Generalized Linear Models 1 STA 2101/442: Fall 2012 1 See last slide for copyright information. 1 / 24 Suggested Reading: Davison s Statistical models Exponential families of distributions Sec. 5.2 Chapter

More information

General Regression Model

General Regression Model Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical

More information

Probability and Estimation. Alan Moses

Probability and Estimation. Alan Moses Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.

More information

Topic 12 Overview of Estimation

Topic 12 Overview of Estimation Topic 12 Overview of Estimation Classical Statistics 1 / 9 Outline Introduction Parameter Estimation Classical Statistics Densities and Likelihoods 2 / 9 Introduction In the simplest possible terms, the

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

Dirichlet process Bayesian clustering with the R package PReMiuM

Dirichlet process Bayesian clustering with the R package PReMiuM Dirichlet process Bayesian clustering with the R package PReMiuM Dr Silvia Liverani Brunel University London July 2015 Silvia Liverani (Brunel University London) Profile Regression 1 / 18 Outline Motivation

More information

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Overview In observational and experimental studies, the goal may be to estimate the effect

More information

Two-Part and Related Regression Models for Longitudinal Data

Two-Part and Related Regression Models for Longitudinal Data ANNUAL REVIEWS Further Click here to view this article's online features: Download figures as PPT slides Navigate linked references Download citations Explore related articles Search keywords Annu. Rev.

More information

The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference

The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference An application to longitudinal modeling Brianna Heggeseth with Nicholas Jewell Department of Statistics

More information

Personalized Treatment Selection Based on Randomized Clinical Trials. Tianxi Cai Department of Biostatistics Harvard School of Public Health

Personalized Treatment Selection Based on Randomized Clinical Trials. Tianxi Cai Department of Biostatistics Harvard School of Public Health Personalized Treatment Selection Based on Randomized Clinical Trials Tianxi Cai Department of Biostatistics Harvard School of Public Health Outline Motivation A systematic approach to separating subpopulations

More information

Chapter 4 Multi-factor Treatment Designs with Multiple Error Terms 93

Chapter 4 Multi-factor Treatment Designs with Multiple Error Terms 93 Contents Preface ix Chapter 1 Introduction 1 1.1 Types of Models That Produce Data 1 1.2 Statistical Models 2 1.3 Fixed and Random Effects 4 1.4 Mixed Models 6 1.5 Typical Studies and the Modeling Issues

More information

Introduction to mtm: An R Package for Marginalized Transition Models

Introduction to mtm: An R Package for Marginalized Transition Models Introduction to mtm: An R Package for Marginalized Transition Models Bryan A. Comstock and Patrick J. Heagerty Department of Biostatistics University of Washington 1 Introduction Marginalized transition

More information

Introduction to Generalized Linear Models

Introduction to Generalized Linear Models Introduction to Generalized Linear Models Edps/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2018 Outline Introduction (motivation

More information

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Components of a linear model The two

More information

Generalized Linear Models Introduction

Generalized Linear Models Introduction Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,

More information

Mohammed. Research in Pharmacoepidemiology National School of Pharmacy, University of Otago

Mohammed. Research in Pharmacoepidemiology National School of Pharmacy, University of Otago Mohammed Research in Pharmacoepidemiology (RIPE) @ National School of Pharmacy, University of Otago What is zero inflation? Suppose you want to study hippos and the effect of habitat variables on their

More information

STAT 705 Generalized linear mixed models

STAT 705 Generalized linear mixed models STAT 705 Generalized linear mixed models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 24 Generalized Linear Mixed Models We have considered random

More information

STAT 302 Introduction to Probability Learning Outcomes. Textbook: A First Course in Probability by Sheldon Ross, 8 th ed.

STAT 302 Introduction to Probability Learning Outcomes. Textbook: A First Course in Probability by Sheldon Ross, 8 th ed. STAT 302 Introduction to Probability Learning Outcomes Textbook: A First Course in Probability by Sheldon Ross, 8 th ed. Chapter 1: Combinatorial Analysis Demonstrate the ability to solve combinatorial

More information

Lecture 7 Time-dependent Covariates in Cox Regression

Lecture 7 Time-dependent Covariates in Cox Regression Lecture 7 Time-dependent Covariates in Cox Regression So far, we ve been considering the following Cox PH model: λ(t Z) = λ 0 (t) exp(β Z) = λ 0 (t) exp( β j Z j ) where β j is the parameter for the the

More information

Joint longitudinal and time-to-event models via Stan

Joint longitudinal and time-to-event models via Stan Joint longitudinal and time-to-event models via Stan Sam Brilleman 1,2, Michael J. Crowther 3, Margarita Moreno-Betancur 2,4,5, Jacqueline Buros Novik 6, Rory Wolfe 1,2 StanCon 2018 Pacific Grove, California,

More information

DEEP, University of Lausanne Lectures on Econometric Analysis of Count Data Pravin K. Trivedi May 2005

DEEP, University of Lausanne Lectures on Econometric Analysis of Count Data Pravin K. Trivedi May 2005 DEEP, University of Lausanne Lectures on Econometric Analysis of Count Data Pravin K. Trivedi May 2005 The lectures will survey the topic of count regression with emphasis on the role on unobserved heterogeneity.

More information

Simulating Realistic Ecological Count Data

Simulating Realistic Ecological Count Data 1 / 76 Simulating Realistic Ecological Count Data Lisa Madsen Dave Birkes Oregon State University Statistics Department Seminar May 2, 2011 2 / 76 Outline 1 Motivation Example: Weed Counts 2 Pearson Correlation

More information

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal

More information

Product Held at Accelerated Stability Conditions. José G. Ramírez, PhD Amgen Global Quality Engineering 6/6/2013

Product Held at Accelerated Stability Conditions. José G. Ramírez, PhD Amgen Global Quality Engineering 6/6/2013 Modeling Sub-Visible Particle Data Product Held at Accelerated Stability Conditions José G. Ramírez, PhD Amgen Global Quality Engineering 6/6/2013 Outline Sub-Visible Particle (SbVP) Poisson Negative Binomial

More information

An R # Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM

An R # Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM An R Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM Lloyd J. Edwards, Ph.D. UNC-CH Department of Biostatistics email: Lloyd_Edwards@unc.edu Presented to the Department

More information

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San

More information

Describing Contingency tables

Describing Contingency tables Today s topics: Describing Contingency tables 1. Probability structure for contingency tables (distributions, sensitivity/specificity, sampling schemes). 2. Comparing two proportions (relative risk, odds

More information

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014 Probability Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh August 2014 (All of the slides in this course have been adapted from previous versions

More information

Distribution-free ROC Analysis Using Binary Regression Techniques

Distribution-free ROC Analysis Using Binary Regression Techniques Distribution-free Analysis Using Binary Techniques Todd A. Alonzo and Margaret S. Pepe As interpreted by: Andrew J. Spieker University of Washington Dept. of Biostatistics Introductory Talk No, not that!

More information

STAT 526 Spring Final Exam. Thursday May 5, 2011

STAT 526 Spring Final Exam. Thursday May 5, 2011 STAT 526 Spring 2011 Final Exam Thursday May 5, 2011 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will

More information

Experimental Design and Statistics - AGA47A

Experimental Design and Statistics - AGA47A Experimental Design and Statistics - AGA47A Czech University of Life Sciences in Prague Department of Genetics and Breeding Fall/Winter 2014/2015 Matúš Maciak (@ A 211) Office Hours: M 14:00 15:30 W 15:30

More information

Lecture 01: Introduction

Lecture 01: Introduction Lecture 01: Introduction Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 01: Introduction

More information

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models Optimum Design for Mixed Effects Non-Linear and generalized Linear Models Cambridge, August 9-12, 2011 Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

More information

,..., θ(2),..., θ(n)

,..., θ(2),..., θ(n) Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.

More information

Surrogate marker evaluation when data are small, large, or very large Geert Molenberghs

Surrogate marker evaluation when data are small, large, or very large Geert Molenberghs Surrogate marker evaluation when data are small, large, or very large Geert Molenberghs Interuniversity Institute for Biostatistics and statistical Bioinformatics (I-BioStat) Universiteit Hasselt & KU

More information

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Michael J. Daniels and Chenguang Wang Jan. 18, 2009 First, we would like to thank Joe and Geert for a carefully

More information

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011 Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011 Outline Ordinary Least Squares (OLS) Regression Generalized Linear Models

More information

Joint Modeling of Longitudinal Item Response Data and Survival

Joint Modeling of Longitudinal Item Response Data and Survival Joint Modeling of Longitudinal Item Response Data and Survival Jean-Paul Fox University of Twente Department of Research Methodology, Measurement and Data Analysis Faculty of Behavioural Sciences Enschede,

More information

Generalized Quasi-likelihood versus Hierarchical Likelihood Inferences in Generalized Linear Mixed Models for Count Data

Generalized Quasi-likelihood versus Hierarchical Likelihood Inferences in Generalized Linear Mixed Models for Count Data Sankhyā : The Indian Journal of Statistics 2009, Volume 71-B, Part 1, pp. 55-78 c 2009, Indian Statistical Institute Generalized Quasi-likelihood versus Hierarchical Likelihood Inferences in Generalized

More information

Generalized logit models for nominal multinomial responses. Local odds ratios

Generalized logit models for nominal multinomial responses. Local odds ratios Generalized logit models for nominal multinomial responses Categorical Data Analysis, Summer 2015 1/17 Local odds ratios Y 1 2 3 4 1 π 11 π 12 π 13 π 14 π 1+ X 2 π 21 π 22 π 23 π 24 π 2+ 3 π 31 π 32 π

More information

Modeling Longitudinal Count Data with Excess Zeros and Time-Dependent Covariates: Application to Drug Use

Modeling Longitudinal Count Data with Excess Zeros and Time-Dependent Covariates: Application to Drug Use Modeling Longitudinal Count Data with Excess Zeros and : Application to Drug Use University of Northern Colorado November 17, 2014 Presentation Outline I and Data Issues II Correlated Count Regression

More information

STAT5044: Regression and Anova

STAT5044: Regression and Anova STAT5044: Regression and Anova Inyoung Kim 1 / 18 Outline 1 Logistic regression for Binary data 2 Poisson regression for Count data 2 / 18 GLM Let Y denote a binary response variable. Each observation

More information

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation Biost 58 Applied Biostatistics II Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 5: Review Purpose of Statistics Statistics is about science (Science in the broadest

More information

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation NELS 88 Table 2.3 Adjusted odds ratios of eighth-grade students in 988 performing below basic levels of reading and mathematics in 988 and dropping out of school, 988 to 990, by basic demographics Variable

More information

Petr Volf. Model for Difference of Two Series of Poisson-like Count Data

Petr Volf. Model for Difference of Two Series of Poisson-like Count Data Petr Volf Institute of Information Theory and Automation Academy of Sciences of the Czech Republic Pod vodárenskou věží 4, 182 8 Praha 8 e-mail: volf@utia.cas.cz Model for Difference of Two Series of Poisson-like

More information

Generalized linear models

Generalized linear models Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models

More information

STAT 526 Advanced Statistical Methodology

STAT 526 Advanced Statistical Methodology STAT 526 Advanced Statistical Methodology Fall 2017 Lecture Note 10 Analyzing Clustered/Repeated Categorical Data 0-0 Outline Clustered/Repeated Categorical Data Generalized Linear Mixed Models Generalized

More information

Bayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London

Bayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London Bayesian methods for missing data: part 1 Key Concepts Nicky Best and Alexina Mason Imperial College London BAYES 2013, May 21-23, Erasmus University Rotterdam Missing Data: Part 1 BAYES2013 1 / 68 Outline

More information

Analysis of Count Data A Business Perspective. George J. Hurley Sr. Research Manager The Hershey Company Milwaukee June 2013

Analysis of Count Data A Business Perspective. George J. Hurley Sr. Research Manager The Hershey Company Milwaukee June 2013 Analysis of Count Data A Business Perspective George J. Hurley Sr. Research Manager The Hershey Company Milwaukee June 2013 Overview Count data Methods Conclusions 2 Count data Count data Anything with

More information

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P. Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk

More information

T E C H N I C A L R E P O R T A SANDWICH-ESTIMATOR TEST FOR MISSPECIFICATION IN MIXED-EFFECTS MODELS. LITIERE S., ALONSO A., and G.

T E C H N I C A L R E P O R T A SANDWICH-ESTIMATOR TEST FOR MISSPECIFICATION IN MIXED-EFFECTS MODELS. LITIERE S., ALONSO A., and G. T E C H N I C A L R E P O R T 0658 A SANDWICH-ESTIMATOR TEST FOR MISSPECIFICATION IN MIXED-EFFECTS MODELS LITIERE S., ALONSO A., and G. MOLENBERGHS * I A P S T A T I S T I C S N E T W O R K INTERUNIVERSITY

More information

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

More information

MIXED EFFECTS MODELS FOR TIME SERIES

MIXED EFFECTS MODELS FOR TIME SERIES Outline MIXED EFFECTS MODELS FOR TIME SERIES Cristina Gorrostieta Hakmook Kang Hernando Ombao Brown University Biostatistics Section February 16, 2011 Outline OUTLINE OF TALK 1 SCIENTIFIC MOTIVATION 2

More information

BIOSTATS Intermediate Biostatistics Spring 2017 Exam 2 (Units 3, 4 & 5) Practice Problems SOLUTIONS

BIOSTATS Intermediate Biostatistics Spring 2017 Exam 2 (Units 3, 4 & 5) Practice Problems SOLUTIONS BIOSTATS 640 - Intermediate Biostatistics Spring 2017 Exam 2 (Units 3, 4 & 5) Practice Problems SOLUTIONS Practice Question 1 Both the Binomial and Poisson distributions have been used to model the quantal

More information

Bayesian Models in Machine Learning

Bayesian Models in Machine Learning Bayesian Models in Machine Learning Lukáš Burget Escuela de Ciencias Informáticas 2017 Buenos Aires, July 24-29 2017 Frequentist vs. Bayesian Frequentist point of view: Probability is the frequency of

More information