Generalized Linear Models I

Similar documents
ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

Generalized Linear Models

Outline of GLMs. Definitions

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

Generalized Linear Models 1

Generalized Linear Models. Kurt Hornik

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Some explanations about the IWLS algorithm to fit generalized linear models

Linear Methods for Prediction

Generalized Estimating Equations

Introduction to General and Generalized Linear Models

Linear Regression Models P8111

Poisson regression: Further topics

Chapter 4: Generalized Linear Models-II

where F ( ) is the gamma function, y > 0, µ > 0, σ 2 > 0. (a) show that Y has an exponential family distribution of the form

Statistics & Data Sciences: First Year Prelim Exam May 2018

Linear Methods for Prediction

Likelihoods for Generalized Linear Models

Generalized Linear Models Introduction

Introduction to Generalized Linear Models

LOGISTIC REGRESSION Joseph M. Hilbe

Lecture 8. Poisson models for counts

Lecture 16 Solving GLMs via IRWLS

MIT Spring 2016

MATH Generalized Linear Models

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

Generalized Linear Models

STAT5044: Regression and Anova

Generalized linear models

Generalized Linear Models (GLZ)

Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )

Stat 579: Generalized Linear Models and Extensions

Machine Learning. Lecture 3: Logistic Regression. Feng Li.

SB1a Applied Statistics Lectures 9-10

Regression models. Generalized linear models in R. Normal regression models are not always appropriate. Generalized linear models. Examples.

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

Lecture 4: Exponential family of distributions and generalized linear model (GLM) (Draft: version 0.9.2)

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

For iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions.

Lecture 1. Introduction Statistics Statistical Methods II. Presented January 8, 2018

STA 216, GLM, Lecture 16. October 29, 2007

A Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Negative Multinomial Model and Cancer. Incidence

if n is large, Z i are weakly dependent 0-1-variables, p i = P(Z i = 1) small, and Then n approx i=1 i=1 n i=1

Generalized Linear Models. stat 557 Heike Hofmann

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Lecture 3 September 1

A Practitioner s Guide to Generalized Linear Models

Analysis of Extra Zero Counts using Zero-inflated Poisson Models

Generalized linear models

Weighted Least Squares I

36-463/663: Multilevel & Hierarchical Models

Information in a Two-Stage Adaptive Optimal Design

Sampling distribution of GLM regression coefficients

Generalized linear models

Linear, Generalized Linear, and Mixed-Effects Models in R. Linear and Generalized Linear Models in R Topics

GENERALIZED LINEAR MODELS Joseph M. Hilbe

Generalized Linear Models: An Introduction

Chapter 5: Generalized Linear Models

Lecture 5: LDA and Logistic Regression

Semiparametric Generalized Linear Models

Stat 710: Mathematical Statistics Lecture 12

The In-and-Out-of-Sample (IOS) Likelihood Ratio Test for Model Misspecification p.1/27

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Chapter 5: Logistic Regression-I

12 Modelling Binomial Response Data

Sections 4.1, 4.2, 4.3

Measurement error, GLMs, and notational conventions

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak.

Two Hours. Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER. 26 May :00 16:00

Extending the Linear Model

Generalized Principal Component Analysis: Projection of Saturated Model Parameters

Chapter 14 Logistic Regression, Poisson Regression, and Generalized Linear Models

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,

STAT5044: Regression and Anova

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

1/15. Over or under dispersion Problem

Chapter 1. Modeling Basics

Chapter 22: Log-linear regression for Poisson counts

Exponential families also behave nicely under conditioning. Specifically, suppose we write η = (η 1, η 2 ) R k R p k so that

Comments. x > w = w > x. Clarification: this course is about getting you to be able to think as a machine learning expert

Lecture 16 : Bayesian analysis of contingency tables. Bayesian linear regression. Jonathan Marchini (University of Oxford) BS2a MT / 15

Logistic Regression. Seungjin Choi

Generalized Linear Models (1/29/13)

Advanced Quantitative Methods: limited dependent variables

3 Independent and Identically Distributed 18

Homework 1 Solutions

Poisson Regression. Gelman & Hill Chapter 6. February 6, 2017

Lecture 9 STK3100/4100

Transcription:

Statistics 203: Introduction to Regression and Analysis of Variance Generalized Linear Models I Jonathan Taylor - p. 1/16

Today s class Poisson regression. Residuals for diagnostics. Exponential families. Fisher scoring. - p. 2/16

Poisson regression Response: Y i Poisson(λ i ) independent. Popular applications: 1. anything with count data 2. contingency tables Link functions: 1. g(x) = log(x); 2. g(x) = 1/x Variance function: V (µ) = µ. - p. 3/16

Canonical link for Poisson In logistic regression, we identified logit as canonical link because g (µ) = 1 V (µ). We have to solve g (µ) = 1 µ. Therefore, in Poisson regression the is g(µ) = log µ. - p. 4/16

Contingency tables An m k contingency table is a table of counts categorized by two categorical variables A and B B(1)... B(k) A(1) Y 11... Y 1k Y 1..... A(n) Y n1... Y nk Y n Y 1... Y k Y where Y ij Poisson(λ ij ). Thinking of categorizing a population of birds, say, by two variables: one being occurence (or not) of West Nile virus, the other, which type of bird. Question: is there more West Nile virus in one type of bird? Or, if we sample a bird at random are the events {has West Nile virus} and {is a bird of type j} independent? Equivalent to H 0 : E(Y ij ) = λ i,a λ j,b log(e(y ij )) = η i + γ j - p. 5/16

Test for independence Under H 0 Pearson s X 2 X 2 = i,j Ŷij = Y i Y j Y (Y ij Ŷ ij ) 2 Ŷ ij. Under H 0, if λ ij s are not too small, X 2 is approximately χ 2 nk n k+1. - p. 6/16

Poisson deviance Easy to see that the Poisson deviance is DEV (µ Y ) = 2 (Y log Yµ ) (Y µ). Because φ = 1 in a Poisson model, under H 0 ( ) 2 i,j Y ij log Y ij Ŷ ij (Y ij Ŷ ij ) χ 2 nk k+1 - p. 7/16

Overdispersed Poisson If the data truly is Poisson, φ = 1 but not all count data is Poisson. Example that throw off Poisson ness: clustering, i.e. counting disease occurence within families and using many families. If disease has a genetic component then the total number of occurences will be more like a mixture of Poissons. How to accomodate this in inference (or test for it)? φ = 1 n p (Y i µ i ) 2 V (µ i ) = X2 n p. - p. 8/16

Residuals Pearson s X 2 can be thought of as a sum of weighted residuals squared. Pearson residual r i,p = Y i µ i V ( µ i ) Deviance in a GLM can be expressed as a sum of n terms: DEV ( µ Y ) = DEV ( µ i Y i ) Leads to deviance residuals r i,dev = sign(y i µ i ) DEV ( µ i Y i ) these are the default ones in R. Once you have residuals, one can do all the standard diagnostic plots. - p. 9/16

Multivariate Newton-Raphson Univariate root finding: suppose we want to find a root of f : R R. Start at some x 0, x n+1 = x n f(x n) f (x n ), n 0. Multivariate root finding: suppose we want to find roots of f : R p R p {x R p : f(x) = 0 R p }. Start at some x 0 x n+1 = x n Jf(x n ) 1 f(x n ), n 0. where Jf is the Jacobean of f. Convergence to a solution is guaranteed (at least assuming that all solutions points x have det(jf(x )) 0. - p. 10/16

Finding critical points Multivariate critical point finding: suppose we want to find critical points of g : R p R. Start at some x 0 x n+1 = x n 2 g(x n ) 1 g(x n ), n 0 where 2 g(x) is the Hessian of g. Convergence to a critical point is guaranteed assuming det( 2 g(x )) 0 for all critical points x. If g is convex: convergence to global minimizer, otherwise can converge to other critical points. - p. 11/16

GLM: Fisher scoring In the Poisson model, β j DEV ( µ Y ) = = = β j DEV ( µ i Y i ) Y i µ i µ i µ i β j Y i µ i µ i 1 g ( µ i ) X ij - p. 12/16

GLM: Fisher scoring 2 DEV ( µ Y ) β k β j ( ) Yi µ i 1 = β k µ i g ( µ i ) X ij ( 1 1 = µ i g ( µ i ) 2 X ijx ik + + (Y i µ i ) ( 1 V ( µ i ) V ( µ i) + )) 1 2 µ i β k g ( µ i ) X ij Therefore Newton-Raphson needs the extra terms not in our algorithm! ( ) 2 E β= β b DEV ( µ Y ) β k β j = 1 µ i 1 g ( µ i ) 2 X ijx ik - p. 13/16

Fisher scoring with the If g (µ) = V (µ) 1 = µ 1 in the Poisson case, then β j DEV ( µ Y ) = = = β j DEV ( µ i Y i ) Y i µ i µ i µ i β j (Y i µ i )X ij 2 β k β j DEV ( µ Y ) = = 1 g ( µ i ) X ikx ij V ( µi )X ik X ij - p. 14/16

Exponential families We have seen logistic and Poisson regression: when else can we talk about GLMs? Suppose the density or mass function of Y then Using the facts we see f(y θ, φ) exp((yθ b(θ))/a(φ) + c(y, φ)) l(θ, φ Y ) = (yθ b(θ))/a(φ) + c(y, φ) ( ) l E (θ,φ) θ Var (θ,φ) ( l θ E(Y ) = b (θ), Poisson: b(θ) = e θ, a(φ) = 1. = 0 ) ( ) l 2 = E (θ,φ) θ 2 Var(y) = b (θ)a(φ). - p. 15/16

Canonical link The variance function is The is g = b 1. Why? g (µ) = Poisson: g(x) = log x... V (µ) = b b 1 (µ). 1 b (b 1 (µ)) = 1 V (µ) - p. 16/16