Advanced Quantitative Methods: limited dependent variables

Similar documents
Multiple regression: Categorical dependent variables

Data Analytics for Social Science

Single-level Models for Binary Responses

Linear Regression With Special Variables

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

Homework 1 Solutions

STAT5044: Regression and Anova

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

Review of Multinomial Distribution If n trials are performed: in each trial there are J > 2 possible outcomes (categories) Multicategory Logit Models

Generalized Linear Models Introduction

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Logit Regression and Quantities of Interest

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

Linear Regression Models P8111

Logit Regression and Quantities of Interest

disc choice5.tex; April 11, ffl See: King - Unifying Political Methodology ffl See: King/Tomz/Wittenberg (1998, APSA Meeting). ffl See: Alvarez

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes 1

Generalized logit models for nominal multinomial responses. Local odds ratios

Goals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1. Multinomial Dependent Variable. Random Utility Model

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

Generalized Linear Models I

Classification. Chapter Introduction. 6.2 The Bayes classifier

Econometrics II. Seppo Pynnönen. Spring Department of Mathematics and Statistics, University of Vaasa, Finland

Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson

Advanced Quantitative Methods: ordinary least squares

Ch 6: Multicategory Logit Models

Linear Regression. Data Model. β, σ 2. Process Model. ,V β. ,s 2. s 1. Parameter Model

Can a Pseudo Panel be a Substitute for a Genuine Panel?

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Generalised linear models. Response variable can take a number of different formats

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random

Post-Estimation Uncertainty

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

GOV 2001/ 1002/ E-200 Section 7 Zero-Inflated models and Intro to Multilevel Modeling 1

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

Models for Heterogeneous Choices

Logistic & Tobit Regression

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

Categorical data analysis Chapter 5

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

A Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46

Generalized linear models

Applied Health Economics (for B.Sc.)

Applied Econometrics Lecture 1

Poisson regression: Further topics

Maximum Likelihood and. Limited Dependent Variable Models

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

BIOS 625 Fall 2015 Homework Set 3 Solutions

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Regression with Qualitative Information. Part VI. Regression with Qualitative Information

Visualizing Inference

12 Modelling Binomial Response Data

Visualizing Inference

Semiparametric Generalized Linear Models

STA 216, GLM, Lecture 16. October 29, 2007

Lecture-19: Modeling Count Data II

GOV 2001/ 1002/ E-2001 Section 10 1 Duration II and Matching

High-Throughput Sequencing Course

Generalized Linear Models for Non-Normal Data

12 Generalized linear models

Lecture 41 Sections Mon, Apr 7, 2008

Practice Examination # 3

Poisson Regression. Gelman & Hill Chapter 6. February 6, 2017

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Generalized Linear Modeling - Logistic Regression

Generalized Linear Models (GLZ)

Loglinear models. STAT 526 Professor Olga Vitek

STAT 7030: Categorical Data Analysis

Splitting a predictor at the upper quarter or third and the lower quarter or third

Generalized linear models

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA

Chapter 11: Models for Matched Pairs

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a

[Chapter 6. Functions of Random Variables]

Generalised Linear Mixed Models

Introduction To Logistic Regression

Contents. Part I: Fundamentals of Bayesian Inference 1

GOV 2001/ 1002/ Stat E-200 Section 8 Ordered Probit and Zero-Inflated Logit

Binary Dependent Variables

Generalized Linear Models

Consider Table 1 (Note connection to start-stop process).

1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches

CHAPTER 5. Logistic regression

Week 7: Binary Outcomes (Scott Long Chapter 3 Part 2)

8 Nominal and Ordinal Logistic Regression

Count and Duration Models

Machine Learning. Lecture 3: Logistic Regression. Feng Li.

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Introduction to Statistical Analysis

Regression Discontinuity Designs

Estimated Precision for Predictions from Generalized Linear Models in Sociological Research

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

MULTIVARIATE DISTRIBUTIONS

Applied Linear Statistical Methods

Transcription:

Advanced Quantitative Methods: Limited Dependent Variables I University College Dublin 2 April 2013

1 2 3 4 5

Outline Model Measurement levels 1 2 3 4 5

Components Model Measurement levels Two components of the model: y N(µ,σ 2 ) Stochastic µ = Xβ Systematic

Components Model Measurement levels Two components of the model: y N(µ,σ 2 ) Stochastic µ = Xβ Systematic Generalised version (not necessarily linear): y f(µ,α) Stochastic µ = g(x,β) Systematic (King 1998, 8)

Components Model Measurement levels y f(µ,α) Stochastic µ = g(x,β) Systematic Stochastic component: varies over repeated (hypothetical) observations on the same unit. Systematic component: varies across units, but constant given X. (King 1998, 8)

Uncertainty Model Measurement levels y f(µ,α) Stochastic µ = g(x,β) Systematic Two types of uncertainty: Estimation uncertainty: lack of knowledge about α and β; can be reduced by increasing n.

Uncertainty Model Measurement levels y f(µ,α) Stochastic µ = g(x,β) Systematic Two types of uncertainty: Estimation uncertainty: lack of knowledge about α and β; can be reduced by increasing n. Fundamental uncertainty: represented by stochastic component and exists independent of researcher.

Levels of measurement Model Measurement levels Discreet Continuous party choice - - - Interval Ratio Categories in no particular order (examples in cells)

Levels of measurement Model Measurement levels Discreet Continuous party choice - education level - - Interval Ratio Categories in a specific order (examples in cells)

Levels of measurement Model Measurement levels Discreet Continuous party choice - education level - - Interval how likely to vote... temperature Ratio All values possible (examples in cells)

Levels of measurement Model Measurement levels Discreet Continuous party choice - education level - - Interval how likely to vote... temperature Ratio deaths in war ideological distance All values possible, with a meaningful zero point (examples in cells)

Levels of measurement Model Measurement levels Discreet Continuous party choice - education level - turnout - Interval how likely to vote... temperature Ratio deaths in war ideological distance Two categories, coded as 0 and 1 (examples in cells)

Limited dependent variables Model Measurement levels When a dependent variable is not continuous, or is truncated for some reason, a linear model would lead to implausible predictions.

Limited dependent variables Model Measurement levels When a dependent variable is not continuous, or is truncated for some reason, a linear model would lead to implausible predictions. E.g. regressing whether someone voted on a set of independent variables will give somewhat reasonable estimates (see previous homework), but using these estimates to calculate predictions leads to meaningless predictions.

Limited dependent variables Model Measurement levels When a dependent variable is not continuous, or is truncated for some reason, a linear model would lead to implausible predictions. E.g. regressing whether someone voted on a set of independent variables will give somewhat reasonable estimates (see previous homework), but using these estimates to calculate predictions leads to meaningless predictions. Furthermore, estimating limited dependent variable data with a linear model implies serious heteroscedasticity.

Outline Logistic regression Interpretation Probit regression 1 2 3 4 5

models Logistic regression Interpretation Probit regression models have a dependent variable consisting of two categories.

models Logistic regression Interpretation Probit regression models have a dependent variable consisting of two categories. For example, Vote on a particular law Turning out in an election Approval in a referendum Bankrupt or not

Logistic regression Logistic regression Interpretation Probit regression Stochastic: y i = Bernoulli(π i ) = π y i i (1 π i ) 1 y i

Logistic regression Logistic regression Interpretation Probit regression Stochastic: y i = Bernoulli(π i ) = π y i i (1 π i ) 1 y i Variance of stochastic component: V(y i ) = π i (1 π i )

Logistic regression Logistic regression Interpretation Probit regression Stochastic: y i = Bernoulli(π i ) = π y i i (1 π i ) 1 y i Variance of stochastic component: V(y i ) = π i (1 π i ) Systematic: π i = g(x i β) = 1 1+e x iβ

Logistic distribution Logistic regression Interpretation Probit regression 1/(1 + exp( x)) 0.0 0.2 0.4 0.6 0.8 1.0 x x 0.5x 6 4 2 0 2 4 6 x

Logistic regression Logistic regression Interpretation Probit regression Resulting predictions: are bounded between 0 and 1

Logistic regression Logistic regression Interpretation Probit regression Resulting predictions: are bounded between 0 and 1 show limited effect at extremes

Logistic regression Logistic regression Interpretation Probit regression Resulting predictions: are bounded between 0 and 1 show limited effect at extremes are a smooth, monotone translation from linear prediction X i β

Loglikelihood function Logistic regression Interpretation Probit regression Systematic part: π i = 1 1+e x iβ Stochastic part: P(y i = 1) = π y i i (1 π i ) 1 y i n f(y π i ) = π y i i (1 π i ) 1 y i i=1 l(y) = log = n i=1 π y i i (1 π i ) 1 y i n y i logπ i + i=1 n (1 y i )log(1 π i ) i=1

Logistic regression in R Logistic regression Interpretation Probit regression model <- glm(y ~ x1 + x2 + x3, data=data, family=binomial(link="logit")) summary(model)

Exercise Logistic regression Interpretation Probit regression Using the asiabaro.dta data set, estimate a logistic regression explaining abstention in elections by trust in the government, satisfaction with democracy, gender, urbanisation, and preference for democracy.

Graphical interpretation Logistic regression Interpretation Probit regression Useful for plotting relationship between one x and π, holding the other values of X constant

Graphical interpretation Logistic regression Interpretation Probit regression Useful for plotting relationship between one x and π, holding the other values of X constant (e.g. at the mean, median, etc.).

Graphical interpretation Logistic regression Interpretation Probit regression Useful for plotting relationship between one x and π, holding the other values of X constant (e.g. at the mean, median, etc.). Because the link function g(xβ) is not linear in this case (but instead g(xβ) = 1 1+e Xβ ), the effect of X on y depends on all X.

Graphical: example in R Logistic regression Interpretation Probit regression m <- glm(y ~ x1 + x2 + x3, family=binomial(link="logit")) curve(1/(1+exp(-(coef(m)[1] + coef(m)[2]*x + coef(m)[3]*mean(x2) + coef(m([4]*mean(x3)))), from=0, to=1)

Graphical: example Logistic regression Interpretation Probit regression jitter(vote) 0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0 20000 40000 60000 80000 100000 120000

Fitted values Logistic regression Interpretation Probit regression A second useful way of interpreting logit regression coefficients is by describing typical cases or interesting examples.

Fitted values Logistic regression Interpretation Probit regression A second useful way of interpreting logit regression coefficients is by describing typical cases or interesting examples. Party Amount P(Vote = 1) 95% C.I. Republican $10000.83 [.69,.95] Democrat $10000.44 [.27,.65] Republican $20000.93 [.84,.99] Democrat $20000.69 [.44,.95]

First differences Logistic regression Interpretation Probit regression Basic idea is to calculate and present: ˆπ = g(xβ) g(x β), whereby X differs only in one variable from X.

First differences Logistic regression Interpretation Probit regression Basic idea is to calculate and present: ˆπ = g(xβ) g(x β), whereby X differs only in one variable from X. Variable Values Diff 95% C.I. Party Rep, Dem -.34 [-.53,-.11] Amount $10000, $20000.11 [.03,.20]

Derivatives Logistic regression Interpretation Probit regression δˆπ X j = β jˆπ(1 ˆπ)

Derivatives Logistic regression Interpretation Probit regression δˆπ X j = β jˆπ(1 ˆπ) Hence a quick method to interpret logit coefficients is to divide them by 4 to get the slope at ˆπ =.5.

Presentation Logistic regression Interpretation Probit regression Bottom line: it is much better to present interpretable and understandable inferences, with an indication of the level of uncertainty, than to present simply estimated coefficients.

Presentation Logistic regression Interpretation Probit regression E.g. An increase in automobile support for a Republican senator from $10000 to $20000 in total increases his or her probability to vote for the Corporate Average Fuel Economy standard bill by 11%, give or take 7%, all else equal.

Predictions check Logistic regression Interpretation Probit regression One method of checking the predictions is to Group the cases in bins based on their predicted probabilities

Predictions check Logistic regression Interpretation Probit regression One method of checking the predictions is to Group the cases in bins based on their predicted probabilities, e.g. 0-10%, 10-20%, etc.

Predictions check Logistic regression Interpretation Probit regression One method of checking the predictions is to Group the cases in bins based on their predicted probabilities, e.g. 0-10%, 10-20%, etc. Plot the bins against the number of ones found in this bin.

Predictions check Logistic regression Interpretation Probit regression One method of checking the predictions is to Group the cases in bins based on their predicted probabilities, e.g. 0-10%, 10-20%, etc. Plot the bins against the number of ones found in this bin. The closer this is to the 45 degree line, the more accurate the predictions of the probabilities.

Predictions check Logistic regression Interpretation Probit regression One method of checking the predictions is to Group the cases in bins based on their predicted probabilities, e.g. 0-10%, 10-20%, etc. Plot the bins against the number of ones found in this bin. The closer this is to the 45 degree line, the more accurate the predictions of the probabilities. In R, you can use the custom function plot.predicted().

Logistic regression Interpretation Probit regression m <- glm(..., family = binomial(link = "logit")) phat <- predict(m, type = "response") pgroups <- cut(phat, seq(0, 1,.1)) hist(phat, xlim = c(0,1)) barplot(prop.table(table(pgroups)), ylim = c(0,1))

Predictions check Logistic regression Interpretation Probit regression Proportion of outcomes positive 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Predicted probability

Predictions check Logistic regression Interpretation Probit regression 0 s correctly predicted 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 1 s correctly predicted

Exercise Logistic regression Interpretation Probit regression Using the logistic previous logistic regression: 1 What is the slope of the regression line with respect to preference for democracy at ˆπ =.5? 2 Calculate predicted differences for urban vs rural respondent. 3 Calculate first differences for urban vs rural and female vs male. 4 Plot predicted probabilities as a function of preference for democracy.

Threshold models Logistic regression Interpretation Probit regression Imagine we have a latent, unobserved variable: y f(µ) µ = Xβ

Threshold models Logistic regression Interpretation Probit regression Imagine we have a latent, unobserved variable: y f(µ) µ = Xβ With the following observation mechanism: { 1 if yi > 0 y i = 0 if yi 0.

Threshold models Logistic regression Interpretation Probit regression If f(µ i ) is the standardised logistic distribution, we replicate the logit model.

Threshold models Logistic regression Interpretation Probit regression If f(µ i ) is the standardised logistic distribution, we replicate the logit model. f(x i β) = e y i µ i (1+e y i µ i ) 2

Threshold models Logistic regression Interpretation Probit regression Alternatively, if f(µ i ) is the cumulative standard normal distribution (σ 2 = 1), we have the probit model.

Threshold models Logistic regression Interpretation Probit regression Alternatively, if f(µ i ) is the cumulative standard normal distribution (σ 2 = 1), we have the probit model. Hence β can now be interpreted as the effect of an increase by 1 in x on y, whereby the unit of y is one standard deviation.

Threshold models Logistic regression Interpretation Probit regression Alternatively, if f(µ i ) is the cumulative standard normal distribution (σ 2 = 1), we have the probit model. Hence β can now be interpreted as the effect of an increase by 1 in x on y, whereby the unit of y is one standard deviation. P(y i = 1 β,x i ) = Φ(x i β), whereby Φ(x) is the cumulative standard normal distribution (i.e. the surface between and x under a normal distribution with µ = 0 and σ 2 = 1).

Threshold models Logistic regression Interpretation Probit regression Alternatively, if f(µ i ) is the cumulative standard normal distribution (σ 2 = 1), we have the probit model. Hence β can now be interpreted as the effect of an increase by 1 in x on y, whereby the unit of y is one standard deviation. P(y i = 1 β,x i ) = Φ(x i β), whereby Φ(x) is the cumulative standard normal distribution (i.e. the surface between and x under a normal distribution with µ = 0 and σ 2 = 1). pnorm(xb); predict(m, type="response")

Threshold models Logistic regression Interpretation Probit regression

Loglikelihood function Logistic regression Interpretation Probit regression Systematic part: π i = Φ(x i β), where Φ(x) is the cumulative standard normal distribution. Stochastic part: P(y i = 1) = π y i i (1 π i ) 1 y i n f(y π i ) = π y i i (1 π i ) 1 y i l(y) = i=1 n y i logπ i + i=1 n (1 y i )log(1 π i ) i=1

Exercise Logistic regression Interpretation Probit regression Repeat the previous estimation using probit instead of logit. 1 Plot predicted probabilities with respect to preference for democracy. 2 Evaluate the predictive performance of this model. 3 Add suitdemoc as a dependent variable to the model. Does the predictive performance improve?

Outline 1 2 3 4 5

Ordered data Models where the dependent variable is categorical, and the categories are in a particular order.

Ordered data Models where the dependent variable is categorical, and the categories are in a particular order. We can thus take a similar latent variable approach.

Ordered probit y i N(µ i,1) µ i = x i β

Ordered probit y i N(µ i,1) µ i = x i β With the following observation mechanism: y i = j if τ j 1 y i τ j

Ordered probit y i N(µ i,1) µ i = x i β With the following observation mechanism: y i = j if τ j 1 y i τ j Or, alternatively, when we use dummy variables for each category: { 1 if τ j 1 yi τ j y ij = 0 otherwise.

Ordered probit y ij = { 1 if τ j 1 y i τ j 0 otherwise.

Ordered probit y ij = { 1 if τ j 1 y i τ j 0 otherwise. P(y i = j β,x i ) = Φ(τ j x i β) Φ(τ j 1 x i β)

Ordered probit

Ordered probit To estimate in R: library(mass) m <- polr(y ~ x1 + x2, method="probit")

Ordered probit To estimate in R: library(mass) m <- polr(y ~ x1 + x2, method="probit") To get predicted probabilities: pnorm(m$zeta[2] - xb) - pnorm(m$zeta[1] - xb) predict(m, type="probs")

Loglikelihood function Assuming y i can take the values 0,1,2,...,p. P(y i = 0) = Φ(τ 1 x i β) P(y i = p) = Φ(x i β τ p ) P(y i = j 0 < y i < p) = Φ(τ j+1 x i β) Φ(τ j x i β)

Loglikelihood function Assuming y i can take the values 0,1,2,...,p. P(y i = 0) = Φ(τ 1 x i β) P(y i = p) = Φ(x i β τ p ) P(y i = j 0 < y i < p) = Φ(τ j+1 x i β) Φ(τ j x i β) l(y) = log(φ(τ 1 x i β))+ log(φ(x i β τ p )) y i =0 y i =p + 0<y i <p log(φ(τ j+1 x i β) Φ(τ j x i β))

Exercise Using the asiabaro.dta data set, estimate a model explaining interest in politics by gender, education, and reliance on fate. 1 Perform t-tests for each of the independent variables. Use: se <- sqrt(diag(vcov(m))) 2 Check the standard errors of the τ values are the categories clearly separated? 3 Calculate the predicted probability for males and females of the respondent being very interested in politics.

Outline 1 2 3 4 5

Multinomial model Here the dependent variable consists of multiple categories, without particular order.

Multinomial model Here the dependent variable consists of multiple categories, without particular order. A latent variable approach will thus not work.

Multinomial model Here the dependent variable consists of multiple categories, without particular order. A latent variable approach will thus not work. The probabilities for a particular case to be in any of those categories still has to be one.

Multinomial logit R function: multinom, in package nnet.

Multinomial logit R function: multinom, in package nnet. m <- multinom(y ~ x1 + x2) summary(m)

Multinomial logit R function: multinom, in package nnet. m <- multinom(y ~ x1 + x2) summary(m) predict(m, type="probs")

Multinomial logit 1 p if j = 1 k=1 P(y i = j β,x i ) = ex i β k e x i β j p if j > 1 k=1 ex i β k

Multinomial logit 1 p if j = 1 k=1 P(y i = j β,x i ) = ex i β k e x i β j p if j > 1 k=1 ex i β k To get predicted probabilities for a new dataset (X ) - e.g. one that is constant on all variables but one:

Multinomial logit 1 p if j = 1 k=1 P(y i = j β,x i ) = ex i β k e x i β j p if j > 1 k=1 ex i β k To get predicted probabilities for a new dataset (X ) - e.g. one that is constant on all variables but one: m <- multinom(y ~ x1 + x2 + x3 + x4) Xb <- X %*% t(coef(m)) denominator <- 1 - rowsums(exp(xb)) probs <- exp(xb) / denominator baseline <- 1 - rowsums(p) p <- cbind(baseline, p)

Loglikelihood function 1 p if j = 1 k=1 P(y i = j β,x i ) = ex i β k e x i β j p if j > 1 k=1 ex i β k

Loglikelihood function 1 p if j = 1 k=1 P(y i = j β,x i ) = ex i β k e x i β j p if j > 1 k=1 ex i β k l(y) = n p p I(y i = j)x i β j log i=1 j=0 j=0 e x iβ j

Exercise Using data set asiabaro.dta, selecting only data from Taiwan, explain party choice by urbanisation, gender, age, and trust in the police. Estimate a multinomial model and interpret the results.

Outline 1 2 3 4 5

models Here the dependent variable is a count (of events). For example, The number of conflicts in a particular period The number of coups d état in the 1980s The number of visits to a psychologist for a respondent

models Here the dependent variable is a count (of events). For example, The number of conflicts in a particular period The number of coups d état in the 1980s The number of visits to a psychologist for a respondent Note: Truncated at zero (no negative outcome possible)

models Here the dependent variable is a count (of events). For example, The number of conflicts in a particular period The number of coups d état in the 1980s The number of visits to a psychologist for a respondent Note: Truncated at zero (no negative outcome possible) No upper limit

models Here the dependent variable is a count (of events). For example, The number of conflicts in a particular period The number of coups d état in the 1980s The number of visits to a psychologist for a respondent Note: Truncated at zero (no negative outcome possible) No upper limit Limited by time, place, or both

Poisson regression Stochastic: y i Poisson(λ i )

Poisson regression Stochastic: y i Poisson(λ i ) Systematic: λ i = e x iβ

Poisson regression Stochastic: y i Poisson(λ i ) Systematic: λ i = e x iβ Poisson distribution: f(k,λ) = λk e λ k!

Poisson regression Stochastic: y i Poisson(λ i ) Systematic: λ i = e x iβ Poisson distribution: f(k,λ) = λk e λ k! Note that there is no parameter for the variance. This leads to regular underestimation of the variance (overdispersion, most common) or overestimation of the variance (underdispersion).

Poisson regression Stochastic: y i Poisson(λ i ) Systematic: λ i = e x iβ Poisson distribution: f(k,λ) = λk e λ k! Note that there is no parameter for the variance. This leads to regular underestimation of the variance (overdispersion, most common) or overestimation of the variance (underdispersion). glm(y ~ x1 + x2 + x3, family=poisson)

Poisson regression The exponential of the coefficients can be interpreted in a multiplicative sense. E.g. if the coefficient of x 1 is 0.012, then e 0.012 = 1.012 implies that an increase of x 1 by 1 increases y by 1.2%.

Estimating overdispersion The estimated overdispersion of a Poisson model is 1 n k n ( y i ŷ i sd(ŷ i ) )2 = 1 n k i n ( y i e X iβ e X i β )2, i which has a χ 2 n k distribution.

Negative binomial The negative binomial model is useful for overdispersed data: Stochastic: y i NegBin(φ,σ 2 )

Negative binomial The negative binomial model is useful for overdispersed data: Stochastic: y i NegBin(φ,σ 2 ) Systematic: φ = e x iβ = E(y i )

Negative binomial The negative binomial model is useful for overdispersed data: Stochastic: y i NegBin(φ,σ 2 ) Systematic: φ = e xiβ = E(y i ) Variance: V(y X) = φσ 2

Negative binomial The negative binomial model is useful for overdispersed data: Stochastic: y i NegBin(φ,σ 2 ) Systematic: φ = e x iβ = E(y i ) Variance: V(y X) = φσ 2 When σ 2 approaches 1, the negative binomial approaches the Poisson distribution.

Negative binomial The negative binomial model is useful for overdispersed data: Stochastic: y i NegBin(φ,σ 2 ) Systematic: φ = e x iβ = E(y i ) Variance: V(y X) = φσ 2 When σ 2 approaches 1, the negative binomial approaches the Poisson distribution. library(mass) glm.nb(y ~ x1 + x2 + x3)