Diagnostics in Poisson Regression. Models - Residual Analysis

Similar documents
Lecture 6: Introduction to Linear Regression

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Statistics for Business and Economics

Statistics for Economics & Business

Negative Binomial Regression

28. SIMPLE LINEAR REGRESSION III

18. SIMPLE LINEAR REGRESSION III

/ n ) are compared. The logic is: if the two

Chapter 13: Multiple Regression

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

Chapter 11: Simple Linear Regression and Correlation

STAT 3008 Applied Regression Analysis

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

Hydrological statistics. Hydrological statistics and extremes

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 14 Simple Linear Regression

Comparison of Regression Lines

Introduction to Generalized Linear Models

Basic Business Statistics, 10/e

x i1 =1 for all i (the constant ).

Correlation and Regression

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

F8: Heteroscedasticity

17 - LINEAR REGRESSION II

Economics 130. Lecture 4 Simple Linear Regression Continued

Learning Objectives for Chapter 11

January Examinations 2015

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

Statistics II Final Exam 26/6/18

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Statistics MINITAB - Lab 2

a. (All your answers should be in the letter!

Chapter 4: Regression With One Regressor

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Professor Chris Murray. Midterm Exam

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Reminder: Nested models. Lecture 9: Interactions, Quadratic terms and Splines. Effect Modification. Model 1

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Chapter 15 Student Lecture Notes 15-1

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

Topic 7: Analysis of Variance

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Two-factor model. Statistical Models. Least Squares estimation in LM two-factor model. Rats

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA APPLIED STATISTICS PAPER I

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

STATISTICS QUESTIONS. Step by Step Solutions.

ECON 351* -- Note 23: Tests for Coefficient Differences: Examples Introduction. Sample data: A random sample of 534 paid employees.

Polynomial Regression Models

Scatter Plot x

Linear Regression Analysis: Terminology and Notation

Tests of Single Linear Coefficient Restrictions: t-tests and F-tests. 1. Basic Rules. 2. Testing Single Linear Coefficient Restrictions

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

EXAMINATION. N0028N Econometrics. Luleå University of Technology. Date: (A1016) Time: Aid: Calculator and dictionary

Chapter 15 - Multiple Regression

An R implementation of bootstrap procedures for mixed models

STK4080/9080 Survival and event history analysis

Logistic regression with one predictor. STK4900/ Lecture 7. Program

Module Contact: Dr Susan Long, ECO Copyright of the University of East Anglia Version 1

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis

Basically, if you have a dummy dependent variable you will be estimating a probability.

Properties of Least Squares

BIO Lab 2: TWO-LEVEL NORMAL MODELS with school children popularity data

( )( ) [ ] [ ] ( ) 1 = [ ] = ( ) 1. H = X X X X is called the hat matrix ( it puts the hats on the Y s) and is of order n n H = X X X X.

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

Unit 10: Simple Linear Regression and Correlation

Chapter 14: Logit and Probit Models for Categorical Response Variables

IV. Modeling a Mean: Simple Linear Regression

Question 1 carries a weight of 25%; question 2 carries 20%; question 3 carries 25%; and question 4 carries 30%.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi

Chapter 14 Simple Linear Regression Page 1. Introduction to regression analysis 14-2

Regression with limited dependent variables. Professor Bernard Fingleton

Problem of Estimation. Ordinary Least Squares (OLS) Ordinary Least Squares Method. Basic Econometrics in Transportation. Bivariate Regression Analysis

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

Measuring the Strength of Association

Interpreting Slope Coefficients in Multiple Linear Regression Models: An Example

Introduction to Regression

The SAS program I used to obtain the analyses for my answers is given below.

Lecture 4 Hypothesis Testing

QUASI-LIKELIHOOD APPROACH TO RATER AGREEMENT PLUS LINEAR BY LINEAR ASSOCIATION MODEL FOR ORDINAL CONTINGENCY TABLES

Introduction to Analysis of Variance (ANOVA) Part 1

Designing a Pseudo R-Squared Goodness-of-Fit Measure in Generalized Linear Models

Composite Hypotheses testing

Financing Innovation: Evidence from R&D Grants

Reduced slides. Introduction to Analysis of Variance (ANOVA) Part 1. Single factor

[ ] λ λ λ. Multicollinearity. multicollinearity Ragnar Frisch (1934) perfect exact. collinearity. multicollinearity. exact

3/3/2014. CDS M Phil Econometrics. Vijayamohanan Pillai N. CDS Mphil Econometrics Vijayamohan. 3-Mar-14. CDS M Phil Econometrics.

Biostatistics. Chapter 11 Simple Linear Correlation and Regression. Jing Li

Transcription:

Dagnostcs n Posson Regresson Models - Resdual Analyss 1

Outlne Dagnostcs n Posson Regresson Models - Resdual Analyss Example 3: Recall of Stressful Events contnued 2

Resdual Analyss Resduals represent varaton n the data that cannot be explaned by the model. Resdual plots useful for dscoverng patterns, outlers or msspecfcatons of the model. Systematc patterns dscovered may suggest how to reformulate the model. If the resduals exhbt no pattern, then ths s a good ndcaton that the model s approprate for the partcular data. 3

Types of Resduals for Posson Regresson Models Raw resduals: O E Pearson resduals: (or standardsed) O E E Adjusted resduals O E (preferred): SD(O E ) S D ( O - E ) = E ( 1 - E / n ) 4

If H 0 s true, the adjusted resduals have a standard normal N(0,1) dstrbuton for large samples. Back to the Recall of Stressful Events example 5

Example 3: Recall of Stressful Events Data month adjusted resdual month adjusted resdual 1 2.46 10 0.66 2 1.02 11-0.42 3 2.10 12 0.30 4 3.18 13 1.02 5-1.14 14-1.86 6 1.02 15-0.78 7 0.66 16-2.58 8-1.50 17-2.58 9-0.06 18-1.50 6

Example 3: Recall of Stressful Events If the adjusted resduals follow N(0,1), we expect 18 0.05 = 0.9 1 adjusted resdual larger than 1.96 or smaller than -1.96 Months 1, 3, 4 Months 16, 17 postve adjusted resduals negatve adjusted resduals More lkely to report recent events (postve resdual: means observed s larger than expected,.e. more lkely to report a stressful event n a months mmedately pror to ntervew) 7

Example 3: Recall of Stressful Events Plot of adjusted resduals by month shows downward trend. 8

Normal Q-Q Plot Probablty plot, whch s a graphcal method for comparng two probablty dstrbutons by plottng ther quantles aganst each other (Q stands for Quantles,.e. quantle aganst quantle) Plots observed quantles aganst expected quantles, hence plot quantles of adjusted resduals aganst the quantles of the standard normal. Ponts should le close to y = x lne, f adjusted resduals are N(0,1). 9

10

Conclusons Dvergence n the tals from straght lne Strong evdence that equprobable model does not ft the data. More lkely to report recent events. Such a tendency would result f respondents were more lkely to remember recent events than dstant events. So, use another model. Whch one? Let us explore a Posson tme trend model (a Posson model wth a covarate) 11

Posson Regresson Model wth a Covarate Posson Tme Trend Model 12

Outlne Posson Tme Trend Model Posson regresson model wth a covarate Example 3: Recall of Stressful Events contnued 13

Posson Tme Trend Model ncludng a covarate y = number of events n month. The log of the expected count s a lnear functon of tme before ntervew: log( μ ) α β 1,, C where µ = E(y ) s the expected number of events n month. Meanng: model assumes response varable Y has a Posson dstrbuton, and assumes the logarthm of ts expected value can be modelled by a lnear combnaton of unknown parameters (here 2). 14

Posson Tme Trend Model ncludng a covarate For model log( μ ) α β 1,, C β = 0 equprobable model (no effect of the covarate). β > 0 the expected count µ ncreases as month () ncreases. β < 0 the expected count µ decreases as month () ncreases. 15

Posson Model wth one covarate In general, the form of a Posson model wth one explanatory varable (X) s: lo g ( m ) = a + b x Explanatory varable X can be categorcal or contnuous. In ths example month s the explanatory varable X. 16

Posson Model: Parameter Estmates Predcted Values and Resduals Parameters estmated va maxmum lkelhood estmaton and. αˆ These estmates are used to compute predcted values: βˆ log( μˆ ) αˆ βˆ μˆ exp( αˆ βˆ ) The adjusted resduals are μˆ y (1 μˆ μˆ /n) where n C 1 y 17

Posson Model: Goodness of Ft Statstcs The goodness-of-ft test statstcs are Pearson Ch-squared Test Statstc 2 C 1 y μˆ μˆ 2 Lkelhood Rato Test Statstcs (Devance) L 2 2 C 1 y log y μˆ 18

The null hypothess s H 0 : The model y ~ Posson(µ ) wth log(µ ) = α+β holds. Returnng to our example μˆ expected counts the model) are (under 19

Example 3: Recall of Stressful Events Month Count Obs Count Exp Month Count Obs Count Exp 1 15 15.1 10 10 7.1 2 11 13.9 11 7 6.5 3 14 12.8 12 9 6.0 4 17 11.8 13 11 5.5 5 5 10.8 14 3 5.1 6 11 9.9 15 6 4.7 7 10 9.1 16 1 4.3 8 4 8.4 17 1 4.0 9 8 7.7 18 4 3.6 Expected value calculated usng the model: mˆ = e x p ( aˆ + b ˆ ) 20

If H 0 s true X 2 ~ χ 2 df L 2 ~ χ 2 df where df = no. of cells no. of model parameters = C 2 = 18 2 = 16 X 2 = 22.71 wth 16 df. p-value = 0.1216 do not reject H 0. L 2 = 24.57 wth 16 df. p-value = 0.0778 do not reject H 0. 21

Concluson The Posson tme trend model s consstent wth the data. Partcpants are more lkely to report recent events. 22

Adjusted Resduals month adjusted resdual month adjusted resdual 1-0.05 10 1.11 2-0.89 11 0.18 3 0.36 12 1.26 4 1.61 13 2.42 5-1.86 14-0.98 6 0.34 15 0.64 7 0.28 16-1.70 8-1.58 17-1.60 9-0.09 18 0.20 23

Adjusted Resduals If the adjusted resduals follow N(0,1), we expect 18 0.05 = 0.9 1 adjusted resdual larger than 1.96 or smaller than -1.96. So, t s not surprsng that we see one resdual (month 13: 2.42) exceed 1.96 (.e. no evdence aganst H 0 ). Adjusted resduals show no obvous pattern based on plot by month. 24

Posson Regresson: Model Interpretaton Ftted values help wth nterpretaton Ftted values are obtaned by The coeffcent s nterpreted whether the explanatory varable s a categorcal or a contnuous varable. yˆ = mˆ = e x p ( aˆ + b ˆ ) bˆ 25

Posson Regresson: Model Interpretaton cont. For a bnary explanatory varable, where X=0 means absence and X=1 means presence, the rate rato (RR) for presence vs. absence RR E y P r e s e n c e E y x 1 E y A b s e n c e E y x 0 e x p ( ) For a contnuous explanatory varable a one unt ncrease n X wll result n a multplcatve effect of on μ e x p ( ) 26

Confdence Interval for the Slope Estmate of the slope s βˆ = -0.0838 wth estmated standard error = 0.0168. Therefore a 95% CI for β s: = -0.0838 ± 1.96 х 0.0168 = (-0.117 ; -0.051 ) (therefore negatve, CI does not nclude 0), the expected count µ decreases as month () ncreases se( βˆ ) 27

Estmated Change To add nterpretaton, the estmated change (from one month to the next) n proportonate terms s μˆ μˆ μˆ μˆ 1 1 Ê (Y ) exp( ) μˆ 1 exp( αˆ βˆ ( 1)) 1 exp( αˆ βˆ ) μˆ αˆ βˆ exp( βˆ ) 1 28

Estmated Change For our example Recall of Stressful Events ths means: mˆ - + 1 mˆ mˆ = e x p (- 0. 0 8 3 8 ) - 1 = 0.920-1 = - 0.08.e. the estmated change s an 8% decrease per month. 29

CI for the Estmated Change Lower Bound Upper Bound β 0.117 0.051 exp(β) exp(-0.117) = 0.89 exp(-0.051) = 0.95 exp(β) - 1 0.89-1 = -0.11 0.95-1 = -0.05-11% -5% Estmated change s -8% per month wth a 95% CI of (-11%, -5%). The number of stressful events decreases. 30

References Agrest, A (2013) Categorcal Data Analyss, 3 rd ed, New Jersey, Wley. Agrest, A. (2007) An Introducton to Categorcal Data Analyss. 2 nd ed, Wley. Haberman, S. J. (1978) Analyss of Qualtatve Data, Volume 1, Academc Press, New York. p. 3 for Recall of stressful events example. p. 87 for Sucdes example. 31

References Agrest, A (2013) Categorcal Data Analyss, 3 rd ed, New Jersey, Wley. Agrest, A. (2007) An Introducton to Categorcal Data Analyss. 2 nd ed, Wley. Haberman, S. J. (1978) Analyss of Qualtatve Data, Volume 1, Academc Press, New York. p. 3 for Recall of stressful events example. p. 87 for Sucdes example. 32