Homework 5: Answer Key. Plausible Model: E(y) = µt. The expected number of arrests arrests equals a constant times the number who attend the game.

Size: px

Start display at page:

Download "Homework 5: Answer Key. Plausible Model: E(y) = µt. The expected number of arrests arrests equals a constant times the number who attend the game."

Ruby Wright
5 years ago
Views:

1 EdPsych/Psych/Soc 589 C.J. Anderson Homework 5: Answer Key 1. Probelm 3.18 (page 96 of Agresti). (a) Y assume Poisson random variable. Plausible Model: E(y) = µt. The expected number of arrests arrests equals a constant times the number who attend the game. or equivalently E(Y/t) = µ. The expected rate of arrests is constant over different teams. This model is plausible if those who attend the games for different terms are the same (similar kind of people who attend different games) Show E(Y) = µt. E(Y/t) = µ. offset= log(t). Model with distribution=poisson and link=identity. Model with distribution=poisson, link=identity, and log(e(y/t)) = logµ = α where log(µ) = α. Estimated Model: Ŷ =.0004t. The estimated expected rate of arrests equals.0004; or we estimate that for every 10 thousand people who attend a game 4 will be arrested. There are two (equivalent) ways in which you could have fit this model: Method 1: E(Y) = α (number attend) = αt In SAS, proc genmod data=fights order=data; model arrests=attend / dist=poisson link=identity noint; run; Note that the options noint mean no intercept. In R, glm(arrests -1 + attend,data=fights,family=poisson(link= identity )) Note that 1 means no intercept. 1

2 With this method the estimate model is Ŷ =.0004t. Method 2: log(e(y)) = α+log(number attend) In SAS, proc genmod data=fights order=data; model arrests= / dist=poisson link=identity offset=logt; run; and in R: glm(arrests offset(t), data=fights,family=poisson)) The estimate model is E(Y) ˆ = exp( )t =.0004t and has the same interpretation as model from Method I. (b) See figure Soccer: Number of Arrests by Number Attending Number Arrested Bournemouth West_Brom Hudderfield Bradford_City Middlesbro BirminghamIpswich_Town Crystal_Palace Leicester_City Blackburn Shrewsbury Swindon_Town Barnsley Stoke_City Sheffield_Utd Millwall Hull_City Plymouth Oldham Reading Aston_Villa Leeds_United Manchester_City Number Attending Game (in thousands) (c) The negative binominal is fit by only changing dist=poisson to dist=negbin in SAS and using glm.nb in R (regardless of whether you use method I or II). I ll show Method II here. 2

3 Poisson Negative Binomial Parameter Estimate SE Wald p-value Est. se Wald p α , 556 < < /φ = D Fit Statistics df G X AIC Note: 95% CI for 1/φ is (0.1355,0.5023) Note that the results from R are a bit different from SAS: Parameter Estimate SE z p-value Est. se z p α < <.01 1/φ = D Fit Statistics df G X 2 AIC (d) The parameter estimates for α for the two models are very similar in value (i.e., for SAS, versus and for R versus ); however, their standard errors differ considerably (i.e., SAS: for Poisson and and R: versus ). This is consistent with the need for the Negative Binomial due to overdispersion; that is, the se from Poisson are too small. (e) There is evidence in support of the Negative Binomial model being the better one (don t need all of these for full credit): The estimated standard errors for Poisson and Negative binomial are very different and that for NB is much larger. This is consistent with there being over dispersion in the data that is a problem for the Poisson. The 95% CI for the dispersion parameter (0.14,0.50) does not include 1 and suggests there is overdispersion in the data.**** Don t need this for for R. G 2 (and X 2 *** don t need this for R) indicate an acceptable fit of the model to the data (i.e., comparing them to a χ 2 with ν = 22 would yield a large p-value. However, this is not the case for the Poisson. The various information criteria (only AIC reported above) are all smaller for the NB than the Poisson. The smaller the value, the better the model. The adjusted Pearson residuals for negative binomial are much smaller than for Poisson (all except 1 are < 1.96, the exception is Bradford City with 2.09) 3

4 It is reasonable to expect that the crowds over teams are heterogeneous, perhaps due to living in different cities with different SES, crowding, etc. Before starting to analyze the data for the next three problems (i.e., 3.13., 3.14 and zero inflated, do a little bit of exploratory data analysis: 1. Compute the mean and variance of number of satellites. Compare. The mean is less than the variance (i.e., 2.92 < (3.15) 2 ), which suggests overdispersion. 2. Plot a histogram of the number of satellites. Comment. Below is a graph of the distribution of satellites. Notice that there are a lot with 0. This could Other than this end of the distribution, Poisson may be OK. A lot of 0s could explain why we have overdispersion (model fitting will help us decide for sure). This might be best fit using a zero-inflated Poission. Figure 1: The distribution of the counts. 3. Look at the relationship between number of satellites (or log of the number) by weight. Comment. 4

5 Also a look at the number of satellites versus weight with a smooth curve (actually a cubic regression) in the Figure 3. 5

6 Figure 2: Initial look at the data: counts versus explanatory variable with a cubic regression curve and log(count) versus explanatory variable with a linear regression curve overlayed.. 6

7 It appears that there is an outlier in terms of weight (i.e., weight> 5). I deleted it and recomputed the mean and variance, but doesn t change results much, so I left it in for the homework answers. And now to do the problems... Problem 3.13 on page 94 of Agresti (2007). The data with SAS code to create a SAS data set is on the course web-site. Note that I re-scaled weight to kg. 1. The prediction equation is ˆµ i = exp( (weight) i ) 2. The estimated mean for a female weighing 2.44kg is ˆµ = exp( (2.44)) = exp(1.0095) = For a one kg increase in weight, the (mean) number of satellites is exp(.5893) = 1.80 times (or 80% larger). Although a 95% confidence interval of ˆβ is given in the SAS output, this comes from ˆβ ± 1.96(ŝe) ± 1.96(0.0650) ± (0.4619, ) The 95% confidence interval for the multiplicative effect (i.e., exp(β)) is found by taking exp of the end-points of the interval for β: (exp(0.4619), exp(0.7167)) (1.59, 2.05) 4. A Wald test: H o : β = 0 versus H a : β 0. X 2 = ( ) = Comparing a chi-square distribution with df = 1 yields a very small p-value; therefore, reject H o and conclude the data support the hypothesis that the number of satellites is related to weight of the female crab. 5. A likelihood ratio test: 2( ) = 71.93, which has a very small p-value (compare to chi-square with df = 1). Conclusion is the same as in part (d). 7

8 Problem 3.14 on page 94 of Agresti (2007). Fitting a negative binomial model The prediction equation is ˆµ = exp( (weight)). The dispersion parameter is 1/φ = D(in Agresti notation) = The estimated standard error of the dispersion parameter is There is evidence that the Negative Binomial gives a better fit than the Poisson: The 95% confidence interval for 1/φ = D is (0.6948,1.4533). The value 0 is not in this interval which suggests we need the scale parameter. All of the global fit statistics are much better for the Negative Binomial than the Poisson: Poisson Negative Binomial Criterion DF Value Value/DF Value Value/DF Deviance Pearson Chi-Square AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) Since the Poisson is a special case of the binomial, we could do a likelihood ratio test (i.e., LR= = , df = 1, p is tiny). Also, according to the information criteria, the Negative Binomial has smaller values and this indicate it s better than the Poisson model. Graphics indicate that the Negative Binomial out-performs the Poisson (I didn t expect graphs, but it you did them, Great!). The Negative Binomial includes more points within the 95% confidence bands and fits the distribution of counts better than the Poisson (however there is room for improve...zip does the best). 8

9 Figure 3: The models were fit to data and then grouped to see how well the models are fitting the data. 9

10 Figure 4: To see how well the various models are doing in terms of fitting the distribution of number of satellites. Neither the Poisson or Negative Binomial are really doing that well; however, the ZIP does pretty good. 10

11 2. A 95% confidence interval for β with the Negative Binomial is ˆβ ± 1.96(se) = ± 1.96(0.1769) = ± (0.4136, ) Versus the one from the Poisson regression that was (0.4619, ) that has half-length equal to The one from the Negative Binomial is wider than the Poisson because the greater the estimated variance with the Negative Binomial (i.e., ˆµ i ˆµ 2 i ) results in greater estimated standard error for β (see page 82 of the text). Fit a zero inflated Poisson regression using weight as a predictor of the mean and width as a predictor in a logit model for the mixing probability. I fit several ZIP models, but the one that seemed to the best in terms of fit of model to data and parameter estimates are significant is one with weight as a predictor in the Poisson regression and width as a predictor in a logit model for the mixing probability. The results are Criteria For Assessing Goodness Of Fit Criterion DF Value Value/DF Deviance Scaled Deviance Pearson Chi-Square Scaled Pearson X Log Likelihood Full Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) Algorithm converged. Analysis Of Maximum Likelihood Parameter Estimates Standard Wald 95% Confidence Wald Parameter DF Estimate Error Limits Chi-Square Pr > ChiSq 11

12 Intercept <.0001 weight Scale NOTE: The scale parameter was held fixed. Analysis Of Maximum Likelihood Zero Inflation Parameter Estimates Standard Wald 95% Confidence Wald Parameter DF Estimate Error Limits Chi-Square Pr > ChiSq Intercept <.0001 width <.0001 So the estimated model for the probability is ˆπ i = exp( (width) i)) 1+exp( (width) i ). The odds of being in the zero class is exp(.5005) = 0.61 times the odds for a one unit increase in width. In other words, the wider the crab, the less likely they re in the zero-class. The estimated probability of a count: P(Y i = y) = ˆµ i = exp( (weight) i ) { ˆπi +(1 ˆπ i )exp( ˆµ i ) for y = 0 (1 ˆπ i ) exp( ˆµ i)ˆµ y i y! for y > 0 Since exp(0.1945) = 1.21, the expected number of satellites is 1.21 than the mean number of satellites with one unit less in weight. 12

13 Figure 5: Observed and fitted from ZIP with logit model. 13

Section Poisson Regression

Section Poisson Regression Section 14.13 Poisson Regression Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 26 Poisson regression Regular regression data {(x i, Y i )} n i=1,