TRB Paper # Examining the Crash Variances Estimated by the Poisson-Gamma and Conway-Maxwell-Poisson Models

TRB Paper #11-2877 Examining the Crash Variances Estimated by the Poisson-Gamma and Conway-Maxwell-Poisson Models Srinivas Reddy Geedipally 1 Engineering Research Associate Texas Transportation Instute Texas A&M Universy 3135 TAMU College Station, TX 77843-3135 Tel. (979) 862-1651 Fax. (979) 845-66 Email: srinivas-g@ttimail.tamu.edu Dominique Lord Associate Professor Zachry Department of Civil Engineering Texas A&M Universy 3136 TAMU College Station, TX 77843-3136 Tel. (979) 458-3949 Fax. (979) 845-6481 Email: d-lord@tamu.edu Word Count: 4,235 + 3, (4 tables + 8 figure) = 7,235 words November 15, 21 1 Corresponding author

ABSTRACT The Poisson-gamma (negative binomial or NB) distribution is still the most common probabilistic distribution used by transportation safety analysts for modeling motor vehicle crashes. Recent studies have showed that the Conway-Maxwell-Poisson distribution (COM- Poisson) distribution is also one of the promising distributions for developing crash prediction models. The obectives of this study were to investigate and compare the estimation of crash variance predicted by COM-Poisson GLM and the tradional Negative Binomial (NB) model. The comparison analysis was carried out using the most common functional forms employed by transportation safety analysts, which link crashes to the entering flows and other explanatory variables at intersections or on segments. To accomplish the obectives of the study, several NB and COM-Poisson GLMs, including flow-only models and models wh several covariates, were developed and compared using two datasets. The first dataset contained crash data collected at signalized 4-legged intersections in Toronto, Ont. The second dataset included data collected for rural 4-lane undivided highways in Texas. The results of this study show that the trend of crash variance prediction by COM-Poisson GLM is similar to that predicted by NB model. The spearman s rank correlation coefficients between that the crash variances predicted by COM- Poisson and NB model confirms that there is a perfect monotone increasing and the values are highly correlated. This means that a se that is characterized by a large variance will essentially be identified as such whether the NB and COM-Poisson model is used. 1

INTRODUCTION In highway safety, the tradional Poisson and mixed-poisson models are the most common probabilistic models utilized for analyzing crash data. Crash data have been found to often exhib over-dispersion (i.e. the variance is larger than the mean) and thus mixed-poisson models (such as the Poisson-gamma or negative binomial) are generally preferred over the tradional Poisson model. The Conway-Maxwell Poisson (COM-Poisson) distribution is also one of those generalizations of Poisson distribution, which can also be used for analyzing crash data. It was originally developed in 1962 as a method for modeling both under-dispersed and over-dispersed count data (1). The COM-Poisson distribution was then revised by Shmueli et al. (2) after a long period in which was not widely used. The COM-Poisson model can also handle underdispersed data (which the NB GLM cannot or has difficulties converging, see below) and datasets that contain intermingled over- and under-dispersed counts (for dual-link models only, since the dispersion characteristic is captured using the covariate-dependent shape parameter). Recent research in highway safety has shown that the dispersion parameter of Poisson-gamma model can potentially be dependent upon the covariates of the model and could vary from one observation to another (3-7). This characteristic has been shown to be important especially when the mean function is mis-specified, such as models that only incorporate entering traffic flows (8). Furthermore, previous studies have reported that Poisson-gamma models wh a varying dispersion parameter provide better statistical f (9-11). Similarly, the shape parameter of COM- Poisson model provides a basis for using a link function to allow the amount of over-dispersion or under-dispersion to vary across measurements. It is also expected that COM-Poisson models wh a varying shape parameter provide improved statistical f. The primary obective of this research was to examine whether the Poisson-gamma model and COM-Poisson model shows similar trend for estimating the crash variance. To accomplish the obectives of the study, NB and COM-Poisson GLMs were developed and compared using two datasets. The first dataset contained crash data collected at 4-legged signalized intersections in Toronto, Ont. The second dataset included data collected for rural 4-lane undivided highways in Texas. Flow-only models and models wh covariates were evaluated. This paper is organized as follows. The first section provides a brief overview about the characteristics of Poisson-gamma models and COM-Poisson models. The second section describes the methodology for estimating and comparing the models. The third section presents the summary statistics of the two datasets. The fourth section presents the results of the analysis. The last section provides a summary of the research and outline avenues for further work. BACKGROUND This section provides a brief description about the characteristics of the Poisson-gamma and the COM-Poisson models, respectively. 2

POISSON-GAMMA MODEL The Poisson-gamma (negative binomial or NB) distribution is the most common probabilistic distribution used by transportation safety analysts for modeling motor vehicle crashes (3, 4, 12 and 13). The Poisson-gamma model has the following model structure (14): the number of crashes Y for a particular i th se and time period t when condional on s mean is Poisson distributed and independent over all ses and time periods Y ~ Po( ) i = 1, 2,, I and t = 1, 2,, T (1) The mean of the Poisson is structured as: where, f X ; )exp( e ) (2) ( f (.) is a function of the covariates (X); is a vector of unknown coefficients; and, e is the model error independent of all the covariates. Wh this characteristic, can be shown that Y, condional on and, is distributed 2 as a Poisson-gamma random variable wh a mean and a variance, respectively. (Note: other variance functions exist for the Poisson-gamma model, but they are not covered here since they are seldom used in highway safety studies. The reader is referred to 15 and 16 for a description of alternative variance functions.) The probabily densy function (PDF) of the Poisson-gamma structure described above is given by the following equation: f y ;, y 1 1 y (3) Where, y = response variable for observation i and time period t ; = mean response for observation i and time period t ; and, = inverse dispersion parameter of the Poisson-gamma distribution. Note that if, the crash variance equals the crash mean and this model reverts back to the standard Poisson regression model. 3

The term is usually defined as the "inverse dispersion parameter" of the Poisson-gamma distribution. (Note: in the statistical and econometric lerature, 1 is usually defined as the dispersion parameter; in some published documents, the variable has also been defined as the over-dispersion parameter. ). This term has tradionally been assumed to be fixed and a unique value applied to the entire dataset in the study. As discussed above, recent research in highway safety has shown that the dispersion parameter can potentially be dependent upon the covariates of the model and could vary from one observation to another (3-7) COM-POISSON MODEL The COM-Poisson distribution has recently been used for modeling motor vehicle crashes (17-18). Shmueli et al. (2) elucidated the statistical properties of the COM-Poisson distribution using the formulation given by Conway and Maxwell (1), and Kadane et al. (19) developed the conugate distributions for the parameters of the COM-Poisson distribution. Its probabily mass function (PMF) can be given by Equations (4) and (5). y 1 PY y (4) Z, y! Z n, (5) n n! where, Y is a discrete count; is a centering parameter that is approximately the mean of the observations in many cases; and, is defined as the shape parameter of the COM-Poisson distribution. The centering parameter λ is approximately the mean when is close to one, differs substantially from the mean for small. Given that ν would be expected to be small for over-dispersed data, this would make a COM-Poisson model based on the original COM-Poisson formulation difficult to interpret and use for over-dispersed data. To circumvent this problem, Guikema and Coffelt (2) proposed a re-parameterization of the 1/ COM-Poisson distribution by substuting to provide a clear centering parameter. This new formulation of the COM-Poisson is summarized in Equations (6) and (7) below. PY y 1 y S, y! (6) n S, (7) n n! The mean and variance of Y are given in terms of the new formulation as EY V Y 2 1 log 2 2 log S 1 logs and log wh asymptotic approximations EY 12 12 and Var Y 4

especially accurate once μ>1. Wh this new parameterization, the integral part of μ is now the 1/ mode leaving μ as a reasonable approximation of the mean. The substution also allows ν to keep s role as a shape parameter. That is, if ν < 1, the variance is greater than the mean while ν > 1 leads to under-dispersion. Guikema and Coffelt (2) developed a COM-Poisson GLM framework for modeling discrete count data. The approach of Guikema and Coffelt (2) depended on MCMC for fting a duallink GLM based on the COM-Poisson distribution. It also used a reformulation of the COM Poisson to provide a more direct centering parameter than the original COM-Poisson formulation. Sellers and Shmueli (21) developed an MLE for a single-link GLM based on the original COM-Poisson distribution. Equations (8) (9) describe this modeling framework. The framework is in effect a dual-link GLM, in which both the mean and the variance depend on the covariates. In Equations (8) and (9), x i and z are covariates, and there are assumed to be p covariates used in the centering link function and q covariates used in the shape link function (similar to the varying dispersion parameter of the Poisson-gamma model proposed by 3, 7, 9, 22). The sets of parameters used in the two link functions do not necessarily have to be identical. p ln (8) i1 q 1 i x i ln z (9) The GLM framework can model under-dispersed data sets, over-dispersed data sets, and data sets that contain intermingled under-dispersed and over-dispersed counts (for dual-link models only, since the dispersion characteristic is captured using the covariate-dependent shape parameter). The variance is allowed to depend on the covariate values, which can be important if high (or low) values of some covariates tend to be variance-decreasing while high (or low) values of other covariates tend to be variance-increasing. The parameters have a direct link to eher the mean or the variance, providing insight into the behavior and driving factors in the problem, and the mean and variance of the predicted counts are readily approximated based on the covariate values and regression parameter estimates. METHODOLOGY This section describes the methodology used for estimating different NB and COM-Poisson models. For each dataset, COM-Poisson GLMs and NB models were inially estimated using the fixed shape parameter and dispersion parameter, respectively. Then, the models were developed using different parameterizations for a varying shape and a varying dispersion parameter. The functional form used for models were the following (Note: the centering parameter is the mean wh NB model whereas is approximately the mode wh the COM-Poisson model): 5

Toronto intersection data: 1 2 Centering parameter F F (1) i Ma_ i Min_ i 1 2 Shape parameter (of COM-Poisson model) F F (11) i Ma _ i Min _ i 1 2 Dispersion parameter (of NB model) F F (12) Texas segment data: i Ma _ i Min _ i Centering parameter (13) L F 1 2* LW 3* SW 4* CD e Shape parameter (of COM-Poisson model): Model 1: L (14) Model 2: / L (15) 1 Model 3: L (16) Dispersion parameter (of NB model) Model 1: L (17) Model 2: / L (18) 1 Model 3: L (19) Where, i = the mean number of crashes for intersection i ; = the mean number of crashes per year for segment ; F = entering flow for the maor approach (average annual daily traffic or Ma _ i Min _ i AADT) for intersection i ; F = entering flow for the minor approach (average annual daily traffic or AADT) for intersection i ; F = flow traveling on segment (average annual daily traffic or 6

AADT) and time period t ; L = length in miles for segment ; LW = lane width in ft for segment ; SW = total shoulder width in ft for segment ; CD = curve densy (curves per mile) for segment ; and, ' s, ' s = estimated coefficients. The coefficients of the COM-Poisson GLMs and NB models were estimated using the software WinBUGS (23). Vague or non-informative hyper-priors were utilized for the COM-Poisson and NB GLMs. A total of 3 Markov chains were used in the model estimation process. The Gelman- Rubin (G-R) convergence statistic was used to verify that the simulation runs converged properly. DATA DESCRIPTION This section describes the characteristics of the two datasets. The first dataset contained crash data collected in 1995 at 4-legged signalized intersections located in Toronto, Ont. The data have previously been used for several research proects and have been found to be of relatively good qualy (3, 17, 24-26). In total, 868 signalized intersections were used in this dataset. The second dataset contained crash data collected from 1997 to 21 at 4-lane rural undivided segments in Texas. The data were provided by the Texas Department of Public Safety (DPS) and the Texas Department of Transportation (TxDOT) and were used for the proect NCHRP 17-29 (Methodology for Estimating the Safety Performance of Multilane Rural Highways) (27). The final database included 1,499 segments (.1 mile). Table 1 presents the summary statistics for two datasets used in this study. 7

Table 1. Summary Statistics for the Toronto and Texas Data Min. Max. Average Total Crashes 54 11.56 (1.2) 13 Toronto Maor AADT 5469 72178 2844.81 (166.4) -- Minor AADT 53 42644 111.18 (8599.4) -- Crashes 97 2.84 (5.69) 4,253 Length (miles).1 6.275.55 (.67) 83.5 AADT 42 24,8 6,613.61 (41.1) -- Texas Lane Width (Feet) 9.75 16.5 12.57 (1.59) Shoulder Width (Right + Left) (Feet) 4 9.96 (8.2) -- Number of Horizontal Curves 16.7 (1.32) 152 RESULTS This section presents the modeling results for the COM-Poisson GLMs as well as for the NB models and is divided into two parts. The first part explains the modeling results for the Toronto data. The second part provides details about the modeling results for the Texas data. TORONTO DATA 8

Table 2 summarizes the results of the COM-Poisson and NB GLMs for the Toronto data. This table shows that the coefficients for the flow parameters are below one, which indicates that the crash risk increases at a decreasing rate as traffic flow increases. It should be pointed out that the 95% marginal posterior credible intervals for each of the coefficients did not include the origin. The Deviance Information Creria (i.e., DIC) value shows that there is no significant difference in the f among various models (this result supports the finding of Lord et al (17)). However, the NB model wh varying dispersion parameter showed a slight better f, as expected. Table 2. Modeling Results for the COM-Poisson and NB GLMs using the Toronto Data COM-Poisson NB Estimates Fixed shape parameter Varying shape parameter Fixed dispersion parameter Varying dispersion parameter Ln( ) -11.53 (.4159) -1.67 (.5249) -1.11 (.4794) -1.28 (.451) 1.635 (.4742).5498 (.4841).671 (.46).6161 (.4695) 2.795 (.311).7971 (.345).6852 (.21).6943 (.2295).348 (.283) -- -- -- -- -- 7.12 (.619) -- Ln( ) -- 4.882 (1.753) -- 2.764 (1.663) 1 -- -.5945 (.1734) -- -.49 (.1856) 2 --.133 (.5463) --.3671 (.149) DIC 4953.7 4937.48 4777.59 4762.38 Figure 1 illustrates the frequency distribution of the varying shape parameter of COM-Poisson distribution across all the observations. It is interesting to note that the frequency distribution can be approximated by a normal- or lognormal-shaped distribution. This figure shows that the highest frequency (i.e., the mode) occurred between the range of.3-.4 and the average shape parameter value was found to be.36, which is slightly higher than the value found for the fixed shape parameter (i.e..34). 9

45 4 392 35 Frequency 3 25 2 15 23 169 1 5 3 57 14 1 2.1-.2.2-.3.3-.4.4-.5.5-.6.6-.7.7-.8.8-.9 Figure 1. Frequency Distribution of the Varying Shape parameter of COM-Poisson Model (Mean=.36) Figure 2 illustrates the distribution of the varying (inverse) dispersion parameter across all the observations. As seen, there is a wide variation of dispersion parameter among various observations. The figure shows that the highest frequency (i.e., the mode) occurred between the range of 9-1, whereas the average dispersion parameter value was found to be 7.. However, the average value of varying dispersion parameter is found to be much closer to the fixed dispersion parameter (i.e. 7.1). Frequency 16 14 12 1 8 6 58 121 12 12 97 133 141 66 4 2 2 1 1-2 2-3 3-4 4-5 5-6 6-7 7-8 8-9 9-1 1-11 Figure 2. Frequency Distribution of the Inverse Dispersion Parameter of NB Model (Mean=7.) Figure 3 shows the comparison of crash variance predicted by COM-Poisson and NB models. It can be seen that for the ses wh crash mean less than 2, both the models predict almost the same variance. However, for the ses wh higher mean, NB model predicts slightly higher variance than the COM-Poisson model, though the shapes are similar. The discrepancy between 1

the f and the smaller variance can be explained by the fact that the variance, via the dispersion parameter, is estimated directly from the data and independently from the mean (see 28 for addional information on this topic). The Spearman s rank correlation coefficient between the variances predicted by both models is found to be.999 which means that there is a perfect monotone increasing and is highly correlated. For this dataset, the variance may be slightly better captured by the COM-Poisson than by the NB, especially at larger mean values. Figure 3. Crash Variance versus Crash Mean for the Toronto Data Figure 4 illustrates the frequency distribution of crash variance predicted by COM-Poisson and NB models. As discussed above, the frequency of ses wh low variance is higher wh the COM-Poisson model than wh the NB model. 25 2 COM NB Frequency 15 1 5-1 1-2 2-3 3-4 4-5 5-6 6-7 Variance 7-8 8-9 9-1 >1 11

Figure 4: Frequency Distribution of Crash Variance for the Toronto Data TEXAS DATA Table 3 presents the results of the COM-Poisson GLM for the Texas data. This table shows that the coefficient for the flow parameter is above one for all models except model 1, which indicates that the crash risk increases at an increasing rate as traffic flow increases. It should be pointed out that the 95% marginal posterior credible intervals for each of the coefficients did not include the origin. The DIC value shows that Model 3 fs the data better than all other models. Model 3 was also found to be the best model in Geedipally et al. (22). Table 3. Modeling Results for the COM-Poisson GLM using the Texas Data Estimates Fixed shape Model 1 Model 2 Model 3 Ln( ) -8.845 (.673) -5.746 (.239) -27.18 (3.741) -13.59 (.947) 1 1.298 (.81).975 (.29) 3.97 (.366) 1.764 (.18) 2 -.14 (.19) -.17 (.17) -.11 (.51) -.99 (.26) 3 -.18 (.4) -.18 (.3) -.19 (.12) -.18 (.6) 4.94 (.13).139 (.9).168 (.39).96 (.18).419 (.26) -- -- -- Ln( ) -- -.321 (.26) -2.984 (.78) -1.792 (.19) 1 -- -- -- -.548 (.43) DIC 5159.3 6481.6 576.2 5.6 Table 4 presents the results of the NB GLM for the Texas data. This table shows that the coefficient for the flow parameter is below one for all models, which indicates that the crash risk increases at a decreasing rate as traffic flow increases. It should be pointed out that the 95% marginal posterior credible intervals for each of the coefficients did not include the origin. The DIC value shows that Model 1 fs the data better than all other models, although Model 3 is close second. In addion, the NB model fs the data slightly better than the COM-Poisson model. 12

Table 4. Modeling Results for the NB GLM using the Texas Data Estimates Fixed dispersion Model 1 Model 2 Model 3 Ln( ) -6.384 (.412) -5.597 (.285) -6.752 (.35) -5.977 (.388) 1.983 (.43).915 (.33) 1.4 (.31).945 (.38) 2 -.55 (.17) -.71 (.14) -.43 (.14) -.61 (.14) 3 -.1 (.3) -.11 (.3) -.9 (.3) -.11 (.3) 4.67 (.12).95 (.14).62 (.1).78 (.13) 2.55 (.234) -- -- -- Ln( ) -- 1.485 (.81) 1.52 (.134) 1.144 (.99) 1 -- -- --.51 (.91) DIC 4784. 479.6 4988.9 4732.6 Figure 5 shows the frequency distribution of the varying shape parameter of COM-Poisson distribution across various observations for Texas data. Similar to the Toronto data, the frequency distribution can be approximated by a normal- or lognormal-shaped distribution. This figure shows that the highest frequency (i.e., the mode) occurred between the range of.2-.3 and the average shape parameter value was found to be.313, which is slightly lower than the value found for the fixed shape parameter (i.e.,.419). 13

5 45 4 455 389 Frequency 35 3 25 2 15 243 245 126 1 5 41 -.1.1-.2.2-.3.3-.4.4-.5.5-.6 Figure 5: Frequency Distribution of Shape Parameter of COM-Poisson (Mean=.313) Figure 6 illustrates the distribution of the varying (inverse) dispersion parameter of NB model across various observations for Texas data. This frequency distribution can be approximated by a skewed normal or log-normal distribution. The figure shows that the highest frequency (i.e., the mode) occurred between the ranges of 1-2, whereas the average dispersion parameter value was found to be 2.45, which is much closer to the fixed dispersion parameter (i.e. 2.55). 6 5 449 521 Frequency 4 3 2 221 1 98 56 33 19 21 11 14 56-1 1-2 2-3 3-4 4-5 5-6 6-7 7-8 8-9 9-1 >1 Figure 6: Frequency Distribution of Inverse Dispersion Parameter of NB (Mean=2.45) Figure 7 shows the comparison of crash variance predicted by COM-Poisson and NB models. The figure illustrates that both the models predict almost the same variance for a given mean. However, should be noted that for higher crash means NB model predicts slightly higher 14

variance than the COM-Poisson model. The Spearman s correlation coefficient between the variances is found to be.99 which confirms that the association is in same direction and is posively correlated. 1 1 Crash Variance 1 1 COM NB.1.1.1.1 1 1 1 1 1 Crash Mean Figure 7. Crash Variance versus Crash Mean for the Texas Data (Note: x-axis and y-axis are formatted under a logarhmic scale) Figure 8 presents the frequency distribution of crash variance predicted by COM-Poisson and NB models. As opposed to the Toronto data, the frequency of ses between both models is almost the same. 15

14 12 COM NB 1 Frequency 8 6 4 2-1 1-2 2-3 3-4 4-5 5-6 6-7 Variance 7-8 8-9 9-1 >1 Figure 8. Frequency Distribution of Crash Variance for Texas Data SUMMARY AND CONCLUSIONS This paper has documented the difference in the estimation of crash variance using the COM- Poisson and NB models. The NB model is the most commonly used model for analyzing motor vehicle crashes. Recently, the COM-Poisson model was introduced for traffic crash data modeling. The COM-Poisson model introduces a covariate-dependent shape parameter which captures the dispersion in the data. Thus, the COM-Poisson model has a capabily to handle datasets that contain intermingled over- and under-dispersed counts. The obectives of this study were to investigate and compare the estimation of crash variance predicted by the COM-Poisson and the NB models. To accomplish the study obectives, several NB and COM-Poisson GLMs were developed using two datasets. The first dataset contained crash data collected at 4-legged signalized intersections in Toronto, Ont. The second dataset included data collected for rural 4-lane undivided highways in Texas. The results of this study show that the trend of crash variance prediction by COM-Poisson GLM is similar to that predicted by NB model. The spearman s rank correlation coefficients between the crash variance predicted by COM-Poisson and NB model are.999 and.99 for Toronto and Texas data respectively. This means that a se that is characterized by a large variance will essentially be identified as such whether the NB and COM-Poisson model is used. This characteristic was found for both flow-only and models wh covariates. For the latter, the results may indicate that the variation observed for the variance is data specific rather (see 22) than attributed to the model specification, as suggested by Mra and Washinton (8). Further work is needed on this topic. It is recognized that constraining the parameters to be constant across various observations may lead to inconsistent and biased estimates (29). To overcome or minimize this important problem in count data models, Anastasopoulos and Mannering (3) suggested using random-parameter 16

models. Further work is thus needed to find the difference in the estimation of crash variance between NB and COM-Poisson models when the random-parameter model is used in conunction wh varying shape parameter. It should be pointed out such codes for the COM- Poisson model are not yet available. The next step consists of examining how do these slight differences in observed variance for high means influence typical highway safety studies, such as evaluating the effects of interventions and the identification of hazardous ses, when a varying dispersion or shape parameter is used. The approach used by Geedipally and Lord (31) could be utilized for such evaluation. It is also recommended to conduct an analysis wh the varying shape parameter for an underdispersed dataset. REFERENCES 1. Conway, R.W, and W.L. Maxwell (1962) A queuing model wh state dependent service rates. Journal of Industrial Engineering, Vol. 12, pp. 132-136. 2. Shmueli, G., T.P. Minka, J.B. Kadane, S. Borle, P. Boatwright (25) A useful distribution for fting discrete data: revival of the Conway-Maxwell-Poisson distribution. Journal of the Royal Statistical Society, Part C, Vol. 54, pp. 127-142. 3. Miaou, S.-P., and D. Lord. Modeling Traffic Crash-Flow Relationships for Intersections: Dispersion Parameter, Functional Form, and Bayes Versus Empirical Bayes Methods. In Transportation Research Record: Journal of the Transportation Research Board, No. 184, Transportation Research Board of the National Academies, Washington, D.C., 23, pp 31-4. 4. Geedipally, S.R. and D. Lord. Effects of the Varying Dispersion Parameter of Poissongamma models on the Estimation of Confidence Intervals of Crash Prediction models. In Transportation Research Record: Journal of the Transportation Research Board, No. 261, Transportation Research Board of the National Academies, Washington, D.C., 28, pp 46-54. 5. Lord, D., S.P. Washington, and J.N. Ivan. Poisson, Poisson-Gamma and Zero Inflated Regression Models of Motor Vehicle Crashes: Balancing Statistical F and Theory. Accident Analysis & Prevention, Vol. 37, No. 1, 25, pp. 35-46. 6. Hauer, E. (1997) Observational Before-After Studies in Road Safety: Estimating the Effect of Highway and Traffic Engineering Measures on Road Safety. Elsevier Science Ltd, Oxford. 7. Heydecker, B.G., and J. Wu. Identification of Ses for Road Accident Remedial Work by Bayesian Statistical Methods: An Example of Uncertain Inference. Advances in Engineering Software, Vol. 32, 21, pp. 859-869. 17

8. Mra, S., and S.P. Washington. On the Nature of Over-Dispersion in Motor Vehicle Crash Prediction Models. Accident Analysis & Prevention, Vol. 39, No. 3, 27, pp. 459-468. 9. Hauer, E. Overdispersion in Modelling Accidents on Road Sections and in Empirical Bayes Estimation. Accident Analysis & Prevention, Vol. 33, No. 6, 21, pp. 799-88. 1. Lord, D., and P.Y-J. Park. Investigating the Effects of the Fixed and Varying Dispersion Parameters of Poisson-Gamma Models on Empirical Bayes Estimates. Accident Analysis & Prevention, Vol. 4, No. 4, 28, pp. 1441-1457. 11. El-Basyouny, K., and T. Sayed. Comparison of Two Negative Binomial Regression Techniques in Developing Accident Prediction Models. In Transportation Research Record: Journal of the Transportation Research Board, No. 195, Transportation Research Board of the National Academies, Washington, D.C., 26, pp 9-16. 12. Poch, M., and F.L. Mannering. Negative Binomial Analysis of Intersection-Accident Frequencies. Journal of Transportation Engineering, ASCE, Vol. 122, No. 2, 1996, pp. 15-113. 13. Lord, D., and F. Mannering (21) The Statistical Analysis of Crash-Frequency Data: A Review and Assessment of Methodological Alternatives. Transportation Research - Part A, Vol. 44, No. 5, pp. 291-35. 14. Lord, D. Modeling Motor Vehicle Crashes Using Poisson-Gamma Models: Examining the Effects of Low Sample Mean Values and Small Sample Size on the Estimation of the Fixed Dispersion Parameter. Accident Analysis & Prevention, Vol. 38, No. 4, 26, pp. 751-766. 15. Cameron, A.C., and P.K. Trivedi. Regression Analysis of Count Data. Cambridge Universy Press, Cambridge, U.K., 1998. 16. Maher M.J., and I. Summersgill. A Comprehensive Methodology for the Fting Predictive Accident Models. Accident Analysis & Prevention, Vol. 28, No. 3, 1996, pp.281-296. 17. Lord, D., S.D. Guikema, and S. Geedipally (28) Application of the Conway-Maxwell- Poisson Generalized Linear Model for Analyzing Motor Vehicle Crashes. Accident Analysis & Prevention, Vol. 4, No. 3, pp. 1123-1134. 18. Lord, D., S.R. Geedipally, and S. Guikema (21) Extension of the Application of Conway-Maxwell-Poisson Models: Analyzing Traffic Crash Data Exhibing Under- Dispersion. Risk Analysis, in press (http://dx.doi.org/1.1111/.1539-6924.21.1417.x) 19. Kadane, J.B., G. Shmueli, T.P. Minka, S. Borle, and P. Boatwright (26) Conugate analysis of the Conway-Maxwell-Poisson distribution. Bayesian Analysis, Vol. 1, pp. 363-374. 2. Guikema, S. D., and Coffelt, J. P. (28), "A Flexible Count Data Regression Model for Risk Analysis.," Risk Analysis, 28, 213-223. 18

21. Sellers, K. F., and Shmueli, G. (21), "A Flexible Regression Model for Count Data," Annals of Applied Statistics, In Press. 22. Geedipally, S.R., D. Lord, and B.-J. Park (29) Analyzing Different Parameterizations of the Varying Dispersion Parameter as a Function of Segment Length. Transportation Research Record 213, pp. 18-118. 23. Spiegelhalter, D.J., A. Thomas, N.G. Best, D. Lun (23) WinBUGS Version 1.4.1 User Manual. MRC Biostatistics Un, Cambridge. Available from: <http://www.mrcbsu. cam.ac.uk/bugs/welcome.shtml>. 24. Lord, D. (2) The Prediction of Accidents on Digal Networks: Characteristics and Issues Related to the Application of Accident Prediction Models. Ph.D. Dissertation. Department of Civil Engineering, Universy of Toronto, Toronto, Ontario. 25. Miaou, S.-P., and J.J. Song (25) Bayesian ranking of ses for engineering safety improvements: Decision parameter, treatabily concept, statistical crerion and spatial dependence. Accident Analysis and Prevention, Vol. 37, No. 4, pp. 699-72. 26. Miranda-Moreno, L.F., and L. Fu (27) Traffic Safety Study: Empirical Bayes or Full Bayes? Paper 7-168. Presented at the 84th Annual Meeting of the Transportation Research Board, Washington, D.C. 27. Lord, D., Geedipally, S.R., Persaud, B.N., Washington, S.P., van Schalkwyk, I.,Ivan, J.N., Lyon, C., and Jonsson, T, 28. Methodology for Estimating the Safety Performance of Multilane Rural Highways. NCHRP Web-Only Document 126, National Cooperation Highway Research Program, Washington, DC,. (http://onlinepubs.trb.org/onlinepubs/nchrp/nchrp_w126.pdf, accessed on June 24 21). 28. Heydecker, B.G., J. Wu (21) Identification of Ses for Road Accident Remedial Work by Bayesian Statistical Methods: An Example of Uncertain Inference. Advances in Engineering Software, Vol. 32, pp. 859-869. 29. Washington, S.P., Karlaftis, M.G., Mannering, F.L., 21. Statistical and Econometric Methods for Transportation Data Analysis. Second Edion, Chapman Hall/CRC, Boca Raton, FL. 3. Anastasopoulos, P.C., Mannering, F.L., 29. A note on modeling vehicle accident frequencies wh random-parameters count models. Accident Analysis and Prevention 41 (1), 153 159. 31. Geedipally, S.R., and D. Lord (21) Hot Spot Identification by Modeling Single-Vehicle and Multi-Vehicle Crash Separately. Transportation Research Record 2147, pp. 97-13. 19