Application of the hyper-poisson generalized linear model for analyzing motor vehicle crashes

Size: px
Start display at page:

Download "Application of the hyper-poisson generalized linear model for analyzing motor vehicle crashes"

Transcription

1 Application of the hyper-poisson generalized linear model for analyzing motor vehicle crashes S. Hadi Khazraee 1 Graduate Research Assistant Zachry Department of Civil Engineering Texas A&M University Tel. (979) hadikhazraee@tamu.edu Antonio Jose Sáez-Castillo, Ph.D. Associate Professor Department of Statistics and Operations Research University of Jáen, Spain Tel ajsaez@ujaen.es Srinivas Reddy Geedipally, Ph.D., P.E. Assistant Research Engineer Texas A&M Transportation Institute Texas A&M University System Tel. (817) srinivas-g@ttimail.tamu.edu Dominique Lord, Ph.D., P.Eng. Associate Professor Zachry Department of Civil Engineering Texas A&M University Tel. (979) d-lord@tamu.edu 1 Corresponding author

2 ABSTRACT The hyper-poisson distribution can handle both over- and under-dispersion, and its generalized linear model formulation allows the dispersion of the distribution to be observationspecific and dependent on model covariates. This study s objective is to examine the potential applicability of a newly proposed generalized linear model framework for the hyper-poisson distribution in analyzing the motor vehicle crash count data. The hyper-poisson generalized linear model was first fitted to the intersection crash data from Toronto, characterized by overdispersion, and then to the crash data from railway-highway crossings in Korea, characterized by under-dispersion. The results of this study are promising. When fitted to the Toronto data set, the goodness-offit measures indicated that the hyper-poisson model with a variable dispersion parameter provided a statistical fit as good as the traditional negative binomial model. The hyper-poisson model was also successful in handling the under-dispersed data from Korea; the model performed as well as the gamma probability model and the Conway-Maxwell-Poisson model previously developed for the same data set. The advantages of the hyper-poisson model studied in this paper are noteworthy. Unlike the negative binomial model, which has difficulties in handling under-dispersed data, the hyper- Poisson model can handle both over- and under-dispersed crash data. Although not a major issue for the Conway-Maxwell-Poisson model, the effect of each variable on the expected mean of crashes is easily interpretable in the case of this new model. Keywords: hyper-poisson, under-dispersion, dispersion parameter 2

3 1. INTRODUCTION Motor vehicle crash count data are often characterized by over-dispersion, meaning that the variance of crash counts on a roadway entity is greater than the mean. It is however possible, although rare, to find crash datasets with under-dispersion, i.e., variance lower than the mean (1), especially in crash data with low sample means (2). The most commonly used distribution in crash count data modeling, the negative binomial (NB)/Poisson-gamma, can only accommodate overdispersion and will have convergence issues and produce incorrect parameter estimates while modeling under-dispersed data (1). Researchers in various fields have proposed numerous alternative models to handle underdispersed count data. For instance, the generalized Poisson (3), the weighted Poisson (4), and the Poisson polynomial (5) models are all extensions of the Poisson model that can handle both overand under-dispersed count data. Of all the models capable of handling both over- and under-dispersion, the Conway- Maxwell-Poisson distribution (COM-Poisson) has probably gained the most attention, especially in highway safety. The COM-Poisson distribution was first introduced by Conway and Maxwell (6) for modeling queues and service rates, and later explored by Shmueli et al. (7) for its statistical properties (1). The COM-Poisson generalized linear model (GLM) has been applied to crash data by Lord et al. (8; 9), and Geedipally and Lord (10). Several studies have found both the COM-Poisson distribution and its regression model to be very flexible in dealing with count data with a wide range of characteristics (e.g. 11; 12). Despite its flexibility for modeling count data, Francis et al. (13) have warned about the limitation of COM-Poisson GLM in dealing with overdispersed data sets with low sample mean values. 3

4 Another approach used to handle under-dispersion in crash count data modeling is the gamma probability distribution. This approach has been used with two different parameterizations. The first parameterization, proposed by Winkelmann (14) and applied to crash data first by Oh et al. (2), assumes that the time elapsed between each two successive crashes (waiting time) follows a gamma distribution. This approach implies that crash events are dependent in the sense that the occurrence of at least one event (in contrast to none) up to time t influences the probability of a further occurrence in t+ t (14). Nonetheless, while crash counts can sometimes have a temporal correlation, they are often described as independent observations (15). Recently, Daniels et al. (16; 17) used a different parameterization of the gamma model in which they assumed that the crash frequency itself follows a continuous gamma density function. Two major theoretical shortcomings exist for this assumption: it implies that crash counts of zero are not possible, and that non-integer crash counts may be observed (15). Both implications are obviously fallacious. The final model worth mentioning is the double-poisson distribution model proposed by Efron et al. (18). Although not very popular among researchers, Zou et al. (15) applied the double- Poisson model to crash count data and found the model to be flexible. Nonetheless, they noticed that the distribution does not handle under-dispersion as reliably as it does over-dispersion. Very recently, Saez-Castillo and Conde-Sanchez (19) formulated a generalized linear model (GLM) framework for a two-parameter generalization of the Poisson distribution, called the hyper-poisson distribution (20). The primary objective of this study is to examine the potential application of the hyper-poisson GLM in the field of highway safety to model crash count data. The hyper-poisson distribution can handle both under- and over-dispersion. In addition, the regression model examined in this study allows the dispersion of the distribution to vary among 4

5 observations. Such observation-specific dispersion structure for crash counts on roadway entities is consistent with the findings of recent research in highway safety. A handful of studies have addressed shortcomings in the assumption of fixed dispersion among all observations and have suggested that the model dispersion can potentially depend on the covariates (e.g. 21; 22; 23). Mitra and Washington (24) advised that the observation-specific structure can be especially important when the mean function is misspecified, such as in models where the mean only depends on the entering traffic flow. In the hyper-poisson model, the covariates enter the mean function at the same time that they influence the dispersion of the distribution. The dual link structure of the hyper-poisson GLM is similar to that suggested by Guikema and Coffalt (25) for the COM- Poisson regression model. In this research study, the hp GLM is first fitted to crash data from the signalized intersections in Toronto to examine the model performance in handling over-dispersed count data. The objective is to ensure that the model can provide an adequate fit to the majority of crash count data sets which are characterized by over-dispersion. The modeling results for Toronto data are compared to those obtained by the NB GLM. The hp model is also fitted to a data set from railway-highway crossings (RHXs) in Korea which is characterized by underdispersion. For this data set, the modeling results are compared to those for gamma probability distribution from Oh et al. (2) and COM-Poisson GLM from Lord et al. (9). 2. BACKGROUND This section describes the characteristics of the hp distribution and the corresponding generalized linear regression model. The first part discusses the hp distribution and its characteristics and the second part describes an extension of the distribution to model crash frequency data. 5

6 2.1. Hyper-Poisson Distribution Bardwell and Crow (20) derived a two-parameter generalization of the Poisson distribution. They called the proposed distribution as the hyper-poisson (hp, hereafter) family because it turned out to be a subclass of the three-parameter hypergeometric series distribution and reduced to the Poisson distribution in a special case. Using the original notations, the probability mass function (pmf) of the hp distribution with parameters θ1 and θ2 is stated as follows: 1 ( ) y f (Y y 1, 2 ) = 2 (1) F (1; ; ) ( y) (2) ( ) r F1 (1; ; 2) 2 (3) ( r) r 0 where, Y is the response variable (discrete crash count in this study), θ2 is the location parameter, λ is defined as the dispersion parameter, and F 1; ; ) is the confluent hypergeometric ( function with first argument equal to 1 (26). If λ=1, the distribution reduces to the Poisson (with variance equal to mean), λ > 1 results in an over-dispersed distribution, super-poisson, whereas λ < 1 produces an under-dispersed distribution, sub-poisson (20). It can be verified from Equation (1) that the hp distribution satisfies the following recurrence condition: (y + )f = f (4) y +1 2 y Summing Equation (4) over all y s yields the following expression for the mean (µ): 1)(1 ) (5) 2 ( f0 1F 1(1; ; 2) 1 2 ( 1) (6) F (1; ; )

7 It is clear from Equation (6) that when λ = 1, the location parameter θ2 matches the mean. In this case, Equation (2) suggests θ1 = θ2 and Equation (3) yields 1F1 (1; λ; θ2) = e θ2, so the distribution Equation (1) reduces to the Poisson with the mean θ2. However, as indicated by Equation (6), θ2 is not equal to the mean in any other case. The mean and θ2 can become significantly different as λ deviates from 1. Equation (6) provides an explicit expression of the mean in terms of θ2 and λ. Nonetheless, θ2 and λ cannot be directly expressed in terms of the mean and the other parameter because they appear as the arguments of the hypergeometric series, which does not have an explicit inverse function. This will give rise to a major computational difficulty in regression modeling as described later in this section. From Equation (4) and using the method of moments, the following relationship between the distribution variance (σ 2 ) and mean (µ) is obtained (19) : ( 2 ( 1)) (7) A comparison between the hp distribution variance, as shown above, and that from the NB distribution would be interesting. The relationship between the variance and the mean in the negative binomial distribution is stated below: 2 2 (8) where α is the over-dispersion parameter. A negative estimate of α is indicative of underdispersion (σ 2 < µ). However, the NB model is inappropriate for modeling under-dispersed data because the estimated variance will be negative for observations with α < -1/µi (27). In this paper, for the sake of convenience in comparing the NB and hp distributions, α is referred to as the dispersion parameter of the NB distribution. As Equation (8) indicates for the NB distribution, the coefficient of the second-degree term in the variance function is allowed to 7

8 vary, whereas in the hp distribution variance function it is the coefficient of the first degree term of the mean that can vary and the second-degree coefficient is constantly -1 (see Equation 7). This allows for higher flexibility of the NB distribution, compared to the hp distribution, to deal with highly over-dispersed data sets, as demonstrated later in the results section. Furthermore, Saez-Castillo and Conde-Sanchez (19) showed how the over-dispersed case of the hp distribution (i.e., when λ > 1) can be viewed upon as a Poisson compound distribution with a confluent hypergeometric distribution. An interested reader is referred to their work for the derivation. This finding provides an interpretational basis for application of the hp distribution and regression model to crash data; crash counts are Poisson distributed with a mean which itself follows a probability distribution (confluent hypergeometric in this case) to account for the heterogeneity among the individual entities (sites). Indeed, the confluent hypergeometric error term captures the variation in the mean caused by the factors not accounted for by the model. Hence, in the over-dispersion context, the hp distribution is comparable to other compound Poisson distributions, such as the negative binomial/poisson-gamma distribution Generalized Linear Model Saez-Castillo and Conde-Sanchez (19) developed an hp GLM framework to model discrete count data. In this approach, both the mean and the dispersion parameter of the hp distribution can depend on the covariates. Denoting Yi as the observed crash count at site i, the GLM assumes that Yi follows an hp distribution with the mean and dispersion parameter stated as below: p ln( ) x (9) i 0 j 1 q k 1 j ij ln( ) z (10) i 0 k ik 8

9 where, xij s and zik s are the covariates used to estimate the mean and dispersion parameter of observation i, respectively, and βj s and δk s are the regression parameters to be estimated by the model. The p covariates used to estimate the mean are not necessarily identical to the q covariates used to estimate the dispersion parameter. This study adopted the GLM as formulated above to model motor vehicle crashes. The dual link structure of the hyper-poisson GLM is similar to that suggested by Guikema and Coffelt (25) for the COM-Poisson regression model. The first link function, in Equation (9), describes the mean as a function of covariates. The covariate-dependent mean function allows for inference about the influence of the changes in the covariates on the expected number of crashes (µ). The same would not be possible had the location parameter was instead modeled as a function of the covariates. Given the estimated values of µ and λ, the location parameter (θ2) can be determined by Equation (6). The variance of each observation can then be determined by Equations (7). The second link function of the GLM, in Equation (10), is added to increase the flexibility of the distribution and enable analysis of data with potential over- or under-dispersion depending on the values of the covariates. As mentioned earlier in the introduction, there are notable advantages in allowing the dispersion characteristic of the crash count distribution to depend on the covariates. 3. METHODOLOGY This section describes the methodology used to fit the hp regression models to the crash data. The first part presents the functional form of each model, and the second part describes the procedure adopted to estimate the models Model Functional Form Toronto data 9

10 For the Toronto intersection crash data, the following common and simple functional form was adopted: i 1 2 F F (11) 0 Maj _ i Min _ i where FMaj_i and FMin_i denote the average annual daily traffic (AADT) on the major and minor approach to the intersection, respectively. Such a flow-only crash model for intersections is consistent with the base safety prediction models suggested by the Highway Safety Manual (28) and also with several other studies that have modeled the Toronto dataset in the past (e.g., 23; 8). The hp GLM was applied to the Toronto data in two steps: first, with a constant dispersion parameter (i.e., i 0 for all i), and next, with an observation-specific dispersion parameter. The observation-specific structure is especially important here because the mean function is misspecified, since the mean is allowed to depend on entering traffic flows only. The dispersion parameter has the following form: i 1 2 F F (12) 0 Maj _ i Min _ i This was done to evaluate the improvement in fit when the dispersion parameter is allowed to vary depending on the covariates. The hp model results were compared to those obtained by using the NB GLMs (and the maximum likelihood method for model estimation) with a fixed and a variable dispersion parameter. When variable, the dispersion parameter of the NB model (αi) followed a similar functional form as in Equation (12): i 1 2 F F (13) 0 Maj _ i Min _ i Korea RHX data For the Korea RHX data, the objective was to compare the hp model fit mainly with that obtained by the gamma probability model, documented by Oh et al. (2), and COM-Poisson GLM, 10

11 documented by Lord et al. (9). An interested reader may refer to their work for background information on the COM-Poisson and gamma probability models. The same functional form for the expected number of crashes was therefore used here: F exp( x ) (14) i 0 n 1 i j 2 j ij where Fi is the average daily vehicle traffic (ADT) on site i, and xij is the covariate j at site i. Various functional forms with different variables were evaluated to model the dispersion parameter but none of them were found to be significant. This supports the previous finding that since the functional form describing the mean function contains several covariates, the varying dispersion parameter is not needed (24) Model Estimation The GLMs in this study were estimated using the method of maximum likelihood. The goal was to find the set of βj and δk parameters that would maximize the joint likelihood (or loglikelihood, equivalently) of observations y1,, yn. From Equation (1), the log-likelihood function is: n logl(y,, yn) log( ( )) log( 2 ) log( ( )) log( 1F 1(1; ; 2 )) 1 i yi i i yi i i (15) i 1 n i 1 n i 1 n i 1 The optimization was carried out using an iterative procedure evaluating the log-likelihood function at different combinations of βj s and δk s until the maximum log-likelihood was reached. Nevertheless, as Equation (15) indicates, the log-likelihood function depends on θ2i and λi, while we model µi and λi as a function of covariates. θ2i in Equation (15) must therefore be replaced with its expression in terms of µi. As specified earlier, no closed form expression exists for θ2i. Consequently, evaluation of the log-likelihood function at each iteration required solving 11

12 the nonlinear Equation (6) to find the value of θ2 corresponding to the estimated µi and λi for each observation. The code developed by Saez-Castillo and Conde-Sanchez (19), in the software R (29) is used in this study. The program uses functions nlm and optim to maximize the log-likelihood, and optimize to solve Equation (6) numerically. 4.DATA DESCRIPTION This section provides an overview of the two data sets used in this research. As discussed above, the datasets come from Toronto and Korea. The Toronto data set contains crash count data collected in 1995 at 868 four-legged signalized intersections in Toronto. Several research studies (e.g., 30; 23; 31) have used this data set for the purpose of crash count modeling and have found it to be of good quality. The Toronto intersection data is characterized by over-dispersion, as commonly seen in most crash data sets. TABLES AND FIGURES Table I presents the summary statistics of the variables in this data set. The Korea data set contains crash count data collected at 162 railway-highway crossings in Korea. This data set was first used by Oh et al. (2) to fit Poisson and gamma probability models, and later by Lord et al. (9) to fit a COM-Poisson model. Although the data shows signs of slight over-dispersion (sample mean = 0.33, sample variance = 0.36), both studies observed underdispersion when crashes were modeled conditional on the mean. Out of the many explanatory variables initially considered for model estimation in these studies, only a few were found to be statistically significant at 10% level and were included in the final model. The hp model in this study was estimated using the variables (covariates) that were found to be significant in the Poisson, Gamma distribution, or COM-Poisson models. TABLES AND FIGURES 12

13 Table I presents these variables and their characteristics. 5. RESULTS This section presents the modeling results for the hp GLM. The first part of this section presents the results for the model fitted to the Toronto intersection data and the second part shows the results for the data from Korea railway-highway crossings Toronto Data Error! Reference source not found.table II summarizes the modeling results for the hp GLM with a fixed and a varying dispersion parameter and compares the results with those obtained from the NB model. The NB GLM with a fixed dispersion parameter was estimated with glm.nb in R, whereas the NB GLM with a variable dispersion parameter was estimated with PROC NLMIXED in SAS (32). All models were estimated using the maximum likelihood method. The values in parentheses indicate the standard error of the parameter estimates. As Table IIError! Reference source not found. indicates, there is no significant difference in the MPB, MAD, and MSPE of the models considered for the Toronto data. The only notable trend is the reduction in the bias (MPB) in both the hp and NB models when the dispersion parameter is allowed to vary. The MAD and MSPE measures of fit vary only slightly from one model to the other. This is due to the very similar estimates of mean function parameters (β s). Note that the MPB, MAD, and MSPE are all only dependent on the mean function and not on the dispersion parameter. Similar β parameters, therefore, have resulted in similar values for these measures of fit. On the other hand, the AIC measure depends not only on the mean function, but also on the dispersion parameter. The reason is that the AIC depends on the model likelihood function which, in both the hp and NB model cases, has the dispersion parameter as an input. Thus, 13

14 models with similar mean function parameters (β s) may have significantly different AIC s (e.g., compare hp with fixed and varying dispersion parameter in Table II). Table IIError! Reference source not found. indicates that, when dispersion parameter is constant, the AIC of the NB model (5077.3) is considerably lower than that of the hp model (5157.3). The difference in AIC is large enough to infer that the NB model with a fixed dispersion parameter outperforms the hp model with the same condition. Nonetheless, when dispersion parameter is allowed to vary depending on the covariates, the hp model s fit improves notably (AIC reduces from to ). Conversely, the NB model with a variable dispersion parameter is not a significant improvement as two of the dispersion parameter function coefficients (δ0 and δ1) are found to be statistically insignificant (at α=0.10) and the reduction in AIC is also marginal (from to ). As a rule of thumb, when the change in AIC is less than 10, the difference is usually deemed to be insignificant (9). Thus, with a variable dispersion parameter, the hp model performs almost as well as the NB model. The variance-mean relationship structure of the hp and NB distributions is the key to explaining the findings above. In the variance-mean function of the NB distribution shown by Equation (8), the over-dispersion parameter is the coefficient of the second-degree term of the mean, whereas in the hp distribution variance-mean function shown by Equation (7), the dispersion parameter can only affect the first-degree coefficient of the mean. Thus, the variance of the NB distribution is more sensitive to the changes in the dispersion parameter and can increase at a faster rate. Figure 1(a) shows the mean-variance relationship of the hp and NB models with fixed dispersion parameters for Toronto data. Clearly, the NB model variance 14

15 increases more rapidly and so the NB model better fits the over-dispersed Toronto data set than the hp model with a fixed dispersion parameter. Once the dispersion parameter of the hp distribution is allowed to vary, the variance-mean relationship becomes more flexible and the hp model becomes more capable of fitting overdispersed crash counts. As illustrated in Figure 1(b) for models with variable dispersion, the hp model mean-variance relationship becomes more similar to that of the NB model. When the mean is less than 25 crashes, the variances of the two distributions resemble closely. As the mean gets larger, however, the variance of the NB model increases at a higher rate than the hp model and the difference between the variances becomes more significant. Figure 2Error! Reference source not found.(a) illustrates the frequency distribution of the varying dispersion parameter of the hp distribution across all observations. It is important to note that even for such an over-dispersed data set, two of the observations have λ s less than 1 and are therefore under-dispersed (conditional on the mean). Despite the very small number of underdispersed observations in the Toronto data set, this finding illustrates how the hp model (with a variable dispersion parameter) can identify data points with under-dispersion, while the NB model fails to do so. Figure 2Error! Reference source not found.(b) shows the distribution of the varying dispersion parameter (α) of the NB model. The NB distribution is under-dispersed if α < 0, equi-dispersed if α = 0, and over-dispersed otherwise. As shown by Error! Reference source not found.(b), the NB model did not identify any under-dispersed observations. It is probable that the NB model would not have performed as well if a great number of observations were under-dispersed (conditional on the mean). 15

16 It is also interesting to compare the hp model performance in fitting overdispersed crash data with that obtained by using the COM-Poisson model. Geedipally and Lord (10) fitted the COM-Poisson GLM with a variable shape parameter to the Toronto data using a full Bayesian (FB) approach with non-informative (vague) prior distributions on the parameters. Figure 3 illustrates the comparison of the mean-variance relationship of the hp and COM-Poisson models. The variances from the two models resemble closely for the entire range of the mean. The hp model can thus be expected to perform as well as the COM-Poisson Korea RHX Data Both Oh et al. (2) and Lord et al. (9) examined the application of the NB model to the underdispersed (conditional on the mean) data from Korea railway-highway crossings, and deemed it to be inappropriate. These two studies also considered the Poisson model and despite the relatively good fit of the model provided, the authors mentioned that the Poisson model should not be used because the data are under-dispersed. Lord et al. (9) also noted that fitting the Poisson GLM to such under-dispersed data can have a significant effect on standard errors. Therefore, the current study compared the hp model fit to the two models found successful by the aforementioned researchers i.e., the gamma probability, and COM-Poisson. The Poisson, gamma probability, and COM-Poisson models for Korea RHX data (2,9) were originally developed using 31 candidate explanatory variables. According to Lord et al. (2), eight of these variables were found significant in at least one of the three models. These eight variables constituted the pool of candidate explanatory variables for the hp model developed in this study (see Table I). Disregarding the remaining 23 variables, it can be assumed that all final models were estimated using a common set of candidate variables. 16

17 To obtain greater accuracy and prevent inclusion of correlated variables in the model, a stepwise forward procedure with the likelihood ratio test was adopted to identify the significant variables in this study. First, the dominant traffic flow (AADT) variable was introduced into the model (mean function) and resulted in a log-likelihood value equal to Then, the other covariates entered the model in the order in which they contributed to the increase in loglikelihood/parameter. A variable was added to the model only if the increase in the loglikelihood was significant according to the likelihood ratio test (LRT). The significance level of the LRT was selected at α = 0.1 for the sake of consistency with other models developed for the Korea data with which the hp model was intended to be compared to. The final model obtained from this stepwise procedure includes the following six variables in its mean function: AADT, presence of speed hump, train detector distance, presence of commercial area, presence of track circuit controller, and presence of a guide. The log-likelihood of the final model is Error! Reference source not found.table III presents the modeling results for the hp distribution model and the comparison with the other models. All models were estimated using the maximum likelihood method. The same set of variables as those in the COM-Poisson model were found significant in the hp model. However, it is necessary to note that the coefficients estimated for the COM-Poisson model are for the centering parameter and not for the mean (E[Y]) as in the case of other distributions in Table III Error! Reference source not found. (see (9), for more details on the COM-Poisson GLM). The dispersion parameter of the hp model (0.298) confirms the finding of the previous studies that the Korea data are under-dispersed (conditional on the mean) (see also 15). Using the AIC values, the hp model provides a fit as well as the COM-Poisson and gamma models. 17

18 It is important to note that despite the similar quality of statistical fit, the three models compared in Table III each include a distinct set of variables. This comparison is still meaningful because all three models were estimated using a common pool of explanatory variables. The presence of a certain variable in one model and not in the other is attributable to the correlation among variables, meaning that the inclusion of a certain set of variables eliminates the need for one or more other variables. The considerably large difference between parameter estimates in different models is due to the distinct set of significant variables in each model. Similar to the Toronto data application, the hp Poisson model for the Korea data performs very well in terms of the bias; the MPB of the hp model is very close to zero, indicating that the model neither over-predicts nor under-predicts the crashes. The COM-Poisson model also has a relatively small bias but the value of MPB for the gamma model indicates that this model overpredicts the crashes. The MAD and MSPE of the hp distribution are almost as low as those of the COM-Poisson, but better than those of the gamma model. Overall, the hp and COM-Poisson models performed almost equally well, slightly outperforming the gamma model. 6. CONCLUSIONS The results of this study for the application of the hp GLM to crash data modeling are promising. The hp GLM with a covariate-dependent dispersion parameter could fit the overdispersed data from Toronto almost as well as the popular NB model. When applied to the under-dispersed data from Korea, the hp model had an equally good performance compared to the COM-Poisson and gamma probability models. The hp model can handle under-dispersion, while the NB model is incapable to do so properly. Lord et al. (9) showed that application of the NB model to under-dispersed data can result in unstable and unreliable parameter estimates, hence mis-specified models. In modeling 18

19 over-dispersed crash data, however, the authors admit that the NB model is usually preferable over the hp model because the variance-mean relationship structure of the NB model offers more flexibility when the variance increases very rapidly with the increase in the mean. The NB model becomes especially useful when the data are highly over-dispersed. Nonetheless, this study showed that the hp GLM with covariate-dependent dispersion can perform satisfactorily even with an over-dispersed data set. The GLM formulation of the hp model studied in this research has an advantage over the COM-Poisson GLM. In the hp model, the mean (E[Y]) is expressed in terms of the covariates, whereas in the COM-Poisson model, the centering parameter, which is approximately equal to the mode, is a function of covariates. Thus, the hp GLM permits direct interpretation of the effect of each variable on the expected mean of crashes, while the COM-Poisson GLM has on the expected mode of the crash distribution. For instance, one might look at the sign of the variable coefficients in the hp model and directly quantify the effect on the expected mean of crashes with an increase in the value of each variable. When compared to the gamma model, the hp model is preferred because it does not suffer the same theoretical issues involved with the gamma model formulation, as discussed in the first section of the paper. This paper was a report on the first steps of the ongoing research on the application of the hp GLM in crash data modeling. There are many aspects of the application that needs to be further investigated. For example, the hp model performance should be examined over a greater range of dispersion characteristics likely through simulated data. It is also recommended to examine the hp model fit to crash frequency data from the roadway segments and for identifying hazardous sites. 19

20 20

21 REFERENCES 1. Lord D, Mannering F. The Statistical Analysis of Crash-Frequency Data: a Review and Assessment of Methodological Alternatives. Transportation Research - Part A, 2010;44(5): Oh J, Washington SP, Nam D. Accident Prediction Model for Railway Highway Interfaces. Accident Analysis & Prevention, 2006;38(2): Consul P, Famoye F. Generalized Poisson Regression-Model. Communications in Statistics-Theory and Methods, 1992;21(1): Castillo J, Pérez-Casany M. Overdispersed and Underdispersed Poisson Generalizations. Journal of Statistical Planning and Inference, 2005;134: Cameron AC, Johansson P. Count Data Regression Using Series Expansions: with Applications. Journal of Applied Econometrics, 1997;12(3): Conway RW, Maxwell WL. A Queuing Model with State Dependent Service Rates. Journal of Industrial Engineering, 1962;12: Shmueli G, Minka T, Kadane JB, Borle S, Boatwright P. A Useful Distribution for Fitting Discrete Data: Revival of the Conway Maxwell Poisson Distribution. Journal of the Royal Statistical Society Series C, 2005;54(1):

22 8. Lord D, Guikema SD, Geedipally S. Application of the Conway-Maxwell-Poisson Generalized Linear Model for Analyzing Motor Vehicle Crashes. Accident Analysis & Prevention, 2008;40(3): Lord D, Geedipally SR, Guikema SD. Extension of the Application of Conway Maxwell Poisson Models: Analyzing Traffic Crash Data Exhibiting Underdispersion. Risk Analysis, 2010;30(8): Geedipally SR, Lord D. Examination of Crash Variances Estimated by Poisson-Gamma and Conway Maxwell Poisson Models. Transportation Research Record, 2011;2241: Sellers KF, Shmueli G. A Flexible Regression Model for Count Data. Annals of Applied Statistics, 2010;4(2): Sellers K, Borle S, Shmueli G. The COM Poisson Model for Count Data: A Survey of Methods and Application. Applied Stochastic Models in Business and Industry, 2012;28(2): Francis RA, Geedipally SR, Guikema SD, Dhavala SS, Lord D, LaRocca S. Characterizing the Performance of the Conway Maxwell Poisson Generalized Linear Model. Risk Analysis, 2012; 32(1):

23 14. Winkelmann R. Duration Dependence and Dispersion in Count-Data Models. Journal of Business & Economic Statistics, 1995;13(4): Zou Y, Geedipally SR, Lord D. Evaluating the Double Poisson Generalized Linear Model. Accident Analysis & Prevention, 2013; forthcoming. 16. Daniels S, Brijs T, Nuyts E, Wets G. Explaining Variation in Safety Performance of Roundabouts. Accident Analysis & Prevention, 2010;42(2): Daniels S, Brijs T, Nuyts E, Wets G. Extended Prediction Models for Crashes at Roundabouts. Safety Science, 2011;49(2): Efron B. Double Exponential-Families and their Use in Generalized Linear-Regression. Journal of the American Statistical Association, 1986;81(395): Sáez-Castillo AJ, Conde-Sánchez A. A Hyper-Poisson Regression Model for Overdispersed and Underdispersed Count Data. Computational Statistics and Data Analysis, 2012;61: Bardwell GE, Crow EL. A Two-Parameter Family of Hyper-Poisson Distributions. Journal of the American Statistical Association, Vol. 9, No. 305, 1964, pp

24 21. Hauer E. Overdispersion in Modeling Accidents on Road Sections and in Empirical Bayes Estimation. Accident Analysis and Prevention, 2001;33(6): Heydecker BG, Wu J. Identification of Sites for Road Accident Remedial Work by Bayesian Statistical Methods: An Example of Uncertain Inference. Advances in Engineering Software, 2001;32: Miaou S P, Lord D. Modeling Traffic Flow Relationships at Signalized Intersections: Dispersion Parameter, Functional Form and Bayes vs Empirical Bayes. Transportation Research Record, 2003;1840: Mitra S, Washington SP. On the Nature of Over Dispersion in Motor Vehicle Crash Prediction Models. Accident Analysis and Prevention, 2007;39(3): Guikema SD, Coffelt JP. A Flexible Count Data Regression Model for Risk Analysis. Risk Analysis, 2008;28(1): Johnson NL, Kotz S, Kemp AW. Univariate Discrete Distributions, 3rd ed. New York: Wiley; Saha K, Paul S. Bias-Corrected Maximum Likelihood Estimator of the Negative Binomial Dispersion Parameter. Biometrics, 2005; 61(1);

25 28. American Association of State Highway and Transportation Officials (AASHTO), Highway Safety Manual. 1st ed. AASHTO; R Development Core Team, R: A Language and Environment for Statistical Computing. Vienna (Austria): R Foundation for Statistical Computing; Lord, D. The Prediction of Accidents on Digital Networks: Characteristics and Issues Related to the Application of Accident Prediction Models [dissertation]. [Toronto(ON)]: University of Toronto; Miranda-Moreno LF, Fu L. Traffic Safety Study: Empirical Bayes or Full Bayes?. 84th Annual Meeting of the Transportation Research Board, Washington, DC, SAS Institute Inc. SAS System for Windows. 9th ver. Cary (NC); Burnham KP, Anderson DR. Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, 2nd ed. Springer-Verlag; Oh J, Lyon C, Washington SP, Persaud BN, Bared J. Validation of the FHWA Crash Models for Rural Intersections: Lessons Learned. Transportation Research Record, 2003;1840:

26 TABLES AND FIGURES Table I: Summary statistics of the data sets in this study Variables M in. M ax. Average (SD) Frequency Toronto Data Crashes (10.02) 868 Major approach AADT 5,469 72,178 28, (10,660.4) 868 Minor approach AADT 53 42,644 11, (8,599.40) 868 Korea Data Crashes (0.60) 162 Highway AADT 10 61, ( ) 162 Average daily highway traffic (rail.trf) (37.34) 162 Train detector distance (dist.trn.dtc) 0 1, (328.38) 162 Time duration btw activation of warning signals and gates (wrn.time) (25.71) 162 Presence of commercial area (p.comm) 1 (yes) 149 (91.98%) 0 (no) 13 (8.02%) Presence of a speed hump (p.hump) 1 (yes) 134 (82.72%) 0 (no) 28 (17.28%) Presence of a track circuit controller (p.trck.cric.cont) 1 (yes) 113 (69.75%) 0 (no) 49 (30.25%) Presence of a guide (p.guide) 1 (yes) 126 (77.78%) 0 (no) 36 (22.22%) = not applicable Table II: Modeling results for the hp and NB GLMs with the Toronto data Hyper-Poisson Negative-Binomial model Fixed dispersion Varying dispersion Fixed dispersion Varying dispersion Estimate parameter parameter parameter parameter Ln(β 0 ) (0.4464) (0.4325) (0.465) (0.4555) β (0.0462) ( ) ( ) ( ) β ( ) ( ) ( ) ( ) λ α (0.0122) Ln(δ 0 ) (2.709) (2.4381) δ (0.2677) (0.2345) δ (0.1073) (0.1002) AIC MPB MAD MSPE Akaike information criterion (33) ; 2 Mean prediction bias (34) ; 3 Mean absolute deviance (34) ; 4 Mean squared predictive error (34) ; = not applicable 26

27 Table III: Parameter Estimates and GOF Measures of Three Different Models for the Korea Data Variables COM-Poisson Gamma Hyper-Poisson Constant (1.206) a (1.008) a (0.756) Ln(ADT) 0.648(0.139) 0.230(0.076) 0.472(0.057) Average daily railway traffic (0024) - Presence of commercial area 1.474(0.513) 0.651(0.287) 0.965(0.370) Train detector distance (0.0007) 0.001(0.0004) (0.0006) Time duration between the activation of warning signals and gates (0.002) - Presence of track circuit controller (0.431) (0.303) Presence of guide -88(0.512) (0.294) Presence of speed hump (0.531) -1.58(0.859) (0.441) Shape parameter 2.349(0.634) 2.062(0.758) - Dispersion parameter (0.189) AIC MPB MAD MSPE a Standard error; - = not applicable 27

28 hp (constant dispersion) NB (constant dispersion) Variance Mean (a) hp (variable dispersion) NB (variable dispersion) Variance Mean (b) Figure 1: Crash variance vs. mean for the Toronto data obtained by the models with (a) fixed, (b) variable dispersion parameter. 28

29 Frequency (a) λ > Frequency α >0.5 (b) Figure 2: Frequency distribution of (a) the varying dispersion parameter of hp model for Toronto data (b) the varying dispersion parameter of NB model for the Toronto data. 29

30 hp (variable dispersion) 350 COM (variable shape parameter) Variance Mean Figure 3: Crash variance-mean relationship of the COM- Poisson vs. the hp model for the Toronto data. 30

The Conway Maxwell Poisson Model for Analyzing Crash Data

The Conway Maxwell Poisson Model for Analyzing Crash Data The Conway Maxwell Poisson Model for Analyzing Crash Data (Discussion paper associated with The COM Poisson Model for Count Data: A Survey of Methods and Applications by Sellers, K., Borle, S., and Shmueli,

More information

TRB Paper # Examining the Crash Variances Estimated by the Poisson-Gamma and Conway-Maxwell-Poisson Models

TRB Paper # Examining the Crash Variances Estimated by the Poisson-Gamma and Conway-Maxwell-Poisson Models TRB Paper #11-2877 Examining the Crash Variances Estimated by the Poisson-Gamma and Conway-Maxwell-Poisson Models Srinivas Reddy Geedipally 1 Engineering Research Associate Texas Transportation Instute

More information

The Negative Binomial Lindley Distribution as a Tool for Analyzing Crash Data Characterized by a Large Amount of Zeros

The Negative Binomial Lindley Distribution as a Tool for Analyzing Crash Data Characterized by a Large Amount of Zeros The Negative Binomial Lindley Distribution as a Tool for Analyzing Crash Data Characterized by a Large Amount of Zeros Dominique Lord 1 Associate Professor Zachry Department of Civil Engineering Texas

More information

Crash Data Modeling with a Generalized Estimator

Crash Data Modeling with a Generalized Estimator Crash Data Modeling with a Generalized Estimator Zhirui Ye* Professor, Ph.D. Jiangsu Key Laboratory of Urban ITS Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies Southeast

More information

Effects of the Varying Dispersion Parameter of Poisson-gamma models on the estimation of Confidence Intervals of Crash Prediction models

Effects of the Varying Dispersion Parameter of Poisson-gamma models on the estimation of Confidence Intervals of Crash Prediction models Effects of the Varying Dispersion Parameter of Poisson-gamma models on the estimation of Confidence Intervals of Crash Prediction models By Srinivas Reddy Geedipally Research Assistant Zachry Department

More information

Exploring the Application of the Negative Binomial-Generalized Exponential Model for Analyzing Traffic Crash Data with Excess Zeros

Exploring the Application of the Negative Binomial-Generalized Exponential Model for Analyzing Traffic Crash Data with Excess Zeros Exploring the Application of the Negative Binomial-Generalized Exponential Model for Analyzing Traffic Crash Data with Excess Zeros Prathyusha Vangala Graduate Student Zachry Department of Civil Engineering

More information

TRB Paper Examining Methods for Estimating Crash Counts According to Their Collision Type

TRB Paper Examining Methods for Estimating Crash Counts According to Their Collision Type TRB Paper 10-2572 Examining Methods for Estimating Crash Counts According to Their Collision Type Srinivas Reddy Geedipally 1 Engineering Research Associate Texas Transportation Institute Texas A&M University

More information

Analyzing Highly Dispersed Crash Data Using the Sichel Generalized Additive Models for Location, Scale and Shape

Analyzing Highly Dispersed Crash Data Using the Sichel Generalized Additive Models for Location, Scale and Shape Analyzing Highly Dispersed Crash Data Using the Sichel Generalized Additive Models for Location, Scale and Shape By Yajie Zou Ph.D. Candidate Zachry Department of Civil Engineering Texas A&M University,

More information

Does the Dispersion Parameter of Negative Binomial Models Truly. Estimate the Level of Dispersion in Over-dispersed Crash data with a. Long Tail?

Does the Dispersion Parameter of Negative Binomial Models Truly. Estimate the Level of Dispersion in Over-dispersed Crash data with a. Long Tail? Does the Dispersion Parameter of Negative Binomial Models Truly Estimate the Level of Dispersion in Over-dispersed Crash data wh a Long Tail? Yajie Zou, Ph.D. Research associate Smart Transportation Applications

More information

Characterizing the Performance of the Conway-Maxwell Poisson Generalized Linear Model

Characterizing the Performance of the Conway-Maxwell Poisson Generalized Linear Model Characterizing the Performance of the Conway-Maxwell Poisson Generalized Linear Model Royce A. Francis 1,2, Srinivas Reddy Geedipally 3, Seth D. Guikema 2, Soma Sekhar Dhavala 5, Dominique Lord 4, Sarah

More information

The Negative Binomial-Lindley Generalized Linear Model: Characteristics and Application using Crash Data

The Negative Binomial-Lindley Generalized Linear Model: Characteristics and Application using Crash Data The Negative Binomial-Lindley Generalized Linear Model: Characteristics and Application using Crash Data Srinivas Reddy Geedipally 1 Engineering Research Associate Texas Transportation Institute Texas

More information

LEVERAGING HIGH-RESOLUTION TRAFFIC DATA TO UNDERSTAND THE IMPACTS OF CONGESTION ON SAFETY

LEVERAGING HIGH-RESOLUTION TRAFFIC DATA TO UNDERSTAND THE IMPACTS OF CONGESTION ON SAFETY LEVERAGING HIGH-RESOLUTION TRAFFIC DATA TO UNDERSTAND THE IMPACTS OF CONGESTION ON SAFETY Tingting Huang 1, Shuo Wang 2, Anuj Sharma 3 1,2,3 Department of Civil, Construction and Environmental Engineering,

More information

Bayesian Poisson Hierarchical Models for Crash Data Analysis: Investigating the Impact of Model Choice on Site-Specific Predictions

Bayesian Poisson Hierarchical Models for Crash Data Analysis: Investigating the Impact of Model Choice on Site-Specific Predictions Khazraee, Johnson and Lord Page 1 of 47 Bayesian Poisson Hierarchical Models for Crash Data Analysis: Investigating the Impact of Model Choice on Site-Specific Predictions S. Hadi Khazraee, Ph.D.* Safety

More information

Key Words: Conway-Maxwell-Poisson (COM-Poisson) regression; mixture model; apparent dispersion; over-dispersion; under-dispersion

Key Words: Conway-Maxwell-Poisson (COM-Poisson) regression; mixture model; apparent dispersion; over-dispersion; under-dispersion DATA DISPERSION: NOW YOU SEE IT... NOW YOU DON T Kimberly F. Sellers Department of Mathematics and Statistics Georgetown University Washington, DC 20057 kfs7@georgetown.edu Galit Shmueli Indian School

More information

Flexiblity of Using Com-Poisson Regression Model for Count Data

Flexiblity of Using Com-Poisson Regression Model for Count Data STATISTICS, OPTIMIZATION AND INFORMATION COMPUTING Stat., Optim. Inf. Comput., Vol. 6, June 2018, pp 278 285. Published online in International Academic Press (www.iapress.org) Flexiblity of Using Com-Poisson

More information

A Full Bayes Approach to Road Safety: Hierarchical Poisson. Mixture Models, Variance Function Characterization, and. Prior Specification

A Full Bayes Approach to Road Safety: Hierarchical Poisson. Mixture Models, Variance Function Characterization, and. Prior Specification A Full Bayes Approach to Road Safety: Hierarchical Poisson Mixture Models, Variance Function Characterization, and Prior Specification Mohammad Heydari A Thesis in The Department of Building, Civil and

More information

FULL BAYESIAN POISSON-HIERARCHICAL MODELS FOR CRASH DATA ANALYSIS: INVESTIGATING THE IMPACT OF MODEL CHOICE ON SITE-SPECIFIC PREDICTIONS

FULL BAYESIAN POISSON-HIERARCHICAL MODELS FOR CRASH DATA ANALYSIS: INVESTIGATING THE IMPACT OF MODEL CHOICE ON SITE-SPECIFIC PREDICTIONS FULL BAYESIAN POISSON-HIERARCHICAL MODELS FOR CRASH DATA ANALYSIS: INVESTIGATING THE IMPACT OF MODEL CHOICE ON SITE-SPECIFIC PREDICTIONS A Dissertation by SEYED HADI KHAZRAEE KHOSHROOZI Submitted to the

More information

Approximating the Conway-Maxwell-Poisson normalizing constant

Approximating the Conway-Maxwell-Poisson normalizing constant Filomat 30:4 016, 953 960 DOI 10.98/FIL1604953S Published by Faculty of Sciences and Mathematics, University of Niš, Serbia Available at: http://www.pmf.ni.ac.rs/filomat Approximating the Conway-Maxwell-Poisson

More information

Comparison of Confidence and Prediction Intervals for Different Mixed-Poisson Regression Models

Comparison of Confidence and Prediction Intervals for Different Mixed-Poisson Regression Models 0 0 0 Comparison of Confidence and Prediction Intervals for Different Mixed-Poisson Regression Models Submitted by John E. Ash Research Assistant Department of Civil and Environmental Engineering, University

More information

Investigating the effects of the fixed and varying dispersion parameters of Poisson-gamma models on empirical Bayes estimates

Investigating the effects of the fixed and varying dispersion parameters of Poisson-gamma models on empirical Bayes estimates Investigating the effects of the fixed and varying dispersion parameters of Poisson-gamma models on empirical Bayes estimates Dominique Lord, Ph.D., P.Eng.* Assistant Professor Department of Civil Engineering

More information

Accident Analysis and Prevention xxx (2006) xxx xxx. Dominique Lord

Accident Analysis and Prevention xxx (2006) xxx xxx. Dominique Lord Accident Analysis and Prevention xxx (2006) xxx xxx Modeling motor vehicle crashes using Poisson-gamma models: Examining the effects of low sample mean values and small sample size on the estimation of

More information

Parameters Estimation Methods for the Negative Binomial-Crack Distribution and Its Application

Parameters Estimation Methods for the Negative Binomial-Crack Distribution and Its Application Original Parameters Estimation Methods for the Negative Binomial-Crack Distribution and Its Application Pornpop Saengthong 1*, Winai Bodhisuwan 2 Received: 29 March 2013 Accepted: 15 May 2013 Abstract

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION CHAPTER 1 INTRODUCTION 1.0 Discrete distributions in statistical analysis Discrete models play an extremely important role in probability theory and statistics for modeling count data. The use of discrete

More information

EXAMINING THE USE OF REGRESSION MODELS FOR DEVELOPING CRASH MODIFICATION FACTORS. A Dissertation LINGTAO WU

EXAMINING THE USE OF REGRESSION MODELS FOR DEVELOPING CRASH MODIFICATION FACTORS. A Dissertation LINGTAO WU EXAMINING THE USE OF REGRESSION MODELS FOR DEVELOPING CRASH MODIFICATION FACTORS A Dissertation by LINGTAO WU Submitted to the Office of Graduate and Professional Studies of Texas A&M University in partial

More information

TRB Paper Hot Spot Identification by Modeling Single-Vehicle and Multi-Vehicle Crashes Separately

TRB Paper Hot Spot Identification by Modeling Single-Vehicle and Multi-Vehicle Crashes Separately TRB Paper 10-2563 Hot Spot Identification by Modeling Single-Vehicle and Multi-Vehicle Crashes Separately Srinivas Reddy Geedipally 1 Engineering Research Associate Texas Transportation Institute Texas

More information

Reparametrization of COM-Poisson Regression Models with Applications in the Analysis of Experimental Count Data

Reparametrization of COM-Poisson Regression Models with Applications in the Analysis of Experimental Count Data Reparametrization of COM-Poisson Regression Models with Applications in the Analysis of Experimental Count Data Eduardo Elias Ribeiro Junior 1 2 Walmes Marques Zeviani 1 Wagner Hugo Bonat 1 Clarice Garcia

More information

Statistical Model Of Road Traffic Crashes Data In Anambra State, Nigeria: A Poisson Regression Approach

Statistical Model Of Road Traffic Crashes Data In Anambra State, Nigeria: A Poisson Regression Approach Statistical Model Of Road Traffic Crashes Data In Anambra State, Nigeria: A Poisson Regression Approach Nwankwo Chike H., Nwaigwe Godwin I Abstract: Road traffic crashes are count (discrete) in nature.

More information

Including Statistical Power for Determining. How Many Crashes Are Needed in Highway Safety Studies

Including Statistical Power for Determining. How Many Crashes Are Needed in Highway Safety Studies Including Statistical Power for Determining How Many Crashes Are Needed in Highway Safety Studies Dominique Lord Assistant Professor Texas A&M University, 336 TAMU College Station, TX 77843-336 Phone:

More information

Hot Spot Identification using frequency of distinct crash types rather than total crashes

Hot Spot Identification using frequency of distinct crash types rather than total crashes Australasian Transport Research Forum 010 Proceedings 9 September 1 October 010, Canberra, Australia Publication website: http://www.patrec.org/atrf.aspx Hot Spot Identification using frequency of distinct

More information

How to Incorporate Accident Severity and Vehicle Occupancy into the Hot Spot Identification Process?

How to Incorporate Accident Severity and Vehicle Occupancy into the Hot Spot Identification Process? How to Incorporate Accident Severity and Vehicle Occupancy into the Hot Spot Identification Process? Luis F. Miranda-Moreno, Liping Fu, Satish Ukkusuri, and Dominique Lord This paper introduces a Bayesian

More information

arxiv: v1 [stat.ap] 9 Nov 2010

arxiv: v1 [stat.ap] 9 Nov 2010 The Annals of Applied Statistics 2010, Vol. 4, No. 2, 943 961 DOI: 10.1214/09-AOAS306 c Institute of Mathematical Statistics, 2010 A FLEXIBLE REGRESSION MODEL FOR COUNT DATA arxiv:1011.2077v1 [stat.ap]

More information

Using Count Regression Models to Determine the Factors which Effects the Hospitalization Number of People with Schizophrenia

Using Count Regression Models to Determine the Factors which Effects the Hospitalization Number of People with Schizophrenia Journal of Data Science 511-530, DOI: 10.6339/JDS.201807_16(3).0004 Using Count Regression Models to Determine the Factors which Effects the Esin Avcı* a Department of Statistics, University of Giresun,

More information

Varieties of Count Data

Varieties of Count Data CHAPTER 1 Varieties of Count Data SOME POINTS OF DISCUSSION What are counts? What are count data? What is a linear statistical model? What is the relationship between a probability distribution function

More information

ANALYSIS OF INTRINSIC FACTORS CONTRIBUTING TO URBAN ROAD CRASHES

ANALYSIS OF INTRINSIC FACTORS CONTRIBUTING TO URBAN ROAD CRASHES S. Raicu, et al., Int. J. of Safety and Security Eng., Vol. 7, No. 1 (2017) 1 9 ANALYSIS OF INTRINSIC FACTORS CONTRIBUTING TO URBAN ROAD CRASHES S. RAICU, D. COSTESCU & S. BURCIU Politehnica University

More information

Confirmatory and Exploratory Data Analyses Using PROC GENMOD: Factors Associated with Red Light Running Crashes

Confirmatory and Exploratory Data Analyses Using PROC GENMOD: Factors Associated with Red Light Running Crashes Confirmatory and Exploratory Data Analyses Using PROC GENMOD: Factors Associated with Red Light Running Crashes Li wan Chen, LENDIS Corporation, McLean, VA Forrest Council, Highway Safety Research Center,

More information

Statistic Modelling of Count Data through the Recursive Probability Ratio of the COM-Poisson Extended Distribution

Statistic Modelling of Count Data through the Recursive Probability Ratio of the COM-Poisson Extended Distribution Applied Mathematical Sciences, Vol. 7, 2013, no. 115, 5741-5755 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.37359 Statistic Modelling of Count Data through the Recursive Probability

More information

Investigating the Effect of Modeling Single-Vehicle and Multi-Vehicle Crashes Separately on Confidence Intervals of Poisson-gamma Models

Investigating the Effect of Modeling Single-Vehicle and Multi-Vehicle Crashes Separately on Confidence Intervals of Poisson-gamma Models Investigating the Effect of Modeling Single-Vehicle and Multi-Vehicle Crashes Separately on Confidence Intervals of Poisson-gamma Models Srinivas Reddy Geedipally 1 Engineering Research Associate Texas

More information

LOGISTIC REGRESSION Joseph M. Hilbe

LOGISTIC REGRESSION Joseph M. Hilbe LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of

More information

Bivariate Weibull-power series class of distributions

Bivariate Weibull-power series class of distributions Bivariate Weibull-power series class of distributions Saralees Nadarajah and Rasool Roozegar EM algorithm, Maximum likelihood estimation, Power series distri- Keywords: bution. Abstract We point out that

More information

Rate-Quality Control Method of Identifying Hazardous Road Locations

Rate-Quality Control Method of Identifying Hazardous Road Locations 44 TRANSPORTATION RESEARCH RECORD 1542 Rate-Quality Control Method of Identifying Hazardous Road Locations ROBERT W. STOKES AND MADANIYO I. MUTABAZI A brief historical perspective on the development of

More information

LINEAR REGRESSION CRASH PREDICTION MODELS: ISSUES AND PROPOSED SOLUTIONS

LINEAR REGRESSION CRASH PREDICTION MODELS: ISSUES AND PROPOSED SOLUTIONS LINEAR REGRESSION CRASH PREDICTION MODELS: ISSUES AND PROPOSED SOLUTIONS FINAL REPORT PennDOT/MAUTC Agreement Contract No. VT-8- DTRS99-G- Prepared for Virginia Transportation Research Council By H. Rakha,

More information

Lecture-19: Modeling Count Data II

Lecture-19: Modeling Count Data II Lecture-19: Modeling Count Data II 1 In Today s Class Recap of Count data models Truncated count data models Zero-inflated models Panel count data models R-implementation 2 Count Data In many a phenomena

More information

Zero inflated negative binomial-generalized exponential distribution and its applications

Zero inflated negative binomial-generalized exponential distribution and its applications Songklanakarin J. Sci. Technol. 6 (4), 48-491, Jul. - Aug. 014 http://www.sst.psu.ac.th Original Article Zero inflated negative binomial-generalized eponential distribution and its applications Sirinapa

More information

Modeling Simple and Combination Effects of Road Geometry and Cross Section Variables on Traffic Accidents

Modeling Simple and Combination Effects of Road Geometry and Cross Section Variables on Traffic Accidents Modeling Simple and Combination Effects of Road Geometry and Cross Section Variables on Traffic Accidents Terrance M. RENGARASU MS., Doctoral Degree candidate Graduate School of Engineering, Hokkaido University

More information

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1 Parametric Modelling of Over-dispersed Count Data Part III / MMath (Applied Statistics) 1 Introduction Poisson regression is the de facto approach for handling count data What happens then when Poisson

More information

NCHRP Inclusion Process and Literature Review Procedure for Part D

NCHRP Inclusion Process and Literature Review Procedure for Part D NCHRP 17-7 Inclusion Process and Literature Review Procedure for Part D Geni Bahar, P. Eng. Margaret Parkhill, P. Eng. Errol Tan, P. Eng. Chris Philp, P. Eng. Nesta Morris, M.Sc. (Econ) Sasha Naylor, EIT

More information

Choosing the best set of variables in regression analysis using integer programming

Choosing the best set of variables in regression analysis using integer programming DOI 10.1007/s10898-008-9323-9 Choosing the best set of variables in regression analysis using integer programming Hiroshi Konno Rei Yamamoto Received: 1 March 2007 / Accepted: 15 June 2008 Springer Science+Business

More information

Accident Prediction Models for Freeways

Accident Prediction Models for Freeways TRANSPORTATION RESEARCH RECORD 1401 55 Accident Prediction Models for Freeways BHAGWANT PERSAUD AND LESZEK DZBIK The modeling of freeway accidents continues to be of interest because of the frequency and

More information

Spatial discrete hazards using Hierarchical Bayesian Modeling

Spatial discrete hazards using Hierarchical Bayesian Modeling Spatial discrete hazards using Hierarchical Bayesian Modeling Mathias Graf ETH Zurich, Institute for Structural Engineering, Group Risk & Safety 1 Papers -Maes, M.A., Dann M., Sarkar S., and Midtgaard,

More information

A hidden semi-markov model for the occurrences of water pipe bursts

A hidden semi-markov model for the occurrences of water pipe bursts A hidden semi-markov model for the occurrences of water pipe bursts T. Economou 1, T.C. Bailey 1 and Z. Kapelan 1 1 School of Engineering, Computer Science and Mathematics, University of Exeter, Harrison

More information

Safety Effectiveness of Variable Speed Limit System in Adverse Weather Conditions on Challenging Roadway Geometry

Safety Effectiveness of Variable Speed Limit System in Adverse Weather Conditions on Challenging Roadway Geometry Safety Effectiveness of Variable Speed Limit System in Adverse Weather Conditions on Challenging Roadway Geometry Promothes Saha, Mohamed M. Ahmed, and Rhonda Kae Young This paper examined the interaction

More information

Statistical Practice. Selecting the Best Linear Mixed Model Under REML. Matthew J. GURKA

Statistical Practice. Selecting the Best Linear Mixed Model Under REML. Matthew J. GURKA Matthew J. GURKA Statistical Practice Selecting the Best Linear Mixed Model Under REML Restricted maximum likelihood (REML) estimation of the parameters of the mixed model has become commonplace, even

More information

DEVELOPING DECISION SUPPORT TOOLS FOR THE IMPLEMENTATION OF BICYCLE AND PEDESTRIAN SAFETY STRATEGIES

DEVELOPING DECISION SUPPORT TOOLS FOR THE IMPLEMENTATION OF BICYCLE AND PEDESTRIAN SAFETY STRATEGIES DEVELOPING DECISION SUPPORT TOOLS FOR THE IMPLEMENTATION OF BICYCLE AND PEDESTRIAN SAFETY STRATEGIES Deo Chimba, PhD., P.E., PTOE Associate Professor Civil Engineering Department Tennessee State University

More information

Multivariate negative binomial models for insurance claim counts

Multivariate negative binomial models for insurance claim counts Multivariate negative binomial models for insurance claim counts Peng Shi (Northern Illinois University) and Emiliano A. Valdez (University of Connecticut) 9 November 0, Montréal, Quebec Université de

More information

Poisson Inverse Gaussian (PIG) Model for Infectious Disease Count Data

Poisson Inverse Gaussian (PIG) Model for Infectious Disease Count Data American Journal of Theoretical and Applied Statistics 2016; 5(5): 326-333 http://www.sciencepublishinggroup.com/j/ajtas doi: 10.11648/j.ajtas.20160505.22 ISSN: 2326-8999 (Print); ISSN: 2326-9006 (Online)

More information

New Achievement in the Prediction of Highway Accidents

New Achievement in the Prediction of Highway Accidents Article New Achievement in the Prediction of Highway Accidents Gholamali Shafabakhsh a, * and Yousef Sajed b Faculty of Civil Engineering, Semnan University, University Sq., P.O. Box 35196-45399, Semnan,

More information

Bayesian multiple testing procedures for hotspot identification

Bayesian multiple testing procedures for hotspot identification Accident Analysis and Prevention 39 (2007) 1192 1201 Bayesian multiple testing procedures for hotspot identification Luis F. Miranda-Moreno a,b,, Aurélie Labbe c,1, Liping Fu d,2 a Centre for Data and

More information

High-Throughput Sequencing Course

High-Throughput Sequencing Course High-Throughput Sequencing Course DESeq Model for RNA-Seq Biostatistics and Bioinformatics Summer 2017 Outline Review: Standard linear regression model (e.g., to model gene expression as function of an

More information

MODELING COUNT DATA Joseph M. Hilbe

MODELING COUNT DATA Joseph M. Hilbe MODELING COUNT DATA Joseph M. Hilbe Arizona State University Count models are a subset of discrete response regression models. Count data are distributed as non-negative integers, are intrinsically heteroskedastic,

More information

Impact of Day-to-Day Variability of Peak Hour Volumes on Signalized Intersection Performance

Impact of Day-to-Day Variability of Peak Hour Volumes on Signalized Intersection Performance Impact of Day-to-Day Variability of Peak Hour Volumes on Signalized Intersection Performance Bruce Hellinga, PhD, PEng Associate Professor (Corresponding Author) Department of Civil and Environmental Engineering,

More information

The Goodness-of-fit Test for Gumbel Distribution: A Comparative Study

The Goodness-of-fit Test for Gumbel Distribution: A Comparative Study MATEMATIKA, 2012, Volume 28, Number 1, 35 48 c Department of Mathematics, UTM. The Goodness-of-fit Test for Gumbel Distribution: A Comparative Study 1 Nahdiya Zainal Abidin, 2 Mohd Bakri Adam and 3 Habshah

More information

MACRO-LEVEL ANALYSIS OF THE IMPACTS OF URBAN FACTORS ON TAFFIC CRASHES: A CASE STUDY OF CENTRAL OHIO

MACRO-LEVEL ANALYSIS OF THE IMPACTS OF URBAN FACTORS ON TAFFIC CRASHES: A CASE STUDY OF CENTRAL OHIO Paper presented at the 52nd Annual Meeting of the Western Regional Science Association, Santa Barbara, February 24-27, 2013. MACRO-LEVEL ANALYSIS OF THE IMPACTS OF URBAN FACTORS ON TAFFIC CRASHES: A CASE

More information

Katz Family of Distributions and Processes

Katz Family of Distributions and Processes CHAPTER 7 Katz Family of Distributions and Processes 7. Introduction The Poisson distribution and the Negative binomial distribution are the most widely used discrete probability distributions for the

More information

Cost Efficiency, Asymmetry and Dependence in US electricity industry.

Cost Efficiency, Asymmetry and Dependence in US electricity industry. Cost Efficiency, Asymmetry and Dependence in US electricity industry. Graziella Bonanno bonanno@diag.uniroma1.it Department of Computer, Control, and Management Engineering Antonio Ruberti - Sapienza University

More information

Prediction of Bike Rental using Model Reuse Strategy

Prediction of Bike Rental using Model Reuse Strategy Prediction of Bike Rental using Model Reuse Strategy Arun Bala Subramaniyan and Rong Pan School of Computing, Informatics, Decision Systems Engineering, Arizona State University, Tempe, USA. {bsarun, rong.pan}@asu.edu

More information

Time-varying failure rate for system reliability analysis in large-scale railway risk assessment simulation

Time-varying failure rate for system reliability analysis in large-scale railway risk assessment simulation Time-varying failure rate for system reliability analysis in large-scale railway risk assessment simulation H. Zhang, E. Cutright & T. Giras Center of Rail Safety-Critical Excellence, University of Virginia,

More information

Lecture 8. Poisson models for counts

Lecture 8. Poisson models for counts Lecture 8. Poisson models for counts Jesper Rydén Department of Mathematics, Uppsala University jesper.ryden@math.uu.se Statistical Risk Analysis Spring 2014 Absolute risks The failure intensity λ(t) describes

More information

TRAFFIC FLOW MODELING AND FORECASTING THROUGH VECTOR AUTOREGRESSIVE AND DYNAMIC SPACE TIME MODELS

TRAFFIC FLOW MODELING AND FORECASTING THROUGH VECTOR AUTOREGRESSIVE AND DYNAMIC SPACE TIME MODELS TRAFFIC FLOW MODELING AND FORECASTING THROUGH VECTOR AUTOREGRESSIVE AND DYNAMIC SPACE TIME MODELS Kamarianakis Ioannis*, Prastacos Poulicos Foundation for Research and Technology, Institute of Applied

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Risk Assessment of Highway Bridges: A Reliability-based Approach

Risk Assessment of Highway Bridges: A Reliability-based Approach Risk Assessment of Highway Bridges: A Reliability-based Approach by Reynaldo M. Jr., PhD Indiana University-Purdue University Fort Wayne pablor@ipfw.edu Abstract: Many countries are currently experiencing

More information

Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions

Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions JKAU: Sci., Vol. 21 No. 2, pp: 197-212 (2009 A.D. / 1430 A.H.); DOI: 10.4197 / Sci. 21-2.2 Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions Ali Hussein Al-Marshadi

More information

DAYLIGHT, TWILIGHT, AND NIGHT VARIATION IN ROAD ENVIRONMENT-RELATED FREEWAY TRAFFIC CRASHES IN KOREA

DAYLIGHT, TWILIGHT, AND NIGHT VARIATION IN ROAD ENVIRONMENT-RELATED FREEWAY TRAFFIC CRASHES IN KOREA DAYLIGHT, TWILIGHT, AND NIGHT VARIATION IN ROAD ENVIRONMENT-RELATED FREEWAY TRAFFIC CRASHES IN KOREA Sungmin Hong, Ph.D. Korea Transportation Safety Authority 17, Hyeoksin 6-ro, Gimcheon-si, Gyeongsangbuk-do,

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 2016 MODULE 1 : Probability distributions Time allowed: Three hours Candidates should answer FIVE questions. All questions carry equal marks.

More information

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016 Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier

More information

Local Calibration Factors for Implementing the Highway Safety Manual in Maine

Local Calibration Factors for Implementing the Highway Safety Manual in Maine Local Calibration Factors for Implementing the Highway Safety Manual in Maine 2017 Northeast Transportation Safety Conference Cromwell, Connecticut October 24-25, 2017 MAINE Darryl Belz, P.E. Maine Department

More information

Unconditional Distributions Obtained from Conditional Specification Models with Applications in Risk Theory

Unconditional Distributions Obtained from Conditional Specification Models with Applications in Risk Theory Unconditional Distributions Obtained from Conditional Specification Models with Applications in Risk Theory E. Gómez-Déniz a and E. Calderín Ojeda b Abstract Bivariate distributions, specified in terms

More information

Unobserved Heterogeneity and the Statistical Analysis of Highway Accident Data. Fred Mannering University of South Florida

Unobserved Heterogeneity and the Statistical Analysis of Highway Accident Data. Fred Mannering University of South Florida Unobserved Heterogeneity and the Statistical Analysis of Highway Accident Data Fred Mannering University of South Florida Highway Accidents Cost the lives of 1.25 million people per year Leading cause

More information

On Discrete Distributions Generated through Mittag-Leffler Function and their Properties

On Discrete Distributions Generated through Mittag-Leffler Function and their Properties On Discrete Distributions Generated through Mittag-Leffler Function and their Properties Mariamma Antony Department of Statistics Little Flower College, Guruvayur Kerala, India Abstract The Poisson distribution

More information

PLANNING TRAFFIC SAFETY IN URBAN TRANSPORTATION NETWORKS: A SIMULATION-BASED EVALUATION PROCEDURE

PLANNING TRAFFIC SAFETY IN URBAN TRANSPORTATION NETWORKS: A SIMULATION-BASED EVALUATION PROCEDURE PLANNING TRAFFIC SAFETY IN URBAN TRANSPORTATION NETWORKS: A SIMULATION-BASED EVALUATION PROCEDURE Michele Ottomanelli and Domenico Sassanelli Polytechnic of Bari Dept. of Highways and Transportation EU

More information

Compound COM-Poisson Distribution with Binomial Compounding Distribution

Compound COM-Poisson Distribution with Binomial Compounding Distribution Compound COM-oisson Distribution with Binomial Compounding Distribution V.Saavithri Department of Mathematics Nehru Memorial College Trichy. saavithriramani@gmail.com J.riyadharshini Department of Mathematics

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 16 Introduction

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 16 Introduction Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 16 Introduction ReCap. Parts I IV. The General Linear Model Part V. The Generalized Linear Model 16 Introduction 16.1 Analysis

More information

ORDER RESTRICTED STATISTICAL INFERENCE ON LORENZ CURVES OF PARETO DISTRIBUTIONS. Myongsik Oh. 1. Introduction

ORDER RESTRICTED STATISTICAL INFERENCE ON LORENZ CURVES OF PARETO DISTRIBUTIONS. Myongsik Oh. 1. Introduction J. Appl. Math & Computing Vol. 13(2003), No. 1-2, pp. 457-470 ORDER RESTRICTED STATISTICAL INFERENCE ON LORENZ CURVES OF PARETO DISTRIBUTIONS Myongsik Oh Abstract. The comparison of two or more Lorenz

More information

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Journal of Data Science 9(2011), 549-564 Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Masaru Kanba and Kanta Naito Shimane University Abstract: This paper discusses the

More information

The relationship between urban accidents, traffic and geometric design in Tehran

The relationship between urban accidents, traffic and geometric design in Tehran Urban Transport XVIII 575 The relationship between urban accidents, traffic and geometric design in Tehran S. Aftabi Hossein 1 & M. Arabani 2 1 Bandar Anzali Branch, Islamic Azad University, Iran 2 Department

More information

Freeway rear-end collision risk for Italian freeways. An extreme value theory approach

Freeway rear-end collision risk for Italian freeways. An extreme value theory approach XXII SIDT National Scientific Seminar Politecnico di Bari 14 15 SETTEMBRE 2017 Freeway rear-end collision risk for Italian freeways. An extreme value theory approach Gregorio Gecchele Federico Orsini University

More information

MODELING OF 85 TH PERCENTILE SPEED FOR RURAL HIGHWAYS FOR ENHANCED TRAFFIC SAFETY ANNUAL REPORT FOR FY 2009 (ODOT SPR ITEM No.

MODELING OF 85 TH PERCENTILE SPEED FOR RURAL HIGHWAYS FOR ENHANCED TRAFFIC SAFETY ANNUAL REPORT FOR FY 2009 (ODOT SPR ITEM No. MODELING OF 85 TH PERCENTILE SPEED FOR RURAL HIGHWAYS FOR ENHANCED TRAFFIC SAFETY ANNUAL REPORT FOR FY 2009 (ODOT SPR ITEM No. 2211) Submitted to: Ginger McGovern, P.E. Planning and Research Division Engineer

More information

Hot Spot Analysis: Improving a Local Indicator of Spatial Association for Application in Traffic Safety

Hot Spot Analysis: Improving a Local Indicator of Spatial Association for Application in Traffic Safety Hot Spot Analysis: Improving a Local Indicator of Spatial Association for Application in Traffic Safety Elke Moons, Tom Brijs and Geert Wets Transportation Research Institute, Hasselt University, Science

More information

Lattice Data. Tonglin Zhang. Spatial Statistics for Point and Lattice Data (Part III)

Lattice Data. Tonglin Zhang. Spatial Statistics for Point and Lattice Data (Part III) Title: Spatial Statistics for Point Processes and Lattice Data (Part III) Lattice Data Tonglin Zhang Outline Description Research Problems Global Clustering and Local Clusters Permutation Test Spatial

More information

Optimization of Short-Term Traffic Count Plan to Improve AADT Estimation Error

Optimization of Short-Term Traffic Count Plan to Improve AADT Estimation Error International Journal Of Engineering Research And Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 13, Issue 10 (October 2017), PP.71-79 Optimization of Short-Term Traffic Count Plan

More information

Even Simpler Standard Errors for Two-Stage Optimization Estimators: Mata Implementation via the DERIV Command

Even Simpler Standard Errors for Two-Stage Optimization Estimators: Mata Implementation via the DERIV Command Even Simpler Standard Errors for Two-Stage Optimization Estimators: Mata Implementation via the DERIV Command by Joseph V. Terza Department of Economics Indiana University Purdue University Indianapolis

More information

Penalized Loss functions for Bayesian Model Choice

Penalized Loss functions for Bayesian Model Choice Penalized Loss functions for Bayesian Model Choice Martyn International Agency for Research on Cancer Lyon, France 13 November 2009 The pure approach For a Bayesian purist, all uncertainty is represented

More information

Appendix A. Numeric example of Dimick Staiger Estimator and comparison between Dimick-Staiger Estimator and Hierarchical Poisson Estimator

Appendix A. Numeric example of Dimick Staiger Estimator and comparison between Dimick-Staiger Estimator and Hierarchical Poisson Estimator Appendix A. Numeric example of Dimick Staiger Estimator and comparison between Dimick-Staiger Estimator and Hierarchical Poisson Estimator As described in the manuscript, the Dimick-Staiger (DS) estimator

More information

HOTSPOTS FOR VESSEL-TO-VESSEL AND VESSEL-TO-FIX OBJECT ACCIDENTS ALONG THE GREAT LAKES SEAWAY

HOTSPOTS FOR VESSEL-TO-VESSEL AND VESSEL-TO-FIX OBJECT ACCIDENTS ALONG THE GREAT LAKES SEAWAY 0 0 HOTSPOTS FOR VESSEL-TO-VESSEL AND VESSEL-TO-FIX OBJECT ACCIDENTS ALONG THE GREAT LAKES SEAWAY Bircan Arslannur* MASc. Candidate Department of Civil and Environmental Engineering, University of Waterloo

More information

Application of Poisson and Negative Binomial Regression Models in Modelling Oil Spill Data in the Niger Delta

Application of Poisson and Negative Binomial Regression Models in Modelling Oil Spill Data in the Niger Delta International Journal of Science and Engineering Investigations vol. 7, issue 77, June 2018 ISSN: 2251-8843 Application of Poisson and Negative Binomial Regression Models in Modelling Oil Spill Data in

More information

ABSTRACT (218 WORDS) Prepared for Publication in Transportation Research Record Words: 5,449+1*250 (table) + 6*250 (figures) = 7,199 TRB

ABSTRACT (218 WORDS) Prepared for Publication in Transportation Research Record Words: 5,449+1*250 (table) + 6*250 (figures) = 7,199 TRB TRB 2003-3363 MODELING TRAFFIC CRASH-FLOW RELATIONSHIPS FOR INTERSECTIONS: DISPERSION PARAMETER, FUNCTIONAL FORM, AND BAYES VERSUS EMPIRICAL BAYES Shaw-Pin Miaou Research Scientist Texas Transportation

More information

A Cautionary Note on Estimating the Reliability of a Mastery Test with the Beta-Binomial Model

A Cautionary Note on Estimating the Reliability of a Mastery Test with the Beta-Binomial Model A Cautionary Note on Estimating the Reliability of a Mastery Test with the Beta-Binomial Model Rand R. Wilcox University of Southern California Based on recently published papers, it might be tempting

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

MECHANISTIC-EMPIRICAL LOAD EQUIVALENCIES USING WEIGH IN MOTION

MECHANISTIC-EMPIRICAL LOAD EQUIVALENCIES USING WEIGH IN MOTION MECHANISTIC-EMPIRICAL LOAD EQUIVALENCIES USING WEIGH IN MOTION Prepared By: Curtis Berthelot Ph.D., P.Eng. Dept. of Civil Engineering University of Saskatchewan Tanya Loewen Dept. of Civil Engineering

More information

Confidence and prediction intervals for. generalised linear accident models

Confidence and prediction intervals for. generalised linear accident models Confidence and prediction intervals for generalised linear accident models G.R. Wood September 8, 2004 Department of Statistics, Macquarie University, NSW 2109, Australia E-mail address: gwood@efs.mq.edu.au

More information