Investigating the effects of the fixed and varying dispersion parameters of Poisson-gamma models on empirical Bayes estimates

Size: px
Start display at page:

Download "Investigating the effects of the fixed and varying dispersion parameters of Poisson-gamma models on empirical Bayes estimates"

Transcription

1 Investigating the effects of the fixed and varying dispersion parameters of Poisson-gamma models on empirical Bayes estimates Dominique Lord, Ph.D., P.Eng.* Assistant Professor Department of Civil Engineering Texas A&M Universy 3136 TAMU College Station, TX Tel. (979) Fax. (979) d-lord@tamu.edu Peter Young-Jin Park, Ph.D., P.Eng. Transportation Engineer itrans Consulting Inc. 1 York Boulevard, Sue 3 Richmond Hill, ON, Canada L4B 1J8 Tel: (95) ext.5264 Fax: (95) ppark@ransconsulting.com Paper submted for publication * Corresponding author March 1, 28

2 ABSTRACT Tradionally, transportation safety analysts have used the empirical Bayes (EB) method to improve the estimate of the long-term mean of individual ses; to correct for the regressionto-the-mean (RTM) bias in before-after studies; and to identify hotspot or high risk locations. The EB method combines two different sources of information: 1) the expected number of crashes estimated via crash prediction models, and 2) the observed number of crashes at individual ses. Crash prediction models have tradionally been estimated using a negative binomial (NB) (or Poisson-gamma) modeling framework due to the over-dispersion commonly found in crash data. A weight factor is used to assign the relative influence of each source of information on the EB estimate. This factor is estimated using the mean and variance functions of the NB model. Wh recent trends that illustrated the dispersion parameter to be dependent upon the covariates of NB models, especially for traffic-flow only models, as well as varying as a function different time periods, there is a need to determine how these models may affect EB estimates. The objectives of this study are to examine how commonly used functional forms as well as fixed and time-varying dispersion parameters affect the EB estimates. To accomplish the study objectives, several traffic flow-only crash prediction models were estimated using a sample of rural three-legged intersections located in California. Two types of aggregated and time-specific models were produced: 1) the tradional NB model wh a fixed dispersion parameter and 2) the generalized NB model (GNB) wh a time-varying dispersion parameter, which is also dependent upon the covariates of the model. Several statistical methods were used to compare the fting performance of the various functional forms. The results of the study show that the selection of the functional form of NB models has an important effect on EB estimates both in terms of estimated values, weight factors, and dispersion parameters. Time-specific models wh a varying dispersion parameter provide better statistical performance in terms of goodness-of-f (GOF) than aggregated multi-year models. Furthermore, the identification of hazardous ses, using the EB method, can be significantly affected when a GNB model wh a time-varying dispersion parameter is used. Thus, erroneously selecting a functional form may lead to select the wrong ses for treatment. The study concludes that transportation safety analysts should not automatically use an existing functional form for modeling motor vehicle crashes whout conducting rigorous analyses to estimate the most appropriate functional form linking crashes wh traffic flow. Keywords: crash prediction models, dispersion parameter, empirical Bayes estimates, negative binomial, rural intersections

3 INTRODUCTION Statistical models or crash prediction models have been a very popular method for estimating the safety performance of various transportation elements. The most common statistical models used by transportation safety analysts are the Poisson and Negative Binomial (NB) (or Poisson-gamma) regression models (Miaou, 1994; Pock and Mannering, 1996; Lord et al., 25a). NB models are usually the model of choice and have been applied extensively in various types of highway safety studies, from the identification of hotspots or hazardous ses, the prediction of motor vehicle collisions, to the development of accident modification factors via the coefficients of the model (Harwood et al., 2; Miaou, 1996; Vogt, 1999; Lord and Bonneson, 26). [note: more recently, Lord et al. (28) have proposed the Conway- Maxwell-Poisson model as a substute to the NB model.] There are two main reasons why NB models are favored over Poisson models for modeling motor vehicle collisions. First, the variance of the response variable (i.e., crashes per un of time) commonly exceeds the mean value of the variable, which violates the main assumption associated wh the Poisson model (i.e., over-dispersion phenomenon). As a result, if a Poisson distribution is assumed in estimating the expected number of crashes, larger discrepancies between the observed and the predicted crashes may be observed (Hauer, 21). Second, a mis-specified Poisson model may lead to the inclusion of covariates that have been erroneously identified as being significant when, in fact, they are not (Park and Lord, 27). It has been reported that the over-dispersion is caused by some unmeasured uncertainties associated wh the unobserved or unobservable variables, resulting in the omted variable problem. However, although the latter problem can contribute to the overdispersion, is mainly attributed to the nature of the crash process, namely the fact that crashes are the product of Bernoulli trials wh unequal probabily of events (this is also known as Poisson trials). Lord et al. (25a) have reported that as the number of trials increases and becomes very large, the distribution may be approximated by a Poisson process (hence the use of Poisson-based or mixed-poisson models), where the magnude of the overdispersion is dependent upon the characteristics of the Poisson trials. (Note: the overdispersion can be minimized using appropriate mean structures of statistical models, as discussed in Miaou and Song, 25 and Mra and Washington, 27). In short, the NB model can efficiently reduce these unmeasured uncertainties by allowing an error term to capture the unmeasured heterogeney in a study dataset (Miaou and Lord, 23). Therefore, in order to take into account the over-dispersion problem in a given study dataset, transportation safety analysts normally adapt the NB modeling framework for developing crash prediction models. In highway safety, the dispersion parameter of NB models (note: some researchers use the term over-dispersion parameter instead of the dispersion parameter) takes a central role for calculating empirical Bayes (EB) estimates. These estimates are used to smooth the random fluctuation of crash counts and generate a more accurate estimate of the long-term mean at a given se. Inasmuch as the EB estimates are one of the main inputs for a sound EB before-after study, the accuracy of EB estimates will definely affect the precision of the analysis output. The EB estimates can also be used to identify hotspots (see Saccommano et 2

4 al., 21) or ses wh promise (see Hauer, 1996) by ranking crash-prone locations by order of magnude or by computing the difference between the output of predictive models and the EB estimate. As a result, rigorous statistical models based on an appropriate NB modeling framework must be developed to obtain reliable EB estimates and to maximize the safety benef per dollar spent. As discussed by Hauer (1997), the long-term mean for a se i over a period t can be estimated using the EB method: ( 1 ) μˆ = γ y + γ μˆ (1) where, μ ˆ = EB estimate in crashes per year for given se i and year t; γ = weight factor for given se i and year t; y = observed number of crashes for given se i and year t; μ ˆ = the estimated number of crashes by crash prediction models for given se i and year t (usually estimated using a NB model). The weight factor γ is given as follows: γ ( αμˆ ) = 1 1+ (2) where, α = the dispersion parameter for the given dataset [note: in the safety lerature, analysts sometimes report the inverse dispersion parameter φ= 1 α]. Up until very recently, researchers did not estimate time-varying EB estimates (as currently defined in equations (1) and (2)); instead transportation safety analysts produced an average EB estimates for the ses under study for the entire period by relying on a tradional NB model (Harwood et al. 2; Persaud et al., 21; Vogt, 1999). The tradional NB model uses a fixed dispersion parameter that is applied to the entire dataset in the study (Miaou, 1996). However, as pointed out by Hauer (21), there is no tangible rationale that all ses in the dataset should have a constant dispersion parameter over a given study period. Several other researchers have also questioned the hypothesis that the dispersion parameter has a fixed value over different ses and time-periods (Heydecker and Wu, 21; Miaou and Lord, 23; Lord et al., 25b; Miranda-Moreno et al., 25; El-Basyouny and Sayed, 26). Heydecker and Wu (21) attempted to estimate varying dispersion parameters as a function of ses covariates, such as minor and major volumes (AADT) at intersections, vertical and horizontal curvatures among others. They asserted that the NB model wh a varying dispersion parameter (henceforth defined as generalized NB model or GNB) can better represent the nature of crash dataset than the tradional NB model wh a fixed dispersion parameter. The approach proposed by Heydecker and Wu (21) was also used by Lord et al. (25b) for modeling the safety performance of freeways as a function of traffic flow 3

5 characteristics. An exception is Lyon et al. (25) who introduced a time-varying dispersion parameter using the tradional NB model. The dispersion parameter for each year was estimated outside the model estimating process using the maximum likelihood method. There are many different functional forms that have been proposed to link crashes to the explanatory variables of tradional regression models for segments (Martin, 22; Abbas, 24; Lord et al., 25b; Fzpatrick et al., 28) and intersections (Nicholson and Turner, 1996; Turner and Nicholson, 1998; Mountain and Fawaz, 1998; Miaou and Lord, 23). In the past, these functional forms were adopted to determine the model that provided the best statistical f whout considering the relationship between different functional forms and the dispersion parameter (wh the exception of Miaou and Lord, 23). However, if the selection of a functional form can influence the precision of the model estimates [i.e., the estimated number of crashes; μˆ in equation (1)], then may also influence the precision of estimated dispersion parameters [i.e., α in equation (2)]. In evaluating the safety effects of treatments (e.g., Persaud et al., 21; Powers and Carson, 24) and hotspot identification (e.g., Miranda-Moreno et al., 25), the impact of the dispersion parameters on the EB estimates calculated using the output of different functional forms mer further investigation. To answer this question, this study has been motivated to address the following issues: 1) Develop a series of traffic flow-only crash prediction models which can estimate a se and time-specific number of crashes using both tradional NB and GNB models; estimate a fixed and varying dispersion parameters across se i and time t; and, investigate the characteristics of the dispersion in the data as a function of the selected covariates (i.e., traffic volumes). 2) Evaluate the statistical performance of crash prediction models using commonly used goodness-of-f (GOF) statistics as well as Cumulative Residual (CURE) plots. 3) Examine the relationship between the functional forms of crash prediction models and the estimated dispersion parameters (i.e., both fixed and varying dispersion parameters). 4) Examine the differences on the EB estimates between the NB model wh a fixed dispersion parameter and GNB model wh a varying dispersion parameter, and compare both estimates in hotspot identification. To accomplish the objectives of this study, several crash prediction models were produced using a sample of three-legged rural intersections located in California. Crash, traffic flow and geometric design data (to confirm the intersection geometry) were obtained from the Highway Safety Information System (HSIS) managed by the Universy of North Carolina in Chapel Hill, NC. For a given five-year ( ) study period, a total of 5,752 three-legged rural intersections were included in the database. Intersections that contained missing and questionable values were discarded from the dataset, resulting in a sample of 5,588 three-legged rural intersections for the same five-year period wh a total of 5,996 reported crashes (all crash severies or the total number of crashes). Given the large sample size, the inverse dispersion parameters in this study are assumed to be properly estimated (see Lord, 26 and Park and Lord, 28, about this assumption). Table 1 contains a brief summary of study dataset. 4

6 MODEL DEVELOPMENT This section describes the characteristics of the tradional NB and GNB models. TRADITIONAL AND GENERALIZED NEGATIVE BINOMIAL MODELS Properties of the tradional NB model have been illustrated by Cameron and Trivedi (1998). The probabily densy function (pdf) of the NB distribution can be defined as Γ ( y 1 ) 1 + PY ( yi, ) α α μ = μ α = y! ( 1 ) 1 1 Γ μ μ α + + α α In contrast to the Poisson distribution, the NB distribution allows for over-dispersion, μ = E Y = X β ; X = i a vector of covariates, and and thus the mean (i.e., { } exp( ) 1 α y ti (3) β = regression coefficients corresponding to the covariates) can be smaller than the variance 2 (i.e., Var{ Y } = μ + α μ ) (Note: the model s output can also show signs of underdispersion). When the dispersion parameter α= 1 φ is equal to zero (that is φ ), the NB distribution reverts back to the Poisson distribution. Larger values of α signifies a greater amount of over-dispersion. An important characteristic of the tradional NB model is that the dispersion parameter α (or s inverse φ ) would not vary from se to se. This type of model has only a single fixed value over all observations whout considering potential dependency on the covariates. The GNB uses the same pdf shown in equation (3) and estimates the number of crashes of each se, like the tradional NB model. However, instead of estimating a fixed dispersion parameter, the model estimates varying dispersion parameters by using the following expression (Hardin and Hilbe, 21): α = exp( Z δ ) (4) t where, Z = a vector of secondary covariates (are not necessarily the same as the covariates used for estimating the mean function μ ˆ ), δ t = a vector of regression coefficients corresponding to covariates Z. Wh equation (4), the GNB model can be used for estimating a different overdispersion parameter according to the ses attributes (i.e., covariates). If there are no significant secondary covariates for explaining the systematic dispersion structure, the dispersion parameters will only contain a fixed value (i.e., constant term), resulting in a tradional NB regression model. 5

7 In this study, STATA V.8. program (Stata, 23) was used to estimate all the coefficients of the tradional NB and GNB models, including the fixed and varying dispersion parameters. In order to simplify the analysis, the serial correlation associated wh time-trend models was not included in this study. It should be pointed out that since the data did not contain any missing values and because the model type is defined as a marginal model, the coefficients of generalized linear models (GLM) are the same (or very similar) as the values produced by the Generalized Estimating Equations (GEE), no matter which working correlation matrix is used (or whether the correlation matrix is mis-specified). The only difference is related to the standard errors of the coefficients. The standard errors are usually underestimated when temporal effects are not included in the modeling process (see Lord and Persaud, 2 and Hardin and Hilbe, 23 for addional information). CRASH PREDICTION MODELS Given the specific objectives of this study, instead of developing the models wh the best statistical f considering every possible combination of covariates, only entering traffic volumes (from major and minor intersecting roads) were used as covariates. Traffic flow-only models are the most common type of model used by transportation safety analysts (Mountain and Fawaz, 1996; Hauer, 1997; Hughes et al., 25). In addion, this kind of model has been shown to exhib a structured variance function (Miaou and Lord, 23; Mra and Washington, 27; Geedipally and Lord, 28). In other words, the variance function of the model (i.e., a function of the dispersion parameter) can be dependent upon the specific characteristics of each se (Miaou and Lord, 23). Thus, investigating the effects of a varying dispersion parameter on the EB estimates for traffic flow-only models is important in the context described above. Miaou and Lord (23) listed the most popular functional forms (referred in the study as Models 1 to 5) from previously published studies: 1) Model 1: μ = β + β ln( F1 F2 ) ln 1 t + 2) Model 2: ln μ = βt + β1 ln F1 + β2 ln F2 3) Model 3: μ = β + β ln( F1 F2 ) ln 1 t 4) Model 4: μ = β + β ln( F1 + F2 ) + ln( F2 / F1 ) ln 2 t 1 β 5) Model 5: ln μ = βt + β1 ln F1 + β2 ln F2 + β3 ln F2 6

8 Where, μ = the expected number of crashes at intersection i in year t (note: EY { } = μ ); F1 = AADT entering from major road at intersection i in year t; and, F2 = AADT entering from minor road at intersection i in year t. As reported by Miaou and Lord (23), the functional forms described above are not the most adequate for describing the relationship between crashes and exposure since the forms do not appropriately f the data near the boundary condions. Nonetheless, they are still relevant for this study, as they are considered established functional forms in the highway safety lerature. In addion, the most adequate functional form proposed by Miaou and Lord (23), a model wh two distinct mean functions, cannot be estimated via a generalized linear modeling (GLM) framework, as was done in this study. In this analysis, a two-step calibration approach that was introduced by Lord and Persaud (2) and applied by Lyon et al. (25) to estimate the model parameters (β t, β 1, β 2, and β 3 ) was utilized. They developed a series of tradional NB models assuming the varying intercept term (β t,) by year, but constant model parameters for the covariates (β 1, β 2, and β 3 ). Step 1) Develop the All-Year model using total crashes for five years and average AADT over the same five years to estimate the model parameters for covariates (β 1, β 2, and β 3 ) (the model output is the number of crashes per 5 years). Step 2) Calibrate the time-specific intercept terms (β t ) using each year AADT and each year crash data wh a constraint of holding the model parameters for covariates (β 1, β 2, and β 3 ) that were estimated at the first step. In this study, only the intercept term (β t ) varies by year for both the tradional NB and GNB crash prediction models. The output of All-Year models represents the total number of crashes over the five-year study period. Instead, we can also obtain the total number of crashes for the same five years by aggregating the number of crashes for each year that was estimated using time-specific crash prediction models. Similar to the Lord et al. (25b), the same covariates were used for both the mean and dispersion functions of each functional form. Furthermore, in order to compare the impact of different functional forms on dispersion parameters, we attempted to maintain every coefficient in the model. Consequently, several coefficients that failed to pass the significant test at a 5% confidence level were still reported as the dispersion parameters of GNB models. Tables 2 and 3 summarize the modeling results for the tradional NB and GNB models, respectively. Five different functional forms are used for each model, and six timespecific (1997, 1998, 1999, 2, 21, and All-Year) crash prediction models are developed for each functional form. To illustrate the application of the models, the expected number of 7

9 crashes per year is estimated using the time-specific GNB Model 1. If we assume, for example, that the Major and Minor AADTs for a particular year is 3, vpd and 3 vpd, respectively, by employing the time-specific GNB Model 1 in Table 3, one obtains the expected number of crashes at 1997 as 99 [i.e., exp( ln(3+3))]. Similarly, the expected number of crashes for 1998, 1999, 2, and 21 is estimated as 75,.12,.13, and.15, respectively. By summing up all these estimates, the expected number of crashes over the five-year period by this disaggregate time-specific GNB Model 1 is estimated to be.485. On the other hand, by employing the aggregate All-Year GNB Model 1 in Table 3, the estimate of the crash frequency at the same intersection over the same 5-year period equals.484 [i.e., exp( ln(3+3)).484]. Using the same example traffic volumes above, the time-specific inverse dispersion parameters are estimated as [i.e., exp( ln(3+3))], 6.137, 2.747, 2.894, and 3.129, respectively, for each corresponding year. As opposed to the mean crashes, these values cannot be aggregated; note that the average value equals Using the All- Year model, the overall dispersion parameter over the five years equals We found that the aggregate All-Year model always produces smaller dispersion parameter than the average of dispersion parameters based on disaggregate time-specific models. It appears that the aggregate All-Year model using the average AADT does not adequately take into account the time variation in the dispersion. Four different GOF statistical tests were employed for comparing the series of crash prediction models. The tests, described in Hardin and Hilbe (21) and Washington et al. (23), are as follows: 2 ln L( M k ) + 2P 1) Akaike s Information Crerion (AIC) = (5) N where, ln L(M k ) = log likelihood of model k; P = the number of parameters; and, N = the number of observations (in our exercise = 5,588). 2) Bayesian Information Creria (BIC) = D(M k ) d.f. lnn (6) where, D(M k ) = Deviance of model k; and, d.f.= degrees of freedom. 3) Sum of Model Deviances (G 2 ) = ln( ˆ ) where, n 2 y i y i μ i (7) i= 1 8

10 y i = the observed number of crashes at se i; μˆ i = the expected number of crashes at se i. 4) R 2 -like measure-of-f (MOF) based on Standardized Residuals (R 2 ) = n 2 n 2 1 ( ) ) ( ) ) y ˆ i ˆ1 μ μ yi y y (8) i= 1 i= 1 where, y = average number of observed number of crashes. The model wh the lowest value in AIC, BIC, and G 2 is considered the model wh the best statistical f. On the other hand, the model wh the largest R 2 -like MOF value indicates a superior fted model. As shown in Table 4, in general Model 4 provides the best f for both the tradional NB and GNB models according to the three different test statistics (i.e., AIC, BIC, and R 2 -like MOF). Model 5 is selected as the best fted model based on the G 2 -statistics, but is selected as the second worst model based on the R 2 -like MOF. On the other hand, all the test statistics selected Model 1 as the worst statistical f model regardless of the model type (i.e., NB or GNB). As pointed out by Miranda-Moreno et al. (25), in general, GNB models f the data better than the tradional NB models on the basis of the three different test statistics (i.e., AIC, BIC, and G 2 -statistics) wh the exception of Model 1 for the G 2 -statistics. Karim and Sayed (26) reported the same conclusion in their study. [Note: Using the Deviance statistics, Miaou and Lord (23) did not find a significant difference between NB and GNB models in terms of GOF.] However, the R 2 -like MOF does not produce consistent test results compared to the results based on the other test statistics. Examining the analysis results of the four different test statistics, the following findings are worthwhile to be noted (refer to the Table 4): 1) Model 4 can be considered as the best fted model amongst the five alternate Models regardless of the model type (i.e., NB and GNB model) based on the test results of AIC, BIC, and R 2 -like MOF. 2) Overall, GNB models show a better f than tradional NB models regardless of the functional forms based on the three different test statistics (i.e., AIC, BIC, and G 2 -statistics). However, since the R 2 -like MOF produced inconsistent test results, this test statistics may not be suable to determine the best fted model as well as the most suable functional form. 3) As a result, determining the model wh the best statistical f as well as the most suable functional form using a single (or a couple of) test statistics may potentially be unreliable, and thus should be avoided. 9

11 GOODNESS OF FIT EVALUATION BASED ON RESIDUAL ANALYSIS In the previous section, four different statistical tests were utilized for measuring the GOF of competive functional forms or models. Another evaluation method, inially proposed by Hauer and Bamfo (1997), can also be used for evaluating the f of models. This method is known as the CURE method and has been used extensively by many transportation safety analysts (e.g., Lord and Persaud, 2; Washington et al., 25; Wang and Abdel-Aty, 27). The method requires scrutinizing a graph (i.e., CURE plot), in which the cumulative residuals are plotted in increasing order for each explanatory variable (i.e., major and minor road AADT in our case) separately. The residuals (e ) represent the difference between the observed (y ) and the estimated number of crashes ( ˆ μ ) at a given study se i in year t. The closer the curve oscillates around zero-residual line, the better the model fs the data. The curious reader is referred to the references listed above for addional details about the CURE method. Figures 1 through 5 show a total of ten different CURE plots using five different functional forms wh the two model types based upon the All-Year crash data. As discussed by Hauer and Bamfo (1997), the CURE plot reveals how well the functional forms f the data wh respect to each individual explanatory variable and show systematic deviations of the cumulative residuals from the zero-residual line. Two different CURE plots (i.e., Figure (a) and (b)) are generated for each model to compare the model performance between the two model types (i.e., NB and GNB models). To shorten the illustration, the major road AADT (F1) has been used as a representative explanatory variable for this analysis. Looking at Figures 1 to 5, several characteristics involving CURE plots can be noticed: 1) Model 1 underestimates the expected number of crashes (i.e., y > ˆ μ ) for the range between 1 and 65, major road AADT and slightly overestimates the expected number (i.e., y < ˆ μ ) when the major road AADT (F1 ) is higher than 65, (refer to Figure 1) regardless of the model type (NB or GNB). Moreover, for the range of major road AADT (F1) between 5, and 12,, Model 1 produces larger cumulative residual values than the +2.σ confidence interval boundary. No practical difference in the CURE plots between the two model types (i.e., NB and GNB model) is found and the final cumulative residual curve is relatively close to. 2) NB Model 2 (refer to the Figure 2 (a)) and NB Model 4 (refer to the Figure 4 (a)) are similar in that both models underestimate the expected number of crashes over the entire range of the variable (i.e., F1 ). The cumulative residual values are larger than the +2.σ boundary for AADT values above 5,. In addion, the cumulative residual curves for these two NB models do not end near the zero-residual line. 3) Among the different CURE plots, the Model 5 CURE plots (i.e., Figure 5-(a), (b)) showed the most unexpected results, including a catastrophic drop in the 1

12 cumulative residuals at the major road AADT around 42,, regardless of the model type. In the previous section, this model was selected as the one wh the best statistical f according to the G 2 -statistics and chosen as the second best based on the AIC as well as BIC creria. A closer look at the CURE plots as well as the raw dataset revealed that this sudden drop is caused by the unusually higher number of minor road AADT (i.e., F2 = 23,111) at that specific intersection. The value is at least twice as high as the other minor roads AADT, and contributed to produce dramatically greater amount of overestimation values (i.e., e = in NB Model 5, e = in GNB Model 5). In fact, this sudden drop reveals that Model 5 is very sensive wh respect to large values of minor road AADT (F2 ), as captured by the third variable in the functional form (i.e., β 3F2 ). It should be noted that the third variable in Model 5 actually influenced the f of the model, since Model 2 does not show the sudden change for the cumulative residual plot. In truth, this se or observation should be investigated further to determine whether this observation is in fact an outlier (e.g., error in reported flows, etc.) or an influence point. Statistical tests (not used here), such as R-Student, DFFITS and Cooks D, can be used to identify potential outliers and influence points (Myers, 2). 4) GNB Model 2, 3, and 4 yielded improved CURE plots compared to the CURE plots produced from the NB Models in that the amount of bias in the estimate values has been much reduced. The final cumulative residual curves for these three GNB models end close enough to zero-residual line. Although there are just slight differences in the CURE plots among these three GNB models, GNB Model 4 shows better fting properties ( was already selected as the best fted model based on the test statistics analysis in previous section) since the cumulative residual curves are oscillating reasonably across the zero-residual line over the entire range of explanatory values. Table 5 contains the summary of the cumulative residuals for a total of 6 different models [6 years (including all year) 5 functional forms 2 model types] to show the difference between the time-specific model (yearly based model) and all-year model as well as the difference between the NB (fixed dispersion) and GNB (varying dispersion) model. Notable characteristics are: 1) The sums of residuals of time-specific models show a great amount of up-anddown fluctuation in each year regardless of the functional forms employed. For Models 2 though 4, the absolute values of the total residuals from time-specific models (i.e., Sum 97-1) show lower values than those of the aggregated All- Year models. It indicates that the time-specific models produce more accurate results than the aggregated all year models regardless of the model type (i.e., NB or GNB) especially for Model 2, 3 and 4. On the other hand, Models 1 and 5 do not show this characteristic. 11

13 2) In general, GNB model shows eher better fting performances wh much smaller residuals than the tradional NB model (i.e., Model 2, 3, and 4) or almost same performance wh negligible difference in residuals (i.e., Model 1, and 5). 3) Model 5 produced very different results in cumulative residuals compared to those produced by the other four models because of the abnormally high volume for the minor approach at the previously identified intersection. This intersection should be investigated further to ensure is not an outlier and removed, if necessary. However, is of interest to note that Model 5 was originally selected as the best functional form in terms of the G 2 -statistics and the second best functional form according to the AIC and BIC output (refer to Table 4). The fact that the statistical tests in Equations (5) to (8) are frequently used to determine and justify the best model performance by transportation safety analysts whout further looking into raw data set using a model diagnosis tool (e.g., CURE plot) is a cause for concern. DISPERSION PARAMETERS AMONG DIFFERENT CRASH PREDICTION MODELS Table 6 and Figure 6 summarize the estimated dispersion parameters obtained from different models. A few notable features include: 1) In general, tradional NB models underestimate the inverse dispersion parameters compared to GNB models, wh the exception of Model 3. In terms of the varying dispersion parameters, comparing the minimum/maximum values as well as the average values, Model 5 again shows the largest discrepancy amongst all the models evaluated (as seen in Table 6). This may be caused by the observation wh an abnormally large minor road AADT or may just be the peculiar characteristic of Model 5 since a similar discrepancy has been noted by 2 Miaou and Lord (23). Since Var{ Y } = μ + α μ, Model 5 estimates will produce a larger variance than that of the other models, implying a higher level of uncertainty associated wh this model. As a result, compared to the other four models, Model 5 put more emphasis on the observed number of crashes than the model estimates in obtaining EB estimates. 2) Inasmuch as a tradional NB model produces a fixed (i.e., constant) dispersion parameter for each model, the weight factor is inversely related to model estimates (i.e., w 1 μ, refer to Figure 7). The higher the model estimates, the smaller the weight factors, and vise versa. 3) GNB Models 2, 4, and 5 [refer to Figure 8-(b), 8-(d), and 8-(e), respectively] allow different weight factors for intersections wh the same model estimate ( ˆ μ ). The tendency is stronger for the intersections wh higher model estimates. On the other hand, GNB Models 1 and 3 [refer to Figure 8-(a) and 8-(c)] show the same patterns wh the corresponding NB models, and do not allow different 12

14 dispersion parameters unless the intersections have different model estimates. For Models 1 and 3, traffic volumes entering from the major and minor roads (i.e., F2 and F2, respectively) are not treated as distinct traffic volumes, but rather as a single traffic volume un (i.e., F1 +F2, F1 F2 ). Even wh the exact same traffic volume un (e.g., F1 +F2 = 3,/day), intersections could have different entering traffic volumes from the major and minor approaches (e.g., F1 t = 1,5 and F2 = 1,5, F1 = 2, and F2 = 1,, etc.). Since the same functional form is used to explain the dispersion and the mean values in all GNB Models, GNB Models 1 and 3 should have the same value for the weight factor if the model estimates are the same. On the other hand, GNB Models 2, 4, and 5 could have different dispersion parameters and model estimates even for intersections wh the exactly same aggregated traffic volumes. Even though the estimated dispersion parameters can be different between the tradional NB and GNB models and, if one only uses the aggregated traffic volumes (i.e., F1 +F2, F1 F2 ) in explaining the varying dispersion parameters, there is no practical mer of using the GNB Model over the tradional NB model for this case. Figures 9-(a) and 9-(c) clearly show that the GNB Models 1 and 3 produce virtually the same weight factors as calculated from the tradional NB model. It should be noted that GNB models produce a slightly lower value for the weight factor (i.e., the coefficients in Figure 9 are less than 1.) than those of tradional NB models. Hence, GNB Models will give slightly more weight to the observed number of crashes than the model estimates when calculating the EB estimates compared to the tradional NB Models wh the same estimated value. Figure 1 shows which covariates are more heavily associated wh the degree of dispersion amongst different GNB Models. In this illustration, GNB Models 1 and 3 were disregarded since the models only contain a single aggregated traffic volume. GNB Models 2, 3, and 5 show that the major road AADT contributes more significantly to the variation in the dispersion parameters than that of the minor road AADT. Obviously, for a given major road AADT, a number of different dispersion parameters can be estimated. Since the weight factors are a function of dispersion parameters, the variation in the weight factor in GNB Models 2, 3, and 5 is mainly caused by the heterogeney associated wh the major road covariate. Since the AADT has been used as a surrogate measure for explaining the possible structure of un-modeled heterogeneies, as documented in Miaou and Lord (23) and Mra and Washington (27), the inclusion of other covariates describing characteristics associated wh the major approaches may help reduce the heterogeney observed in the models. The final objective of this study consisted of investigating the impact of varying dispersions on identifying hotspots. Figure 11 illustrates the relationship between the hotspot identification lists ranked by the tradional NB models and by the GNB models. Smaller values in the ranking imply more hazardous intersections in terms of EB estimates. A preliminary examination of the figure seems to indicate a posive association between the NB ranking and the GNB ranking. This is supported by the large r 2 values and the analysis results of the Spearman rank-order correlation test (ρ s ) as well as, to a lesser degree, the 13

15 Kendall s ranking test (τ ). Using some of these tests, El-Basyouny and Sayed s (26) reported that using GNB models did not influence the identification of hazardous ses. Looking more closely at the graphs however, one can observe a greater variation in the ranking below the 45-degree line (note: more than 2/3 of all observations lie above the line), which means that, for the same observation, some ses ranked based on the GNB model tend to be ranked as more dangerous than those ranked based on the NB model; the ses below the line are usually associated wh larger weight factors, which imply that the variation at time t is relatively small. Furthermore, a detailed analysis of the ranking, as seen in Table 7, shows that the ranking is actually poles apart. In fact, for some models, about a fifth of the observations had a 2+-posion difference in ranking between the models wh a fixed and varying dispersion parameters. This table clearly shows that, despe the outcome of tradional ranking tests, the functional form of GNB models strongly influences the identification of hazardous ses. The results also show that the tests may not be appropriate for this kind of analysis. Given this characteristic, further work is needed on this topic. SUMMARY AND CONCLUSIONS In this paper, a number of important issues regarding the impact of the dispersion parameters on EB estimates were presented. Several key conclusions can be reported: 1) Developing GNB models provide better statistical properties in terms of fted values than NB models wh a fixed dispersion parameter both for time-specific and aggregated modeling frameworks in most cases. For traffic-flow only models, GNB should be estimated over tradional NB models, since the variance function is most likely to be structured. The varying dispersion parameter can be used to characterize the structure of the variance. 2) Time-specific models wh a varying dispersion parameter will have a significant impact on the EB estimate, as documented in Table 6 and Figure 6. In addion, the selection of the functional form will affect the value of weight factor used for estimating the EB output. This means that for the same observation, different weight factors can be estimated and they will be dependent upon the functional form and the variance function of the model selected, as detailed in Figures 8 to 1. 3) Similar to point 2), the identification of hazardous ses, using the EB method, can be significantly affected when a GNB model is used. For the same observation, some ses ranked based on the GNB model tend to be ranked as more dangerous than those ranked based on the NB model. The weight factor seems to play a role for the observed differences in ranking. Even though the tradional Spearman and Kandall s ranking tests showed a posive correlation, a detailed analysis of the ranking showed oppose results, suggesting that these tests may not be appropriate for this kind of analysis. 4) Thus, automatically adapting a functional form, especially for flow-only models, from previous studies should be avoided, as well as models that are solely based 14

16 on a single GOF test statistics. Transportation safety analysts should evaluate different functional forms (when a varying dispersion parameter is employed) using a combination of GOF test statistics, including CURE plots, since the selection of the functional form could affect methods in which the EB output is used. In conclusion, as pointed out by Miaou (25) and to some degree in Miaou and Lord (23), statistics is just one of many sciences. The "science part" of a statistical model is related to the mean function. What the transportation safety analyst needs the most is to better understand the functional structure of the mean function. While this study focused on the structure of the variance function, especially the one related to the dispersion parameter (or s inverse), transportation safety analysts should never lose sight of the most important part of a statistical model (i.e., the structure of the mean function). In theory, any modifications to the structure of the mean function (via the inclusion or exclusion of covariates in crash prediction models) will affect the structure of the variance function. Ideally, wh any types of model, a good structure and the proper selection of the covariates for the mean function would make the structure of the variance function vanish or at least significantly minimize the magnude of the variance (e.g., see Miaou and Song, 25 and Mra and Washington, 27). However, since is practically unachievable to obtain a perfect mean function (see Xie et al., 27), transportation safety analysts will continue to work wh the variance function for the following three rationales: 1) looking for clues to improve the deficiency of the mean function, 2) reducing the bias in the mean function, and 3) hopefully providing more accurate statistical inferences for decision-making purposes. These are important issues that need to be addressed in future research projects. Finally, although interesting results were found in this study, further research should be conducted using different datasets (urban intersections, rural and urban segments, etc.). This may tell us whether the issues raised in this paper are a common problem, an isolated problem, or specific to rural intersections or this dataset. 15

17 REFERENCE Abbas, K.A. Traffic safety assessment and development of predictive models for accidents on rural roads in Egypt. Accident Analysis & Prevention, Vol. 36, No. 2, 24, pp Cameron, A.C., and Trivedi, P.K. Regression analysis of count data, Econometric Society Monograph No.3, Cambridge Universy Press, El-Basyouny, K., and Sayed, T. Comparison of Two Negative Binomial Regression Techniques in Developing Accident Prediction Models, Presented at Transportation Research Board 85th Annual Meeting, 26. Fzpatrick, K., Lord, D. and Park B.-J. Accident Modification Factors for Medians on Freeways and Multilane Highways. Transportation Research Record, 28, in press. (Presented at the 87th Annual Meeting of the Transportation Research Board.) Geedipally, S.R. and Lord, D. Effects of the Varying Dispersion Parameter of Poisson-gamma models on the Estimation of Confidence Intervals of Crash Prediction models. Presented at the 87 th Annual Meeting of the Transportation Board, Washington, D.C., 27. Hardin, J.J and Hilbe, J.M. Generalized Linear Models and Extensions. Stata Press, Collage Station, Texas, 21. Hardin, J.J., and Hilbe J.M. Generalized Estimating Equations. Chapman & Hall/CRC, Boca Raton, FL, 23. Harwood, D.W., Council, F.M., Hauer, E., Hughes, W.E., and Vogt, A. Prediction of the expected safety performance of rural two-lane highways, Federal Highway Administration, Final Report, FHWA-RD-99-27, 2. Hauer, E. Observational before-after studies in road safety. Pergamon Press, Elsevier Science Ltd., Oxford, England, Hauer, E. Identification of Se wh Promise Transportation Research Record 1542, 1996, pp Hauer, E. Overdispersion in modelling accidents on road sections and in Empirical Bayes estimation. Accident Analysis & Prevention, Vol. 33, No. 6, 21. pp Hauer, E. and Bamfo, J. Two tools for finding what function links the dependent variable to the explanatory variables. In Proceedings of the ICTCT 1997 Conference, Lund, Sweden., Heydecker, B.G., and J. Wu, Identification of ses for road accident remedial work by Bayesian statistical methods: An example of uncertain inference. Advances in Engineering Software, Vol. 32, 21, pp Hughes, W., K. Eccles, D. Harwood, I. Potts, and E. Hauer Development of a Highway Safety Manual. Appendix C: Highway Safety Manual Prototype Chapter: Two-Lane Highways. NCHRP Web Document 62 (Project 17-18(4)). Washington, D.C., 25. ( accessed October 27) Karim, L.-B., and Sayed, T. Comparison of two negative binomial regression techniques in developing accident prediction models. Transportation Research Record 195, 26, pp

18 Lord, D. Modeling motor vehicle crashes using Poisson-gamma models: Examining the effects of low sample mean values and small sample size on the estimation of the fixed dispersion parameter. Accident Analysis & Prevention, Vol. 38, No. 4, 26, pp Lord, D., and Bonneson, J.A. Development of Accident Modification Factors for Rural Frontage Road Segments in Texas. Zachry Department of Civil Engineering, Texas A&M Universy, College Station, TX, 26. Lord, D., Guikema, S.D., and Geedipally, S. Application of the Conway-Maxwell-Poisson Generalized Linear Model for Analyzing Motor Vehicle Crashes. Accident Analysis & Prevention, 28, in press. Lord, D., Manar, A., and Vizioli, A. Modeling crash-flow-densy and crash-flow-v/c ratio relationships for rural and urban freeway segments. Accident Analysis & Prevention, Vol. 37, No. 1, 25b, pp Lord, D., and Persaud, B.N. Accident prediction models wh and whout trend: Application of the Generalized Estimating Equations (GEE) procedure. Transportation Research Record 1717, 2, pp Lord, D., and Persaud B.N. Estimating the safety performance of urban transportation networks. Accident Analysis & Prevention. Vol. 36, No. 2, 24, pp Lord, D., Washington, S.P., and Ivan, J.N. Poisson, Poisson-Gamma and Zero Inflated Regression Models of Motor Vehicle Crashes: Balancing Statistical F and Theory. Accident Analysis & Prevention, Vol. 37, No. 1, 25a, pp Lyon, C., Haq, A., Persaud, B.N., and Kodama, S.T. Development of safety performance functions for signalized intersections in a large urban area and application to evaluation of left turn priory treatment, Presented at the 84th Annual Meeting of Transportation Research Board, Washington, D.C., 25. Martin, J.-L. Relationship between crash rate and hourly traffic flow on interurban motorways. Accident Analysis & Prevention, Vol. 34, 22, pp Miaou, S.P. The relationship between truck accidents and geometric design of road sections: Poisson versus negative binomial regressions. Accident Analysis & Prevention, Vol. 26, No. 4, 1994, pp Miaou, S-P. Measuring the goodness-of-f of accident prediction models. Federal Highway Administration, Final Report, FHWA-RD-96-4, Miaou, S-P., and Lord, D. Modeling traffic crash-flow relationships for intersections: Dispersion parameter, functional form, and Bayes versus empirical Bayes methods. Transportation Research Record 184, 23, pp Miaou, S.-P., and Song, J.J. Bayesian ranking of ses for engineering safety improvements: Decision parameter, treatabily concept, statistical crerion, and spatial dependence, Accident Analysis & Prevention, Vol. 37, No. 4, 25, pp Miaou, S-P. Personal Communication by , April 22, 25. Miranda-Moreno, L.F., Fu, L., Saccomanno, F.F., and Labbe, A. Alternative risk models for ranking locations for safety improvement. Transportation Research Record 198, 25, pp 1-8. Mra, S., and Washington, S. On the nature of over-dispersion in motor vehicle crash prediction models. Accident Analysis & Prevention, Vol. 39, No. 3, 27, pp

19 Mountain, L., and Fawaz, B. Estimating accidents at junctions using routinely-available input data. Traffic Engineering and Control, Vol. 11, 1996, pp Mountain, L., Maher, M.J., and Fawaz, B. The influence of trend on estimates of accidents at junctions, Accident Analysis & Prevention, Vol. 3, No. 5, 1998, pp Myers, R.H. Classical and Modern Regression wh Applications, 2nd ed. Duxbury Press, Pacific Grove, CA, 2. Nicholson, A., and Turner, S. Estimating accidents in a road network. In Proceedings of Roads 96 Conference, Part 5, New Zealand, 1996, pp Park, B.-J., and Lord D. Adjustment for the Maximum Likelihood Estimate of the Negative Binomial Dispersion Parameter. Transportation Research Record, 28, in press. (Presented at the 87th Annual Meeting of the Transportation Research Board.) Park, E.S., and Lord, D. Multivariate Poisson-Lognormal Models for Jointly Modeling Crash Frequency by Severy. Transportation Research Record 219, 27, pp Persaud, B.N. Statistical methods in highway safety analysis, A Synthesis of Highway Practice of Highway Practice, National Cooperative Highway Research Program Synthesis 295, TRB, National Research Council National Academy Press, Washington, D.C., 21. Persaud, B.N., Retting, R.A., Garder, P.E., and Lord, D. Safety effect of roundabout conversions in the uned states: empirical bayes observational before-after study, Journal of the Transportation Research Record 1751, 21, pp Poch, M., and Mannering, F.L. Negative binomial analysis of intersection-accident frequencies, Journal of Transportation Engineering, Vol. 122, No. 2, 1996, pp Powers, M., and Carson, J. Before-after crash analysis: A primer for using the empirical Bayes method Tutorial. U.S. Department of Transportation, Final Report, FHWA/MT , 24. Saccomanno, F.F., Grossi, R., Greco, D., and Mehmood, A. Identifying black spots along Highway SS17 in southern Italy using two models. Journal of Transportation Engineering, Vol. 127, No. 6, 21, pp Stata, Reference Manual, Release 8, Stata Press, 23. Turner, S., and Nicholson A. Intersection accident estimation: The role of intersection location and non-collision flows. Accident Analysis & Prevention, Vol. 3, No. 4, 1998, pp Vogt, A. Crash models for rural intersections: Four-lane by two-lane stop-controlled and twolane by two-lane signalized, Federal Highway Administration, Final Report, FHWA-RD , Washington, S., Persaud, B., Lyon, C., and Oh, J., Validation of Accident Models for Intersections, Federal Highway Administration, Final Report, FHWA-RD-3-37, 25. Wang, X., and Abdel-Aty, M.A. Investigation of Signalized Intersection Right-Angle Crash Occurrence at Intersection, Roadway, and Approach Levels. Paper presented at the 84 th Annual Meeting of the TRB, Washington, D.C., 27. Xie, Y., Lord, D., and Zhang Y. Predicting Motor Vehicle Collisions using Bayesian Neural Networks: An Empirical Analysis. Accident Analysis & Prevention, Vol. 39, No. 5, 27, pp

Effects of the Varying Dispersion Parameter of Poisson-gamma models on the estimation of Confidence Intervals of Crash Prediction models

Effects of the Varying Dispersion Parameter of Poisson-gamma models on the estimation of Confidence Intervals of Crash Prediction models Effects of the Varying Dispersion Parameter of Poisson-gamma models on the estimation of Confidence Intervals of Crash Prediction models By Srinivas Reddy Geedipally Research Assistant Zachry Department

More information

TRB Paper # Examining the Crash Variances Estimated by the Poisson-Gamma and Conway-Maxwell-Poisson Models

TRB Paper # Examining the Crash Variances Estimated by the Poisson-Gamma and Conway-Maxwell-Poisson Models TRB Paper #11-2877 Examining the Crash Variances Estimated by the Poisson-Gamma and Conway-Maxwell-Poisson Models Srinivas Reddy Geedipally 1 Engineering Research Associate Texas Transportation Instute

More information

Does the Dispersion Parameter of Negative Binomial Models Truly. Estimate the Level of Dispersion in Over-dispersed Crash data with a. Long Tail?

Does the Dispersion Parameter of Negative Binomial Models Truly. Estimate the Level of Dispersion in Over-dispersed Crash data with a. Long Tail? Does the Dispersion Parameter of Negative Binomial Models Truly Estimate the Level of Dispersion in Over-dispersed Crash data wh a Long Tail? Yajie Zou, Ph.D. Research associate Smart Transportation Applications

More information

The Negative Binomial Lindley Distribution as a Tool for Analyzing Crash Data Characterized by a Large Amount of Zeros

The Negative Binomial Lindley Distribution as a Tool for Analyzing Crash Data Characterized by a Large Amount of Zeros The Negative Binomial Lindley Distribution as a Tool for Analyzing Crash Data Characterized by a Large Amount of Zeros Dominique Lord 1 Associate Professor Zachry Department of Civil Engineering Texas

More information

TRB Paper Examining Methods for Estimating Crash Counts According to Their Collision Type

TRB Paper Examining Methods for Estimating Crash Counts According to Their Collision Type TRB Paper 10-2572 Examining Methods for Estimating Crash Counts According to Their Collision Type Srinivas Reddy Geedipally 1 Engineering Research Associate Texas Transportation Institute Texas A&M University

More information

The Conway Maxwell Poisson Model for Analyzing Crash Data

The Conway Maxwell Poisson Model for Analyzing Crash Data The Conway Maxwell Poisson Model for Analyzing Crash Data (Discussion paper associated with The COM Poisson Model for Count Data: A Survey of Methods and Applications by Sellers, K., Borle, S., and Shmueli,

More information

Investigating the Effect of Modeling Single-Vehicle and Multi-Vehicle Crashes Separately on Confidence Intervals of Poisson-gamma Models

Investigating the Effect of Modeling Single-Vehicle and Multi-Vehicle Crashes Separately on Confidence Intervals of Poisson-gamma Models Investigating the Effect of Modeling Single-Vehicle and Multi-Vehicle Crashes Separately on Confidence Intervals of Poisson-gamma Models Srinivas Reddy Geedipally 1 Engineering Research Associate Texas

More information

TRB Paper Hot Spot Identification by Modeling Single-Vehicle and Multi-Vehicle Crashes Separately

TRB Paper Hot Spot Identification by Modeling Single-Vehicle and Multi-Vehicle Crashes Separately TRB Paper 10-2563 Hot Spot Identification by Modeling Single-Vehicle and Multi-Vehicle Crashes Separately Srinivas Reddy Geedipally 1 Engineering Research Associate Texas Transportation Institute Texas

More information

Accident Analysis and Prevention xxx (2006) xxx xxx. Dominique Lord

Accident Analysis and Prevention xxx (2006) xxx xxx. Dominique Lord Accident Analysis and Prevention xxx (2006) xxx xxx Modeling motor vehicle crashes using Poisson-gamma models: Examining the effects of low sample mean values and small sample size on the estimation of

More information

Analyzing Highly Dispersed Crash Data Using the Sichel Generalized Additive Models for Location, Scale and Shape

Analyzing Highly Dispersed Crash Data Using the Sichel Generalized Additive Models for Location, Scale and Shape Analyzing Highly Dispersed Crash Data Using the Sichel Generalized Additive Models for Location, Scale and Shape By Yajie Zou Ph.D. Candidate Zachry Department of Civil Engineering Texas A&M University,

More information

The Negative Binomial-Lindley Generalized Linear Model: Characteristics and Application using Crash Data

The Negative Binomial-Lindley Generalized Linear Model: Characteristics and Application using Crash Data The Negative Binomial-Lindley Generalized Linear Model: Characteristics and Application using Crash Data Srinivas Reddy Geedipally 1 Engineering Research Associate Texas Transportation Institute Texas

More information

ABSTRACT (218 WORDS) Prepared for Publication in Transportation Research Record Words: 5,449+1*250 (table) + 6*250 (figures) = 7,199 TRB

ABSTRACT (218 WORDS) Prepared for Publication in Transportation Research Record Words: 5,449+1*250 (table) + 6*250 (figures) = 7,199 TRB TRB 2003-3363 MODELING TRAFFIC CRASH-FLOW RELATIONSHIPS FOR INTERSECTIONS: DISPERSION PARAMETER, FUNCTIONAL FORM, AND BAYES VERSUS EMPIRICAL BAYES Shaw-Pin Miaou Research Scientist Texas Transportation

More information

LEVERAGING HIGH-RESOLUTION TRAFFIC DATA TO UNDERSTAND THE IMPACTS OF CONGESTION ON SAFETY

LEVERAGING HIGH-RESOLUTION TRAFFIC DATA TO UNDERSTAND THE IMPACTS OF CONGESTION ON SAFETY LEVERAGING HIGH-RESOLUTION TRAFFIC DATA TO UNDERSTAND THE IMPACTS OF CONGESTION ON SAFETY Tingting Huang 1, Shuo Wang 2, Anuj Sharma 3 1,2,3 Department of Civil, Construction and Environmental Engineering,

More information

Exploring the Application of the Negative Binomial-Generalized Exponential Model for Analyzing Traffic Crash Data with Excess Zeros

Exploring the Application of the Negative Binomial-Generalized Exponential Model for Analyzing Traffic Crash Data with Excess Zeros Exploring the Application of the Negative Binomial-Generalized Exponential Model for Analyzing Traffic Crash Data with Excess Zeros Prathyusha Vangala Graduate Student Zachry Department of Civil Engineering

More information

Including Statistical Power for Determining. How Many Crashes Are Needed in Highway Safety Studies

Including Statistical Power for Determining. How Many Crashes Are Needed in Highway Safety Studies Including Statistical Power for Determining How Many Crashes Are Needed in Highway Safety Studies Dominique Lord Assistant Professor Texas A&M University, 336 TAMU College Station, TX 77843-336 Phone:

More information

Confirmatory and Exploratory Data Analyses Using PROC GENMOD: Factors Associated with Red Light Running Crashes

Confirmatory and Exploratory Data Analyses Using PROC GENMOD: Factors Associated with Red Light Running Crashes Confirmatory and Exploratory Data Analyses Using PROC GENMOD: Factors Associated with Red Light Running Crashes Li wan Chen, LENDIS Corporation, McLean, VA Forrest Council, Highway Safety Research Center,

More information

Application of the hyper-poisson generalized linear model for analyzing motor vehicle crashes

Application of the hyper-poisson generalized linear model for analyzing motor vehicle crashes Application of the hyper-poisson generalized linear model for analyzing motor vehicle crashes S. Hadi Khazraee 1 Graduate Research Assistant Zachry Department of Civil Engineering Texas A&M University

More information

FULL BAYESIAN POISSON-HIERARCHICAL MODELS FOR CRASH DATA ANALYSIS: INVESTIGATING THE IMPACT OF MODEL CHOICE ON SITE-SPECIFIC PREDICTIONS

FULL BAYESIAN POISSON-HIERARCHICAL MODELS FOR CRASH DATA ANALYSIS: INVESTIGATING THE IMPACT OF MODEL CHOICE ON SITE-SPECIFIC PREDICTIONS FULL BAYESIAN POISSON-HIERARCHICAL MODELS FOR CRASH DATA ANALYSIS: INVESTIGATING THE IMPACT OF MODEL CHOICE ON SITE-SPECIFIC PREDICTIONS A Dissertation by SEYED HADI KHAZRAEE KHOSHROOZI Submitted to the

More information

How to Incorporate Accident Severity and Vehicle Occupancy into the Hot Spot Identification Process?

How to Incorporate Accident Severity and Vehicle Occupancy into the Hot Spot Identification Process? How to Incorporate Accident Severity and Vehicle Occupancy into the Hot Spot Identification Process? Luis F. Miranda-Moreno, Liping Fu, Satish Ukkusuri, and Dominique Lord This paper introduces a Bayesian

More information

Crash Data Modeling with a Generalized Estimator

Crash Data Modeling with a Generalized Estimator Crash Data Modeling with a Generalized Estimator Zhirui Ye* Professor, Ph.D. Jiangsu Key Laboratory of Urban ITS Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies Southeast

More information

A Full Bayes Approach to Road Safety: Hierarchical Poisson. Mixture Models, Variance Function Characterization, and. Prior Specification

A Full Bayes Approach to Road Safety: Hierarchical Poisson. Mixture Models, Variance Function Characterization, and. Prior Specification A Full Bayes Approach to Road Safety: Hierarchical Poisson Mixture Models, Variance Function Characterization, and Prior Specification Mohammad Heydari A Thesis in The Department of Building, Civil and

More information

Comparison of Confidence and Prediction Intervals for Different Mixed-Poisson Regression Models

Comparison of Confidence and Prediction Intervals for Different Mixed-Poisson Regression Models 0 0 0 Comparison of Confidence and Prediction Intervals for Different Mixed-Poisson Regression Models Submitted by John E. Ash Research Assistant Department of Civil and Environmental Engineering, University

More information

EXAMINING THE USE OF REGRESSION MODELS FOR DEVELOPING CRASH MODIFICATION FACTORS. A Dissertation LINGTAO WU

EXAMINING THE USE OF REGRESSION MODELS FOR DEVELOPING CRASH MODIFICATION FACTORS. A Dissertation LINGTAO WU EXAMINING THE USE OF REGRESSION MODELS FOR DEVELOPING CRASH MODIFICATION FACTORS A Dissertation by LINGTAO WU Submitted to the Office of Graduate and Professional Studies of Texas A&M University in partial

More information

Bayesian Poisson Hierarchical Models for Crash Data Analysis: Investigating the Impact of Model Choice on Site-Specific Predictions

Bayesian Poisson Hierarchical Models for Crash Data Analysis: Investigating the Impact of Model Choice on Site-Specific Predictions Khazraee, Johnson and Lord Page 1 of 47 Bayesian Poisson Hierarchical Models for Crash Data Analysis: Investigating the Impact of Model Choice on Site-Specific Predictions S. Hadi Khazraee, Ph.D.* Safety

More information

Global Journal of Engineering Science and Research Management

Global Journal of Engineering Science and Research Management DEVELOPMENT AND APPLICATION OF CRASH MODIFICATION FACTORS FOR TRAFFIC FLOW PARAMETERS ON URBAN FREEWAY SEGMENTS Eugene Vida Maina, Ph.D*, Janice R. Daniel, Ph.D * Operations Systems Research Analyst, Dallas

More information

Evaluation of fog-detection and advisory-speed system

Evaluation of fog-detection and advisory-speed system Evaluation of fog-detection and advisory-speed system A. S. Al-Ghamdi College of Engineering, King Saud University, P. O. Box 800, Riyadh 11421, Saudi Arabia Abstract Highway safety is a major concern

More information

Statistical Model Of Road Traffic Crashes Data In Anambra State, Nigeria: A Poisson Regression Approach

Statistical Model Of Road Traffic Crashes Data In Anambra State, Nigeria: A Poisson Regression Approach Statistical Model Of Road Traffic Crashes Data In Anambra State, Nigeria: A Poisson Regression Approach Nwankwo Chike H., Nwaigwe Godwin I Abstract: Road traffic crashes are count (discrete) in nature.

More information

LOGISTIC REGRESSION Joseph M. Hilbe

LOGISTIC REGRESSION Joseph M. Hilbe LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of

More information

LINEAR REGRESSION CRASH PREDICTION MODELS: ISSUES AND PROPOSED SOLUTIONS

LINEAR REGRESSION CRASH PREDICTION MODELS: ISSUES AND PROPOSED SOLUTIONS LINEAR REGRESSION CRASH PREDICTION MODELS: ISSUES AND PROPOSED SOLUTIONS FINAL REPORT PennDOT/MAUTC Agreement Contract No. VT-8- DTRS99-G- Prepared for Virginia Transportation Research Council By H. Rakha,

More information

HOTSPOTS FOR VESSEL-TO-VESSEL AND VESSEL-TO-FIX OBJECT ACCIDENTS ALONG THE GREAT LAKES SEAWAY

HOTSPOTS FOR VESSEL-TO-VESSEL AND VESSEL-TO-FIX OBJECT ACCIDENTS ALONG THE GREAT LAKES SEAWAY 0 0 HOTSPOTS FOR VESSEL-TO-VESSEL AND VESSEL-TO-FIX OBJECT ACCIDENTS ALONG THE GREAT LAKES SEAWAY Bircan Arslannur* MASc. Candidate Department of Civil and Environmental Engineering, University of Waterloo

More information

The relationship between urban accidents, traffic and geometric design in Tehran

The relationship between urban accidents, traffic and geometric design in Tehran Urban Transport XVIII 575 The relationship between urban accidents, traffic and geometric design in Tehran S. Aftabi Hossein 1 & M. Arabani 2 1 Bandar Anzali Branch, Islamic Azad University, Iran 2 Department

More information

Comparison of Accident Rates Using the Likelihood Ratio Testing Technique

Comparison of Accident Rates Using the Likelihood Ratio Testing Technique 50 TRANSPORTATION RESEARCH RECORD 101 Comparison of Accident Rates Using the Likelihood Ratio Testing Technique ALI AL-GHAMDI Comparing transportation facilities (i.e., intersections and road sections)

More information

DEVELOPMENT OF CRASH PREDICTION MODEL USING MULTIPLE REGRESSION ANALYSIS Harshit Gupta 1, Dr. Siddhartha Rokade 2 1

DEVELOPMENT OF CRASH PREDICTION MODEL USING MULTIPLE REGRESSION ANALYSIS Harshit Gupta 1, Dr. Siddhartha Rokade 2 1 DEVELOPMENT OF CRASH PREDICTION MODEL USING MULTIPLE REGRESSION ANALYSIS Harshit Gupta 1, Dr. Siddhartha Rokade 2 1 PG Student, 2 Assistant Professor, Department of Civil Engineering, Maulana Azad National

More information

Accident Prediction Models for Freeways

Accident Prediction Models for Freeways TRANSPORTATION RESEARCH RECORD 1401 55 Accident Prediction Models for Freeways BHAGWANT PERSAUD AND LESZEK DZBIK The modeling of freeway accidents continues to be of interest because of the frequency and

More information

New Achievement in the Prediction of Highway Accidents

New Achievement in the Prediction of Highway Accidents Article New Achievement in the Prediction of Highway Accidents Gholamali Shafabakhsh a, * and Yousef Sajed b Faculty of Civil Engineering, Semnan University, University Sq., P.O. Box 35196-45399, Semnan,

More information

NCHRP Inclusion Process and Literature Review Procedure for Part D

NCHRP Inclusion Process and Literature Review Procedure for Part D NCHRP 17-7 Inclusion Process and Literature Review Procedure for Part D Geni Bahar, P. Eng. Margaret Parkhill, P. Eng. Errol Tan, P. Eng. Chris Philp, P. Eng. Nesta Morris, M.Sc. (Econ) Sasha Naylor, EIT

More information

Local Calibration Factors for Implementing the Highway Safety Manual in Maine

Local Calibration Factors for Implementing the Highway Safety Manual in Maine Local Calibration Factors for Implementing the Highway Safety Manual in Maine 2017 Northeast Transportation Safety Conference Cromwell, Connecticut October 24-25, 2017 MAINE Darryl Belz, P.E. Maine Department

More information

Hot Spot Identification using frequency of distinct crash types rather than total crashes

Hot Spot Identification using frequency of distinct crash types rather than total crashes Australasian Transport Research Forum 010 Proceedings 9 September 1 October 010, Canberra, Australia Publication website: http://www.patrec.org/atrf.aspx Hot Spot Identification using frequency of distinct

More information

Freeway rear-end collision risk for Italian freeways. An extreme value theory approach

Freeway rear-end collision risk for Italian freeways. An extreme value theory approach XXII SIDT National Scientific Seminar Politecnico di Bari 14 15 SETTEMBRE 2017 Freeway rear-end collision risk for Italian freeways. An extreme value theory approach Gregorio Gecchele Federico Orsini University

More information

Safety Effectiveness of Variable Speed Limit System in Adverse Weather Conditions on Challenging Roadway Geometry

Safety Effectiveness of Variable Speed Limit System in Adverse Weather Conditions on Challenging Roadway Geometry Safety Effectiveness of Variable Speed Limit System in Adverse Weather Conditions on Challenging Roadway Geometry Promothes Saha, Mohamed M. Ahmed, and Rhonda Kae Young This paper examined the interaction

More information

Hot Spot Analysis: Improving a Local Indicator of Spatial Association for Application in Traffic Safety

Hot Spot Analysis: Improving a Local Indicator of Spatial Association for Application in Traffic Safety Hot Spot Analysis: Improving a Local Indicator of Spatial Association for Application in Traffic Safety Elke Moons, Tom Brijs and Geert Wets Transportation Research Institute, Hasselt University, Science

More information

Lecture-19: Modeling Count Data II

Lecture-19: Modeling Count Data II Lecture-19: Modeling Count Data II 1 In Today s Class Recap of Count data models Truncated count data models Zero-inflated models Panel count data models R-implementation 2 Count Data In many a phenomena

More information

Bayesian multiple testing procedures for hotspot identification

Bayesian multiple testing procedures for hotspot identification Accident Analysis and Prevention 39 (2007) 1192 1201 Bayesian multiple testing procedures for hotspot identification Luis F. Miranda-Moreno a,b,, Aurélie Labbe c,1, Liping Fu d,2 a Centre for Data and

More information

Modeling Simple and Combination Effects of Road Geometry and Cross Section Variables on Traffic Accidents

Modeling Simple and Combination Effects of Road Geometry and Cross Section Variables on Traffic Accidents Modeling Simple and Combination Effects of Road Geometry and Cross Section Variables on Traffic Accidents Terrance M. RENGARASU MS., Doctoral Degree candidate Graduate School of Engineering, Hokkaido University

More information

Planning Level Regression Models for Crash Prediction on Interchange and Non-Interchange Segments of Urban Freeways

Planning Level Regression Models for Crash Prediction on Interchange and Non-Interchange Segments of Urban Freeways Planning Level Regression Models for Crash Prediction on Interchange and Non-Interchange Segments of Urban Freeways Arun Chatterjee, Professor Department of Civil and Environmental Engineering The University

More information

ISSUES RELATED TO THE APPLICATION OF ACCIDENT PREDICTION MODELS FOR THE COMPUTATION OF ACCIDENT RISK ON TRANSPORTATION NETWORKS.

ISSUES RELATED TO THE APPLICATION OF ACCIDENT PREDICTION MODELS FOR THE COMPUTATION OF ACCIDENT RISK ON TRANSPORTATION NETWORKS. ISSUES RELATED TO THE APPLICATION OF ACCIDENT PREDICTION MODELS FOR THE COMPUTATION OF ACCIDENT RISK ON TRANSPORTATION NETWORKS By Dominique Lord March 25 th 2001 Center for Transportation Safety* Texas

More information

Safety Performance Functions for Partial Cloverleaf On-Ramp Loops for Michigan

Safety Performance Functions for Partial Cloverleaf On-Ramp Loops for Michigan 1 1 1 1 1 1 1 1 0 1 0 1 0 Safety Performance Functions for Partial Cloverleaf On-Ramp Loops for Michigan Elisha Jackson Wankogere Department of Civil and Construction Engineering Western Michigan University

More information

EXAMINATION OF THE SAFETY IMPACTS OF VARYING FOG DENSITIES: A CASE STUDY OF I-77 IN VIRGINIA

EXAMINATION OF THE SAFETY IMPACTS OF VARYING FOG DENSITIES: A CASE STUDY OF I-77 IN VIRGINIA 0 0 0 EXAMINATION OF THE SAFETY IMPACTS OF VARYING FOG DENSITIES: A CASE STUDY OF I- IN VIRGINIA Katie McCann Graduate Research Assistant University of Virginia 0 Edgemont Road Charlottesville, VA 0 --

More information

A CASE STUDY IN HANDLING OVER-DISPERSION IN NEMATODE COUNT DATA SCOTT EDWIN DOUGLAS KREIDER. B.S., The College of William & Mary, 2008 A THESIS

A CASE STUDY IN HANDLING OVER-DISPERSION IN NEMATODE COUNT DATA SCOTT EDWIN DOUGLAS KREIDER. B.S., The College of William & Mary, 2008 A THESIS A CASE STUDY IN HANDLING OVER-DISPERSION IN NEMATODE COUNT DATA by SCOTT EDWIN DOUGLAS KREIDER B.S., The College of William & Mary, 2008 A THESIS submitted in partial fulfillment of the requirements for

More information

Bayesian Model Diagnostics and Checking

Bayesian Model Diagnostics and Checking Earvin Balderama Quantitative Ecology Lab Department of Forestry and Environmental Resources North Carolina State University April 12, 2013 1 / 34 Introduction MCMCMC 2 / 34 Introduction MCMCMC Steps in

More information

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error

More information

arxiv: v1 [cs.cv] 28 Nov 2017

arxiv: v1 [cs.cv] 28 Nov 2017 A fatal point concept and a low-sensitivity quantitative measure for traffic safety analytics arxiv:1711.10131v1 [cs.cv] 28 Nov 2017 Shan Suthaharan Department of Computer Science University of North Carolina

More information

PLANNING TRAFFIC SAFETY IN URBAN TRANSPORTATION NETWORKS: A SIMULATION-BASED EVALUATION PROCEDURE

PLANNING TRAFFIC SAFETY IN URBAN TRANSPORTATION NETWORKS: A SIMULATION-BASED EVALUATION PROCEDURE PLANNING TRAFFIC SAFETY IN URBAN TRANSPORTATION NETWORKS: A SIMULATION-BASED EVALUATION PROCEDURE Michele Ottomanelli and Domenico Sassanelli Polytechnic of Bari Dept. of Highways and Transportation EU

More information

Confidence and prediction intervals for. generalised linear accident models

Confidence and prediction intervals for. generalised linear accident models Confidence and prediction intervals for generalised linear accident models G.R. Wood September 8, 2004 Department of Statistics, Macquarie University, NSW 2109, Australia E-mail address: gwood@efs.mq.edu.au

More information

Comparison of spatial methods for measuring road accident hotspots : a case study of London

Comparison of spatial methods for measuring road accident hotspots : a case study of London Journal of Maps ISSN: (Print) 1744-5647 (Online) Journal homepage: http://www.tandfonline.com/loi/tjom20 Comparison of spatial methods for measuring road accident hotspots : a case study of London Tessa

More information

AN ARTIFICIAL NEURAL NETWORK MODEL FOR ROAD ACCIDENT PREDICTION: A CASE STUDY OF KHULNA METROPOLITAN CITY

AN ARTIFICIAL NEURAL NETWORK MODEL FOR ROAD ACCIDENT PREDICTION: A CASE STUDY OF KHULNA METROPOLITAN CITY Proceedings of the 4 th International Conference on Civil Engineering for Sustainable Development (ICCESD 2018), 9~11 February 2018, KUET, Khulna, Bangladesh (ISBN-978-984-34-3502-6) AN ARTIFICIAL NEURAL

More information

NCHRP. Web-Only Document 126: Methodology to Predict the Safety Performance of Rural Multilane Highways

NCHRP. Web-Only Document 126: Methodology to Predict the Safety Performance of Rural Multilane Highways NCHRP Web-Only Document 126: Methodology to Predict the Safety Performance of Rural Multilane Highways Dominique Lord Srinivas R. Geedipally Texas Transportation Institute & Texas A&M University College

More information

EFFECT OF HIGHWAY GEOMETRICS ON ACCIDENT MODELING

EFFECT OF HIGHWAY GEOMETRICS ON ACCIDENT MODELING Sustainable Solutions in Structural Engineering and Construction Edited by Saha, S., Lloyd, N., Yazdani, S., and Singh, A. Copyright 2015 ISEC Press ISBN: 978-0-9960437-1-7 EFFECT OF HIGHWAY GEOMETRICS

More information

ANALYSIS OF INTRINSIC FACTORS CONTRIBUTING TO URBAN ROAD CRASHES

ANALYSIS OF INTRINSIC FACTORS CONTRIBUTING TO URBAN ROAD CRASHES S. Raicu, et al., Int. J. of Safety and Security Eng., Vol. 7, No. 1 (2017) 1 9 ANALYSIS OF INTRINSIC FACTORS CONTRIBUTING TO URBAN ROAD CRASHES S. RAICU, D. COSTESCU & S. BURCIU Politehnica University

More information

GeoTAIS: An Application of Spatial Analysis for Traffic Safety Improvements on Provincial Highways in Saskatchewan

GeoTAIS: An Application of Spatial Analysis for Traffic Safety Improvements on Provincial Highways in Saskatchewan GeoTAIS: An Application of Spatial Analysis for Traffic Safety Improvements on Provincial Highways in Saskatchewan By: Brandt Denham, B.Sc. Head, GeoTAIS Project and Data Analysis Traffic Safety Program

More information

DAYLIGHT, TWILIGHT, AND NIGHT VARIATION IN ROAD ENVIRONMENT-RELATED FREEWAY TRAFFIC CRASHES IN KOREA

DAYLIGHT, TWILIGHT, AND NIGHT VARIATION IN ROAD ENVIRONMENT-RELATED FREEWAY TRAFFIC CRASHES IN KOREA DAYLIGHT, TWILIGHT, AND NIGHT VARIATION IN ROAD ENVIRONMENT-RELATED FREEWAY TRAFFIC CRASHES IN KOREA Sungmin Hong, Ph.D. Korea Transportation Safety Authority 17, Hyeoksin 6-ro, Gimcheon-si, Gyeongsangbuk-do,

More information

Texas A&M University

Texas A&M University Texas A&M University CVEN 658 Civil Engineering Applications of GIS Hotspot Analysis of Highway Accident Spatial Pattern Based on Network Spatial Weights Instructor: Dr. Francisco Olivera Author: Zachry

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

Use of Crash Report Data for Safety Engineering in Small- and Mediumsized

Use of Crash Report Data for Safety Engineering in Small- and Mediumsized Use of Crash Report Data for Safety Engineering in Small- and Mediumsized MPOs 2015 AMPO Annual Conference Sina Kahrobaei, Transportation Planner Doray Hill, Jr., Director October 21, 2015 San Angelo MPO,

More information

Prediction of Bike Rental using Model Reuse Strategy

Prediction of Bike Rental using Model Reuse Strategy Prediction of Bike Rental using Model Reuse Strategy Arun Bala Subramaniyan and Rong Pan School of Computing, Informatics, Decision Systems Engineering, Arizona State University, Tempe, USA. {bsarun, rong.pan}@asu.edu

More information

DEVELOPING DECISION SUPPORT TOOLS FOR THE IMPLEMENTATION OF BICYCLE AND PEDESTRIAN SAFETY STRATEGIES

DEVELOPING DECISION SUPPORT TOOLS FOR THE IMPLEMENTATION OF BICYCLE AND PEDESTRIAN SAFETY STRATEGIES DEVELOPING DECISION SUPPORT TOOLS FOR THE IMPLEMENTATION OF BICYCLE AND PEDESTRIAN SAFETY STRATEGIES Deo Chimba, PhD., P.E., PTOE Associate Professor Civil Engineering Department Tennessee State University

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Traffic Surveillance from a Safety Perspective: An ITS Data Application

Traffic Surveillance from a Safety Perspective: An ITS Data Application Proceedings of the 8th International IEEE Conference on Intelligent Transportation Systems Vienna, Austria, September 13-16, 2005 WB4.2 Traffic Surveillance from a Safety Perspective: An ITS Data Application

More information

Phd Program in Transportation. Transport Demand Modeling. Session 8

Phd Program in Transportation. Transport Demand Modeling. Session 8 Phd Program in Transportation Transport Demand Modeling Luis Martínez (based on the Lessons of Anabela Ribeiro TDM2010) Session 8 Generalized Linear Models Phd in Transportation / Transport Demand Modelling

More information

Risk Assessment of Highway Bridges: A Reliability-based Approach

Risk Assessment of Highway Bridges: A Reliability-based Approach Risk Assessment of Highway Bridges: A Reliability-based Approach by Reynaldo M. Jr., PhD Indiana University-Purdue University Fort Wayne pablor@ipfw.edu Abstract: Many countries are currently experiencing

More information

Varieties of Count Data

Varieties of Count Data CHAPTER 1 Varieties of Count Data SOME POINTS OF DISCUSSION What are counts? What are count data? What is a linear statistical model? What is the relationship between a probability distribution function

More information

Application of Poisson and Negative Binomial Regression Models in Modelling Oil Spill Data in the Niger Delta

Application of Poisson and Negative Binomial Regression Models in Modelling Oil Spill Data in the Niger Delta International Journal of Science and Engineering Investigations vol. 7, issue 77, June 2018 ISSN: 2251-8843 Application of Poisson and Negative Binomial Regression Models in Modelling Oil Spill Data in

More information

Macro-level Pedestrian and Bicycle Crash Analysis: Incorporating Spatial Spillover Effects in Dual State Count Models

Macro-level Pedestrian and Bicycle Crash Analysis: Incorporating Spatial Spillover Effects in Dual State Count Models Macro-level Pedestrian and Bicycle Crash Analysis: Incorporating Spatial Spillover Effects in Dual State Count Models Qing Cai Jaeyoung Lee* Naveen Eluru Mohamed Abdel-Aty Department of Civil, Environment

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Characterizing the Performance of the Conway-Maxwell Poisson Generalized Linear Model

Characterizing the Performance of the Conway-Maxwell Poisson Generalized Linear Model Characterizing the Performance of the Conway-Maxwell Poisson Generalized Linear Model Royce A. Francis 1,2, Srinivas Reddy Geedipally 3, Seth D. Guikema 2, Soma Sekhar Dhavala 5, Dominique Lord 4, Sarah

More information

Unobserved Heterogeneity and the Statistical Analysis of Highway Accident Data. Fred Mannering University of South Florida

Unobserved Heterogeneity and the Statistical Analysis of Highway Accident Data. Fred Mannering University of South Florida Unobserved Heterogeneity and the Statistical Analysis of Highway Accident Data Fred Mannering University of South Florida Highway Accidents Cost the lives of 1.25 million people per year Leading cause

More information

Rate-Quality Control Method of Identifying Hazardous Road Locations

Rate-Quality Control Method of Identifying Hazardous Road Locations 44 TRANSPORTATION RESEARCH RECORD 1542 Rate-Quality Control Method of Identifying Hazardous Road Locations ROBERT W. STOKES AND MADANIYO I. MUTABAZI A brief historical perspective on the development of

More information

Linear Regression Models

Linear Regression Models Linear Regression Models Model Description and Model Parameters Modelling is a central theme in these notes. The idea is to develop and continuously improve a library of predictive models for hazards,

More information

EVALUATION OF SAFETY PERFORMANCES ON FREEWAY DIVERGE AREA AND FREEWAY EXIT RAMPS. Transportation Seminar February 16 th, 2009

EVALUATION OF SAFETY PERFORMANCES ON FREEWAY DIVERGE AREA AND FREEWAY EXIT RAMPS. Transportation Seminar February 16 th, 2009 EVALUATION OF SAFETY PERFORMANCES ON FREEWAY DIVERGE AREA AND FREEWAY EXIT RAMPS Transportation Seminar February 16 th, 2009 By: Hongyun Chen Graduate Research Assistant 1 Outline Introduction Problem

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1)

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) Authored by: Sarah Burke, PhD Version 1: 31 July 2017 Version 1.1: 24 October 2017 The goal of the STAT T&E COE

More information

STATISTICAL ANALYSIS OF LAW ENFORCEMENT SURVEILLANCE IMPACT ON SAMPLE CONSTRUCTION ZONES IN MISSISSIPPI (Part 1: DESCRIPTIVE)

STATISTICAL ANALYSIS OF LAW ENFORCEMENT SURVEILLANCE IMPACT ON SAMPLE CONSTRUCTION ZONES IN MISSISSIPPI (Part 1: DESCRIPTIVE) STATISTICAL ANALYSIS OF LAW ENFORCEMENT SURVEILLANCE IMPACT ON SAMPLE CONSTRUCTION ZONES IN MISSISSIPPI (Part 1: DESCRIPTIVE) Tulio Sulbaran, Ph.D 1, David Marchman 2 Abstract It is estimated that every

More information

IDAHO TRANSPORTATION DEPARTMENT

IDAHO TRANSPORTATION DEPARTMENT RESEARCH REPORT IDAHO TRANSPORTATION DEPARTMENT RP 191A Potential Crash Reduction Benefits of Safety Improvement Projects Part A: Shoulder Rumble Strips By Ahmed Abdel-Rahim Mubassira Khan University of

More information

Effect of Environmental Factors on Free-Flow Speed

Effect of Environmental Factors on Free-Flow Speed Effect of Environmental Factors on Free-Flow Speed MICHAEL KYTE ZAHER KHATIB University of Idaho, USA PATRICK SHANNON Boise State University, USA FRED KITCHENER Meyer Mohaddes Associates, USA ABSTRACT

More information

Specification testing in panel data models estimated by fixed effects with instrumental variables

Specification testing in panel data models estimated by fixed effects with instrumental variables Specification testing in panel data models estimated by fixed effects wh instrumental variables Carrie Falls Department of Economics Michigan State Universy Abstract I show that a handful of the regressions

More information

9 Correlation and Regression

9 Correlation and Regression 9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n

More information

Safety Effects of Icy-Curve Warning Systems

Safety Effects of Icy-Curve Warning Systems Safety Effects of Icy-Curve Warning Systems Zhirui Ye, David Veneziano, and Ian Turnbull The California Department of Transportation (Caltrans) deployed an icy-curve warning system (ICWS) on a 5-mi section

More information

Parameters Estimation Methods for the Negative Binomial-Crack Distribution and Its Application

Parameters Estimation Methods for the Negative Binomial-Crack Distribution and Its Application Original Parameters Estimation Methods for the Negative Binomial-Crack Distribution and Its Application Pornpop Saengthong 1*, Winai Bodhisuwan 2 Received: 29 March 2013 Accepted: 15 May 2013 Abstract

More information

Model comparison. Patrick Breheny. March 28. Introduction Measures of predictive power Model selection

Model comparison. Patrick Breheny. March 28. Introduction Measures of predictive power Model selection Model comparison Patrick Breheny March 28 Patrick Breheny BST 760: Advanced Regression 1/25 Wells in Bangladesh In this lecture and the next, we will consider a data set involving modeling the decisions

More information

GLM I An Introduction to Generalized Linear Models

GLM I An Introduction to Generalized Linear Models GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March Presented by: Tanya D. Havlicek, ACAS, MAAA ANTITRUST Notice The Casualty Actuarial Society is committed

More information

Implication of GIS Technology in Accident Research in Bangladesh

Implication of GIS Technology in Accident Research in Bangladesh Journal of Bangladesh Institute of Planners ISSN 2075-9363 Vol. 8, 2015 (Printed in December 2016), pp. 159-166, Bangladesh Institute of Planners Implication of GIS Technology in Accident Research in Bangladesh

More information

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 1.1 The Probability Model...1 1.2 Finite Discrete Models with Equally Likely Outcomes...5 1.2.1 Tree Diagrams...6 1.2.2 The Multiplication Principle...8

More information

$QDO\]LQJ$UWHULDO6WUHHWVLQ1HDU&DSDFLW\ RU2YHUIORZ&RQGLWLRQV

$QDO\]LQJ$UWHULDO6WUHHWVLQ1HDU&DSDFLW\ RU2YHUIORZ&RQGLWLRQV Paper No. 001636 $QDO\]LQJ$UWHULDO6WUHHWVLQ1HDU&DSDFLW\ RU2YHUIORZ&RQGLWLRQV Duplication for publication or sale is strictly prohibited without prior written permission of the Transportation Research Board

More information

EVALUATION OF HOTSPOTS IDENTIFICATION USING KERNEL DENSITY ESTIMATION (K) AND GETIS-ORD (G i *) ON I-630

EVALUATION OF HOTSPOTS IDENTIFICATION USING KERNEL DENSITY ESTIMATION (K) AND GETIS-ORD (G i *) ON I-630 EVALUATION OF HOTSPOTS IDENTIFICATION USING KERNEL DENSITY ESTIMATION (K) AND GETIS-ORD (G i *) ON I-630 Uday R. R. Manepalli Graduate Student, Civil, Architectural and Environmental Engineering, Missouri

More information

Geospatial Big Data Analytics for Road Network Safety Management

Geospatial Big Data Analytics for Road Network Safety Management Proceedings of the 2018 World Transport Convention Beijing, China, June 18-21, 2018 Geospatial Big Data Analytics for Road Network Safety Management ABSTRACT Wei Liu GHD Level 1, 103 Tristram Street, Hamilton,

More information

Modeling Crash Frequency of Heavy Vehicles in Rural Freeways

Modeling Crash Frequency of Heavy Vehicles in Rural Freeways Journal of Traffic and Logistics Engineering Vol. 4, No. 2, December 2016 Modeling Crash Frequency of Heavy Vehicles in Rural Freeways Reza Imaninasab School of Civil Engineering, Iran University of Science

More information

How to Detect and Remove Temporal. Autocorrelation in Vehicular Crash Data.

How to Detect and Remove Temporal. Autocorrelation in Vehicular Crash Data. Journal of Transportation Technologies, 2017, 7, 133-147 http://www.scirp.org/journal/jtts ISSN Online: 2160-0481 ISSN Print: 2160-0473 How to Detect and Remove Temporal Autocorrelation in Vehicular Crash

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice

The Model Building Process Part I: Checking Model Assumptions Best Practice The Model Building Process Part I: Checking Model Assumptions Best Practice Authored by: Sarah Burke, PhD 31 July 2017 The goal of the STAT T&E COE is to assist in developing rigorous, defensible test

More information

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Jon Wakefield Departments of Statistics and Biostatistics University of Washington 1 / 37 Lecture Content Motivation

More information

MODELING ACCIDENT FREQUENCIES AS ZERO-ALTERED PROBABILITY PROCESSES: AN EMPIRICAL INQUIRY

MODELING ACCIDENT FREQUENCIES AS ZERO-ALTERED PROBABILITY PROCESSES: AN EMPIRICAL INQUIRY Pergamon PII: SOOOl-4575(97)00052-3 Accid. Anal. and Prev., Vol. 29, No. 6, pp. 829-837, 1997 0 1997 Elsevier Science Ltd All rights reserved. Printed in Great Britain OOOI-4575/97 $17.00 + 0.00 MODELING

More information