Traffic Volume Time-Series Analysis According to the Type of Road Use

Similar documents
Development of Improved Models for Imputing Missing Traffic Counts

Optimization of Short-Term Traffic Count Plan to Improve AADT Estimation Error

Real-Time Travel Time Prediction Using Multi-level k-nearest Neighbor Algorithm and Data Fusion Method

Appendix BAL Baltimore, Maryland 2003 Annual Report on Freeway Mobility and Reliability

Short-term traffic volume prediction using neural networks

QUANTIFICATION OF THE NATURAL VARIATION IN TRAFFIC FLOW ON SELECTED NATIONAL ROADS IN SOUTH AFRICA

Project Appraisal Guidelines

Transportation and Road Weather

CHAPTER 5 FORECASTING TRAVEL TIME WITH NEURAL NETWORKS

Responsive Traffic Management Through Short-Term Weather and Collision Prediction

A Hybrid ARIMA and Neural Network Model to Forecast Particulate. Matter Concentration in Changsha, China

Mapping Accessibility Over Time

Effect of Snow, Temperature and Their Interaction on Highway Truck Traffic

Univariate Short-Term Prediction of Road Travel Times

A More Comprehensive Vulnerability Assessment: Flood Damage in Virginia Beach

Defining Normal Weather for Energy and Peak Normalization

CONGESTION REPORT 1 st Quarter 2016

Parking Occupancy Prediction and Pattern Analysis

Effect of Environmental Factors on Free-Flow Speed

The prediction of passenger flow under transport disturbance using accumulated passenger data

Integrated Electricity Demand and Price Forecasting

AN ARTIFICIAL NEURAL NETWORK MODEL FOR ROAD ACCIDENT PREDICTION: A CASE STUDY OF KHULNA METROPOLITAN CITY

LOADS, CUSTOMERS AND REVENUE

GIS ANALYSIS METHODOLOGY

Improving the travel time prediction by using the real-time floating car data

AREP GAW. AQ Forecasting

EXAMINATION OF THE SAFETY IMPACTS OF VARYING FOG DENSITIES: A CASE STUDY OF I-77 IN VIRGINIA

Short-term wind forecasting using artificial neural networks (ANNs)

Artificial Intelligence

Lecture 19: Common property resources

ANN and Statistical Theory Based Forecasting and Analysis of Power System Variables

5.1 Introduction. 5.2 Data Collection

Path and travel time inference from GPS probe vehicle data

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

Analysis of Fast Input Selection: Application in Time Series Prediction

Multivariate Regression Model Results

Predicting freeway traffic in the Bay Area

That s Hot: Predicting Daily Temperature for Different Locations

Figure 8.2a Variation of suburban character, transit access and pedestrian accessibility by TAZ label in the study area

Forecasting Crude Oil Price Using Neural Networks

NATHAN HALE HIGH SCHOOL PARKING AND TRAFFIC ANALYSIS. Table of Contents

Analysis and modeling of highway truck traffic volume variations during severe winter weather conditions in Canada

Neural Network Approach to Estimating Conditional Quantile Polynomial Distributed Lag (QPDL) Model with an Application to Rubber Price Returns

Municipal Act, 2001 Loi de 2001 sur les municipalités

I. M. Schoeman North West University, South Africa. Abstract

Chapter 7 Forecasting Demand

TRAFFIC FLOW MODELING AND FORECASTING THROUGH VECTOR AUTOREGRESSIVE AND DYNAMIC SPACE TIME MODELS

CIV3703 Transport Engineering. Module 2 Transport Modelling

Do we need Experts for Time Series Forecasting?

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

Estimation of Average Daily Traffic (ADT) Using Short - Duration Volume Counts

MODELLING ENERGY DEMAND FORECASTING USING NEURAL NETWORKS WITH UNIVARIATE TIME SERIES

MODELLING TRAFFIC FLOW ON MOTORWAYS: A HYBRID MACROSCOPIC APPROACH

Artificial Neural Networks. Historical description

Outage Coordination and Business Practices

Use of Weather Inputs in Traffic Volume Forecasting

Appendix B. Traffic Analysis Report

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur

CHARACTERISTICS OF TRAFFIC ACCIDENTS IN COLD, SNOWY HOKKAIDO, JAPAN

Expanding the GSATS Model Area into

UAPD: Predicting Urban Anomalies from Spatial-Temporal Data

CMSC 421: Neural Computation. Applications of Neural Networks

CIVL 7012/8012. Collection and Analysis of Information

3.0 ANALYSIS OF FUTURE TRANSPORTATION NEEDS

Snow and Ice Control POLICY NO. P-01/2015. CITY OF AIRDRIE Snow and Ice Control Policy

Analysis of Multilayer Neural Network Modeling and Long Short-Term Memory

The Scope and Growth of Spatial Analysis in the Social Sciences

APPENDIX IV MODELLING

Driving Cycle and Road Grade on-board prediction for the optimal energy management in EV-PHEVs

AUTO SALES FORECASTING FOR PRODUCTION PLANNING AT FORD

Technical Memorandum #2 Future Conditions

Spatio-Temporal Analytics of Network Data

Using Tourism-Based Travel Demand Model to Estimate Traffic Volumes on Low-Volume Roads

City of Hermosa Beach Beach Access and Parking Study. Submitted by. 600 Wilshire Blvd., Suite 1050 Los Angeles, CA

Spatiotemporal Analysis of Urban Traffic Accidents: A Case Study of Tehran City, Iran

Chapter-1 Introduction

Forecasting of Electric Consumption in a Semiconductor Plant using Time Series Methods

Unit 8: Introduction to neural networks. Perceptrons

Transportation Management Center s Mission

Investigating uncertainty in BPR formula parameters

AN INVESTIGATION INTO THE IMPACT OF RAINFALL ON FREEWAY TRAFFIC FLOW

Travel Pattern Recognition using Smart Card Data in Public Transit

Accessibility-Remoteness (A-R) Index Summary Paper

Appendix C Final Methods and Assumptions for Forecasting Traffic Volumes

peak half-hourly New South Wales

Texas A&M University

Travel Time Estimation with Correlation Analysis of Single Loop Detector Data

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods

Central Ohio Air Quality End of Season Report. 111 Liberty Street, Suite 100 Columbus, OH Mid-Ohio Regional Planning Commission

WOODRUFF ROAD CORRIDOR ORIGIN-DESTINATION ANALYSIS

Further information: Basic principles of quantum computing Information on development areas of Volkswagen Group IT

Forecasting Currency Exchange Rates: Neural Networks and the Random Walk Model

Background and Hong Kong Statistics. Background. Estimation of Network Reliability under Traffic Incidents for ITS Applications

Traffic Impact Study

Forecasting Using Time Series Models

Introducing GIS analysis

Improving forecasting under missing data on sparse spatial networks

Lecture 4: Feed Forward Neural Networks

Transportation Management Center s Mission

Drought Criteria. Richard J. Heggen Department of Civil Engineering University of New Mexico, USA Abstract

Transcription:

Computer-Aided Civil and Infrastructure Engineering 15 (2000) 365 373 Traffic Volume Time-Series Analysis According to the Type of Road Use Pawan Lingras* Department of Mathematics and Computing Science, Saint Mary s University, Halifax, Nara Sootin, B3H 3C3, Canada & Satish C. Sharma, Phil Osborne & Iftekhar Kalyar Faculty of Engineering, University of Regina, Regina, Saskatchewan, S4S 0A2, Canada Abstract: Problems related to highway traffic operation and congestion management can be alleviated with the use of modern intelligent transportation systems (ITSs). Advanced Traveler Information Systems (ATIS)is one of the emerging technologies that will help travelers plan routes and schedules of their trips so as to redistribute the traffic over the highway network. Such redistribution will try to maximize the use of available highway capacity. Collections of real-time data and short-term predictions of traffic volumes are among the critical needs of an ATIS. This article studies characteristics of different traffic volume time series. In particular, time-series analysis is applied to the prediction of daily traffic volumes. The daily traffic volume is estimated by using the previous 13 daily traffic volumes. The study involves a comparison of statistical and neural network techniques for time series analysis. The analysis is applied to different types of road groups according to the trip purpose and trip length distribution. It is hoped that this study will provide a better understanding of various issues involved in the short-term prediction of traffic volumes on different types of highways. * To whom correspondence should be addressed. E-mail: Pawan.Lingras@stmarys.ca. 1 INTRODUCTION Different types of highway sections exhibit different traffic flow characteristics. Urban freeways and arterial highways in metropolitan areas carry a large volume of traffic on a daily basis. In jurisdictions across North America, many regional and rural highways also experience very high volumes. Even though some of the recreational highway sections carry relatively smaller average daily traffic, during peak recreational seasons they can experience severe congestion. The aim of the advanced traffic management and information systems is to provide improved and coordinated traffic control, incident management, and vehicle routing within the network. The short-term predictions of traffic volumes on a highway or road section can be useful for highway agencies to alleviate congestion through appropriate warnings and to control the highway accesses. The objectives of traffic volume time-series analysis may vary depending on the type of highway. Nonetheless, timeseries analysis will play an important role in the future traffic management systems on all types of highway sections.this article provides a comprehensive comparison of traffic volume time series for five different road groups. The analysis includes visual comparisons of the time series as well as comparisons of errors from short-term traffic volume predictions. 2000 Computer-Aided Civil and Infrastructure Engineering. Published by Blackwell Publishers, 350 Main Street, Malden, MA 02148, USA, and 108 Cowley Road, Oxford OX4 1JF, UK.

366 Lingras, Sharma, Osborne & Kalyar In past studies, different methodologies and techniques have been used to forecast traffic volumes. A number of forecasting methods are related to time-series analysis, in which the prediction of the future is based on past values of variables. The time-series models identify the pattern in the past data and extrapolate that pattern into the future. Researchers are taking significant interest in the short-term predictions of traffic volumes. 1,7,8 Smith and Demetsky 7 developed and tested such forecasting models as historical average, time-series, neural network, and nonparametric regression models for the purpose of forecasting traffic flow 15 minutes into the future. Ulbricht 8 developed multirecurrent neural network models to forecast the number of cars passing on a highway checkpoint between 7 and 8 a.m. and compared the results with various statistical methods. In this study, autoregression and time-delay neural networks (TDNNs) are used to predict the next day s traffic volume based on the previous 13 daily traffic volumes. The objective is to compare the statistical and neural network approaches for a variety of highway sections. Highway agencies use permanent traffic counters (PTCs) to record traffic volumes at various highway sections. These PTCs record traffic volumes continuously throughout the year. The data obtained from these counters aid in the calculation of other parameters such as annual average daily traffic (AADT), average daily traffic, and design hourly volume, which are used for planning, design, and management of highway networks. PTC sites from the province of Alberta were used in this study. Hierarchical grouping technique and Scheffe s S-method 6 were used to classify different road groups. The classification yielded five different categories of highways: highly recreational, regional recreational, long distance, urban commuter, and regional commuter. A visual inspection of the time series for these highways showed that the highly recreational traffic flow patterns were less regular than those for the commuter highways. It was expected that the recreational traffic volume time-series analysis would result in higher prediction errors than the commuter traffic volume time-series analysis. Since the commuter highway categories had a higher number of highways, the analysis was carried out for multiple highway sections from the commuter categories. The use of multiple highway sections from the same group also made it possible to verify the consistency of results across a given highway category. Linear autoregression and TDNNs were used to predict traffic volumes for PTC sites in different groups. Linear autoregression analysis finds the linear model that minimizes the sum of squares of errors between actual and predicted values in the training set. In general, the mathematical form of a neural network model is nonlinear. The neural network attempts to find an appropriate function such that the sum of squares of differences between actual and predicted values of traffic volume is minimized. The search space for the neural network training process is larger than the linear autoregression approach. However, the regression approach finds the optimal model from its search space, whereas neural networks are not guaranteed to find the optimal model from their search space. The objectives of this study are to compare the two methods, differences between groups, and similarity of results within groups. Neural networks for PTC sites from each group were trained on 4 years of traffic data (1989 1992) and later tested on 1 year of data (1993). Section 2 provides some literature review on the classification of roads and hierarchical grouping method. The essential concepts of the time-series analysis using autoregression and time-delay neural networks are described in Section 3. The problem statement and details of the study data are followed by the description of the models in Section 4. Section 5 reports the results of the analysis. Summary and conclusions of the study are provided in Section 6. 2 CLASSIFICATION OF ROADS Highway sections with similar traffic characteristics can be grouped together to simplify traffic analysis. Every province or state has different numbers of permanent traffic counter (PTC) sites depending on the traffic volume or highway system in that province or state. These PTC sites are located throughout the provincial or state highway system so that continuous data on the traffic patterns and characteristics of classes of highways are collected. 4 The PTC sites are grouped together to establish various types of road classes. Roads are classified on the basis of trip purpose and trip length characteristics. Some examples of classes are commuter, business, long distance, and recreational. Such a distribution simplifies the analysis because, instead of analyzing individual highway sections, it is possible to consider a fewer number of classes. The hierarchical grouping method is used mostly in behavioral research. The purpose of this method is to compare a set of N objects (e.g., number of road sites) each measured on K different variables (e.g., 12 monthly traffic factors) and group them in such a manner that groups are similar in their values of the K variables. The procedure carried out in this article for grouping the road sites is adopted from an article by Sharma and Werner. 6 The different road groups that resulted from the hierarchical grouping were further analyzed in terms of their ability to represent a more specific categorization of the highway system. The classes of roads are derived by exhibiting discernible and consistent patterns of seasonal, daily, and hourly traffic flow. The temporal variation patterns are objectively and systematically related to different types of road use. 5

Traffic volume time-series analysis according to the type of road use 367 3 TIME-SERIES ANALYSIS Human desire to predict the future and understand the past is well known. This desire drives the search for laws that explain the behavior of observed phenomena. Such phenomena could be currency exchange, sunspot activity, financial markets, or traffic flow. The underlying deterministic equations of such problems are not always known. In this case, the rules that govern the system evolution and the actual state of the system must be inferred from regularities in the past. 9 Time-series analysis may have three different goals: forecasting, modeling, and characterization. Forecasting means accurately predicting the short-term evolution of the system. The second goal, modeling, means finding a description that accurately captures features of the long-term behavior of the system. These two goals are not necessarily similar; the long-term behavior of a system may not be the most reliable way to determine parameters for good shortterm predictions, and a model suited to short-term forecasting may have incorrect long-term behavior. The third goal, system characterization, attempts to determine the fundamental properties of the system. It can differ from the first two goals in that the complexity of a model useful for predicting may not be related to the actual complexity of the system. This study predicts short-term traffic volumes using autoregression analysis and time-delay neural networks. Fig. 1. Time-delay neural network. 3.1 Time-delay neural networks Neural networks are considered clever and intuitive because they learn by example rather than by following programmed rules. They are good at pattern recognition. They learn the trends from the data and develop the ability to categorize, imitate, and generalize. 3 Their other characteristics are adaptability, plasticity, self-organization, dynamic stability, convergence, fault tolerance, and normalization. These concepts can be applied to almost all neural networks. Researchers have proposed different types of neural networks for solving a variety of problems. 2 This study used a neural network based on time-delay neural network (TDNN) design. Figure 1 shows the details of a TDNN. Some of the important concepts in the TDNN shown in Figure 1 are as follows: Delay node. This is a node that contains a copy of either a hidden, input, or delay node from a previous time period. Input to a delay node is not a summation of all the nodes of the previous layer. It receives one input, from one node, which is not modified. Delay length. This applies to the number of copies of the layer present in the model. A delay length of 1 represents no delay nodes in that layer. In Figure 1, the input layer has a delay length of 2, the hidden layer has a delay length of 4, and the output layer has a delay length of 1. Fig. 2. Time-delay neural network model for traffic analysis. Delay connection. Unlike regular connections, these do not have a weight associated with them. They simply transfer the value from one node to the next. In Figure 1, they are represented by broken lines. The network used in this study consists of three layers: input, hidden, and output. Figure 2 shows the network used in this study. The input layer receives data from the outside world. The input layer neurons send information to the hidden layer neurons. The hidden layer neurons are all the neurons between the input and output layers. They are part of the internal abstract pattern, which represents the neural network s solution to the problem. The hidden layer neurons feed their output to the output layer neurons, which provide the neural network s response to the input data. Neurons process input and produce output. Each neuron takes in the output from many other neurons. Actual output from a neuron is calculated using a transfer function. A sigmoid transfer function is chosen because it produces a continuous value in the range [0, 1]. It is necessary to train a neural network model on a set of examples called the training set so that it adapts to the system it is trying

368 Lingras, Sharma, Osborne & Kalyar to simulate. Supervised learning is the most common form of adaptation. In supervised learning, the correct output for the output layer is known. Output neurons are told what the ideal response to input signals should be. In the training phase, the network constructs an internal representation that captures the regularities of the data in a distributed and generalized way. The network attempts to adjust the weights of connections between neurons to produce the desired output. The backpropagation method is used to adjust the weights, in which errors from the output are feed back through the network, altering weights as they go, to prevent the same error from happening again. 3 3.2 Regression analysis Regression analysis is a popular statistical tool for developing models to study systems. It can be used to develop a mathematical model of the relationship between the dependent and independent variables of the system. Equation (1) shows the linear regression equation for a system with N independent variables (x 1,x 2,...,x N ), which is also known as the least-squares method of finding a hyperplane of best fit. The symbol e represents an external input. y = N a i x i + e (1) i=1 Autoregression analysis provides a more suitable approach for time-series analysis. In this analysis, the independent variables are the previous values of the dependent variable. The previous N values of x are the independent variables, whereas the next value of x in the time series is the dependent variable. Equation (2) shows the autoregression equation. x t = N a i x t i + e t (2) i=1 4 STUDY DATA AND THE MODEL In this study, the data from years 1989 to 1992 were used to train the neural networks, and 1993 data were used for the testing process. The data consist of hourly traffic volumes for each day and were used to model daily traffic volume time series. 4.1 Study data The data for this study were obtained from the permanent traffic counter (PTC) sites located on the provincial highway system of Alberta. These PTCs record hourly traffic volume for the entire year. The PTC sites cover a wide range of traffic patterns and road types. A total of 78 PTC sites were used from 1993 to determine different groups of road classifications. The PTC sites were grouped on the basis of their monthly traffic factors, which were obtained by dividing the average monthly traffic by the annual average daily traffic. The method of hierarchical grouping for areawide PTCs as suggested by Sharma and Werner 6 was used for this purpose. Further analysis was carried out on these five groups resulting from the hierarchical grouping method in terms of their ability to represent a more specific categorization of the provincial highway system. Table 1 shows different road groups obtained by the hierarchical grouping method. From now on, these five road groups will be referred to as group 1, group 2, group 3, group 4, and group 5. After determining different road types, one PTC site each from groups 1, 2, and 3 and three PTC sites each from groups 4 and 5 were selected to perform autoregression and neural network analysis. Three PTCs for groups 4 and 5 were used in this study because the numbers of PTCs in these groups are larger than in the other groups. Selecting more than one PTC from the same groups also makes it possible to test the consistency of the results within that group. Only PTC sites with 5 years of continuous traffic data were selected. The data span the years 1989 to 1993. 4.2 The models For all the models in this study, daily traffic volumes of the previous 13 days are used as the independent variables or input variables to predict traffic volume for the following day. Thus the model uses 2 weeks of time series. The autoregression (AR) model and time-delay neural network (TDNN) model are based on the same design. The AR model uses a 14-dimensional space. The previous 13 days traffic volumes were the independent variables, and the traffic volume to be predicted was the dependent variable. The TDNN had 13 input nodes corresponding to the previous 13 daily traffic volumes and 1 output for the traffic volume to be predicted. As shown in Figure 2, the input layer consists of 1 input, which corresponds to the traffic volume of the previous day, and 12 delays of that input. Seven hidden layer nodes were used, based on the general rule of thumb Table 1 List of PTCs and road groups Groups Type of road PTC Location Group 1 Highly recreational 016021 Jasper Park Gates Group 2 Regional recreational 001061 1.5 km W 1 & 22 Cochrane Group 3 Long distance 002081 7.6 km N2&Claresholm Group 4 Urban commuter 004061 11 km E4&5Wilson 015061 6.7 km W 15 & 45 Scotford 044021 7 km N 18 & 44 Westlock Group 5 Regional commuter 043221 0.8 km S 16 & 43 Carvel Cor 016142 0.2 km E 16 & 16X Beach Cor 16X401 2.1 km W 16X & 779 Stony Pl

Traffic volume time-series analysis according to the type of road use 369 that the number of hidden nodes should be the average of the number of input and output nodes. There are no delays in the hidden layer of this model. 5 RESULTS This section includes visual analysis of the traffic patterns for various road groups. The visual inspection makes it possible to establish certain hypotheses regarding the timeseries analysis. The results of the time-series predictions are then analyzed based on these hypotheses. 5.1 Traffic patterns for different groups Figures 3 to 7 show variations in daily traffic volumes over 4 years, from 1989 to 1992, for group 1 to group 5, respectively. The traffic volume is expressed as a ratio of AADT. This normalization made sure that the comparisons between road groups is based on the variation in traffic volumes as opposed to the absolute values of traffic volumes. Later on, for prediction purposes, output values were multiplied by a factor (less than 1) to ensure that the values were in the range [0, 1]. Figures 3 to 7 clearly show the variation of traffic volume from one group to another group. Figures 3 and 4, representing highly recreational and regional recreational groups, show high volume during summer months. However, the regional recreational group shows significantly higher traffic volume during late winter months, i.e., January to March. The PTC site for the regional recreational highway is located on the Trans-Canada Highway, between Calgary and Banff National Park. Ski trips are the main reason for the high volume in winter on this recreational road. Figures 5 and 7 show similar patterns. Figure 5 represents the long-distance group, and Figure 7 represents the regional commuter group. The regional commuter road has less seasonal variation in traffic volume as compared with the long-distance road. Figure 6 represents the time-series pattern of urban commuter road and shows more stable traffic volumes throughout the 4 years. Figures 3 to 7 clearly indicate that every road group is different due to seasonal variation, trip purpose, and trip length distributions of road users. The time-series patterns of the other two PTCs in groups 4 and 5, which are included in this study, were compared with the time-series patterns of Figures 6 and 7, respectively. The patterns for PTCs from the same group were found to be similar. It is expected that the time-series predictions for roads with low variations in daily traffic volumes will be more accurate than those with high variations. Based on such an assumption, the five groups can be ranked as follows: 1. Urban commuter (daily traffic volume/aadt ratio in the range of 0.5 1.5) Fig. 3. Traffic volume time series for a highly recreational highway. 2. Regional commuter (daily traffic volume/aadt ratio in the range of 0.5 2.0) 3. Long distance (daily traffic volume/aadt ratio in the range of 0.5 2.25) 4. Regional recreational (daily traffic volume/aadt ratio in the range of 0.5 2.8) 5. Highly recreational (daily traffic volume/aadt ratio in the range of 0.5 3.2) The accuracy of predictions was hypothesized to be in the order shown above. For all groups, traffic volume time series (Figures 3 to 7) clearly show low traffic volume on one day in 1989, on January 31, 1989. The possibility of equipment failure on that day is rejected, because it is unlikely that all these PTCs failed on the same day. Bad weather may be the cause of such low volumes. 5.2 Time-series predictions The neural network models were trained using 4 years of data. Once neural networks were trained, they were tested on 1993 data to test the feasibility of regression and neural network approaches. 5.2.1 Error measures used for comparison. Percentage error was calculated as: Error = actual predicted actual 100 (3) Average and maximum errors were used to measure the accuracy of predictions. Special circumstances such as equipment failure, bad weather, and highway incidents may result in unusually high errors. A small number of very high errors will have a significant impact on the maximum error and to some extent on the average error. The cumulative frequency distributions of errors provide a different perspective on the accuracy of predictions. The study used 50th, 85th, and 95th percentile errors from the cumulative frequency distributions for further analysis. The 50th percentile error, i.e., the median error, was used as an alternative to the average error. Half the predictions have better

370 Lingras, Sharma, Osborne & Kalyar Fig. 4. Traffic volume time series for a regional recreational highway. Fig. 5. Traffic volume time series for a long-distance highway. Fig. 6. Traffic volume time series for a regional commuter highway. Fig. 7. Traffic volume time series for an urban commuter highway.

Traffic volume time-series analysis according to the type of road use 371 accuracy than the 50th percentile error. The 95th percentile error provides a better measure of worst-case predictions than the maximum error by ignoring extreme anomalies. Nineteen of twenty predictions have errors lower than the 95th percentile error. The 85th percentile error value can be used as a measure of accuracy for a significant majority of predictions. 5.2.2 Error analysis for training sets. Table 2 shows the average and maximum errors for the training sets from five road groups. Table 3 shows the 50th, 85th, and 95th percentile error values. The maximum errors for all the models are very high. Most of these maximum values came from the same day of traffic data, i.e., January 31, 1989. This abnormally low daily volume was discounted from our conclusions as noise. Earlier discussion indicates that the 95th percentile errors are probably better indicators of the worst-case errors and will be used for detailed analysis. Differences in errors between different groups can be noticed easily from Tables 2 and 3. The average errors of groups 4 and 5 are better than other groups. One of the reasons is that these groups, which carry commuter traffic, have more stable traffic patterns throughout the year as compared with groups 1 and 2. The difference in errors between groups shows a definite trend, with group 4 having the lowest errors and group 2 having the highest errors. Within a group, the errors for different PTC sites vary by approximately 1 percent, whereas the errors for recreational sites are twice as high as those for commuter sites. This observation indicates that our conclusions can be extended to an entire group instead of a single site. The average errors for urban commuter roads vary from 7 to 8 percent for regression analysis. The neural network errors are slightly smaller, in the range of 6 to 7 percent. The urban commuter traffic pattern shown in Figure 6 is the most stable among all five groups. This is reflected in the lowest errors. The 50th percentile errors are slightly smaller than the average errors. However, there is a significant difference between maximum errors (171 327 percent) and the 95th percentile errors (16 18 percent). As mentioned before, the high maximum errors are from the same time period and hence should be ignored as noise in favor of the 95th percentile errors. The 85th percentile errors of 10 to 11 percent for TDNN models indicate that errors for the majority of predictions will be under 10 percent. Since the traffic patterns for regional commuter roads are more unstable than those for urban commuter roads, the average, 50th, 85th, and 95th percentile errors are higher by approximately 1 to 2 percent. The rest of the conclusions are essentially similar to those for the urban commuter roads. Prediction errors for the long-distance highway are higher than for both the urban and regional commuter roads. This difference can be explained by the high Table 2 Average and maximum errors for road groups of training data Average error Maximum error Groups and counters AR TDNN AR TDNN Group 1 c016021 14.34 13.03 725.82 734.91 Group 2 c001061 16.66 12.99 226.39 174.72 Group 3 c002081 9.15 7.79 327.45 314.52 Group 4 c004061 6.87 6.36 223.68 198.15 c015061 7.07 6.26 175.52 171.36 c044021 7.68 6.68 331.49 326.98 Group 5 c043221 8.07 6.85 313.18 321.54 c016142 7.24 6.60 327.33 342.55 c16x401 8.06 7.01 241.17 252.94 Table 3 50th, 85th, and 95th percentile errors for road groups of training data 50th percentile 85th percentile 95th percentile Groups and counters AR TDNN AR TDNN AR TDNN Group 1 c016021 8.95 8.86 24.88 21.38 40.92 33.29 Group 2 c001061 11.45 9.54 29.67 20.93 50.09 36.70 Group 3 c002081 6.19 5.76 16.09 12.74 27.66 20.31 Group 4 c004061 4.50 4.36 11.43 10.54 20.37 17.76 c015061 4.73 4.31 12.20 11.05 20.69 16.44 c044021 5.38 4.79 12.77 11.05 20.84 16.60 Group 5 c043221 5.58 5.05 14.07 11.75 22.65 17.36 c016142 4.82 4.50 12.41 11.31 19.83 17.07 c16x401 5.72 5.18 14.15 12.13 23.40 17.38 variation in the long distance traffic pattern (Figure 5) compared with the traffic patterns for commuter highways (Figures 6 and 7). The difference between the average and 50th percentile errors is more pronounced. Similarly, the difference in accuracy of the TDNN and AR models is also higher than those for the commuter highways. The 95th percentile error for the TDNN model is 20 percent as opposed to 27 percent for the AR model. For the urban commuter roads, the 95th percentile errors for the TDNN model were about 4 percent lower than those for the AR model. This observation seems to suggest that as the variation in traffic volume time series increases, the TDNN models tend to outperform the AR models. This conclusion is further confirmed by the analysis of recreational roads.

372 Lingras, Sharma, Osborne & Kalyar The high variation in recreational roads from groups 1 and 2 resulted in higher prediction errors. The 50th percentile error of 11.45 percent for the regional recreational road was significantly higher than the 4 to 6 percent errors for the commuter roads. Surprisingly, the errors for the highly recreational roads were lower than for the regional recreational roads. One possible explanation is that even though highly recreational roads have high seasonal variations, the regional recreational roads tend to be more susceptible to short-term weather effects. The 95th percentile error of 50 percent using the AR model for the regional recreational road is substantially higher than 37 percent using the TDNN model. This observation confirms the superior performance of TDNN models for more unstable time series. In summary, it is possible to rank the accuracy of predictions as follows: 1. Urban commuter (50th percentile errors: 4.5 5.4 percent) 2. Regional commuter (50th percentile errors: 4.8 5.7 percent) 3. Long distance (50th percentile error: 6.19 percent) 4. Highly recreational (50th percentile error: 8.95 percent) 5. Regional recreational (50th percentile error: 11.45 percent) Note that the ranking based on actual errors is slightly different from the one hypothesized based on traffic volume variations. In particular, the predictions for the highly recreational road, despite higher variations in traffic volume, were more accurate than those for the regional recreational road. 5.2.3 Error analysis for test sets. Tables 4 and 5 provide the corresponding results for the test data of 1993. Average and maximum errors are listed in Table 4. Table 5 reports 50th, 85th, and 95th percentile errors. It may be noted that the maximum errors for the test set were significantly lower than those for the training set. This justifies our earlier conclusion that the traffic data for January 31, 1989 can be ignored as an abnormality. Again, neural networks predicted better than autoregression. The average, maximum, and 95th percentile errors are slightly lower for neural networks (with the exception of the 50th percentile errors for groups 1 and 3). The errors are higher for recreational groups 1 and 2 compared with the long-distance and commuter groups 3, 4, and 5. These recreational roads tend to have more variation in traffic volumes than other road groups, which can be seen from Figures 3 to 7. The errors for the urban commuter highways were the lowest among all the groups. These errors closely resembled the errors obtained for the training set. The 50th percentile errors for the regional commuter roads are higher Table 4 Average and maximum errors for road groups of test data Average error Maximum error Groups and counters AR TDNN AR TDNN Group 1 c016021 13.14 12.18 228.01 164.46 Group 2 c001061 15.62 13.53 108.48 86.34 Group 3 c002081 8.01 7.44 62.62 58.44 Group 4 c004061 6.39 6.29 53.31 54.77 c015061 6.44 6.14 53.52 78.94 c044021 6.85 6.35 65.50 63.62 Group 5 c043221 6.66 5.96 56.76 59.25 c016142 6.39 5.87 85.92 79.81 c16x401 6.76 6.13 44.47 41.64 Table 5 50th, 85th, and 95th percentile errors for road groups of test data 50th percentile 85th percentile 95th percentile Groups and counters AR TDNN AR TDNN AR TDNN Group 1 c016021 8.64 9.00 23.99 20.58 39.98 38.08 Group 2 c001061 11.52 9.97 27.49 22.85 45.05 36.63 Group 3 c002081 5.55 5.65 14.55 13.08 25.13 20.94 Group 4 c004061 4.56 4.28 11.18 11.65 19.13 19.00 c015061 4.63 4.21 11.16 10.65 18.71 18.85 c044021 4.88 4.75 12.03 11.10 18.92 16.98 Group 5 c043221 5.05 4.60 11.83 10.77 18.40 15.33 c016142 4.30 4.21 11.55 10.51 18.60 16.35 c16x401 4.98 4.88 12.28 10.70 17.84 15.60 than those for the urban commuter roads. However, 85th and 95th percentile errors show slightly anomalous behavior. Some of the 85th and 95th percentile errors for regional commuter roads are smaller than those for the urban commuter roads. For the long-distance roads, the errors in the test set are slightly lower than in the training set. However, the difference is not significant enough to draw any conclusions. As expected, these errors are higher than those for the commuter roads. Both the recreational highways tested in this study have prediction errors similar to those for the training data. There are a few anomalies where errors using the AR model

Traffic volume time-series analysis according to the type of road use 373 declined faster than those for the TDNN model. Further testing may shed more light on these anomalies. Overall, the observations for the test sets are similar to those for the training sets. However, it should be noted that the TDNN models are designed for continuous adaptation to the changing time series. Therefore, the performance of the TDNN models in real-world implementations may be even better than that for the test sets used in this study. 6 SUMMARY AND CONCLUSIONS Forecasting of hourly and daily traffic volume is important for short-term scheduling of highway facilities. 8 Intelligent transportation systems also require projected traffic conditions in the short term for vehicle guidance. 1,7 Researchers have developed a variety of models for special purposes such as predicting traffic on urban arterial roads or at a border crossing. With the advent of new technologies, shortterm traffic forecasting will be an important issue for all types of highways ranging from commuter to recreational. A comprehensive time-series analysis for different types of highways may provide useful insights and directions for further research. In this study, time series analysis was applied to different road groups to predict the next day s traffic volume by using the previous 13 daily traffic volumes. PTC sites with similar traffic characteristics were grouped together to simplify the analysis. Hierarchical grouping techniques were used to obtain different road groups. Autoregression analysis and time-delay neural networks were used to compare the results. Neural network models were better predictors than autoregression models. The average, maximum, 50th, 85th, and 95th percentile errors were lower with the neural network approach for all road groups as compared with autoregression. The prediction errors for predominantly recreational roads were higher than those for predominantly commuter and long-distance roads. This conclusion is supported by the fact that commuter and long-distance traffic patterns are relatively more stable than recreational traffic patterns. The errors for different PTCs within the same group were similar. This observation implies that the conclusions drawn in this study are applicable to the highway categories instead of specific highway sections used in the study. ACKNOWLEDGMENTS We would like to thank NSERC, Canada, for the financial support and Alberta Transportation for the data used in this study. REFERENCES 1. Florio, L. & Mussone, L., Neural network models for classification and forecasting of freeway traffic flow stability, Transportation System: Theory and Application of Advanced Technology, 2 (1995), 773 84. 2. Hecht-Nielsen, R., Neurocomputing, Addison-Wesley, Don Mills, Ontario, Canada, 1990. 3. Lawrence, J., Introduction to Neural Network: Design, Theory and Application, California Scientific Software Press, Nevada City, CA, 1993. 4. Sharma, S. C. & Allipuram, R. R., Duration and frequency of seasonal traffic counts, Journal of Transportation Engineering, American Society of Civil Engineers, 116 (3), (1993), 344 59. 5. Sharma, S. C., Lingras, P. J., Hassan, M. U. & Murthy, A. S., Road classification and driver population, Transportation Research Record Na 1090, Transportation Research Board, Washington, 1986, pp. 61 9. 6. Sharma, S. C. & Werner A., Improved method of grouping province-wide permanent traffic counters, Transportation Research Record No. 815, Transportation Research Board, Washington, 1981, pp. 13 18. 7. Smith, B. L. & Demetsky, M. J., Traffic flow forecasting: Comparison of modeling approaches, Journal of Transportation Engineering, American Society of Civil Engineers, 123 (4), (1997), 261 6. 8. Ulbricht, C., Multi-recurrent networks for traffic forecasting, in Proceedings of the Twelfth National Conference on Artificial Intelligence, Vol. II, AAAI 94, AAAI Press/MIT Press, Cambridge, MA, 1994, pp. 883 8. 9. Weigend, A. S. & Gershenfeld, N. A., Time Series Prediction: Forecasting the Future and Understanding the Past, Addison- Wesley, Don Mills, Ontario, Canada, 1994.