Examining Travelers Who Pay to Drive Slower in the Katy Managed Lanes

Similar documents
INTRODUCTION PURPOSE DATA COLLECTION

Appendix BAL Baltimore, Maryland 2003 Annual Report on Freeway Mobility and Reliability

Montmorency County Traffic Crash Data & Year Trends. Reporting Criteria

Montmorency County Traffic Crash Data & Year Trends. Reporting Criteria

Long-Term Evaluation of the Operational Performance of Fixed Time Ramp Metering Control Strategy: A Freeway Corridor Study

Examining travel time variability using AVI data

Appendixx C Travel Demand Model Development and Forecasting Lubbock Outer Route Study June 2014

I-10 East at Redd closes for 24 hours this Sunday, Feb 11. Then, I-10 West at Resler closes for 27 hours on Feb 25

TRAFFIC ALERT FOR WEEK OF February 4 8, 2008

Responsive Traffic Management Through Short-Term Weather and Collision Prediction

Effectiveness of Experimental Transverse- Bar Pavement Marking as Speed-Reduction Treatment on Freeway Curves

Encapsulating Urban Traffic Rhythms into Road Networks

APPENDIX IV MODELLING

Predicting flight on-time performance

Real-Time Travel Time Prediction Using Multi-level k-nearest Neighbor Algorithm and Data Fusion Method

Prepared for: San Diego Association Of Governments 401 B Street, Suite 800 San Diego, California 92101

The Sunland Park flyover ramp is set to close the week of March 19 until early summer

Predicting freeway traffic in the Bay Area

INTRODUCTION TO TRANSPORTATION SYSTEMS

Portable Changeable Message Sign (PCMS) Speed Study

QUANTIFICATION OF THE NATURAL VARIATION IN TRAFFIC FLOW ON SELECTED NATIONAL ROADS IN SOUTH AFRICA

TRB Paper Examining Methods for Estimating Crash Counts According to Their Collision Type

NORTH HOUSTON HIGHWAY IMPROVEMENT PROJECT (NHHIP)

STATISTICAL ANALYSIS OF LAW ENFORCEMENT SURVEILLANCE IMPACT ON SAMPLE CONSTRUCTION ZONES IN MISSISSIPPI (Part 1: DESCRIPTIVE)

Parking Regulations Dundas Street West, from Bathurst Street to Dovercourt Road

Transportation and Road Weather

Mountain View Community Shuttle Monthly Operations Report

VHD Daily Totals. Population 14.5% change. VMT Daily Totals Suffolk 24-hour VMT. 49.3% change. 14.4% change VMT

Estimating Through Trip Travel without External Surveys

Dynamic Pricing, Managed Lanes and Integrated Corridor Management: Challenges for Advanced Network Modeling Methodologies

Measuring Wave Velocities on Highways during Congestion using Cross Spectral Analysis

National Rural ITS Conference 2006

NORTH HOUSTON HIGHWAY IMPROVEMENT PROJECT (NHHIP)

2015 Grand Forks East Grand Forks TDM

Project Appraisal Guidelines

When is the concept of generalized transport costs useless? The effects of the change in the value of time

TMC Monthly Operational Summary

EFFECTS OF WEATHER-CONTROLLED VARIABLE MESSAGE SIGNING IN FINLAND CASE HIGHWAY 1 (E18)

Weather and Travel Time Decision Support

U.S. - Canadian Border Traffic Prediction

Lecture 19: Common property resources

FY 2010 Continuing i Planning Program Product Report. Local Transportation and Traffic Data. Wood-Washington-Wirt Interstate Planning Commission

Forecasts from the Strategy Planning Model

WOODRUFF ROAD CORRIDOR ORIGIN-DESTINATION ANALYSIS

Mountain View Community Shuttle Monthly Operations Report

Active Traffic & Safety Management System for Interstate 77 in Virginia. Chris McDonald, PE VDOT Southwest Regional Operations Director

Anticipatory Pricing to Manage Flow Breakdown. Jonathan D. Hall University of Toronto and Ian Savage Northwestern University

Prioritization of Freeway Segments Based on Travel Time Reliability and Congestion Severity

PREDICTING SURFACE TEMPERATURES OF ROADS: Utilizing a Decaying Average in Forecasting

New Jersey Department of Transportation Extreme Weather Asset Management Pilot Study

Understanding Travel Time to Airports in New York City Sierra Gentry Dominik Schunack

Real-Time Congestion Pricing Strategies for Toll Facilities

TMC Monthly Operational Summary

Factors Impacting Link Travel Speed Reliability: A Case Study at Cairo, Egypt

6 th Line Municipal Class Environmental Assessment

April 10, Mr. Curt Van De Walle, City Manager City of Castle Hills 209 Lemonwood Drive Castle Hills, Texas 78213

Accessing and Using Indiana Traffic Data

Tornado Hazard Risk Analysis: A Report for Rutherford County Emergency Management Agency

Multivariate Regression Model Results

FHWA/IN/JTRP-2008/1. Final Report. Jon D. Fricker Maria Martchouk

Traffic Surveillance from a Safety Perspective: An ITS Data Application

A Speed-Delay Study of Highway 401 in Toronto, Ontario

Metro SafeTrack Impact on Individual Travel Behavior & Regional Traffic Conditions. 1. Introduction. 2. Focus of this Volume & Issue

CS7267 MACHINE LEARNING

CONTINUING PLANNING PROGRAM LOCAL TRANSPORTATION AND TRAFFIC DATA PRODUCT REPORT [OH Corridors]

The Built Environment, Car Ownership, and Travel Behavior in Seoul

A Proposed Driver Assistance System in Adverse Weather Conditions

Neighborhood Locations and Amenities

Predicting MTA Bus Arrival Times in New York City

PENNSTATE COMMONWEALTH OF PENNSYLVANIA DEPARTMENT OF TRANSPORTATION

Transportation Management Center s Mission

Radar Analysis of Second Wave of UFO Sightings Near Stephenville, Texas. October 21 and 23, 2008

Transportation Management Center s Mission

Freeway rear-end collision risk for Italian freeways. An extreme value theory approach

California Experience With Inside Shoulder Removals

CHAPTER 4 CRITICAL GROWTH SEASONS AND THE CRITICAL INFLOW PERIOD. The numbers of trawl and by bag seine samples collected by year over the study

Departure time choice equilibrium problem with partial implementation of congestion pricing

The quality of loop data and the health of California s freeway loop detectors

Risk Analysis for Assessment of Vegetation Impact on Outages in Electric Power Systems. T. DOKIC, P.-C. CHEN, M. KEZUNOVIC Texas A&M University USA

Short-term traffic volume prediction using neural networks

Parking Occupancy Prediction and Pattern Analysis

Texas Transportation Institute The Texas A&M University System College Station, Texas

Guidelines on Using California Land Use/Transportation Planning Tools

DEVELOPMENT OF ROAD SURFACE TEMPERATURE PREDICTION MODEL

Sensitivity of estimates of travel distance and travel time to street network data quality

CIV3703 Transport Engineering. Module 2 Transport Modelling

PATREC PERSPECTIVES Sensing Technology Innovations for Tracking Congestion

Active Traffic Management Case Study: Phase 1

Overfitting, Bias / Variance Analysis

Texas A&M University

Analysis and Design of Urban Transportation Network for Pyi Gyi Ta Gon Township PHOO PWINT ZAN 1, DR. NILAR AYE 2

The Urbana Free Library Parking and Transportation Study

Feedback Based Dynamic Congestion Pricing

Lessons Learned Using ESRI s Network Analyst to Optimize Snow Treatment Routes in Kentucky

Public Open House Meeting Thursday, February 22, Broken Arrow to Tulsa Mass Transit Feasibility Study

Assessment of Interaction of Crash Occurrence, Mountainous Freeway Geometry, Real-Time Weather, and Traffic Data

A Study of Red Light Cameras in Kansas City, MO

Traffic Impact Study

WMO Aeronautical Meteorology Scientific Conference 2017

Analysis of Travel Time Reliability on Indiana Interstates

Transcription:

Examining Travelers Who Pay to Drive Slower in the Katy Managed Lanes Farinoush Sharifi Zachry Department of Civil Engineering, Texas A&M University TAMU, College Station, TX Tel: -- Email: farinoushsharifi@tamu.edu Mark W. Burris, corresponding author Zachry Department of Civil Engineering, Texas A&M University TAMU, College Station, TX Tel: -- Email: mburris@tamu.edu Submission Date: March, 0

Sharifi, Burris Table of Contents Abstract... Introduction... Background... Data... Variables of Study... Data Analysis... Uneconomical Managed Lane Trip Identification... Sampling... Machine Learning Techniques... Initial Models... Final Models... Conclusion... Acknowledgment... References...

Sharifi, Burris 0 ABSTRACT Many people believe that paying a toll to use a managed (tolled) lane will result in a shorter travel time than using the toll-free general-purpose lanes. However, there are times users pay to travel on the toll lane but go slower than the toll-free lanes. This research examined these uneconomical trips on managed lanes to discover potential similarities among these trips and help understand the lane choice behavior. Some potential factors considered were toll rate, traffic flow, and past trip experience. Random forest and logistic regression methods were implemented to examine the impact and importance of variables on the probability of a user making an uneconomical managed lane trip. This study indicated that random forest model provided the most accurate results. It also showed toll rate, traffic flow, travel time variability, and trip route (start and end points) are the key factors in predicting uneconomical managed lane trips. This study helps to better understand uneconomical managed lane trips and identify factors, such as time and location, that increase the likelihood of these trips. Therefore, this study provides a first step towards being able to predict these trips and some additional understanding of travel on managed lanes.

Sharifi, Burris 0 0 0 0 0 INTRODUCTION As defined by the Federal Highway Administration (FHWA) publication, managed lanes (MLs) are designated lanes or roadways within highway rights-of-way where the flow of traffic is managed by restricting vehicle eligibility, limiting facility access, or and in some cases collecting variably priced tolls (). Generally, the MLs require less travel time than the general-purpose lanes (GPLs), and it is advantageous to travel on these lanes. However, it is not always the least travel time route or the most economical route choice. In fact, approximately % (.% in 0,.% in 0, and 0.% in 0) of the paid ML trips on the Katy Freeway were 'uneconomical', meaning the drivers paid but experienced a longer travel time (). The main focus of this study was to establish some insight into these uneconomical managed lane (U-ML) trips. To understand the U-ML trips better, this study: explored ML trips, specifically U-ML trips; searched for commonalities among U-ML trips; identified the most critical variables affecting U-ML trips; investigated the way these variables impact U-ML trips; estimated a model to predict U-ML trips. This information should be beneficial in predicting ML travel behavior and leading to improved transportation planning models. Previous studies showed some possible variables that might have an impact on the travelers lane choice behavior (). These variables were the time of day, trip length, travel time variability, trip history, and ML trip frequency. To improve the previous analyses, trip route, crashes, rain, and traffic flow were included in this study. Pattern recognition methods were implemented to find the possible patterns between these variables and U-ML trips. Hence, random forest and logistic regression techniques were applied to recognize the most important factors and their ranking in predicting U-ML trips. For this study, nearly three years of Katy Freeway data was obtained from Texas Department of Transportation (TxDOT) automated vehicle identification (AVI) sensors and Harris County Toll Road Authority (HCTRA) sensors. This dataset includes all trip information of vehicles with transponders on Katy Freeway. Also, precipitation records from National Oceanic and Atmospheric Administration (NOAA) were added to include the rain factor in the main dataset. Finally, this study found the most important factors and the best model in the prediction of U-ML trips. These findings can help to explain the U-ML trips better and predict them based on the corresponding key elements, which leads to a better ML travel prediction and travel behavior understanding. BACKGROUND The FHWA publication states one of the primary goals for priced MLs is congestion reduction, enabling the vehicles to travel at higher speeds and save travel time (). That is, one of the main benefits of MLs is travel time saving, and people pay for this time saving. Nevertheless, MLs may not always have a shorter travel time than the GPLs. Burris et al. previously studied these trips with higher travel time on MLs. They examined travelers lane choice behavior on Katy Freeway and found the travel time saving for paid ML trips ranged from -. to over 0 minutes with an average of. minutes. They found approximately % of the paid ML trips had a negative travel time saving (trips were slower on the MLs than on the GPLs). It was also observed that travelers were not willing to change their typical lane used due to a previous bad (slow) trip on the lane (a trip with speed much slower than all other vehicles average speeds) (). Actually, travelers usually overestimate their travel time-saving. An examination of the perceived travel time saving along SR- reported that % (in the AM peak hour) and % (in the PM peak hour) of respondents believed in having a travel time savings of more than minutes. Also, almost all of the respondents overestimated their travel time saving (). Several studies also identified the travel time saving as the dominant incentive to choose MLs over GPLs (-). Also, current models assume that travelers do not take the MLs if they do not save travel time. A study conducted on I- concluded travelers are willing to pay up to $0 to

Sharifi, Burris 0 0 0 reduce one hour of travel time (). The reported value of time for SR- in Orange County was $. per hour in 00, This study acknowledged the small travel time savings. However, there were no further studies on travelers paying the toll and going slower (). The small or negative travel time savings and speed difference were also observed in some other articles. An examination of the willingness to pay to use I- MLs in Minnesota indicated a small difference between the ML speed and the GPL speed. Moreover, % of travelers on I- MLs paid for a travel time saving of less than one minute. It was concluded that the small travel time savings obtained from this small speed increase could not be the primary incentive for choosing MLs over GPLs. Other reasons might be avoiding a bottleneck or the higher reliability of MLs (). However, no studies have been undertaken to confirm why travelers choose a negative travel time savings on tolled MLs. The current study focused on the travel time loss of paid ML users and implemented two pattern recognition techniques to diagnose the pattern or relationship between the key variables and U-ML trip probability on the Katy Freeway. Logistic regression technique was used in several studies to model lane and mode choice behavior (; -). The ability to examine the magnitude of each attribute s impact on the U-ML trip likelihood and its simplicity makes logistic regression a favorable technique in studying U-ML trips. A study was conducted in 0 to examine the mode choice classifiers using the Dutch National Travel Survey from 00 to 0. Seven different pattern recognition techniques including random forest and multinomial logit model were proposed as classifiers. The random forest model was the most accurate model. The multinomial logit model had the lowest accuracy (0). Hence, the current study also implemented random forest technique. Random forests are efficient and functional in dealing with large datasets. This flexibility and the ability to rank variables based on their importance are the two main reasons random forest models were used in this study. DATA In this study, a unique dataset obtained from Katy Freeway was investigated. Interstate 0 (I-0) is a major east-west interstate highway, and Katy Freeway is a -mile section of I-0 connecting the City of Katy to Downtown Houston. Katy Freeway was converted to a ML facility in 00. It consists of up to six GPLs and two MLs in each direction. Some drivers on the MLs are required to pay a toll, depending on the time of the day, day of the week, and number of passengers. During Monday to Friday, am to am, and pm to pm, high-occupancy vehicles (HOVs) with two or more occupants and motorcycles can use MLs for free. However, HOVs during all other times and single-occupancy vehicles (SOVs) have to pay the toll that varies by time of day. HCTRA is responsible for toll rates and toll collection at three toll plazas. The toll rate schedule is available on HCTRA website () and is shown in Table. Tolls are electronically collected via EZ Tag or TxTag. TABLE Katy Managed Lanes Toll Rate Schedule Dates Direction Time of Day Opening day (Apr. 00) Sept., 0 Westbound Eastbound Toll at Eldridge (See Figure ) Peak: -pm weekdays $.0 $.0 Shoulder: - & - pm weekdays $0.0 $0.0 Off-peak: all other times $0.0 $0.0 Peak: -am weekdays $.0 $.0 Shoulder: - & -0 am weekdays $0.0 $0.0 Off-peak: all other times $0.0 $0.0 Westbound Peak: - pm weekdays $.0 $.0 Toll at Wilcrest and Wirt (See Figure )

Sharifi, Burris Sept., 0 Sept., 0 Sept., 0 Today Eastbound Westbound Eastbound Shoulder: - & - pm weekdays $.0 $0.0 Off-peak: all other times $0.0 $0.0 Peak: - am weekdays $.0 $.0 Shoulder: - & -0 am weekdays $.0 $0.0 Off-peak: all other times $0.0 $0.0 Peak: - pm weekdays $.0 $.0 Shoulder: - & - pm weekdays $.0 $.0 Off-peak: all other times $0.0 $0.0 High Peak: - am weekdays $.0 $.0 Low Peak: - am weekdays $.0 $.0 High Shoulder: - am weekdays $.0 $.0 Low Shoulder: -0 am weekdays $.0 $.00 Off-peak: all other times $0.0 $0.0 Much of the data was acquired from AVI sensors operated by TxDOT. There are AVI sensors with unique sensor numbers located along GPLs and MLs (Figure ). When they detect a vehicle, they record the vehicle s transponder ID, sensor ID, and detection time. All vehicles using MLs are required to have a transponder sensor. The data included most of the trip records from 0, 0, and 0 with transponder ID, AVI number, and detection time. FIGURE AVI Sensors along Katy Freeway ().

Sharifi, Burris 0 0 0 0 0 Another large set of data was obtained from HCTRA. They collect data from AVI sensors at three toll plazas and use that to charge vehicles the appropriate toll rates. The AVI sensors are shown in Figure. This data also records vehicle s transponder ID, toll plaza ID, lane ID, and the detection time as the vehicle passes each sensor. The third dataset was obtained from the National Oceanic and Atmospheric Administration (NOAA) to include daily precipitation effect in the study. NOAA s National Center for Environmental Information (NCEI) website has a valuable source of daily weather summary for several stations in the Houston area (). This study used daily rain measurements from the closest station to Katy Freeway. This station is named HOUSTON. WNW TX, US. To coordinate this data with two other datasets, the precipitation data covered the period from January 0 to September 0. TxDOT AVI sensors and HCTRA toll data were combined to form the vehicles travel information data, which included their trip route, their trip time, and the paid toll. Daily rain data and lane closure data obtained from TxDOT were also combined to form the main dataset. The main dataset included data from January 0 to November 0 and January 0 to September 0 and contained,0, ML trips. To get these million plus trips required a series of data cleaning and processing steps. First, the original transponder ID of each traveler was changed to a random ID to respect the anonymity of travelers. Then, the toll-free HOVs were excluded from the main dataset to focus on paid ML trips. Defining an alternate GPL trip for each ML trip was needed to compare the ML trip travel time with a toll-free trip travel time and compute the travel time savings or loss for the ML trip. To generate the alternate GPL trip for each actual ML trip, the start time and starting and ending point on Katy Freeway for the generated trip was the same as the ML trip. The speed of travelers on the GPL at the exact day and time of the ML trip was then used to find the GPL travel time. In the absence of GPL trip data, the average speed of actual trips on GPLs over one month in the same -minute period of the day was used to generate alternate GPL travel time. Additional details regarding data cleaning, quality control, and trip development from sensor readings can be found in Lee s dissertation (). Variables of Study Variables of interest were selected among all available and independent attributes, and they represent time of trip (including day of the week, peak hour, and shoulder hour), route of the trip (including start sensor, end sensor, direction, and trip length), rain, lane blockages (shoulder lane, main lane, frontage lane, ramp lane, and HOV lane), toll rate, number of vehicles with transponders on GPLs/MLs (termed GPL/ML traffic flow), travel time variability, and travel behavior (including ML trip frequency and percentage of ML trips). DATA ANALYSIS Uneconomical Managed Lane Trip Identification The two features of paying a toll to use MLs and spending a longer travel time than the alternate GPL trip indicate a U-ML trip. There are two approaches to classify ML trips. The first type of ML trip classification can be placed in the form of a binary parameter termed tripclass b, which is: tripclass b = { if (TT ML TT GPL ) (Uneconomical ML trip) () 0 if (TT ML < TT GPL ) (Economical ML trip) Where: TT ML=Actual ML trip travel time TT GPL=Alternate GPL trip travel time This classification approach does not distinguish losing one minute in a two-minute trip from losing 0 seconds in a ten-minute trip. Hence, another classification for ML trips was established to

Sharifi, Burris 0 0 0 0 differentiate the relative quantity of travel time saving or loss. This new classification is defined with a variable termed tripclass m. Tripclass m divides ML trips into three groups: economical ML (E-ML) trips, U-ML trips, and closely similar or middle ML trips. The final class was developed for the ML trips with a small relative travel time saving or loss. To equitably adjust the interval for these ML trips, the relative travel time difference is defined as: Relative Travel Time Difference (RTTD) = (TT ML TT GPL )/TT ML () To classify ML trips, tripclass m is defined in Equation (). tripclass m = {0. if (RTTD > 0.0) (Uneconomical ML trip) if ( 0.0 RTTD 0.0) (Middle ML trip) 0 if (RTTD < 0.0) (Economical ML trip) The second set includes ML trips with a small travel time loss or saving which is likely hard for travelers to observe. This study used both of these classifications to examine ML trips. Sampling To get the sample set,,00, trips (one-seventh of all trips) were randomly selected from all trips. This sample size was large enough comparing to other travel behavior studies (; ). Then, the data were split into two sets to train the model and assess the predictive power of the model. A comprehensive review of the optimal split of the dataset stated that the training set should be 0% to 0% of the total data size (). Consequently, to have an optimal size of the splits, 0% of the data was used as the training set, which included 0, ML trips. The number of U-ML trips in training set was 0,0 based on multiclass classification and increased to, trips in binary classification. The second subset of the data, named test set, included 00, trips (0% of the sample trips). Machine Learning Techniques Logistic regression and random forest were proposed to predict U-ML trips and find the most relevant variables. Logistic regression is a regression model for predicting discrete choice or categorical dependent variables (). The main form of the model is shown in Equation (): log ( p(x) p(x) ) = β 0 + β X () Where: β X = Regression coefficient multiplied by the independent variables β 0 = Intercept of the linear equation p(x)= Probability of X happening (between 0 and ). Random forest is an ensemble of decision trees (). The central concept of random forest is to create a strong classifier by gathering all small decision trees together. Each tree in the random forest gets a set of input observations and produces a set of outputs or votes for the random forest output. The output of the random forest model is the mode or mean of all decision trees outputs. The models were compared based ()

Sharifi, Burris 0 0 0 0 on their area under Receiver Operating Characteristic (ROC) curve (AUC). ROC curve expresses a tradeoff between the benefit and the cost of the model. Higher AUC indicates a more accurate model. Initial Models The initial models were developed to find the most important variables for the final models (Equation ). tripclass(b, m) = F(Direction, Weekday, Peak hour, Shoulder hour, Main lane blockage, Frontage lane blockage, Ramp lane blockage, HOV lane blockage, Shoulder lanes blockage, Rain, ML traffic flow, GPL traffic flow, Trip length, Start sensor, End sensor, Travel time variability, ML trip frequnecy, Percentage of ML trips, Toll rate) () The variables were ranked by their impact on the models when that variable is removed using the mean decrease in accuracy parameter. The most critical variables are those that cause the largest decrease in accuracy. The first six leading factors to the models were the travel time variability, GPL traffic flow, ML traffic flow, start sensor, toll rate, and end sensor. These six variables were selected for the final models noting they were a representative of the time of the trip, route of the trip, cost of the trip, and traffic flow. Final Models The final binary random forest, multiclass random forest, and binary logistic regression models were generated using the six most important variables identified from the initial analysis. (See Equation ). tripclass(b, m) ~ F(ML traffic flow, Start sensor, End sensor, Travel time variability, GPL traffic flow, Toll rate) () Comparing AUC from initial and final models shows that the exclusion of all but the six key variables did not decrease the AUC of the final model very much (Table ). Instead, the model is now more straightforward and practical for future use. TABLE AUC Summary Model Binary Random Forest Multiclass Random Forest Binary Logistic (BRF) (MRF) Regression (BLR) Initial Full Model 0. 0. 0.0 Final Model 0. 0. 0. Table shows the variable importance ranking and their significance in the final models. The analysis of variance (ANOVA) for BLR model is implemented. This analysis sequentially compares the smaller model with the next more complex model and is conducted for each variable by comparing the full model and the model without the variable listed in the first column of the table. The Wald Chisquared test evaluates this comparison by generating p-values, which shows the significance of the variable in the model. A large p-value shows that the BLR model without the corresponding variable is not significantly different. As indicated, all p-values are small, and all variables have a significant impact on ML trip classification.

Sharifi, Burris 0 TABLE Final Variables' Impacts Variable Final BRF Mean Decrease Accuracy Final MRF Mean Decrease Accuracy Final BLR ChiSquare Test P-value 0 GPL traffic flow 0. 0. < 0.000 Travel time.. variability < 0.000 ML traffic flow.. < 0.000 Start sensor..0 < 0.000 End sensor.. < 0.000 Toll rate.. < 0.000 As indicated by random forest models, the most important variables for ML trip classification are GPL traffic flow, travel time variability, ML traffic flow, start sensor, end sensor, and toll rate. However, their impact on the U-ML trip rate is unclear in the random forest models. The BLR model provides more information on a variables impact on U-ML trip rate based on the estimated coefficients. Table illustrates the coefficient estimations for each variable in final BLR model. A positive value indicates an increased likelihood of a U-ML trip by an increase in the associated variable. The lowest p- value (the strongest association with U-ML trip likelihood) is for variables: GPL traffic flow, toll rate, start sensor 0, and travel time variability. TABLE Parameter Estimates for final BLR model Variable Coefficient Std. Error Z value P-Value Model ln ( P(tripclass b=) P(tripclass b =) ) = Intercept + (Coefficient i Variable i )* Intercept -0.0 0. -0..E-0 ML traffic flow 0.0 0.00..E-0 Start sensor # 0. 0.0..E- Start sensor #0 -.0 0. -..E-0 Start sensor #0-0. 0. -..E-0 Start sensor #0-0. 0. -..E-0 Start sensor #0-0. 0. -..E-0 Start sensor #0 0.0 0. 0..E-0 Start sensor #0-0. 0. -..E-0 Start sensor #0 0. 0.0.0.E-0 Start sensor #0.0 0.0..E- Start sensor #0. 0.0 0. 0.00E+00 Start sensor #0.0 0.0..E- Start sensor #. 0.0. 0.00E+00 Start sensor # -0.0 0.0-0..E-0 Start sensor # 0. 0.0 0..E- Start sensor # 0. 0.0.0.E-

Sharifi, Burris Start sensor # -0. 0. -..0E-0 Start sensor # -0. 0. -..E-0 Start sensor # -0. 0. -..E-0 Start sensor # -0. 0. -..0E-0 Start sensor # 0. 0...E-0 Start sensor # 0.0 0. 0..E-0 Start sensor # 0. 0...E-0 Start sensor #0 0. 0.0..E-0 Start sensor # -0.0 0. -..E-0 End sensor # 0. 0..00.E-0 End sensor #0 0. 0...E- End sensor #0 0. 0..0.E-0 End sensor #0 0. 0...0E-0 End sensor #0 0. 0...E-0 End sensor #0 0. 0...E-0 End sensor #0 0. 0...E-0 End sensor #0-0. 0. -..E-0 End sensor #0-0. 0. -..E-0 End sensor #0-0. 0. -..E-0 End sensor #0-0.0 0. -..E-0 End sensor # -0. 0. -..E-0 End sensor # -0. 0. -..E-0 End sensor # 0. 0...E- End sensor # -. 0. -..E- End sensor # -.0 0. -..0E-0 End sensor # -0.0 0. -..E-0 End sensor # -0. 0. -..E-0 End sensor # 0. 0. 0..0E-0 End sensor # -0. 0. -0..E-0 Travel time variability. 0.. 0.00E+00 GPL traffic flow -0.0 0.00 -.0 0.00E+00 Toll rate -.0 0.0 -. 0.00E+00 *For categorical variables, associated class is assigned amount. GPL traffic flow is the most important variable in BRF model. Also, the BLR model shows that the GPL traffic flow is an influential variable in predicting the trips. The increase in GPL traffic flow (or the congestion on GPL) will result in a decrease in U-ML trip likelihood.

Sharifi, Burris 0 0 The next important variable in BRF model is travel time variability. This variable is also significantly important in the BLR model (considering the associated p-value). The increase in travel time variability will cause an increase in U-ML trips because of the higher variance in the expected travel time. The ML traffic flow rate is the third most important variable in the BRF model and a significant variable of the BLR model. The increase in ML traffic flow indicates a congestion on MLs and a higher likelihood of a U-ML trip. The next two variables of importance are the start sensor and end sensor. Start and end sensors are two categorical variables, and each class of them has a specific coefficient in the BLR model. The highest and lowest coefficients for the start point are for sensors 0 and 0. The sensors with highest and lowest coefficients for the end point are and. To verify the BLR model, the U-ML trip likelihood for each sensor pair (start and end sensor) with more than 000 U-ML trips is computed using the main dataset. Table and Figure show the most and least likely routes for U-ML trips. All five of the sensor pairs that were least likely to have U-ML trips, and four of the five sensor pairs that were the most likely to have U-ML trips were in the eastbound direction possibly because the eastbound (AM) peak period traffic on the GPLs is generally not as congested as westbound peak period traffic. Both the least and most likely U-ML trip sensor pair were often far apart. Possibly indicating the travelers willingness to risk a U-ML trip since they are traveling a long way and cannot be sure of traffic conditions far away from them. Many of the trips that were least likely to be U-ML trips were very common trips. While the trips more likely to be U-ML trips generally had way fewer total trips between those sensor pairs. TABLE Most and Least Likely Routes for U-ML Trips Economical Uneconomical Sensor Pair Total Trips Trips Trips U-ML trip Percentage 0-0 0 % 0-0 0 0 % -0 0 0 0 % 0- % -0 0 % Average 0 % - 0 % - 0 00 0 % 0-0 % 0-0 0 % -0 0 0 % Average 0% U-ML trip Likelihood Least likely Most Likely

Sharifi, Burris 0 0 FIGURE Most and Least Likely Routes for U-ML Trips (Red arrows show the most likely routes, and green arrows show the least likely routes). The final significant variable is toll rate. As estimated by BLR model, the drop in toll rate will result in a higher likelihood of a U-ML trip. This is not surprising since lower toll rates occur during less congested times. Therefore, toll rate also acts as a peak hour variable. CONCLUSION The objective of this study was to identify uneconomical managed lane (U-ML) trips and the factors associated with these trips. Based on nearly years of data, % of total paid ML trips on the Katy Freeway had a negative travel time savings (or were U-ML trips). Two forms of ML trip classification were examined in this study: binary (uneconomical and economical) and multiclass classification (uneconomical, economical, and a third group where the relative difference in travel times on the MLs and GPLs was very close). The most important factors in predicting a U_ML trip were ML traffic flow, GPL traffic flow, toll rate, travel time variability, start sensor, and end sensor. These variables indirectly add additional trip attributes to the final models, including time of trip (toll rate), route of the trip (start and end sensor), congestion (ML/GPL traffic flow), and reliability (travel time variability). The final models showed higher AUC for both BRF and MRF models over the BLR model. However, the BRF model is the best model because of the simplicity of binary classification as compared to the MRF model. Despite having a lower AUC, the BLR model has the advantage that the coefficient of each variable indicates its impact on the likelihood of U-ML trips. As observed, high ML traffic flow, low GPL traffic flow, lower toll rate (during non-peak hours), and large travel time variability will lead to a higher likelihood of a U-ML trip. The start sensor and end sensor are categorical variables, and their impact is not relatively positive or negative. It was found that sensor pairs with a lower number of ML trips had a higher probability of U- ML trips. Therefore, sensors combined effect as a pair is more critical than their individual impacts.

Sharifi, Burris There were some limitations in this study. First, many demographic attributes including wealth and income may influence U-ML trip rate. Plus, the trip purpose may be essential. There was no available data on drivers demographic characteristics. ACKNOWLEDGMENT The authors thank Dr. Katie Turnbull, Dr. Yunlong Zhang, and Dr. Gene Hawkins for their helpful comments and valuable time. The authors also appreciate Texas Department of Transportation (TxDOT), Harris County Toll Road Authority (HCTRA), NOAA s National Center for Environmental Information (NCEI), and Houston TranStar for sharing the data used in this project.

Sharifi, Burris 0 0 0 0 REFERENCES [] Perez, B. G., C. Fuhs, C. Gants, R. Giordano, and D. H. Ungemah Priced Managed Lane Guide.Publication FHWA-HOP--00, 0. [] Burris, M., C. Spiegelman, A. Abir, and S. Lee. Travelers Value of Time and Reliability as Measured on Katy Freeway. PRC - F, Texas A&M Transportation Institute, Texas A&M University System, College Station, TX. 0. http://tti.tamu.edu/documents/prc---f.pdf. Accessed January 0. [] Sullivan, E. Continuation Study to Evaluate the Impacts of the SR Value-Priced Express Lanes: Final Report, California Polytechnic State University. 000. [] Burris, M., K. Sadabadi, S. Mattingly, M. Mahlawat, J. Li, I. Rasmidatta, and A. Saroosh. Reaction to the Managed Lane Concept by Various Groups of Travelers. Transportation Research Record: Journal of the Transportation Research Board, No., 00, pp. -. https://doi.org/0./-0. [] Devarasetty, P., M. Burris, and W. Shaw. Do Travelers Pay for Managed-Lane Travel as They Claimed They Would? Before-and-After Study of Travelers on Katy Freeway, Houston, Texas. Transportation Research Record: Journal of the Transportation Research Board, No., 0, pp. -. https://doi.org/0./-0. [] Devarasetty, P. C., M. Burris, W. Arthur, J. McDonald, and G. J. Muñoz. Can Psychological Variables Help Predict the Use of Priced Managed Lanes? Transportation Research Part F: Traffic Psychology and Behaviour, Vol., 0, pp. -. https://doi.org/0.0/j.trf.0.0.00. [] Brownstone, D., A. Ghosh, T. F. Golob, C. Kazimi, and D. Van Amelsfort. Drivers Willingness-To- Pay to Reduce Travel Time: Evidence from the San Diego I- Congestion Pricing Project. Transportation Research Part A: Policy and Practice, Vol., No., 00, pp. -. https://doi.org/0.0/s0-(0)000-. [] Lam, T. C., and K. A. Small. The Value of Time and Reliability: Measurement from a Value Pricing Experiment. Transportation Research Part E: Logistics and Transportation Review, Vol., No., 00, pp. -. https://doi.org/0.0/s-(00)000-. [] Burris, M., S. Nelson, P. Kelly, P. Gupta, and Y. Cho. Willingness to Pay for High-Occupancy Toll Lanes: Empirical Analysis from I- and I-. Transportation Research Record: Journal of the Transportation Research Board, No., 0, pp. -. https://doi.org/0./-0. [0] Hagenauer, J., and M. Helbich. A Comparative Study of Machine Learning Classifiers for Modeling Travel Mode Choice. Expert Systems with Applications, Vol., 0, pp. -. https://doi.org/0.0/j.eswa.0.0.0. [] Harris County Toll Road Authority (HCTRA). Toll Rate Schedule- Katy Freeway Managed Lanes. https://www.hctra.org/content/hctrardpartypages/katymanagedlanes/media/katy_toll_sched.pdf. Accessed February 0. [] Lee, S. Real Option Analysis to Value Managed Lanes Using Big Data. PhD Dissertation, Zachry Department of Civil Engineering, Texas A&M University, College Station, Texas, 0. [] US Department of Commerce. NOAA's National Centers for Environmental Information (NCEI). GHCND:USTXHRR0, Climate Data Online: dataset discovery. https://www.ncdc.noaa.gov/cdoweb/datasets. Accessed April, 0. [] Dobbin, K. K., and R. M. Simon. Optimally Splitting Cases for Training and Testing High Dimensional Classifiers. BMC medical genomics, Vol., No., 0, p.. https://doi.org/0./---. [] Train, K. Discrete Choice Methods with Simulation. Cambridge University Press, 00. https://doi.org/0.0/cbo00. [] Breiman, L. Random Forests. Machine Learning, Vol., No., 00, pp. -. https://doi.org/0.0/a:000.