5.0 Beta Testing of Framework and Review of Guidelines

The purpose of the beta testing phase of the project is to test the concepts, framework, including the methodologies, as well as the guidelines for assessing traffic data quality. It was expected that data from actual projects of State DOTs would be used for testing the applicability of the data quality assessment methodologies. In this regard, the framework was sent out to selected individuals in DOTs of the states of Florida, Georgia, Illinois, Maryland, Utah, Virginia, and Washington to review and where possible apply the framework using their individual local data. The original intent of testing the framework with state data was abandoned when it became clear that a fully fledged beta testing is quite an unreasonable demand on the public agencies. It became necessary to request only reviews of the concepts presented in the framework. A few review comments were received and these are summarized below.

In developing the guidelines for implementing the framework, estimates of the level of effort required to establish a data quality assessment system and straw man estimates of data quality acceptability levels were developed. Review comments on the guidelines and reality checks on the straw man estimates were sought from various offices of the U.S. DOT, FHWA and a few state DOTs. The review comments are also presented below. The estimates of the levels of effort and acceptability levels were revised based on recommendations by the reviewers.

5.1 Framework Review Comments

Out of the seven state DOTs included in the beta testing, four provided written comments on the framework as whole. None of the states actually applied the framework to their data. The following are detailed review comments on the framework.

Florida DOT

Florida DOT (FDOT) agrees that the quality measures are useful but noted that some may be difficult to calculate. Completeness and validity measures can be easily calculated for continuous counters. FDOT believes that it would not be practical to compute the accuracy of counters by routinely performing manual counts and comparing with machine counts. This is because (i) it is very time consuming to count from a video, (ii) manual counts are very error prone, and (iii) it is very difficult to synchronize the times on the video to the times in the permanent counter.

As far as timeliness measure is concerned, as long as the data resides in the database at the time it is extracted for processing, it is considered to be timely.

FDOT estimates 100 percent coverage of the state highway system every year, because those few roads that cannot be counted due to construction are estimated by applying a growth factor to the latest measured year.

Georgia DOT

Georgia DOT (GDOT) agrees with all of the quality measures and noted that the framework is well-presented and easy to follow. GDOT is currently undertaking a similar exercise in assessing data quality and determining archiving processes and needs. The results to date are generally similar to those presented by the case studies in the framework.

Washington State DOT

Washington DOT (WsDOT) noted that timeliness definition works for their ATR sites. However, for short counts (AADT - archived data), AADTs are calculated only at the end of year (usually May-June timeframe to coincide with HPMS submittal). This is because factors based on ATR sites are not available until February/March. WsDOT observed that coverage should also mention group factors (sufficient number of stations to accurately develop group factors).

Washington state has started using Data Stewards, data dictionaries, and providing a "data mart" for ISP's to access the data they need. WsDOT is pleased to see this concept in the framework.

California DOT

California DOT (Caltrans) has just deployed a quality check program for their WIM data that incorporates the accuracy, completeness and validity measures outlined in the framework.

5.2 Guidelines Review Comments

Beta testing of the guidelines focused on validating the straw man estimates of the acceptability levels of the quality measures and the estimated level of effort to implement a data quality assessment system within an agency. The following are comments on the draft guidelines.

Minnesota Department of Transportation (MnDOT) noted that data quality and anticipated variance in data quality are related to so many variables that, establishing parameters and targets is extremely difficult. The Office of Transportation Data and Analysis at the MnDOT have traditionally approached data quality from at least three perspectives:

Data Inputs: Certainly, primary factors in data quality center on the accuracy, completeness, validity, timeliness, coverage and accessibility of data inputs. The variability of traffic count data and traffic forecasting results are challenged and influenced by a host of factors including the type of roadway and its current AADT; the reliability of traffic data collection equipment; the ability to capture and incorporate information on trip making, demographic, land use and traffic generator changes; the precision of hourly, weekly and monthly adjustment factors; the accuracy of axle correction factors; and the robustness of the program to indicate when recounts are warranted and the availability of staff to perform recounts. Even when acceptance criteria are established for individual sites or types of roadways, there is no guarantee that the criteria will continue to be applicable over time as travel behavior at individual sites change.

MnDOT have established a "census" cycle for collecting traffic data on what is called "uniform traffic data segments" in Minnesota. The counted system consists of Interstates, U.S. and Minnesota highways, County Highways and Municipal State Aid roadways. To help assure data quality MnDOT has established "customer driven" screening criteria based upon AADT ranges to assist us in determining when recounts of traffic volumes are desirable. The attached chart shows a "general" guide to assist MnDOT's traffic count program administrators and field personnel who collect traffic count data. Comparisons are made between new, adjusted traffic count volumes and earlier annual estimates of AADT when individual sites were actually counted. Recounts are taken as program priorities, budget and time allow.

Data Uses: Another approach for considering data quality is related to how well the data match the needs of the users. Traffic data are used throughout the transportation community for planning, investment analysis, project development, environmental analysis, pavement design, operations and maintenance. MNDOT believes that data quality targets should be tied at some level to the sensitivity of the decisions they support. As a transportation data community, there is the need to work to ensure that these sensitivities are more clearly defined, articulated and universally understood by all of the stakeholders involved in providing and using traffic data.

Performance of Data Results Over Time: A third factor for considering the quality of traffic data has to do with how well what data collected and especially how the forecast matches actual trends and future volumes. MnDOT believes that most transportation agencies monitor their traffic data collection efforts over time and perform various trend analyses. Trend data outside expected parameters typically receive more scrutiny. In respect to travel demand modeling and forecasting, Mn/DOT Metro District Office in the Minneapolis-St. Paul metropolitan area uses a confidence interval to assist in evaluating the accuracy of their traffic forecasting and traffic modeling efforts. Currently, they are using a confidence interval of plus or minus 15%. The 15% range was based in part on an analysis of how well 20-year traffic forecasts done in the 1960's and 1970's for Twin Cities freeways compare to actual volumes in the forecast year. MnDOT believes that one may wish to consider adding an attribute to the list that deals with how well the data performs over time.

FHWA Offices

The following office of FHWA reviewed the straw man estimates of acceptable levels for the quality measures and levels of effort. The suggested changes to the initial estimates were incorporated.

Traffic Monitoring and Surveys Division: commented on the acceptable levels for the completeness measure and offered suggestions.

HPMS Division: provided guidance on the acceptable levels for accuracy measure for traffic counts on rural and urban highways, completeness for portable machine counts, and VMT.

Office of Safety on Highway Safety section: noted that annual VMT is used to calculate the highway safety rates (e.g., fatality and crash rates) and the data is needed in 1-year or less for Federal, State and local statistical and program purposes. It was recommended that this data and time be added to the Highway Safety section of the guidelines. It was also noted that the Daily VMT (i.e., DVMT) is not a relevant static for safety analysis. Several other changes were recommended.

5.3 Other Comments

There was some discussion about the data completeness measure. It was observed that a single statistic completeness measure does not distinguish completeness over time from completeness at the same time. For example, consider two extreme cases in which 95 percent of the possible readings are provided i.e., completeness (or availability is 95 percent). In the first case, data from 95 percent of the sensors is always available but is never available from the other 5 percent of the sensors. In the second case, data from all the sensors is available 95 percent of the time. The two cases differ significantly from each other, but both of them provide an overall availability of 95 percent. By this example, it would appear that a single statistic combines spatial and temporal completeness and can be misleading. It was noted that completeness needs to be addressed in greater detail i.e., a user needs a report of when and where data are available.

It is important to note that a single completeness statistic as illustrated above is not misleading. A single completeness statistic provides an indication of the magnitude of data completeness in the same way a national congestion statistic does not provide any details on congestion at specific locations. It is desirable to have more detailed completeness statistics and this can also be achieved by showing several levels of detail for completeness measure. It is conceivable that operations managers and other mid-level managers would appreciate a single completeness statistic (and not having to run calculations on their own to get a single statistic). Based on the single statistic, one can determine whether further details are required to characterize the quality of the data. Similarly, they could be links to other levels of detail e.g., "completeness by day" or "completeness by road" or something similar that permits detailed analysis by the desired category.

The framework is developed to provide a single statistic of the completeness measure. The single measure is intended to provide an "overall" measure of completeness of data. Detailed information can then obtained about the spatial and temporal variability of completeness if desired.