Data Detour: Analytics Will Move Transportation Forward

Image: Zahlm/Flickr

Mobile devices and networks in our homes, schools, recreation spaces and workplaces are all connected to the Internet, providing accessibility and convenience to our lives. Even our commutes are connected – in fact, the transportation industry is a leader in creating the Internet of Everything, generating vast volumes of data each day through sensors in passenger counting and vehicle locator systems and ticketing and fare collection systems, to name just a few. Over time however, those terabytes of collected data have added complexity to IT operations, and consume immense amounts of storage in silo after silo of transportation operator data centers.

Transportation authorities have information and advanced planning tools today and each one serves a specific purpose; however, by fusing these siloed data sources they can create a much more complete picture of the dynamics and factors that contribute to the effectiveness of the network and consequently, the satisfaction experienced by travelers.

Creating that richer, more complete picture of what’s happening on the ground is presently an opportunity for business analytics in the transportation sector, to leverage big data tools and predictive analytics to help transportation agencies improve operations, reduce costs and better serve travelers. Before transportation agencies can arrive at those results however, there are several hurdles to be cleared using the available technology tools and a tailored data science process.

One current weakness in business intelligence has been the lack of a coherent process that weaves the raw data acquired from different systems into meaningful information that is used to make decisions that impact operations, revenue or customer satisfaction. Collaboration is part of the process too, because success depends on having knowledge of existing systems, understanding of the transit organization, and the participation of subject matter experts in honing the solution. So creating a process that harnesses this collaboration allows an organization to arrive at analytic insights in an organized and efficient fashion.

As previously mentioned, massive data sets already exist in transportation organizations, so agencies must decide which of them to use and where in the organization to look for complementary data assets. This first process hurdle of gathering data includes extracting and collecting data in any format from wherever it lives. Then it must be staged in a data lake, preferably one that’s unconstrained by traditional data management constraints, before it is cleansed and organized into standardized transportation-specific models suitable for analytics.

Overcoming the next process hurdle, of comprehending the story told by the data, happens by fusing domain-specific data that was cleansed and organized in the first step, with other modeled data and applying predictive analytics techniques to arrive at a more complete picture of the dynamics and factors that are influencing the outcome. Statistical analysis, simulation and optimization can be applied to exploit the relationships reflected in the data, to plan for and predict what’s likely to happen under different scenarios. These important insights must then be presented in compelling data visualizations and intuitive reports that reveal and communicate the information that matters.

The final step of harnessing the insights for operational improvement is how agencies will realize the benefits to achieve the desired operational outcomes. Each step in this process should be repeatable and used regularly to measure performance against targets, confirm success and then look to the next set of objectives in an iterative and continuous improvement cycle.

As an example of this process, the Metropolitan Transit System (MTS) in San Diego has been working to understand in a more detailed way, how travelers are moving through their transportation network, like making connections between bus routes or from bus to rail systems and vice versa. Understanding where and when connections are being made and arriving at a comprehensive view of those activities will allow MTS to determine if assumptions that underpin the planning of services are consistent with how travelers are actually using the services. They are using analytic modeling to bring together five different independent data sources supplied from disparate enterprise systems to arrive at this understanding.

In the future, specific tools and models will work in concert with data sources produced by even more operational systems as well as non-transit sources such as traffic data, social media feeds to reveal traveler sentiment, population demographics, geospatial data, weather data and economic and retail data to improve operational management and planning activities for transportation.

Some examples of how agencies will use data are:

Predict the impact on roads, highways and public transit networks caused by subway line closures, planned road-works or transit maintenance projects, and recommend the optimum change in transit schedules and communication strategy to deal with the impact

Predict the impact of major unplanned events such as a transit labor strike on transportation utilization and the local economy

Detect and predict the likely occurrence of everyday unplanned service incidents like a traffic accident or vehicle breakdown, and recommend optimum responses

Pinpoint the everyday events, such as late-arriving buses, bus breakdowns or signal outages that have the highest economic impact/cost to a transit agency and recommend ways to eliminate the events or mitigate the economic impact

Model and predict the impact of different proposed urban development projects to transportation and assist in the selection or modification of the projects to achieve sustainability objectives while supporting the need for mobility of a heterogeneous population

Examine the impact of a major commercial development, such as the building of a new stadium to take account of the relationships between transit usage and other relevant factors such as demographics, geospatial data (e.g. number of licensed restaurants and other establishments), and commerce activity to reveal the interrelationships and how they may be impacted so that public leaders can maximize the public benefit delivered by such a project

Model and predict the effect of planned expansion of transportation networks with clear understanding of the patterns of usage and the impact of land use and development decisions, special events, holidays and weather, and employment etc.

Examining major events such as the Super Bowl to determine specifically where and when services should be adjusted or supplemented to better accommodate visitors and avoid putting additional cars on the road that cause congestion

The future of data analytics in transportation has many applications and opportunities. The challenge is certainly not the ability to generate data, because systems in place are already providing more than is currently being used. The solution therefore, is pushing forward using significantly improved means and methods to gather and understand the data in order for business decisions to be informed by better insights.