NextTrain API

The Next Train API provides a JSON web service for any General Transit Feed Specification (GTFS) feed. By publishing transit schedule data in a consistent format like GTFS, transit agencies enable a robust and interoperable ecosystem of third-party developer applications (e.g Google Maps and Next Train Mobile).

Problem Statement

For performance reasons, it is not practical for client applications to directly consume data in GTFS format. GTFS comprises a comprehensive ZIP file of normalized CSV files, not a concise JSON response to the parameters of a specific request.

Solution

The NextTrain API acts as an intermediary between the the client application and the raw GTFS data. It is designed to optimize the efficiency of the client application request process.

The system pre-processes GTFS data by checking a specified GTFS feed URL for new data on a scheduled basis and storing the results in a relational database. This facilitates future on-demand querying.

The response includes a list of trips running from the origin station to the destination station on the given day. It even includes a comprehensive list of all stops for each trip.

Collaboration

Any product which consumes GTFS data depends on the respective transit agency to publish quality data. Unfortunately, when working with data published by my local transit agency, Shore Line East, I noticed chronic data integrity issues and publishing process inconsistencies. Even though the agency was somewhat responsive when I brought these issues to their attention, I lost confidence in the agency’s ability to consistently deliver quality data.

To mitigate these issues in the future, transit agencies like Shore Line East should “eat their own dog food”. In other words, an agency’s own services (e.g. Shore Line East’s Trip Planner) should consume the agency’s own GTFS data. By becoming its own first customer, an agency increases its reliance on its own data. This reliance puts the right incentive structure in place for the agency to prioritize the quality of its data.

Adaptability

I encourage any interested developer to configure and re-deploy this API to consume GTFS data from any of these other transit agencies. Configuration involves setting an environment variable called GTFS_SOURCE_URL to point to the desired GTFS feed URL, and scheduling the server to run a job (rake gtfs:import) on a recurring basis. If you do this, let me know how it goes! I’m happy to provide support.