README.md

gtfsrt2lc

Converts GTFS-RT updates to Linked Connections following a predefined URI strategy that is provided using the RFC 6570 specification and any variable present in a (also given) related GTFS datasource.

Install it

$ npm install -g gtfsrt2lc

Test it

This bundle comes with example data of both, a GTFS datasource and a GTFS-RT update that belong to the Belgian Railway company NMBS and that can be used for testing this tool. Once installed, the usage can be checked as follows:

How does it work?

Providing globally unique identifiers to the different entities that comprise a public transport network is fundamental to lower the adoption cost of public transport data in route-planning appplications. Specifically in the case of live updates about the schedules is important to mantain stable identifiers that remain valid over time. Here we use the Linked Data principles to transform schedule updates given in the GTFS-RT format to Linked Connections and we give the option to serialize them in JSON, CSV or RDF (turtle, N-Triples or JSON-LD) format.

The URI strategy to be used during the conversion process is given following the RFC 6570 specification for URI templates. Next we describe how can the URI strategy be defined through an example. A basic understanding of the GTFSspecification is required.

URI templates

In order to define the URI of the different entities of a public transport network that are referenced in a Linked Connection, we use a single JSON file that contains the different URI templates. We provide an example of such file here which looks as follows:

The parameters used to build the URIs are given following an object-like notation (object.variable) where the left side references a CSV file present in the provided GTFS datasource and the right side references a specific column of such file. We use the data from a reference GTFS datasource to create the URIs as with only the data present in a GTFS-RT update may not be feasible to create persistent URIs. The GFTS files that can be used to create the URIs in the current implementation of this tool are routes and trips. As for the variables, any column that exists in those files can be referenced. Next we describe how are the entities URIs build based on these templates:

stop: A Linked Connection references 2 different stops (departure and arrival stop). The data used to build these specific URIs comes directly from the GTFS-RT update, which is why here we do not specify a CSV file from the reference GTFS datasource. The variable name chosen for the example is stop_id but it can be freely named.

route: For the route identifier we rely on the routes.route_short_name and the trips.trip_short_name variables.

trip: In the case of the trip we add the associated connection.departureTime(YYYYMMDD) on top of the route URI. The connection entity will be explained next.

connection: Finally for a connection identifier we resort to its departure stop with connection.departureStop, the connection.departureTime(YYYYMMDD), the routes.route_short_name and the trips.trip_short_name. In this case we reference a special entity we called connection which contains the related basic data that can be extracted from a GTFS-RT update for every Linked Connection. A connection entity contains these parameters that can be used on the URIs definition: connection.departureStop, connection.arrivalStop, connection.departureTime and connection.arrivalTime. As both departureTime and arrivalTime are date objects, the expected format can be defined using brackets.

How you define your URI strategy to obtain stable identifiers will depend on the actual data that exists on both the GTFS datasource and the GTFS-RT updates, and how these data is mantained.