The open-sourcing of Mobile Oxford has begun in earnest. On our list of things to do we have choosing a license, producing documentation, and splitting out the Oxford-specific parts.

To differentiate between the Oxford instance and the code it’s running on we’ve decided to call the project Molly.

Licensing

We’re as yet unsure as to which license we’ll be releasing Molly under, though we’re currently favouring a permissive license over a copyleft license.

To help us decide we’ll be meeting with OSS Watch (the higher education open-source software advisory service) tomorrow to help us make an informed decision. OSS Watch also provide background on a number of open-source licenses, which has proved useful in getting us not-so-legally-aware types up to speed.

Choosing the right license is essential to ensure that we foster as much input as possible from other parties. By way of example, Molly is designed to integrate rather heavily with an institution’s existing systems, and it’s possible that said institution might not want to publish how those interfaces work. As I understand it (and remember, I am not a lawyer) such an interface would constitute a derivitive work and require publishing were we to choose the AGPL. Additionally, the GPL is effectively permissive if used for a networked service as the software itself is never distributed – yet the name may still hold some ‘scare factor’ and thus put off contributions from some institutions.

Sakai is licensed under the Educational Community License, and as Molly is intended to provide strong Sakai integration we should consider how we position ourselves alongside.

Splitting out the Oxford-specific functionality

Mobile Oxford was initially intended to be a ‘demonstration location-aware application,’ but has serendipitously turned into something more. Being a demonstration, there hasn’t always been a strict separation between functionality and data sources. We intend that Molly will provide the user interface and data model for various applications, with the implementer creating a ‘provider’ to hook it up to a local system.

So far I’ve pulled out the contact search, creating three providers. These are our original screen-scraping implementation, another using an IP-restricted web service, and an LDAP-based solution for MIT’s people directory (based on their open-source MIT Mobile Web). The core functionality handles pagination, linking to further details and display; the provider can pass in a list of results for a given query and retrieve specific people based on some unique identifier. The interface between them is based on the LDAP attributes defined in RFC 4519 and comprises three methods and two attributes, making implementation relatively easy.

Contact search is one of the simpler parts of the site, so it’s going to take a fair bit of effort to provide a similar level of abstraction for the remainder of the site’s functions.

Documentation

We’ll be using Sphinx for Molly’s documentation. Sphinx is the closest thing to a standard in Python documentation, being used by many Python-based projects including Python itself and the Django project. Sphinx is specifically intended for documenting Python projects, including support for cross-documentation links and coverage reporting.

Documenting is also proving very useful in formalising the internal interfaces and exposing previous poor design decisions. There’s been at least a couple of occasions so far where I’ve documented how it should have been done and then had to refactor the code to bring it in line.

I spent a day a little while back investigating how to get the raw data behind OxonTime’s live bus locations. Here’s how it works.

Please be aware that what follows is the result of a purely academic investigation, and that before using this information it may be worth contacting Oxfordshire County Council to discuss your plans, particularly if you’re going to distribute the results of your endeavours.

Performing a request

By monitoring the HTTP requests made by the applet we can deduce that it fetches data from a particular resource with parameters provided in the querystring. The resource is located at http://oxfordshire.acislive.com/pda/mainfeed.asp, and takes the following parameters:

type

One of STOPS, INIT or STATUS. The first retrieves a list of stops and their locations for a given area. The latter two both return a list of buses, with the latter being used for updates.

maplevel

I believe this only accepts the values 0 through 3, with nothing returned for 0 and 1. The results for 2 and 3 seem identical, so there seems little point in varying it.

SessionID

This seems to be a misnomer as it specifies the area you want to enquire about. We’ll explain how to convert to and from these numbers later.

systemid

Always 35 as it gets unhappy if you change it.

stopSelected

This expects an ATCO code yet seems to be ignored, so you may as well leave it as 34000000701.

vehicles

A comma-separated list of vehicle identifiers you currently believe to be in the area you’re enquiring about. This is only passed when type=STATUS, and lets you find out when the given buses have left the area.

Format of responses

The response in each case is a pipe-delimited list of values, with the first being the action the client should perform in updating its state. The things you may expect are:

STOP

Retrieves a list of stop locations in the area. Nearby stops are collected into one line.

NEW

This signifies that a bus is to be found at the given location.

DEL

These only appear when type=STATUS and signify that a bus is no longer at the location. If the bus is still within the requested area there will be a subsequent corresponding NEW action to give its new location. If it has left there will be no such NEW action. The buses that appear here will be a subset of those provided in the vehicles parameter of the querystring.

Now we know the form of the response, let’s see the values returned for each action. We’ll start with STOP.

As mentioned earlier, stops that are close to one another (e.g. on opposite sides of the road, or a ‘lettered’ group) are collected together, with count giving the number of stops at this location.

naptan-codes, stop-names and stop-bearings are each caret-delimited lists with fairly obvious contents. stop-bearings are given in clockwise degrees from grid north.

x and y give pixel offsets from the top-left corner of the displayed area. More on this later.

I have no idea what the 35 signifies, and currently assume it’s something to be with systemid.

A note on bus stop identifiers

The NaPTAN (National Public Transport Access Nodes) database provides two classes of identifiers, ATCO codes and NaPTAN codes. ATCO codes are upto 12 characters in length, whereas NaPTAN codes consist of nine digits. OxonTime predominantly exposes the latter; these are the numbers beginning ‘693’ displayed at bus stops. However, ATCO codes are used for the currentStop parameter and are accepted elsewhere in place of their equivalent NaPTAN codes.

The NaPTAN database is currently maintained under license from the Department for Transport by Thales. Access requires a license, which may come with a fee for commercial use. However, you may be interested to note that NaPTAN data is finding its way into OpenStreetMap.

NEW actions

These have the form:

NEW|identifier|orientation|service-name|Operators/common/bus/1|y,x

Here’s an example:

NEW|1024|4|X5|Operators/common/bus/1|45,302

identifier serves to keep track of buses between requests. I don’t know whether it has some further meaning outside of this API. orientation is an integer between 1 and 8 inclusive, being N, NE, E, SE, S, SW, W, NW respectively. service-name is the same as is used on the rest of the site (e.g. ‘S1’, ‘5’, ‘TUBE’). The next bit seems constant and can probably be safely ignored. Finally the offsets are given, only this time with y first; I have no idea why.

DEL actions

These have the form:

DEL|identifier|

Here’s an example:

DEL|1024|

These identifiers match up with those given in NEW actions. Note the trailing pipe.

By periodically making requests with type=STATUS one can process the returned lines of a stream of commands describing how to update the local state. This makes client implementation easier as you are effectively applying a diff, as opposed to having to compare new to old.

The co-ordinate system

First off, I’d better give you a disclaimer. This API and its associated co-ordinate system is very specific to the applet that is its only intended client. As such, the co-ordinate system provides exactly what it needs and no more.

Locations are addressable using a combination of SessionID — hereafter known as map number for clarity — and an x and y pixel offset from the top-left corner of that tile.
The maps are each 418 pixels square, and are arranged in a grid aligned with the British National Grid. The important thing to note about this is that grid North is not the same as true North, and that if you intend to plot these things on (for example) a Google or OpenLayers map, you’ll need to get your projections right.

The maps seem to be numbered somewhat arbitrarily, as shown by this map of bus stops and their associated map numbers. Colours are based on a hash of the number of the map they appear on.

A map showing the relative locations of bus stops and the ACIS map numbers for each map.

These were found by requesting bus stops for all map numbers between 2500 and 3000, so are likely not complete. Predicting the map numbers for areas beyond these seems non-trivial or prone to error.

Doing the conversion

Edit: The following are for zoom-level 3 maps. Lower map numbers are at zoom-level 2 and cover nine times the area, so it probably makes more sense to retrieve and parse those.

To help you on your way, here is a Python dictionary which maps between map numbers and their top-left corners expressed as metres East and North of the SV square on the British National Grid. A pixel is equivalent to about 2.2×2.2 metres, as given by scale, making each map about 920 metres square.

For each stop, finding the absolute position as given by the NaPTAN database

Instantiating two variables, ∑∆offset and ∑∆position, to null 2D-vectors

For each pair of stops appearing on the same map, adding the positive differences of their offsets within the map, and their absolute positions, to the respective variables

Dividing ∑∆offset by ∑∆position to give a conversion factor between pixels and metres (given above as scale)

For each map finding the average of each stop’s location minus its offset multiplied by the conversion factor (given above in map_numbers)

Here’s a bit of Python to convert between a map number and x and y offsets, and the WGS84 co-ordinate system. I’ve cheated a little in using Django’s GIS functionality which in turn uses the ctypes module to call functions from the GEOS library. If you’re not using Python or don’t want such a large dependency then you may wish to read the documentation linked from this page on the Ordnance Survey website.

from django.contrib.gis.geos import Point
def to_wgs84(map_number, rel_pos):
"""
Takes an ACIS map number and a two-tuple specifying the offset on that
map. Returns a Point object under the WGS84 projection.
"""
corner = map_numbers[map_number]
# Differing signs as we're applying a left-down offset to a left-up position
pos = (
corner[0] + rel_pos[0] * scale[0],
corner[1] - rel_pos[1] * scale[1],
)
# 27700 is BNG; 4326 is WGS84
return Point(pos, srid=27700).transform(4326, clone=True)
def from_wgs84(point):
"""
Takes a Point object under any projection and returns a map number and
two-tuple for that point. Raises ValueError if the point does not lie on
any maps we know about.
"""
# Make sure we're using the British National Grid
pos = point.transform(27700, clone=True)
for map_number, corner in map_numbers.items():
rel = (
(pos[0] - corner[0]) / scale[0],
(corner[0] - pos[1]) / scale[1],
)
# This is the right map if it appears in the 418 pixel square to the
# lower-right of the the corner.
if 0 <= rel_pos[0] < 418 and 0 <= rel_pos[1] < 418:
return map_number, rel_pos
raise ValueError("Appears on unknown map")

Next steps

The terms of use for the OxonTime website forbid using it for other than personal non-commercial purposes, making it an abuse of the terms to use this data in some sort of mash-up. From my reading of the terms, however, there’s nothing to stop one writing and distributing a client that uses this API directly. If you’ve got the time and inclination, why not write an iPhone/Android/mobile-du-jour client application using all sorts of fancy geolocation and free mapping data?

You could even scrape the real-time information from http://www.oxontime.com/pip/stop.asp?naptan=naptan-code&textonly=1 (just be wary about non-well-formed HTML). Obviously, make yourself aware of the terms, and if in doubt, contact Oxfordshire Country Council. Be warned that this API, though likely stable, comes with no guarantee to that effect. Also, it seems a little slow at times, so be gentle and treat it with respect.

The Future of Technology in Education Conference 2009 (FOTE09) is dedicated to showcasing the hottest technology related trends and challenges impacting the academic sector over the next 1 – 3 years and builds on the success of our inaugural event in 2008.

Date: Friday, October 2nd, 2009

Venue: Royal Geographic Society, Exhibition Road, London

The 2008 conference completely exceeded our expectations and we were taken back with the great feedback we received for bringing together a diverse mix of speakers to give an insight into the unique technology related challenges currently facing the academic sector.

Alex Dutton and I are on our way to see the guys in Bristol who are part of the JISC funded Rapid Innovation project “Mobile Campus Assistant” (link to follow).

En route we noticed that Marks and Spencer have started putting 2D barcides on some of their products to provide bits of information to their customers.

Although there isn’t all that much useful on there at the moment, I think it has potential to deliver some interesting ideas in the future. I would be quite interested to see what kind of take up M&S have from this.

My personal incling is that there may well be a large response initially as they publish these codes on their products but unless they can deliver some compelling content the trend will die down.