Data is uploaded using GPX, TCX, or FIT formatted data – all of which are new to me. Standard KML uploads don’t work – time stamps are required for each waypoint.

Along the route, photographic waypoints can be added to illustrate the journey, which got me thinking: this could be a really neat addition to the Rally-maps.com website, annotating stage maps after a race with:

photographs from various locations on the stage;

images at each split point showing the leaderboard and time splits from each stage;

pace info, showing the relative pace across each stage, perhaps captured from a reconnaissance vehicle or zero car.

Alternatively, it might be something that the WRC – or Red Bull TV, who are providing online and TV coverage of this year’s rallys – could publish?

And if they want to borrow some of my WRC chart styles for waypoint images, I’m sure something could be arranged:-)

A couple of reasons for tinkering with WRC rally data this year, over and the above the obvious of wanting to find a way to engage with motorsport at a data level, specifically, I wanted a context for thinking a bit more about ways of generating (commentary) text from timing data, as well as a “safe” environment in which I could look for ways of identifying features (or storypoints) in the data that might provide a basis for making interesting text comments.

One way in to finding features is to look at a visual representations of the data (that is, just look at charts) and see what jumps out… If anything does, then you can ponder ways of automating the detection or recognition of those visually compelling features, or things that correspond to them, or proxy for them, in some way. I’ll give an example of that in the next post in this series, but for now, let’s consider the following question:how can we group numbers that are nearly the same? For example, if I have a set of stage split times, how can I identify groups of drivers that have recorded exactly, or even just nearly, the same time?

It struck me that a tweak to the code could limit the range of any grouping relative to a maximum distance between the first and the last number in any particular grouping – maybe I don’t want a group to have a range more than 0.41 for example (that is, strictly more than a dodgy floating point 0.4…):

A downside of this is we might argue we have mistakenly omitted a number that is very close to the last number in the previous group, when we should rightfully have included it, because it’s not really very far away from a number that is close to the group range threshold value…

In which case, we might pull back numbers into a group that are really close to the current last member in the group irrespective of whether we past the originally specified group range: