Saturday, December 31, 2011

I feel the need, the need for reasonably accurate speed data

So as I mentioned, I was writing a post on how to look at speed data at Trackleaders when it occurred to me that the situation might actually be somewhat more complicated than I thought. Or maybe not. Basically, I wasn't sure how they were calculating speed. Speed is a function of time and distance, and while "time" is pretty straightforward "distance" actually isn't.

The simplest way to calculate distance would be to take two GPS readings and calculate the straight-line distance between the two. A somewhat more complicated way to calculate distance would be to take two GPS readings and calculate the trail distance between the two, based on the trail projection shown on the tracking page. I had assumed that they were using the former when I realized that they might actually not be, so we did some poking around to find the actual GPS data they were using (it was buried a few layers down but Chris has awesome web app-fu) and did some arithmetic to see if what we were getting was matching what Trackleaders was showing (with a brief diversion caused by not taking into account that the track points were numbered starting at '1' and the data array was 0-based - you're never too old or too experienced for off-by-one errors). Ultimately what we found was that the speed calculations on Trackleaders were based on straight-line distances between two GPS points. Note that because a line, by definition, is the shortest distance between two points,

The actual distanced traveled by the dog teams will always be longer than shown on the tracker

Therefore, teams are always traveling somewhat faster than is shown. How much faster depends on how straight the trail is - if it's straight the tracker is going to be closer to the actual speed and if it's not it's potentially quite a bit slower than the actual speed. Here's a case in point (no pun intended):

This is Kristy Berington on the Susitna River, with GPS points 118 and 119. Here's a sort-of digression but not really - Trackleaders draws straight lines between GPS points - it help visualize the track but it absolutely must not be construed as the track. Every year there are questions about this - no, the musher almost certainly hasn't gotten off the trail and hasn't gotten lost, those are just lines between data points. Remember, when in doubt trust the data from the GPS more than the data summary, whether it's the leaderboard or lines between track points. We really don't know where the dog team actually was between track points, although it's a pretty safe bet that they were on the trail. Trackleaders gives the speed between those two points as 5.3 mph, based on the assumption that her team stayed on the straight line between points 118 and 119. We can, however, take a look at the image and understand that she almost certainly stayed on the river, traveled a greater distance than is shown here (they say 3.02 miles), and therefore was traveling faster than was displayed on the tracker. The straighter the trail the more accurate the speed given by the tracker will be, and the curvier the trail the more inaccurate the speed given by the tracker will be.

Note that this has implications for projections. Because the tracker is consistently underestimating speed at least a little bit, it's probably the case that mushers will tend to arrive at a checkpoint earlier than projected (taking into account things like camping, rests, etc.). It will be interesting to watch this during the Knik 200 and the Copper Basin 300, two pretty big races coming up in the next few weeks.

Trackleaders.com is doing a great job with some pretty sparse data. Collecting all that data and figuring out how to aggregate it in a way that makes what's happening on the trail easier to comprehend is a hard, hard problem and I think they've done a good job. But there are limits to what you can do with messy data and I hope that this discussion is not understood as criticism in any way, but as a tool to help fans understand a little better what they're seeing on the trackers.

Re. speed calculations, in cases where the race organizers do not possess a GPX file of their course (almost always the case in mushing, for many reasons), Trackleaders work with the trail boss(es) to hand-draw (sometimes quite arduously) the courses on Google Maps. The KML file route distance contained in the Gmap is then multiplied by a route factor of 1.073. This gives us what we hope is close to actual trail miles between two points and we calculate speed from there--not on a straight line btwn points. While we admit that sometimes projected avg. speeds might seem a touch low, it's probably not off by as much one might think. We are highly dependent on the map line we're working with. Ideally, early (pre-race) trail breakers would always carry GPS units on snow machines, then run quickly home to email us course reroutes, etc., but with these races (generally) barely pulled off by understaffed volunteer forces, we try not to demand too much too soon in this evolution.

Amen to the limits of what one can do with messy data. We are optimistic that, as tracking matures in mushing, and if SPOT remains the transponder of choice, all parties will conspire to get it more dialed. On a related note: A burining question among organizers that Trackleaders are seeking to help resolve is *where* to place the SPOTs (on sled or person). The units really need a clear view of the sky to transmit, yet there are good reasons to place them on the musher too, should a dog team get away and run 40 miles down trail before getting hung up. Further, who should be charged with operating the SPOTs? Some in the mid-distance qualifier scene feel the mushers need to take more responsibility for operation given the huge challenge of resetting the tracking mode every 24hrs. This is a tricky one. The mushers have so much to think about already. For the Yukon Quest, the volunteer force is there to keep teams 'reset', but for bare-bones staffed mid-distance races, not so much. Believe it or not, pulling off live tracking for events is a TON of work (on both sides). Configuring the online trackers with rosters, bios, pics, links, etc. is time consuming, and then there's the implementation side. Shipping the hardware between races, educating organizers, devising execution plans. Yet, still the perceived 'fiscal value' is low! At Trackleaders we pretty much work for minimum wage. The reward for now, comes from how much the fans love it--which really makes us smile.