Sunday, January 8, 2012

How long is the track? How long is the trail?

Melinda mentioned the conversations she and I have been having about how to look at the data, where the numbers may be coming from and what it all means. So tonight, I thought I'd chime in here on the blog and take a second look at what Melinda pointed out in her post about measuring speeds: Because the SPOT track is a succession of straight line segments, she wrote, "The actual distanced traveled by the dog teams will always be longer than shown on the tracker."

How much longer? Ideally, when we receive the position of a musher from the SPOT tracker every 10 minutes, we could measure the distance travelled not simply as the straight line between these points, but along the actual race trail. This would require that we know where exactly the trail is, and how long it is between any two points on it. Unfortunately, neither of these two questions is easy to answer.

To illustrate some of the difficulties, take a look at this screenshot of Jeremy Rutledge's track while on his way out towards Skwetna. We can see two ways the GPS track deviates from the trail as it is marked in red (as always, click on the image for a larger version):

On top ("Deviation 2"), the GPS points are more or less on the Knik 200 trail, but the line that connects them cuts across a curve in the red line. This is exactly the situation we talked about, and we could fix the distance measurement by going along the red line, instead of cutting across it. However, the track points labeled "Deviation 1" show that it isn't quite that easy: Here the GPS points look like they aren't even on the trail! So was Jeremy lost or taking the wrong trail? No, I think it's the red trail line that isn't in the right place.(*)

Ouch. So on the one hand we only know where the musher is every 10 min, and on the other we aren't even sure where precisely the trail is.

But let's soldier on with our goal to measure the dog team's path along the trail. As it is marked in red on the map, the trail is nothing but another track, but with more points on it than the SPOT GPS track of the dog team. Now comes a trick: I can get the points of the trail track by poking around in the web page code. Or I could have asked Matthew Lee from Trackleaders.com. In any event, I procured a .kml file (the kind you can look at in Google Earth) that contains the red trail track. Neat. The file for the Knik 200 contains 544 waypoints, with latitude and longitude, from start to finish, which is very roughly speaking 5 trail waypoints between successive GPS points of a moving dogteam, on the average (at Lance Mackey's speed, that is). The .kml file for the Gin Gin 200 contained 1139 such latitude/longitude pairs.

Now I have all the ingredients assembled to calculate distances along the track, as best I can. What I have to do is:

Get the latitude and longitude of two GPS points (as transmitted by the SPOT trackers).

Find the two waypoints on the trail track that are closest to these GPS points, taking into account the direction the team is travelling in, and applying precautions if there are loops in the trail (which can make it really really complicated to figure out which trail segment to assign to our dog team).

Sum up all the distances between successive intermediate waypoints on the trail between these two points.

The reader will notice that we're still summing up straight line segment lengths -- but this time along the trail, not just along the GPS track, so the more trail points we have, the more accurate this measurement will become. And if the trail information we have isn't too wrong, the result will always be longer than the first shortest-GPS-path approximation. But it will still be less than the exact distance travelled by the dog team! (**)

OK, this was long and complicated, so let's look at a practical example. What I did is to take the first half of the trail, from the start to Skwetna Road House. Along the trail, measuring along the red line, I am finding a length of 82.15 miles. As Matthew writes in another comment, Trackleaders has the trail length down as 84 miles - I'm reasonably close, with the main source of error being the formula I'm using to calculate distance in miles from latitude/longitude pairs (the Earth radius of 3,958.761 miles is almost certainly not accurate up here, but I didn't have time to research the best way to compensate). Now to compare with the SPOT GPS track length, I summed up all the 55 segments in Lance Mackey's track between the start and Skwetna. Result: 76.42 miles.

So what does this mean? Based on these numbers, as a rule of thumb, distance and speed measurements between successive GPS track points ("X miles traveled at Y mph") are on the average too low by about 7 % compared to along the trail-as-we-know-it, and possibly around 10 % (with a good margin of error here!) lower than the real-world distances and speeds. Obviously, the curvier a segment of the trail, the greater our error - we knew that already - and our method also averages out all the ways the trail information is incorrect. But now I have a tool to quantify these errors, and if I want, to calculate along-the-trail speeds for any team. Even if the method is really really fiddly.

(*) If you think about it, you'll come up with many ways that can happen: The Trackleaders team may never have received a real GPS track of the trail -- if the trailbreakers didn't have their own GPS device, they may have drawn the trail on a map, or they may be using a trail map from a previous year when the trail was in a slightly different location, especially on the river where there can be overflow. Also, when the weather's bad and the trail blows in, the trailbreakers sometimes make last minute changes. And then there are inaccuracies in Google maps, which are quite common up here in rural Alaska.

(**) As Matthew commented earlier, Trackleaders is multiplying the along-the-trail distance by 1.073 to get even closer to the real length. It surprises me a little that this number is apparently the same for the Gin Gin and the Knik, even though we had approximately twice as many trail points for the Gin Gin. I'd expect along-the-trail measurements for the Gin Gin to be more accurate than for the Knik.

5 comments:

Nice work Chris. If only Trackleaders could convince the trail bosses of the importance of a very correct track for our calculations. Or, more realistically, we get more scientific about our route factor for given courses. I am not sure exactly when the Knik track was recorded, but we received it 3 weeks ago, so surely it was subject to some change. I believe it was obtained with a Garmin GPS on a snow machine. Going forward, ideally we start to collect good, high-res GPS tracks from a few good mushers in every race. Then we can study how dog teams run a trail. It's this track we're seeking after. Regarding the Gin Gin track, that was hand-drawn (No GPS track), so there was no conversion from GPX to KML.

Thanks, Matthew - interesting! I was wondering if one of them was hand-drawn and the other from a GPS, but I thought the Knik track was likely drawn as it contained also elevation data. Though of course they could have been added by importing a .gpx file into Google Earth. In any event, the format was quite different.

What I still don't understand is where you get your factor of 1.073 from. Interestingly, the difference between the .kml track and the SPOT track was in my test case just about 7%, but I'd think the correction factor would depend on both the scale of the curves in the actual trail (we're interested in 5 metres or larger, more or less, but not curves on the scale of a grain of sand) and the average spacing of the sample points.

Forgot to mention...Trackleaders actually bumped up the route factor for Knik line to 1.09 due to all the circuitous river time. We were thinking it would help with speed calc. accuracy, but i'm not so sure it did due to the vagaries of teams straightening the river bends (when they can).

Regarding the diff. btwn the Gin Gin and Knik tracks, i should probably let our lead programmer, GPS wiz, Scott Morris of Topofusion speak to this, as I am def. no expert, but I believe elevation data aren't available in his processing software above 60 degrees, so he found a hack of some sort. The route factor is Scott's call too. Maybe we can get him to do an interview for a mush-tech post? Perhaps you write up some "Q" & I'll encourage him to "A". Cheers!

Sorry, one more addition: Regarding the pic above and your reference to the 'SPOT track' 'cutting the trail', the blue polyline linking SPOT points together has no significance other than to help the viewer understand chronology. it's especially useful when discerning the return leg points from the outbound points on an out-and-back course like the Knik.