Boston Marathon organizers faced a lot of issues after bombs exploded near the finish of the 2013 race. Most were related to future race security. But they also had to decide what to do about the 5,624 runners who were stopped at or before the marathon’s 40K mark.

This decision was 99 percent public relations, and half mathematics, as Yogi Berra might have said. The Boston Athletic Association passed the first test, telling all 5,624 late-race non-finishers that they were guaranteed entry in the 2014 marathon.

It turns out that the B.A.A. consulted with a team of serious statisticians to help them assign finish times to the runners who were stopped. Both the B.A.A. and the runners wanted such times to be part of the historical record. But then the B.A.A. ignored the team’s recommendation, in part because the recommendation was based on complicated calculations that would have been hard to explain to runners and media. (Three of the seven consulting mathematicians had run the 2013 Boston, finishing before the bombings.)

According to the stat team, the best way to estimate marathon finish times is to use a process called the “nearest neighbor” system. This involves finding runners from previous Bostons with splits similar to “John Smith,” a non-finisher. Then, assign John Smith a finish time that matches up with the actual finish times of similar runners from previous Bostons. To do this, you need a lot of data, and a fair amount of number-crunching ability. The stat team had both.

Many runners are more comfortable with simple manipulations of their split times. For example, the PLoS One paper explains a “split-ratio” approach. This also involves some number crunching, since it sequentially compares each of your 5K splits to your previous 5K split. That is, it tracks the trending of your splits: Are you getting faster or slower as you hit the 40K mark?

The paper acknowledges a nice piece of work done by Raymond Britt, a web analytics and stat guy who has completed a bunch of Boston Marathons and Ironman triathlons. Britt explained his system several weeks after the 2013 Boston Marathon at his web site, runtri.com.

Basically, Britt analyzed a number of previous Boston results, and found that it took runners 6 percent of their total finish time to get from 40K to the finish at 42.2K--a distance that represents just 4.7 percent of the marathon distance. So most runners are clearly slowing down at the end. Britt’s system turns out to be the simplest of all analyzed systems, since it requires multiplying John Smith’s 40K time by 1.06, and nothing more. (Britt came up with a factor of 1.23 for runners who reached 35K but not 40K.)

Boston organizers also decided to use a split-based method of assigning finish times to the 5,624. But they took a kinder, gentler approach than either the “split-ratio” approach or the Raymond Britt approach.They assumed the “constant pace” theory. No one speeds up, no one slows down. You just hold your average pace at 40K (or from halfway, 25K, 30K, and 35K, whichever was your last split time) through the remaining distance to the finish. Few runners would ever complain about this system, especially not those who have hit the wall.

According to the stat team, their analysis has a use beyond the 2013 Boston Marathon. Many marathons now send emails and texts based on “John Smith’s” pace along the course. The data predicts his finish time, and is closely followed by media, friends, and family members. However, these updates are based on the “constant pace” assumption. If races wanted to be innovative, more accurate (and more mathematical), they would switch to the “nearest neighbors” algorithm.

“I wouldn’t say that implementing our system is trivial, as it would take some software engineering to convert it to the ‘athlete tracker’ systems that various marathons use,” Richard L. Smith, the University of North Carolina statistician first contacted by the B.A.A., told Runner’s World Newswire. “But it could be done, and it would produce better predictions than the systems being used currently.”

If you’d like to see all the Boston Marathon data analyzed by the stat team, it’s available here. The team challenges you to devise a better system than their “nearest neighbor” approach. They note that the mathematics involved is similar to the Netflix challenge, which tries to predict what movies someone will like.