Week 38: A Day at the Races

For week 38 we challenged the community to analyze and visualize our race data, which we produced on Sunday as Andy ran the Richmond Marathon and I competed at the Sprint Triathlon Age Group World Championship in Rotterdam.
This meant the data came out late on Sunday, as I only finished my race at 4.30pm and had to wait to collect my bike and wetsuit from transition, make my way back to the hotel, download the data, tidy it up and upload it and of course coordinate with Andy as he got his data ready.

So thanks everyone for being patient on the timing.

I published the original vizzes and some accompanying information on our blog. And that is where we get started with Lesson 1…

LESSON 1: BE REALLY CLEAR IN YOUR COMMUNICATION

This lesson is not for you, but for Andy and me (mostly me this time, as I wrote that blog and it was my week for MM).

We clearly were not clear in our communication. We provided background information to give people a helping hand if they wanted to find out more about marathons and triathlons and the KPIs people look at when training for such races. So far so good.

We also included a fairly long list of questions and comments around the things we were interested in finding out based on our data.

That’s where I failed to state that answering those questions was COMPLETELY OPTIONAL.

As Monday came and went there were hardly any submissions and on Tuesday it was still just a ‘slow trickle’ of vizzes. None of the regulars had posted anything yet and I was getting really worried that we had missed some secret agreement the community made to just not participate this week. So we asked around and it turned out that some people were put off or intimidated by those questions and saw them as requirements to be addressed.

Nooooooooo! Our intention was to give people guidance in case they didn’t know which direction to take their analysis. And for our own benefit, we hoped we’d get some answers that would help our training and our race performance.

We tried something different this week and it didn’t work out the way we expected. Most of that can probably be alleviated by communicating more clearly, so my apologies if you felt put off or overwhelmed by the questions we posted. Don’t be discouraged, we were simply trying to give guidance to those who wanted to do ‘more’ than a makeover. We didn’t expect people to go through the entire list and answer each question.

LESSON 2: WHEN WORKING WITH DIFFICULT DATA, USE THE NECESSARY RESOURCES

Ummm, this is another one for Andy and me. This week’s dataset was challenging. The topic was different but our community can handle different topics, they’ve proven that to us many times this year. Beside the topic, however, there were additional challenges in the fields contained in the dataset.

In the dimensions we had fields including Path ID, which is a field containing data from each point at which a GPS signal was recorded by our watches.

In the measures, there was a whole lot more. We had velocity, distance, speeds, moving time, heart rate, cadence, etc. This is very different from sales, profits, number of records.

In Tableau, working with duration and time is not straight forward. And understanding the definitions of these fields isn’t necessarily intuitive. Andy and I have worked with this data for months as we’re testing the Strava Web Data Connector that is being developed by Tableau. You guys don’t have the benefit of that background information and I failed to provide a link to additional resources. You could argue that the community is well capable of doing a google search and I completely agree, but we may have been spoiling you by providing simpler and more self-explanatory datasets in the past, so the lesson for me here is to give more background information, including this link to the Strava API documentation which defines each of the fields:

To make the data set really useful and have all the necessary fields for analysis, additional calculations were required and Andy and I agree that we should have provided those calculations. In all honesty, there just wasn’t enough time, but here are a couple if you’re interested…

Duration = [Time]/86400 This divides the time which is given in seconds, by the number of seconds per day. The resulting duration field can be formatted by changing the Number Format to Custom and typing ‘hh:mm:ss’ into the text box

Speed = [Velocity] * 3.6 This turns velocity (meters per second; location dependent) into speed (km per second; location independent). For miles per hour use *2.24. Then change the aggregation to ‘Average’ rather than Sum, because otherwise you will get the sum of each data point (each Path ID), which is incorrect.

There are also a couple of lessons for the community from this week’s challenge, so let’s move onto

LESSON 3: ASK QUESTIONS

We’re here to help and we run this project for the community. If you have questions, please ask us anytime.

Yes, there are questions to which you can find the answer quite easily by doing a simple google search, and then there are other questions which we can certainly help you with. If you’re unsure of how to tackle the challenge, just ask. If you want to understand what a certain field means, let us know and we’ll clarify.

Not only will this help you with your analysis, it is also good practice for business analysts in general. Asking questions and challenging the status quo is what we’re here to do. Don’t hesitate, just fire away.

LESSON 4: SENSE CHECK YOUR RESULTS

Sometimes little errors sneak up on us because we don’t sense check our results. Now I don’t expect people to know what my average running speed should be. A quick check for reasonableness is possible though. Is it likely that I’m running at 30km/hr or that Andy completed his marathon at 3.2km/hr? Not really.

Both authors fixed it though when we pointed it out, so well done on taking the feedback on board:

Ian initially had the speed in meters per second and the alignment suggested that I was running at over 31km/hr

He made the relevant updates after we gave him feedback and the alignment and units are much more intuitive and clearer now

Francis had simply not turned the meters per second into kilometers per hour, so the labelling suggested that Andy was running his marathon at a pace more appropriate for someone twice his age 😉

A quick clarification later, he implemented the changes and updated his workbook

Compared to standard pace calculators this one actually shows not just the target pace for a desired finishing time, but also how well Andy did in his previous races with relation to that target time, i.e. how realistic and achievable is the target time based on past performance