March 14, 2018

Rody brought back a classic technique for Workout Wednesday week 11. Get all of the requirements here. Almost exactly two years ago, he wrote a guest post on this blog about how to create the unit charts he's challenging everyone to make.

One thing he mentioned in the blog is that now you can use unicode characters in 10.5, however, I've been using them for a long time. I got to this site, find the character I want to use, and paste it into the calculation or field name.

The trickiest part for me on this challenge was getting the dots in the right place, that is, starting at the bottom right. I had to swap some of the logic of the calcs around and it was done. As for the headers above each set of dots, well, I'll leave that to you to figure out. Here's my tip: don't overcomplicate it; it's quite simple.

What works well?

Showing cumulative growth since 2011 makes it easy to compare the regions.

The region colors are easy to distinguish from one another.

The subtitle tells me what the chart is about.

The legend is prominently placed so that we know it's important.

Putting the axis on the right speed processing because the axis is next to the end of the lines.

What could be improved?

Labeling the lines directly would remove the need for the legend.

Labeling the ends of the lines would tell us what the growth actually has been.

The title isn't very relevant to the data. Why does this make the Irish lucky? A better title is needed.

So what? What are we supposed to get from this?

My Goals

There are a lot of countries and a lot more years in the data set, explore them and look for stories.

Consider the regional aggregations to see if there's an significance.

Look at both volume and change. Looking at just change doesn't factor in the volume that each country imports.

This is time series data, so spend time looking at line charts.

After playing with data for a while, I remembered this great viz by Emily Chen and thought it might fit well with this data set because whiskey sales within Ireland have decreased. I used her viz as my style guide to create my Makeover Monday week 11.

March 7, 2018

Let's all start week 10 by agreeing that Luke is mean. Just kidding Luke! Workout Wednesday is supposed to be difficult and this week it sure was. This was definitely the hardest week for me so far.

We had to build this sunburst chart:

I had built sunburst charts before (link) so I opened that workbook knowing that would at least get me started. Ok, I was quickly about 75% of the way done, but my colors were all overlapping and starting at the same spot. The requirement has them essentially stacked with a gap in between.

I tried over and over again, but couldn't get the calculation right. So I went to lunch and went for a walk to get away from the computer and do some thinking. To figure it out, I thought about it as one week at a time, figuring if I get one week right, I might have it all correct.

The trick is figuring out the length of each segment within each week. Notice how they all end at the same spot. That's a pretty big hint. Then once you figure that out, you probably won't have a gap between the colors. That part is pretty simple.

Good luck!

March 5, 2018

God I hate survey data, especially survey data where the responses are so poor compared to the actual that they make it look like the data is wrong. I really, really struggled this week coming up with anything I found interesting.

I ended up skimming through parts of the main report to get a better idea as to what the survey is about and any interesting findings they might describe.

In the end, there were major themes:

Policymakers are really out of touch with the issues facing girls and women in the five countries in the study.

Policymakers "think" they know what's going on.

Really, it's quite a sad situation. When policymakers are so far disconnected from the truth yet think they are close, I suspect not enough action is taken. I guess that's politics.

It took me a good 15 minutes to comprehend what the original chart was even about. Apparently it shows how far policymakers' estimates are from the actual indicators.

What works well?

The bands for +/- 20% from the actuals helps give context to the estimates from policymakers.

The country titles and subtitles for the topic make is easy to know what each chart is about.

What could be improved?

What do the green diamonds mean? Apparently they are policymaker estimates, but there's no indication of that in the dashboard.

Why are these topics picked for these countries?

Why is Senegal excluded?

A more impactful and descriptive title would help.

It's unnecessary to include the source and legend with each chart.

My Goals

Try to understand the data; easier said than done

Understand the spread of each topic within each country

Show ALL responses

Allow the user to filter and drill in to the topic they are interested in.

Stick to the overall style guidelines from Equal Measures 2030

Include BANs for the number of policymakers that estimated within +/- 20% of the actual values

I don't love my final dashboard, but after working on it for far too long, I figured it was "good enough". It could probably use an explanation somewhere for how to read the charts.

March 3, 2018

A couple weeks ago, I demoed how to create tile maps of Runkeeper runs using Alteryx and Tableau. You can watch the recording here; I was the final speaker if you need to fast forward.

In this post, I'll detail how I created the tile maps and share the workflow and visualizations. The inspiration for this project comes from Marcus Volz and his great github tutorial on how to create small multiple visualizations of your Strava runs, which I wrote about here. This was the viz that I created based on his R code:

When I visited The Information Lab France in January, we decided that we would try to replicate Marcus' process with Alteryx and Tableau because we wanted it to be an interactive visualisation, whereas Marcus' creates a static image. This process should work for Runkeeper, Strava or any fitness app that uses GPX files.

To get all of your routes from Runkeeper:

Login to Runkeeper

Go to "Account Settings" from the gear at top right of the screen

Choose the Export Data option on the left

Select the date range

Click on Export Data

Unzip the files

To get all of your routes from Strava:

Login to Strava

Select "Settings" from the main drop-down menu at top right of the screen

I actually don't follow either of these methods. I pay a small yearly fee for a service called Tapiriik that allows you to sync your fitness data with Dropbox.

ALTERYX WORKFLOW

Since all of my files are in one place, this makes getting them all into Alteryx with a single input easy using a wildcard input. I've documented my workflow and you can download it here.

Steps

Input all of the files using a wildcard match for 2018 runs only.

Assign a unique number to every row. In these files there's a GPS reading every second. Having a unique number makes creating the routes easier because you know the sequence.

The next few steps are the magic part. These essentially give every route the same size as a square. We need them all to be the same size regardless of the actual geographical area covered.

Create points for each GPS points and connect them to create a polyline. Do this for both the route itself and the boundary

Calculate some summary stats for each run.

Bring them all back together.

Export as a SHP file.

TABLEAU VISUALIZATION

Once the shapefile is done, all you need to do is connect to it in Tableau. I created two visualizations. Click on the images for the interactive versions and to download the workbook.

1. Small Multiples

Like Marcus' viz, I created a view that spaces the runs based on the number of runs in the view. For example, in February, I recorded 22 runs, so I get a view with 5 rows and 5 columns. Each square has a route from the first run on the upper left to the most recent run in a Z pattern. There are additional details about each individual runs available on hover.

This view uses table calculations to determine the spacing.

2. Calendar View

The calendar is much simpler to create since it doesn't need any table calcs.

Double click the Geometry field to get a map

Filter to a single month

Place Weekday on the Columns

Week on the Rows

Add information for tooltips

Add total monthly mileage to the caption

Resize the rows and columns to fit the window

LESSONS LEARNED

I'm nowhere near an expert in Alteryx. I know I can improve if I practice more. Here are five lessons learned I learned that I will be taking forward:

Leverage the strength of your team to help you solve a problem and to help you learn.

Alteryx makes creating and working with spatial objects incredibly simple.

Prepping the data exactly as you need it in Tableau will make the visualization process much faster.

Data prep processes in R and Python make for great data prep exercises in Alteryx.

February 28, 2018

This week, Rody had a few tricks up his sleeve for Workout Wednesday. At first, I thought we were going to need to do something with polygons like he did in this tip about shading between two lines. I started down that road, but quickly realized I'd used up all of the shelves and I needed more, so back to the drawing board I went.

Suddenly a possible solution popped into my head (I figured it out by hovering again and again over his viz). I'm not going to give away any spoilers. Here's my solution if you get stuck...but give it a solid effort before you look at someone else's solution.

I'd like to get the difference from the first to last year for each country regardless of what year that is. When I use difference from first or a lookup, I get incorrect values for the countries that have a later year start than the others. I tried some LOD calcs, but that didn't seem to work either. Any help is appreciated!

This week's tip walks you through tackling this problem. In particular, how do you do three things?

Ignore years that have no values but a year exists

Return the value from first and last years where data exists

Compare the first and year to look at the change

The video and example below demonstrate the solution I came up with. Enjoy!

February 22, 2018

Ok, this week's workout sucked. The requirements and result you needed to achieve were super clear. What sucked was how fiddly the calculations were.

Given it's day 2 of training for DS8 in The Data School, I thought they should give this a try, as a group, with me. It was mostly them speaking the logic out loud and me translating into Tableau. It made for a really fun exercise that took us about 3.5 hours.

Looking at Rody and Luke's methods, our's was most similar to Rody's and way different than Luke's. But that's the beauty of Tableau; we all approached it very differently, yet got the same result. Tableau works with the way you think.

Thanks for the challenge Luke! You almost got us! Here's our result:

February 18, 2018

I had no idea that drug and medicine exports were $318B in 2016 before this week's Makeover Monday. That's crazy! I went in assuming that the US would be the highest since drugs are so stupidly expensive there, but I was clearly wrong; Germany is the clear leader. Apparently there are lots of big pharma companies there.

I love learning something new! Let's take a look at the original viz:

What works well?

Resizing the continents by their overall exports makes it obvious that Europe is the largest exporter.

I hate packed bubbles, however in this view, the largest countries stand out by double encoding with color.

Including the percentages for each continent provides needed context

What could be improved?

The packed bubbles make comparisons overly difficult.

The color legend uses unequal intervals.

Plotting the data on a map doesn't add any context.

There's no sense of overall ranking across all countries.

If you ask "so what?", there's no answer.

The article that accompanied the viz provided quite a few interesting statistics. It would have been great if these were included in the original viz, but they weren't, so I decided to make something super simple that: