Failing to understand small multiples

I’ve compiled some examples of this chart type here, but haven’t got round to having a go at them myself. If I’m honest, I find them a bit busy on a large scale, but having said that, I really like some of Ramon Martinez‘s examples on his blog.

Today, Chris Love posted an eye-catching viz employing this chart type, and I felt it warranted a bit of investigation.

What has deterred me to date from Small Multiples, is that I don’t understand exactly what the Row and Column splitters are actually doing. Generally, examples I have seen just leverage the calculations used from examples eons ago, and reuse them in a “new” viz. That’s fine, but I have an annoying mind that wants to understand how and why things work. When I unhid the worksheets to look at the main chart page, the Row and Column shelves looked like this:

Time to pick these apart, one by one.

Column calculation

The column splitter comprises this:

OK. So I know what Index does. It’s just a sequential count of stuff within a partition. Or, in super simple terms that register with me: how many things there are in a row or column. The Compute Using option determines which direction in which the indexing occurs. Here’s the online help definition. So Chris is counting stuff sideways (Table (across) computation – which is the default computation method).

What does the ‘%’ operator do? It represents a modulo operation. What the hell is that? I haven’t studied maths for a good 20+ years now, and that knowledge has long since dissipated.

Modulo Defined
The Modulo operation determines the remainder of division of one number by a second number. Example: 9 MOD 2 evaluates to 1 because 9 divided by 2 is 4 with a remainder of 1. Modulo can only operate on Integers.

Tableau Usage
For Modulo, the percent symbol (“%”) is used in a Tableau Calculated Field. In the above statement, 9 MOD 2 would be displayed in a calculated field as 9 % 2.

OK, so now I see a faint glimmer of light. Index counts stuff, and the modulo operator chops count that up into chunks. But it still isn’t really sinking in. I need to reel this right back to basics, so I’ll fire up Sample – Superstore and see what’s what.

We’re off. The Col calculation on Text is simply this:

The computation across the table just happens by default. The important thing for me here is that the INDEX() is just counting stuff, and I’m knocking 1 off that INDEX(), hence starting at zero. OK, I get this. Time to layer on a bit more of that first calc. Chris had it set using modulus 7. I’m going to add in a basic parameter to allow me to control that amount and observe what happens. The parameter is just this:

And I incorporate it into the Col calculation like this:

When the parameter is set to 1, this happens:

If I bump it to 4, look at this:

Who can guess what happens when I go straight in at 8? That’s right:

So it’s clear that the use of modulus chops your data up based on the value set. It then restarts at zero and loops through again. Progress. So it’s probably time to look at the Row calculation.

Row calculation

In Chris’ workbook, the calculation is this:

I *think* know what INT does! Here is is on the online help to confirm:

INT(expression)
Casts its argument as an integer. For expressions, this function truncates results to the closest integer toward zero.
Examples
INT(8.0/3.0) = 2
INT(4.0/1.5) = 2
INT(0.50/1.0) = 0
INT(-9.7) = -9
When a string is converted to an integer it is first converted to a float and then rounded.

What does this seem to mean? It just takes the result of a calculation and rounds it down to a whole number towards zero. Let’s check that out in practise. I’ll strip it back to basics first of all:

As you can see, the Row calc replicates the earlier simplified Col calc, albeit wrapped in an INT(). What happens if I remove the INT? Nothing.

Alright, lets introduce the parameter to the calc with the INT prefix:

So it’s the same effect as before – compare the the Col calc with the parameter set to 8 and we’re in the same position. What happens if I remove INT now? Nothing. So maybe it only becomes relevant when we lob them together into the same view.

Nope. Now I’m really confused. Started trying to replicate a small multiple with Sample – Superstore data and I ended up with a mess like this. I’m just guessing what I’m doing at this point:

Calcs were set up like this:

I’m heading up a blind alley here. In the dark.

If in doubt, refer to Andy Kriebel. Andy shared a video about Small Multiple line charts in March 2016 and I’m going to tell my inquisitive mind to back off. I’m just going to copy the calcs like for like with Superstore data, and if it works, I’m happy and that’s it! No more wondering how and why – I’ll leave that to somebody who truly understands it as it’s out of my league.

This is what the initial view looked like:

The Column Divider and Row Divider calculations are exactly how Andy had them in that video. How and why do they work? I don’t know, and I don’t care*

*(I do care, but I can’t figure it out 😦 )

The only important part was to Compute Using the relevant field: Region. What I don’t get, is why if I switch to using Sub-Category and update Compute Using accordingly, this happens:

Looks OK, doesn’t it? Look what happens when I switch to a line chart:

Notice how a few of the charts have a combination of lines? More than one Sub-Category is being plotted per chart, and I do not understand this at all. I can see that there at 17 charts plotted, and I know that there are 17 Sub-Categories in the default Sample-Superstore data, so I can’t understand why I have Copiers and Chairs in one chart, or why I have Fasteners, Envelopes and Copiers in another.

I’ve uploaded the workbook for any kind soul to investigate and confirm the error of my ways. This is the first chart type I’ve struggled with. I want to understand the mathematics (in simpleton’s terms please!) and also why this specific workbook has this mix of Sub-Categories.

The main issue was sparsity of data, which Michael demonstrated ably with this image:

No Envelope orders in February 2012. Data sparsity. To resolve this, I just needed to navigate to here:

With Stack Marks set to On, Tableau draws marks cumulatively along an axis, and so dissipates noise creates by missing values.

Stacking marks is relevant when your data view includes numeric axes. That is, at least one measure has been placed on the Rows or Columns shelves. When marks are stacked, they are drawn cumulatively along an axis. When marks are not stacked, they are drawn independently along an axis. That is, they are overlapping.

Update II

I still managed to forget this when whimsically recreating a Sample – Superstore viz today, so I saved a dashboard to Tableau Public to hopefully (finally) chisel this information into my brain: