Menu

Creating my first Tableau web data connector : part 3

At last, the final part of the trials and tribulations of creating my first Tableau Web Data Connector… Part 1 went through the pre-requisites, and building what I might generously term my “user interface”. Part 2 was a struggle against the forces of web security. And in this part, we battle against the data itself, until the error-message dragon is truly slayed.

That makes sense, as I didn’t yet tell it what sort of data objects to expect.

As I’m following the Tableau web data connector tutorial, which accesses data from URLs like this one, I figured I’d look at what data that URL returns precisely, compare it to the web data connector tutorial code, and then, when I understand the relationship between the code and the data from the Yahoo URL, I might be able to adapt the code to fit the format my chosen data arrives in.

Here’s the data one gets if you point your web browser to the URL that the tutorial is teaching you how to build a WDC from:

OK, stock-pickers will immediately recognise that it’s stock quote data, with dates, highs, lowers, symbols and all the other lovely gubbins that comes along with basic stock charts. It’s in JSON format, which is human-readable, but only in small doses.

So I need to understand the relationship between the above data, and this piece of code:

I can kind of see it by eye – there’s obviously JSON entries for Symbol, Date, Close, query and so on. But can’t we make it easier?

Yes we can, because there’s such a thing as a web JSON viewer. Past all that stock-related text into something like http://www.jsoneditoronline.org/ and you get a nice hierarchical visualisation of the JSON structure, like this:

So if we assume that “data” is referring to the overall set of data itself, and use dot notation to traverse the hierarchy, we can see a line in the code that says:

var quotes = data.query.results.quote;

That would seem to fit into the JSON structure above, where below the “object” level you have a level called query, a sublevel called results, and a sublevel called quote.

The variables “quotes” that the code creates therefore is basically referring to everything at/below the “quote” level of the JSON hierarchy.

Then you get a bunch of records which are the data of interest itself. These are numbered 0 for the first datapoint, 1 for the next, and so on.

If you know a bit of basic programming, you might note that the WDC code has “for loop” with a counter variable called “ii” that is set to 0 at first and is incrementing by one each time it runs. That seems to fit nicely in with the idea of it iterating through datapoint 0, 1, …n until it gets to the end (which is effectively the length of the quotes dataset, i.e. quotes.length).

Each one of these JSON records then has a few attributes – and they include the ones mentioned in the code as being part of whichever “quotes” subrecord we’re on; “Symbol”, “Date”, “Close”.

There they’ve defined the fieldnames they want to have Tableau display (which can be totally different from the JSON names we saw above in the datasource) and the field types.

Field types can be any from this list:

bool

date

datetime

float

int

string

So, let’s look at what’s returned when I make the call to my Boardgames datasource.

OK, I can see some fields in there, but it’s kind of messy. So I pasted it back into the online JSON editor.

Here’s what the relevant part looks like when formatted nicely:

So, the interesting information there is probably the boardgame’s name, year published, URLs to its image and thumbnail, and the various statuses (own, want, etc.) and the number of times its been played (numplays).

I therefore edited the “getColumnHeaders” code snippet to reflect the fields I wanted, and the type of data that would feature in them.

Now I’ve defined the fields, I can go back to the retrieving results section of code (if (data.query.results)…) and, now knowing the structure of the JSON generated by my API, parse out and assign to the above variables the data I want.

I decided to call the collection of data I was building “games” rather than “quotes”, because that’s what it is. I next noted that each individual “game” within the JSON is listed in a hierarchy below the “item” entry which itself is below “items”.

(“Amyitis” is the name of a boardgame, rather than an allergy to people called Amy, believe it or not).

So, I assigned all the items.item data to “games”

if (data.items.item) {
var games = data.items.item;

And copied the tutorial to loop through all the “items.item” entries, each of which is a boardgame, until we’d reached the end i.e. when the number of times we looped is the same as the length of the data table..

var ii;
for (ii = 0; ii < games.length; ++ii) {

Finally, it’s time to assign the relevant bit of data returned from the API to the variables I’d defined above.

At first I got a little confused, because the JSON output had a bunch of _ and $ entries that didn’t seem similar to what was returned in the tutorial dataset. But it turns out that’s nothing to worry about. Just treat them as though they were any other text.

In the above, you can think about games[ii] as being the individual boardgame record. The “for loop” we defined above substitutes each record into the ‘ii’ variable, so it’s accessing games[0], games[1] etc. which translates into data.items.item[0] and so on, if you remember how we defined the games variable above.

Then to find the boardgame’s name we need to traverse into the “name” chid of the boardgame itself, then look for the first entry below that (always the first one here, so we can refer to that as entry [0]), and look for the field that is shown as an underscore, _.

Hence:

'BoardgameName': games[ii].name[0]._,

Rinse and repeat this for each element of interest, and you’ve collected your dataset ready for Tableau to use!

Of course, the world is not perfect, and, in reality, I did not manage to do the above without making the odd mistake here and there. I got blank fields sometimes, when I knew they shouldn’t be, or fields with the wrong data in. As I was just using Notepad and the online simulator, I wasn’t really getting many useful error messages.

Lesson learned: you can use the code

tableau.log("YOUR MESSAGE HERE");

to display the message you write in the Google Chrome developer console. You might remember from part 2 that you can bring that up by pressing Ctrl Shift I in Google Chrome, and selecting the “Console” tab.

Why is that useful? Well, one, if it displays your message then it means that piece of code ran, so you can check you’re not skipping over any important section. And secondly, you can append text stored in variables to it. So for instead I could write:

tableau.log("the name of the game is " + games[ii].name[0]._);

And, if my code is correct, it should print out the name of the boardgame as it runs.

If I messed up, perhaps using the wrong index, or forgetting the underscore, it will output something different, and perhaps also an error message, guiding me towards the source of the problem.

That, as they say, is basically that! I ran it through the hosted Tableau Web Data Connector simulator again, this ticking the box so it will “automatically continue to data gather phase”. And, lo and behold, below you can see that it has interpreted the information about that famous Amyitis game into a nice tabular format.

Once you’ve got to that stage, you’ve already won. Just load up Tableau Desktop for real, select to connect to a web data connector and up should pop your interface, and later, your data.