VIsually Revealing Gaps in ONCourse Data

One of the things that I never really, fully, truly appreciated until I saw Martin Hawksey mention it (I think in an off the cuff comment) at a JISC conference session a year or two ago (I think?!) was how we can use visualisations to quickly spot gaps, or errors in a data set.

Following on from my previous University of Lincoln course data explorations (see Getting ONCourse With Course Data and Exploring ONCourse Data a Little More… for more details), I have another little play, this time casting the output of course data around a programme into a hierarchical tree format that can be used to feed visualisations in d3.js or Gephi (code; call the data using URLs rooted on https://views.scraperwiki.com/run/uni_of_lincoln_oncourse_tree_json/ Parameters include: progID (programme ID using the n2 ID for a programme); full=true|contact|assessment (true generates a size attribute for a module relative to the points associated with the module; assessment breaks out assessment components for each module in the programme; contact breaks out contact hours for each module; format=gexf|json gives the output in GEXF format or d3.js tree JSON format. Note: it should be easy enough to tweak the code to accept a moduleID and just display a view for the assessment of contact time breakdown for the module, or a programme ID and level and just display the contact or assessment details for the modules in a particular level, or further core/option attribute to only display core or optional modules etc.).

I also did a quick demo of how to view the data using a d3.js enclosure layout. For example, here’s a peek at how core and option modules are distributed in a programme, by level, with the contact times broken out:

The outermost bubble is a programme, the next largest bubbles represent level (Level 1, Level 2, etc corresponding to first year, or second year of an undergraduate degree programme etc). The structure then groups modules within a level as either core or option (not shown here? All modules are core for this programme, I think…?) and then for each module breaks out the contact time in hours (which proportionally set the size of contact circles) and the contact type.

The labelling is lacking in a flat image view, (if you hover over elements you get tooltips popping up, that identify module codes, whether a set of modules are core or option, the level of a set of modules etc., but in places the layout is hit and miss; for example, in the above example, one module must have 100% lectures, so we don’t see a module bubble around it. I maybe need to see each module bubble with a dummy/null bubble of size 0, if that’s possible, to force a contact bubble to be visibly enclosed by a module bubble?)

Here’s another example, this time breaking out assessment types:

Hovering over these two images (click through them to see the live version), I noticed that form this programme, there don’t appear to be any optional modules, which may or may not be “interesting”?

Looking at the first, contact time displaying image, we also notice that several modules do not have any contacts broken out. Going back to the data, the corresponding element for these modules is actually an empty list, suggesting the data is not available. What these views give us, then, is a quick way exploring not only how a programme is structured, but which modules are lacking data. This is something that the treemaps in Exploring ONCourse Data a Little More… did not show – they displayed contact or assessment breakdowns across a course only for modules where that data was available, which could be misleading. (Note that I should try to force the display of a small empty circle showing the lack of core or option modules if a programme has no modules in that category?)

Something else that’s lacking with the visualisations is labeling regarding which programme is being displayed, etc. But it should also be noted that my intention here is not to generate end user reports, it’s to explore what sorts of views over the course data we might get using off the shelf visualisation components, (testing the API along the way), how this might help us conceptualise the sorts of structures, stories and insights that might be locked up in the data and how visual approaches might help us open up new questions to ask over, or more informative reports to drawn down from, the API served data itself.

Anyway, enough for now. The last thing I wanted to try was pulling some of the API called data into an R environment and demonstrate the generation of a handful of reports in that context, but I’ve run out of time just now…