2-BitBio

Tuesday, January 30, 2018

There comes a time in every scientist's career when it's time to leave behind the excel chart tools.
There are many options spanning a broad range of difficulty and customisability.
Check out this post on Source for a good breakdown of the available programs and libraries.
Personally, I am a huge fan of the R library ggplot2 because of it's flexibility and many add ons including my favorite, the XKCD add on.
Lately though I've been working on some data visualisation projects and needed to generate plots on the fly for display in a website.
Enter: D3.js and it's offspring plotly.js.
D3 is a javascript library that takes input data and uses it draw, edit and render elements of a website.
The results are stunning data visualisations built right into the html itself.
Plotly, is a plotting library built on top of D3 and is basically a series of wrappers to make the most common chart types.
So why use plotly when we can go straight to the D3 source? Because D3 is difficult. Especially if you aren't completely comfortable in a js/html/css environment.
Then why use D3 when plotly can do (mostly) what we need? Because with D3 the possibilities are virtually endless. If you can dream it, D3 can probably make it.
Want to have a customised interactive plot? D3.
Want to make an interactive network diagram? D3.
How about a heatmap? D3.
Maybe you fancy a scatter plot with poop emojis?
You get the idea...
So let's give it a try. Below we'll make a simple scatter - line plot using D3 and add some interactivity.
CLICK HERE to skip the tutorial and jump to the full code.
Here's what we're making:

The plot is a basic line and scatter plot. Note the nifty hover labels and clickable points! Let's go through the code used to make it. First we need to make a simple web layout. Here I'm making a basic html page with two divs, one for the plot with the class D3Div, and one which will display some data with the id `clickTable`:

Now we are ready to write our javascript. We have some dependencies to load: First is the D3 library (we're using version 4 which has a different syntax from v3) and its dependency jquery. We also will use the tool tip tipsy for hover labels.

Now for some D3 action. The basic workflow in D3 is to select an html element then modify it, adding drawing elements or what have you, according to our data. If we get it right, the drawing elements will show up in the correct spot on the screen to make a plot or whatever it is you're trying to create.

Here let's start by selecting the D3Div div element, adding an svg layer to it, and setting the height and width. This is the layer we'll draw the plot in.
Note that first I'm taking the height and width of the D3Div div and using these as attributes for the svg layer. We use the d3.select() method to select D3Div, then we use .append() to add an svg element.
Finally we set the height and width with .attr and store everything in the var svg.

// get the height and width of the target container //set padding for the graph
container = document.getElementById( "D3Div" );

Did you notice the "chained" commands? The commands that are separated by "." are run in series allowing us to block multiple functions into one d3 call.

Scales

Now before we can map data to our svg layer we need to set up a scale. This will scale the data to fit the x and y range of our drawing layer which is itself established by the div size.
The scale takes two arguments, domain and range. Domain is the interval of our input data, and range is the interval it is scaled to. Check out this image from Jerome Cukier
For our purposes we want the input interval, domain, to span our dataset, and the output to map to the div size minus a little padding. To calculate the span of our input data we're using the built in d3.max() function to loop through the array and find the maximum.
Then for the range, we pass our height, width and padding variables.

Let's draw some stuff!!

So now we have variables which store functions for selecting the D3Div, the X and Y scales and our datasets. Time to put them together to make something!
We use the .selectAll() function to select all the circle drawing elements (even though they don't exist yet). Then we load the data with .data()
and bind it to the elements with .enter. Then we use .append to draw a circle and .attr to set the x and y positions termed 'cx' and 'cy'.
The functions we pass to the cx and cy attributes calls our scale functions on the appropriate 'column' of the array. The 'r' attribute defines the radius of the circles as 6.
This is all repeated for each element of the data.

Not bad! the dots (circles) have been arranged by their cx and cy attributes according to the scaled data. Go ahead and right click and 'inspect element' on the plot to see how it's broken down.
Next we want to add a line. To do that we'll make another variable to store a function using ds.line() to set the x and y coordinates of the path.
Then we append that line to svg using svg.append("path"). We pass the dataset using .datum() this time since we are only drawing one element. Confused? more on .datum() versus .data() by the developer here.
Finally, we set the "d" attribute (which is an html attribute that defines a path to follow) by calling our line function .attr("d", line);:

Axes

Now it's time for some axes. We should be used to storing functions in variables by now and here we make variables of the d3 functions d3.axisLeft() and d3.axisBottom() and pass our scale functions so they're the correct size.
Then we need to position them. We can use our previously defined height and padding variables to tell them where to sit in the div.
Finally, we add them to the plot using .append() and .call(). We set their position by passing xAxisPosition and yAxisPosition to the transform attribute. Note that translate sets an element to the position (x,y):

Axis Labels

Now we just need some labels.
We simply set their positition using the height, width, and padding variables and add them to the plat using the now familiar .append and .attr functions:
Here we can also add a bit of styling using .style.

Hey, that's a proper plot right there! But wait there's more! Since this is javascript we can do all the fun javascripty things like write callback functions!

Making plots interactive:

Here let's make a function that makes the dots (circles) clickable. To do that, we make a function 'clicked'. The function has four major steps:
1 : Select and set all the circles to black (to undo previous color changes).
2 : Invert the scale to go from pixels to data and call findNearest() to get the closest data point. This function from Andy Aiken on scottlogic.com is defined just below. It loops through the data and compares each data point to the clicked data and finds the smallest difference i.e. the closest data point. Clever huh?
3 : Use nearest to set the div clickTable to include those points.
4 : Change the color of the clicked dot to red (and kind of blink while doing so).
We invoke the function clicked by adding it to the end of the the circle drawing code chunk (since it targets only the circles).

Great! But doesn't plotly do all this?

As discussed in the introduction, plotly is great for routine charting.
If you want to do anything outside the bounds of plotly though, D3 can make it happen.
For example, here's the same plot drawn with poop emojis!

Putting it all together:

In conclusion, I usually don't recommend re-inventing the wheel. So if plotly provides what you're looking for, that might be the way to go. If, however you want to make something completely new, or just like a challenge D3 is worth taking the time to look into. The learning curve for D3 is steep but once you get a handle on the workflow it comes quite naturally.
Also check out bl.ocks.org for inspiration and code. Good luck!
Here's all the code in one block for copy pasting:

/* You have to select the circles even before they're created the data function binds data to the circles the enter function creates the circles the append function sticks them on then we set the attributes for each. x,y and r=radius */

Friday, December 22, 2017

One of my favorite things about R is the massive number of community built packages. Besides the datascience essentials like tidyverse, there’s some quirky fun ones like emoGG for when you want :hankey: instead of the normal pch icons. So it was to my nerdy elation that I found the ggplot add on xkcd to draw ggplots in the style of the legendary webcomic. Let’s give it a try:

You can also include stickfigures! They are a bit tought to draw though. I recommend experimenting with the angles until you get it right. The vignettes do a good job of explaining how the different parameters map.