DRYing up D3

Stop Repeating Yourself.

D3.js is a powerful, extensible library for data
visualization. It makes some fairly advanced data visualization ideas available
to anybody who can bind data to DOM elements. The set of
supported features is vast, including everything from layouts like the stream
graph,
to an arsenal of map
projections for the
geodesist. However, D3 acheives this amazing breadth of utility through a an
unconventional programming pattern. For example, to add new SVG circle elements
to an existing <svg> element, you’d do something like:

// select all the circles in `svg` and return themvarexisting_circles=svg.selectAll("circle")// bind `your_data_array` to the circlesvardata=circles.data(your_data_array)// make a place holder for each data element without a circlevarplaceholders=data.enter()// and append a circle for each of the placeholders above.varall_circles=placeholders.append("circle")

This style of declension is a double edged sword. It is terse and
powerful: You can bind data to DOM elements, execute transitions, enter new
elements, and remove obsolete elements all in a single block of code. On the
other hand, it defies naive attempts to avoid repetition. This can lead to some
pretty awful spaghetti code. Hence, it’s important to take a proactive stance
on code repetition: noting where it happens and abstracting it away. The first
thing to do is to have a decent working definition for the task. This
semi-formal definition is my working concept of what a data visualization is:

Chart: Let data be the set of possible data points. Let the set of
graphical elements be called graphics. Finally, let mappings be a collection
of functions transforming dimensions of data into graphics elements on
the screen. Then a Chart is a tripple of (mappings, data, graphics),

data is an exogenous variable, and choosing graphical elements is largely a
design decision. Of course, this can get complicated, but much of D3’s rich
feature set is targeted at resolving the intricacies of this problem. However,
the treatment of the mapping element of the tupple is bare bones.

D3 implements the heavy lifting with the d3.scales object. d3.scales
implements a variety of common mappings from data space to screen space.
For example, in this
area chart, courtesy of Mike Bostock,
d3.scale.linear and d3.time.scale map data to the vertical and horizontal
dimensions of the screen.

`d3.time.scale` lives apart from `d3.scale.linear` and its other
relatives is because of how strange time is, *not* because it is conceptually
different

However, d3.scales also leave a few steps to the user, and for the most part,
people implement these steps
over and
over and
over again, and
they do so for each mapping function in their data visualization.

This isn’t good—- it’s tedious, error prone, and certainly not
D.R.Y. There are
essentially two operations that are responsible for most of the repetition:
Accessing data elements and computing the output of the mapping function.

Initially these were the only functionalities my solution addressed, but there
are a few additional operations we almost always do once we have the accessor
function, the scale, and the data. The first is
computing
the extent of the data, so we can figure out the ratio between units of data
and units on the screen (whether they be in cartesian coordinates or RGB).
Secondly we should always include an axis for the graph, although interaction
can alleviate some of that pressure. In any event I can’t imagine a case in which
you would use d3.svg.axiswithout eventually using a scale, so it
makes sense to keep these ideas together.

*N.B.* I use fellow Airshipper Chris Dickinson's [style
guide](https://gist.github.com/chrisdickinson/c82ae21ef6c962f59f3d). It's a
little idiosyncratic, but before you flame his inbox for the absence of
semicolons, you should understand his reasoning, laid out [in the unabridged version](https://gist.github.com/chrisdickinson/243ce66e936d95fab40b)

I think the amount of repeated thought involved in creating D3 scales and axes
is a bit of a wart, so I wrote a
class, called Mapping, to relieve
the pain. This class lets you stop thinking about what a scale actually is,
and provides a few convenience methods, for the common scale-related tasks
described above.

You initialize the class with the base d3.scale object of your choice, and
the accessor function responsible for computing inputs to the scale. Commonly
the accessor is as simple as function(d){return d.x}, but even in those cases, once
you’ve defined it you never need to think about it again. An example Mapping
might be:

If you need the value of a data element p, just call
mapping.accessor(p). More commonly you’d just call mapping.place to map p
into the screen space.

Finally there’s two convenience methods, mapping.compute_domain which takes
the data array and calculates the extent of the data set and updates the domain
of the d3.scale object; and create_axis, which returns a newly created
d3.svg.axis() with the scale attribute already set.

Methods involving the scale or the axis return the appropriate object, so you
can continue method chaining like you’re used to.