Archive

Auto-writing Software

Good software writes itself. We all know the patterns and following them makes things easy. Not because we have memorized any monumental volume on design patterns, but because years of experience taught us what should be done on an almost intuitive level.

Also in a negative sense. We avoid duplication, globals, encapsulation violations. We know that planting a global variable somewhere to “facilitate” object interaction as innocent as it appears at the moment, throws a huge wrench into extensibility and new feature development. (This is known by a majestic name: legacy). Even if in a moment of weakness we allowed ourselves to relax unnecessarily the software itself would soon make us realize the error of our ways by resisting any attempt to extend it. And then with a sigh we go back regretting a momentary weakness and do things right.

In this case, however, software design dictated the user experience.

The Context

Last year, I did a brief stint on the DNA Storage Project at Microsoft Research. My task was to create an app that would streamline the process of encoding, synthesizing then decoding and analyzing the data on DNA molecules.

The UX Failure

Our intrepid UX designers sketched a pretty simple UX. The workflow appeared uncomplicated enough with just a few interacting entities. It was natural to split it into the “synthesis” and “analysis” parts, and then a few hierarchical grids or grids with links should have done the job. The MEAN stack fit the need and Azure was to host all the storage and backend processing.

Everything worked great until it came to the UX. No matter how much I huffed and puffed it would not yield. Whatever I did, I could not build the MVVM workflow. The model on the “synthesis” and the “analysis” side consisted of pretty much the same entities (files, pools, runs), however, their semantics while similar in some aspects were drastically different in others. While it seemed natural to have a model with a few abstract base classes wrapping the commonalities, the View-Model became an insurmountable obstacle: different View interactions demanded a lot of “similar but different” code in the VM; duplication was everywhere. The code was coming out entwined, and unmaintainable.

d3.select.enter

The inspiration came from the House episode Failure to Communicate. House is in Baltimore justifying some of his cases to a Medicare review officer. One of the cases involved prescribing Viagra to a woman with heart problems because she did not tolerate Nitroglycerin. Medicare was refusing to cover the off-label use.

Using D3 for UX development is an off-label use of D3. Perhaps I could have picked a different framework, but this was the one I was familiar with. The idea was to throw away MVVM and turn the entire experience in a data-driven “video-game” as one of the UX designers on the project called it. Then all the conceptual difficulties simply dissolved. With D3, I could easily streamline the peculiarities of the workflow, and, while new challenges emerged (entities interactions, drag & drop, etc), the software resumed its course “writing itself”.

Usage is simple: you pick a color scheme and add it to the class of your parent element that will contain the actual SVG elements displayed, e.g.: Spectral. Then, one of the “qi–n classes are assigned to these child elements to get the actual color.

So for instance:

The main SVG element on the Crime Explorer visual looks like this:

<svg class="Spectral" id="svg_vis">...</svg>

Then, each of the “circle” elements inside this SVG container will have one of the qi-9 (I am using 9 total colors to display this visualization so i ranges from 0..8).

All of this is supported by the BubbleChart class with some prodding.
You need to:

Pass the color scheme to the constructor of the class derived from BubbleChart upon instantiation:

allStates = new AllStates('vis', crime_data, 'Spectral')

Implement a function called color_class in the derived class, that will produce a string of type “qi-n”, given an i. The default function supplied with the base class always returns “q1-6”.

@color_class =
d3.scale.threshold().domain(@domain).range(("q#{i}-9" for i in [8..0]))

In my implementation, I am using the d3 threshold scale to map a domain of values to the colors I need based on certain thresholds. The range is reversed only because I want “blue” colors to come out on lower threshold values, and “red” – on higher ones (less crime is “better”, so I use red for higher values). See AllStates.coffe for a full listing.How this is hooked up tho the actual data is discussed in the next section.

Data Protocol

This is key: data you pass to the BubbleChart class must comply with the following requirements:

It must be an array (not an associative array, a regular array). Each element of this array will be displayed as a circle (“bubble”) on the screen.

Each element must contain the following fields:

id – this is a UNIQUE id of the element. It is used by BubbleChart to do joins (see d3 documentation for what these are)

value – this is what the “value” of each data element is, and it is used to compute the radius of each bubble

group – indicates the “color group” to which the bubble belongs. This is what is fed to the color_class function to determine the color of each individual bubble

With all these conditions satisfied, the array of data is now ready to be displayed.

Figure a blog without pictures or conversations is just boring, so, here it is.

Robbery in Cali

Lately, I have been dealing a lot with data visualization. This was a brand new area for me and while we do use F# for data extraction, all of the front end is done using d3, an amazing toolkit by Mike Bostock.

First of all, I owe the fact that my projects got off the ground to Mike and Jim Vallandigham. Jim taught me all I know about how to draw bubble “charts” and use d3 force layouts. His blog is invaluable for anyone making first steps in the area of data visualization. Code snippets I am going to post here are due to Jim’s and Mike’s generosity. So, thank you Mike and Jim.

One may ask, if there are already tutorials on how to put together visuals, why assault the world with more musings (as a Candace Bushnell character once wrote). The answer is: my goal in these posts is not to exploring creation of visuals, but rather sharing experiences on how to put together projects that involve visualizations.

These are very different problems, since your task is not just to create a single document or web page for a single purpose, but to create something that can dynamically build these documents or pages, and maybe, within each such document provide different views of the same data. Questions of:

design

reuse

coding practices

come up right away, not to mention general problems:

What are data visualizations?

What are they used for?

Are they needed at all?

So, for these posts, we will be building a project that visualizes crime statistics in the US for the year 2008. The data source for this is found here and the complete solution will look like this.