Update on WEAVE government data visualization software

User-friendly visualization tools are coming early next year.

On Tuesday I heard the dynamic University of Massachusetts at Lowell
professor Georges Grinstein talk about WEAVE (Web-based Analysis and Visualization Environment), a visualization tool for
public data. One of the coolest things about WEAVE is the very idea of
it. About 10 government agencies decided three years ago (before the
Gov 2.0 movement was hot) to put their data out for easy public
consumption, and to collaborate around it with the hope of eventually
being able to combine all their data. These governments have combined
into theOpen Indicators Consortium
to fund and guide development.

When WEAVE started it was pretty ground-breaking; now one can cite
lots of related projects. Data.gov alone (the major US federal site
for data) boasted 305,692 datasets when I checked it right after
Grinstein’s talk. But as Grinstein points out, most sites think
they’re being hip just by putting out computer-consumable data
sets. These are a big step up from PDFs, but the missing piece is ways
to interpret the data, which is being left up to outside programmers.

Between vision and utopia lies a lot of stumbling and fumbling, and
Grinstein let us in on a little of it on Tuesday. One stumble was
dropped Internet connectivity in our high-style hotel right in
Cambridge’s Harvard Square — a pocket of Third-World deprivation in the
midst of one of the world’s broadband utopias. So we didn’t get to see
much of the visualizations–and all the people located remote from us
who wanted to get a live feed were shut out — but from the stills I
could tell that WEAVE offered many of the same animated, interactive
visualizations that one can build withthe Processing language.

These apps typically use color, texture, size, and position in
creative ways and then let you drag, click, zoom in and out, and
manipulate the data yourself. For instance, if you see a state map and
click on a county, it might zoom out to occupy the screen while in the
background your browser requests detailed county data from the server.

Java-based Processing is supposed to be able to handle data sets many
orders of magnitude greater than the ones amenable to Flash/Flex,
which is the basis of WEAVE. (Apple’s decision not to support Flash on
iOS devices is clearly another stumble that lies outside of
Grinstein’s control.) The WEAVE team is excited about the potential of
rewriting the display engine in HTML5, but they have to see whether
their backers will fund development. I suspect that the government
agencies don’t understand what HTML5 will enable (although telling
them it will run on iOS devices may persuade them) but I also trust
that the WEAVE team will get the port done by hook or by crook.

The WEAVE team designed their server to use generic, open source
components so that installation would be easy — but even so, there were
difficulties with the different host server security aspects and the
team had to spend a lot of time they didn’t budget for on sysadmin
support, another stumble outside their control.

Tuesday’s presentation was typically Cantabrigian, from the Boston
accents (we heard a lot about “visualizing dayter”) to the
interminable questions about formats, architecture, and other
technical details from an audience of non-profit reps who had spent as
much time tuning a computer system as distributing meals to the needy.
There wasn’t time for a lot of technical discussion, but I caught that
WEAVE doesn’t have to run on the server of the agency that provides
the data; the agency can feed data to a WEAVE server running somewhere
else. The end-user needs nothing except a browser with a Flash plugin.

We also had a discussion about the code and license status of WEAVE,
which will be released in or shortly before March of next year. The
University of Massachusetts at Lowell decided there’s too much
intellectual property tied up in WEAVE to release it as open source,
and Grinstein feels fine about that because in his experience, good
open source projects mature in a closed environment. I thought of two
counter-examples that I’m using right now, GNU/Linux and GNOME, and I
exchanged some email with Grinstein where he gave his interpretation
of their history and how they too reflect the importance of a long
gestation.

As it is, the source code will be published and anyone can use WEAVE
free for non-commercial use, while commercial users will pay very
modest fees. Donations of code will definitely be appreciated, but
Grinstein expects that one-quarter of the team’s time will be spent
checking over, testing, and vetting donations. Not a free lunch for
them. Grinstein wants to make sure that consortium members have good
code, because their public users will have a low tolerance for bugs.

Still, in a few years, WEAVE may well go out under an open source
license.

What’s in the future for WEAVE? One of the most intriguing features
they’re considering is collaboration. Even upon the first release, you
should be able to run a visualization, save it, and pass the URL
around for others to comment on. Eventually they hope to let users
work together to produce and view interactive animations in real time.
They also are looking for ways to filter data on the server side so
that less needs to be transmitted over the network.

Development process has a big effect on a project’s success, and the
WEAVE process is hard to classify — a bit gawky, in my impression,
although they are using agile methods. WEAVE seems like a “get it out
there” kind of project rather than a grand-vision kind of project,
which is fine and may be the key to success. But it means such
compromises as making it easy for agencies to submit spreadsheet
content instead of trying to formalize a scheme for accepting
well-formatted data.

And despite the leaning toward open software, WEAVE is very much a
U.Mass. Lowell Computer Science department project, beholden to the
dictates of the university and the research needs of the students.
This doesn’t mean the software will be bad — in fact, Grinstein tells
me he has built commercial-grade software and built entire companies
using student programmers.

WEAVE also has a clearly delineated set of funders, whose priorities
will direct development more than user reaction to the
visualizations — or rather, user reactions are important but will be
filtered heavily through the funders. But WEAVE has generated a lot of
excitement among the public anyway, so I’m sure they’ll line up to try
it, and in three to six months we’ll be able to judge its value.

Get the O’Reilly Data Newsletter

Stay informed. Receive weekly insight from industry insiders.

http://praxagora.com/andyo/ Andy Oram

I believe the project is not ready to do a general release yet. They test the software with their sponsors first. The software should be available cost-free to non-commercial users when the team thinks it’s ready.

Jim Farnam

Weave and the Weave API were released in beta under open source licenses on June 15, 2011. We are looking for collaborators and developers to join us! See project web site at http://www.oicweave.org.

Get the Data Newsletter

Stay informed. Receive weekly insight from industry insiders.

Featured Video

Data science: Where are we going? - DJ Patil, the U.S. government's first Chief Data Scientist, looks at the future of data science at Strata + Hadoop World 2015 in San Jose.