Archive for March 7, 2012

Last week was a thought provoking one for me with Big Data Camp on Mon February 27, a “Great Demo” workshop on February 29, and a tour of the Strata 2012 exhibit hall on March 1. I encountered either examples or stories of visualizations that required hundreds to thousands of CPU hours to create at all three events.

Thanks to the commodization that cloud computing is now driving you can spend $100 and get 1,000 CPU hours, the equivalent of a dozen racks of computers in a data center for a day depending upon how many chips and cores you can fit usefully into an enclosure. It’s roughly equivalent to running one computer for a six weeks to get an answer.

A lot of these visualizations involve streaming data, realtime processing, or graph calculations where one size does not fit all: SQL and Hadoop are not going to be the pervasive computing building blocks. I think we are in for a period of intense experimentation and search for new technologies (or perhaps old algorithms that can be given the computing horsepower they need to get rapid solutions.

A lot of weird new data stores, visualization techniques, and parallel processing methodologies are going to find problem architecture niches and become familiar tools in the next few years.

“When the going gets weird, the weird turn pro.”
Hunter S. Thompson

Three questions for those playing the home game

What’s the strangest useful visualization you have seen in the last year?

What picture is worth a thousand CPU hours–or about $100 at a dime an hour on Amazon–to you?

What models for “Insight as a Service” do you see emerging?

Related Blog Posts

Hadoop Summit 2009 Quick Impressions
The show reminded me a lot of INTEROP 88, the year that Interop transitioned from workshop to trade show with a few dozen vendors at the Santa Clara Convention Center. The vendor ecosystem for Hadoop is not yet as diverse, but the focus was clearly on system administration and technology, with the applications discussed in highly technical language. The crowd seemed to be researchers and system programmers for the most part, but the potential business impacts are starting to become a lot clearer.