HDInsight: A Vignette – Part Two of the Cortana Cadence Series

Valuable data insights are more accessible than you might think. Sometimes all it takes is a step back to see how ubiquitous and attainable it really is. Let me paint you a picture based on some of our experiences…

I was considered a heretic by everyone in the room. I felt like Galileo standing before the Inquisition, bombarded with leading questions, emotional arguments, and pernicious attacks on my motivations. Just a week earlier Richa called Northwest Cadence as the new CIO of an advertising agency. We had done some process improvement work with her at her previous company where we had a brief discussion about the escalating importance of big data over lunch. Now she was in a position to actually leverage big data, which was why she called us. I was there to introduce her nontechnical company to big data and build out a proof of concept solution. The only problem was, this was not a popular decision and Richa was nowhere to be found. Although the company was only slightly profitable, I found myself on the wrong side of a very hostile room espousing the immense value of big data. “Thanks Richa,” I thought.

Preaching the significance of big data as a competitive advantage was clearly not working so I took a different tack. “Forget big data for a moment,” I said in all sincerity, “How do you measure the success of a campaign? How do you know what works and what doesn’t?” After a few uncomfortable moments of silence, I could feel the audience shifting gears. The responses wearily trickled in, becoming more measured and less emotional. Mostly they relied on a combination of gut feelings and delayed metrics, like an increase in units sold or website hits to gauge the success of a marketing effort. “Great!” I declared, scribbling bullet points of success metrics on the whiteboard as they were shouted out. “Now, what kind of information would be nice to have? For example what about Tweets, YouTube comments, or email sentiment?” I petitioned. Slowly I started to see a confluence of approving nods so I added the suggestions to our list. As we continued brainstorming, the atmosphere shifted gears again from traditional marketing reflexes to creative collaboration. This was their strong suit. Before long I was squeezing potential data sources on the board sideways just to capture them all. “Now imagine that we have all of this data in our very own supercomputer. What questions could we ask it?” I said, taking a seat. A number of scattered suggestions were put forth, but then the woman beside me really wooed everyone when she casually noted, “I suppose we could do faster A/B and split testing.” That was it. They were starting to see the promise of big data.

I just had one last hurdle to overcome, a thick gruff man named Tom. Just as I began to feel at ease Tom flared up, “We don’t have the time or money for all of this pie in the sky [stuff]. And who the [heck] is going to manage it? I know Richa won’t and I sure as [turd] ain’t going to do it.” “Tom has a good point,” I said with an amenable expression, “Typically big data solutions require a lot of overhead. You need lots of computers to create a cluster; you need lots of physical space with air conditioning to house those computers; and you need a trained and dedicated admin to maintain the infrastructure. That usually requires a lot of time and money. Fortunately for us, though, we can build our big data solution in the cloud.” I went on to explain how Azure lets us create and dynamically size Hadoop clusters in minutes with HDInsight. All of the physical hardware and scaling issues are handled automatically by Microsoft. Once it’s setup everything can be automated, from data input to graphical dashboards. It even lets us pipe data through Azure Event Hubs and Azure Machine Learning models with minimal effort. Moreover, the data is persistent in blob storage, so if anything ever went wrong while processing the data, it won’t be corrupted or lost. Although he was losing ground Tom remained intractable and reiterated, “What about price? If it’s not affordable all that [stuff] is moot.” After doing some quick and dirty usage estimations we used the Azure Pricing Calculator to determine that the cost would ultimately be negligible, especially compared to the business insights they could gain.

After the meeting I asked Tom to sit down with me and build a proof of concept architecture in Azure. He reluctantly agreed and together we built out a big data solution. Within the hour we had a fully automated pipeline consuming raw unorganized data, which was then processed and displayed as graphical summaries on a real-time Power BI dashboard. Tom never let on that he was impressed but during our follow up call with Richa, we learned that Tom is now their self-appointed “Azure-data-guy”. In fact, Richa mentioned that HDInsight has brought about a culture shift in the company. People are starting to think in terms of big data, asking big questions, and getting even bigger results.