Data driven decisions

Big Data is all the rage right now. Industry analysts and pundits of every stripe are singing the praises of analytics the way snake oil salesmen once hawked miracle potions to help us all live longer, healthier, more fulfilling lives.

Would that data analytics were as simple as buying a bottle of potion.

Analytics comes in two flavors

Descriptive analytics (DA) includes what we old folks call reporting, i.e., putting together in either summary or detailed format information from some kind of data set; for example, a report on the number of employees with better-than-average 2012 performance results.

Prescriptive analytics (PA) asks and answers more difficult questions, sometimes of the “what if?” variety and other times of the “how do we really do X well?” or “what should we be paying attention to now?” question types. In all cases the idea is to identify patterns in current data sets in order to, on a best guess basis, predict the likely patterns in future or currently less complete data sets.

DA exercises are all about asking practical questions, and then thinking hard about the data you have and how best to present it. A list of employees who have not yet taken a certain certification examination might be a good start. A percentage breakdown of those employees who passed, failed and/or have not yet taken the test might be more useful. A chart showing the percentage breakdowns further filtered by location, job role and business unit might be even more useful (dashboard reporting). An exception report concentrating solely on the failures and failure-to-finish employees might be immediately actionable, i.e., with an accurate employee exception list you could reach out and make the errant employees take (and pass) the test, fire them or get them into remediation of some sort. It’s all good reporting and nobody does enough of it. But none of it is prescriptive.

Prescriptive analytics, the PA type, is all about trying to cull insight. In this case you might ask a more open-ended question like: “What kind of new hire is likely to fail the certification examination and why?” And then let some algorithms loose on the data sets. The answers may surprise you. Perhaps left-handed musicians are most likely to make time to take the test (and be most likely to pass) while Type-A-Personality MBAs with less than five years of post-graduate work experience and a background in investment banking are most likely to fail the tests or ignore the certification requirements altogether.

In the PA world you might start mixing and matching data sets, for example, HR records plus sales figures plus client satisfaction survey reports plus recruitment histories. The idea being to determine multivariate correlations, if there are any, and then to determine what those correlations might suggest about workforce optimization options going forward.

In the DA world you might still mix data sets but only to ask a different kind of question, a simpler question such as, “How much extra new-hire training budget do I need to request for next year given this year’s likely retirement and separation figures?” If the costs and attrition rates can be measured or at least accurately estimated, then spitting out next year’s budget numbers isn’t that hard for a good data solution.

Exercises in both the DA and PA worlds will yield value, sometimes enormous value. The DA exercises are easier to accomplish. DA intelligence is the low-hanging fruit of the big data world. But the pundits are not talking about better reporting. They are talking about game-changing, counter-intuitive insights discovered by in-house expert teams sifting through massive data sets. They’re talking about hype-cycle news such as zip codes being the best predictors of new-hire attrition rates or women’s buying habits at Target stores being good evidence of pregnancy status.

That kind of deep data analysis is hard.

Let’s be honest, most of us are never going to get there

No money. No time. No in-house expertise. Prescriptive analytics is hard to do at all, let alone do well. But perhaps there are other options.

I just looked at one cloud aggregator of analytics talent, Kaggle. It’s too early to tell if Kaggle is the right way forward in the PA world but it does seem to be gaining some traction. Kaggle runs contests. Data owners post their data sets (securely) and data scientists compete to create statistical analysis models that best answer the data owners’ questions. The winners get prize money.

Watch the video below for more on how Kaggle works:

It’s a game

The most interesting thing about Kaggle is that every exercise is a competition. The data scientists, some 40,000 of them so far, win real money (as much as US$3 million for one contest). Just as importantly, the scientists get recognition from their peers. The data mavens win, place and show in iterative rounds until a winner is declared. Then all plaudits go to the top teams.

It’s social

Sort of . . . Kaggle has clear network effects. The scientists can chat and form themselves into teams for each new exercise. Non-specific social network technologies struggle hard to get usage. Kaggle’s social scene has no such problem. Data scientists looking to self-select and self-organize to solve problems and win money are motivated. All 40,000 get chatted up regularly.

It’s global

The hottest Kaggle team is headed by a French businessman based in Singapore. 537 different teams from around the world are competing in one contest right now (try organizing that in-house).

They know more than you do and they learn faster (and have no bias issues)

Kaggle’s Chief Scientist, an Australian, says that the mathematicians and programmers beat the industry analysts and other subject matter experts hands down, time after time. Being an industry expert does not help in the world of big data. Industry experts are good at framing questions and that’s it.

So what now?

If you have a big, hairy question and lots of data to work with, use Kaggle or one of Kaggle’s competitors. It will be cheaper, faster and yield better outcomes than you’d ever get by putting together your own team. If you have fairly simple PA questions and the ability to run different statistical tests on your data, then go ahead and do it in-house.

Otherwise, I’d strongly recommend you ignore just about everything the industry experts are telling you about the miracle of prescriptive analytics and instead focus on getting more out of your descriptive analytics efforts.

You can up your game considerably with better reporting and better executive dashboards — learning and performance management scenarios, budget planning, governance, risk and compliance profiling, recruitment feedback, workforce planning — all these things will be clearer and better managed with more focus on DA reporting.

By the way, we’ll be announcing new tools to help you with your talent-related descriptive and prescriptive initiatives shortly . . .