Data-themed articles, essays, and studies

It Doesn’t Take An Einstein

I learned last week that there are people who believe that the Sandy Hook elementary school shootings (which took place just about four years ago) were actually a conspiracy.

I shouldn’t have been surprised. If like me, you were blissfully unaware of this bizarre notion, let me save you a little Googling. The idea is that President Obama (or simply “Obama,” in the vernacular of consipracy theoreticians) hired actors to pretend they were being shot at Sandy Hook, in furtherance of a gun-control agenda. It’s not clear whether “Obama” had the time to personally direct and instruct his actors, or was too busy hiring other actors to run the federal government. (Who knows, that’s probably another conspiracy.)

This, along with post-election hand-wringing about our existence in a fact-free society, led me to re-ask a practical question: how do we know if something is a fact or not? It’s not just a matter of debunking dubious musings about what our president does in his spare time. The presumption of fact, if I can call it that, impacts analytics thinking as well.

This isn’t really a binary question. We need a term for statements between an arbitrary assertion – which could true or false – and a fact, which is (so my dictionary tells me) undisputablythe case. I like the word summary for a statement that describes a limited set of observations. Summaries are provisional facts – they might eventually be confirmed as facts, after testing the statement on new situations. A summary might prove out within certain situations, but not in others – then it’s a limited fact. Or, it might hold in any circumstance we can think of – which is really great (and really rare). And very often our statement might just be another conceptual junker, and we’ll jettison it. One way or another, summaries become facts (or not) through the process of testing them – really, of trying to prove them wrong – in any serious inquiry.

From that perspective, I believe the notion of one person’s facts is a misconception. Facts are not arbitrarily maleable. But there can certainly be one person’s summaries – provisional but untested notions based on limited information. Conspiracy ideas don’t really fly in the face of reality, they just ignore reality. Although it might be painful to admit, I do think that when we work within information “silos” we are much the same – it’s matter of degree, not of kind. Regaining control over those trying to fool us may start by responding to summary statements with Really? So prove it. We would find many statements in common currency would fall by the wayside. if they were subjected to even the most basic challenges. To paraphrase Bertrand Russell, life would be much simpler if we stopped calling all of our summaries facts.

In analytics too, or perhaps especially. the mechanics of analytics exists to summarize data – to produce summary statements – and not particularly to establish facts. So some data points that are described by a line are just a set of points and a line, until we have more context. To really know if we have a true fact on our hands, we will probably need to look well beyond the original data set – I’ve seen many data sets that conform to analytical models, but break down when unexamined assumptions were eventually violated. It’s fine to craft a summary using analytics as a tool, but it is a problem to believe we’ve finished the job of understanding our data when we’ve really just started. Often it’s only learning when our summary fails us, that we know how well we’ve done. I personally don’t take entirely standalone empirical models as valid, no matter how sophisticated that data analysis.

In short, without exploring the limits of validity we don’t have facts, only summaries. And even the very best facts seem to have limits. One of the most famous facts of all is Newton’s laws of gravitation. After it was originally proposed in 1687, I have little doubt that many of Newton’s competitors thought: that sanctimonious prick! and worked tirelessly to prove the laws of gravitation wrong, or at least limited in some way. While it took over 200 years, those limits were eventually discovered. And this time it really did take an Einstein to get to a better answer, in the form of General Relativity. That may in turn have its limits – that’s a question still to be determined.