A place to discuss topics related to Cloud, Big Data, Government, Marketing, working at big companies, working at small companies, traffic in Tyson's Corner, Business Travel, Not so Business Travel, Music, Books, Movies and anything else that might strike my fancy!

Friday, May 18, 2012

Garbage In - Garbage Out

Over the past few weeks, I've been engaged in a search for a new home. Part and parcel to this activity is, of course, interfacing with the financial services industry. Oh joy! Now, those of you who work in the industry, please bear with me. I know the vast majority of individuals who dedicate their time to helping the rest of us with our financial needs are competent, friendly and really do care, so, I'm not writing this post to in anyway insult the people of the industry.That said, I'd like to use the systems of the FSI industry, in particular, that of the credit scoring agencies, to make a point, so hang with me.

History lesson: Way back when in 1961, my parents opted to name me after my dad, Robert Lee Caudill, Jr. Nice sounding name, right? I've always been proud to use it, however, like many people's names, over the years, I ended up with a number of different versions. My friends all call me "Bobby Caudill", at work, I've been "Robert Caudill", "Robert L. Caudill", "Robert Lee Caudill", "Robert Caudill, Jr." and any other twist you can imagine. And, as luck would have it, my dad's name has been similarly altered.

Why am I going on about this? Enter the credit reporting agencies. Can you imagine the havoc this situation has had on my credit report? (Yes, my dad's report is all screwed up too!) For years, we've been trying to unwind our credit profiles and after at least 25 years of trying, we are not any closer to getting it done once and for all. Every time either of us makes a life change, we both end up disputing entries to our credit report. Sad as it is, we've long since accepted this as an unfortunate reality we must deal with in this age of technology.

So, data. What an interesting thing it is. Practically every thing we do today is influenced or even controlled by data that's been collected, mined, sliced, diced, analysed to death. When the data in question is guaranteed accurate and authentic AND the questions being asked of it are appropriate and well thought out, the results can be quite useful. But what happens when the data is assumed accurate and/or the questions being asked are simply not appropriate? Well, as I can attest to based on my current credit score, anything can happen.

As the world continues down the path of storing and analyzing every single bit (yes, I mean 'bit' in the context of a 'byte') of digitally generated 'information', I often wonder just how often people (and systems) come to the wrong conclusion because the data set has incorrect information.

Working for Cleversafe, I talk a great deal about how we help customers indefinitely store huge volumes of data, keeping it reliably accessible, safe and secure for all their future processing and analytic needs. With my focus being on government customers, it is quite rewarding knowing that I am helping to preserve our nation's information through to the end of the republic. Yet, sometimes I wonder, how much of the data being stored and used for big data projects is truly accurate?