Bobby Caudill

So, this really is a big deal. Last Thursday, April 10th, 2014, the U.S. Senate unanimously passed the Digital Accountability and Transparency Act (DATA Act). The DATA Act, a mandate to the U.S. Treasury and the Office of Management and Budget to develop a standardized format for the reporting and publication of all federal spending disclosures, will ultimately allow the public access to financial information in a manner more suitable for use in the world of modern technology.

Consider this. The DATA Act is easily the most significant open-government legislation since the Freedom of Information Act in 1966. The bill now heads to the House of Representatives, where passage is expected soon. It has been three years since the time U.S. Senator Mark Warner (D-VA) introduced this legislation for data transparency and has the potential to transform the relationship between the federal government and its citizens. Further, on behalf of Teradata, I would like to thank our headquarters home state senator, U.S. Senator Rob Portman (R-OH), for his sponsorship of this milestone bill.

To date, most government spending information is published in document-based formats and while this may be desirable for humans to read, it poses a significant hurdle to those seeking access to the raw data. To some, the reason for wanting access to the raw data is apparent, however, for those who are less familiar, allow me to explain.

First, what is raw data? Think of raw data as data that has not been processed in any way - no analytics, no summarizations, no mathematics, etc. Such data is usually generated and stored in enterprise systems and is the result of the systems performing their intended functions. For example, in a banking system, each transaction a customer performs creates data that is stored in the system’s database, with each element of the transaction such as, time, date, amount, customer, transaction type, etc. stored separately in a uniquely. At this point, I’m getting perilously close to having to explain how a database works, so I’ll that you’ll forgive me if I refer you to Google if you wish to learn more. The key point to understand is that every element of data is stored separately and distinctly from every other element.

Ok, so what? If you have after looked at a detailed spreadsheet with 100’s of columns and hundreds of thousands rows, you are seeing a pretty good representation of raw data. And, you might be asking, why in the world would anyone want that? It’s seemingly impossible to discern any insight beyond “This is a lot of data.” However, data that is highly structured into rows and columns is a highly desirable format for a computer and someone who wishes to analyze data.

At the end of the day, boiling it all down, this is why the DATA Act is so important. It is mandating that the financial data generated, collected and stored by the Federal Government be made available so that computers and people who wish to analyze the data can. That’s right; it really does boil down to something this simple.

Access to raw data in “a widely-accepted, nonproprietary, searchable, platform-independent computer readable format” will offer the ability for organizations outside government to analyze the information and the possible benefits to the American people can be dramatic. With new people and organizations analyzing the data here’s a few new insights we could see:

The ability to see where tax dollars are spent at a detailed level

Discrepancies in spending that may suggest fraud, waste, abuse

Ideas to more efficiently or effectively spend tax dollars for the betterment of all

New business opportunities that could lead to the creation of new jobs

There are many other possibilities, limited only to the creativeness and curious nature of those who care to dig into the data.

Sounds good right?

Of course, creating a law and implementing a law are always two different things and there will be, undoubtedly, challenges in getting this implemented, not the least of which may be, agreeing on the structure and labeling of the data. (Unique identifiers for Federal awards and entities receiving Federal awards that will be consistently applied Government-wide.) However, this is not as much a technical challenge as it is a people challenge as some folks are going to have to do things a little differently.

Hopefully, all those involved will agree to agree to get this done and not get too caught up, as people are wont to do, in the old cliché, “but that’s not how we've always done it.”

If you want to learn more about the DATA Act and the Data Transparency Coalition, I encourage you to visit www.datacoalition.org.

As far back as I can remember, I’ve always been fascinated with what came before the “final result”. For instance, I was much more interested in HOW and WHY my toys worked than the fact that they did work. Over the years, my curious nature led to many a toy, including sophisticated toys, real automobiles, musical instruments and computers which I at least partly disassembled to satisfy my curiosity and my need to understand more.
More recently, I’ve been focused on breaking down data-driven projects in government to learn how they come about and how they mature. The type projects that fascinate me most are the ones that have delivered insights far greater than ever intended, especially those resulting in some form of “social good”. What some government organizations are beginning to realize is, the wealth of data they have previously collected to support a particular mission or goal can often serve as a starting point to support additional initiatives.

Here’s an example. Consider the type and nature of data collected to support a government agency’s efforts to combat fraud. To have any hope of discovering fraud, data regarding all transactions, including enrollments, claims, payments, etc. must be made available to a data analytics platform. The data set will include information regarding patients, time, date, location, costs, prescriptions, providers, diagnoses, treatments, outcomes, all of which is needed to discover patterns of unlawful behavior.

So, now that all this data has been collected, what more can be done with it? Rather than speculate, we can look to CMS, the Centers for Medicare and Medicaid Services to see what they are doing with their Integrated Data Repository. Beyond using data to uncover fraud, waste and abuse in the system, the agency is leveraging the past seven years of data to further support better data-driven decisions for these activities:

Medical trend and utilization analysis – past and future

Healthcare costs and assessment – past and future

Policy analysis and development

Provider profiling and management

Quality and effectiveness - pay for performance

Program integrity – past and future

Rapid response to legislative inquiries

Supporting data requirements for external analysts and researchers

Talk about hitting multiple birds with one stone! Beyond addressing efficiencies, CMS is now leveraging the same data to make programs more effective, to drive compliance and to become more transparent. Further, beyond using the same data to support multiple goals is the nature of the questions being asked. Similar to how many projects start, the original intent was to look back over time and ask “What happened”, providing users some form of report or analysis. However, more recently, advanced predictive analytics are being leveraged to answer questions such as “What may happen?” For decision makers, this is a huge leap forward. Of course, business experience and expertise will always be important, however, forward-looking decisions can now be made with the benefit of supporting data.

CMS is but one government agency that is being successful leveraging already existing data to solve additional business challenges. How do they typically go about it? There are many factors that have helped contribute to their success. So, there are a few commonalities to consider:

Don’t Assume - Data that may have no value today may be the critical link to answering a question in the future

Technology Matters – beware of single purpose applications and platforms, there’s no “silver bullet” – the technologies you opt to use should be agile and scalable for performance, capacity, workload, complexity, user type, etc. – a platform that can perform one type query blazing fast becomes worthless if you need a different type query – it should be capable of supporting users of different skill levels

Curiosity and Creativity – a great way to innovate is to allow people to discover hidden insights by asking new questions – encourage and reward your innovators – the more questions you can answer, the better you can serve your citizens and constituents

I am always on the lookout for new examples of government using data for social good or some other innovative project. If you are aware of or affiliated with a such a project and you’d like me to write about it, please contact me directly at any of the following places.

Being a member of the Data Transparency Coalition has some interesting benefits, not the least of which is, insight to activities happening on Washington’s Capitol Hill that are directly related to data. There are times for those of us who believe in the power of open, transparent data to rejoice, such the DATA Act being passed by the U.S. House on November 18, 2013, moving it one step closer to becoming law, and there are days of scratching heads wondering, “where did THAT come from?”

Today is one of those latter days. Reading a blog from Hudson Hollister, founder and Executive Director of the coalition, I learned of a proposed bill intended to exempt a large number of small businesses from using a data standard called XBRL as the approved means to file company financial statements to the SEC. For sure, this in one of those times I’m found scratching my head.

I’ve been a cross-breed technologist/marketer for, well, a long time, and I’ll be the first one to admit, getting standards implemented properly does take time, effort and skill. However, in the long run, the benefits usually FAR outweigh the challenges.

Being associated with XBRL for the past 8 years, I can state with some authority, this is a complex standard. However, it is being used quite successfully in a number of implementations around the world. However, it does seem to be getting some negative press with regards to the SEC’s use of the standard. And the response on the Hill? Exempt some companies from the need to adhere to the standard.

What?

The reason he standard was introduced to begin with was the process of filing of financial statements to the SEC was fraught with errors, fraud and the collected data was all but unusable because there was no uniform means to digest it into the SEC’s systems. And now the answer is, “Let’s go back to the way it was.” Why? Because it’s hard? Granted, it IS hard. But, perhaps, rather than leap backwards, maybe the SEC could take a note from other success stories around the world and implement XBRL in a way that will lead to success.

For more details, read Hudson’s blog. He does a great job outlining the benefits to all of us to squashing this proposal here and now.

Here’s a question. Say you are the facilities manager for a government building. How do you keep track of the operating status of all the equipment and infrastructure within the building? Would you know if a chiller was not operating at peak efficiency? Now, hold that thought…

Wind the calendar back to 2008. Recognizing a way to cut costs, save energy and help the environment all at the same time, the Maryland General Assembly passed Governor Martin O’Malley’s legislation called the EmPOWER Maryland Energy Efficiency Act. The goal? To cut electricity consumption and demand by 15% by 2015. I had the pleasure of recently speaking with Lauren Buckler, Energy & Engineering Director for the Maryland Department of General Services (DGS). The story she related to me started much the same way most stories of new and innovative ideas start, “We didn’t have a baseline to start with and had to start from scratch.” Faced with collecting information from 58 separate state agencies with a combined total of 120 accounts payable departments, she and her team got started. Not only did her team collect electric bills, they also pulled in natural gas, water and propane bills as well. After a great deal of heavy lifting, by 2011, DGS Secretary Alvin Collins and the energy team had finally coaxed enough participation from the agencies to create a baseline of energy consumption.

Ah, data! Now they could get down to the business of analyzing the data and seeking out ways to achieve the targeted savings goal, right? Wrong. Let’s face it, for all the people working in the agencies, remembering to report energy usage on a regular basis was not exactly a high priority. Now, here’s where the story gets really interesting, where it becomes about the good government does with data.

Rather than take a punitive approach, the smart team at DGS created an annual competition in 2011 and called it the 16 Agency Energy Competition, and in 2014 expanded it to include an awards ceremony, the Maryland Energy Cup. This award recognizes the agencies who deliver ultimate performance across a number of categories. As Lauren describes it, “We decided the best way to hold the agencies accountable to participate in this program was to measure and put them into friendly competition with one another. We have taken a carrot and stick approach, with the carrot being offered publicly and the stick being gently administered in private in the form of offering assistance.” And talk about a carrot, just a few weeks ago the 2013 winners were presented their awards publicly with Governor O’Malley in attendance. Not a bad way to get people to participate if I do say so myself.

Who do you think the winners are? Well, if you are interested in who received awards, you can trot right over to the DGS Press release. However, I’ll submit to you, the REAL winners in this competition are the people of the State of Maryland. According to the DGS Energy Management website the state has funded measure that will save the tax payers $21.5M annually, equating to $310M over time! And what created this tremendous savings? You guessed it. Data. Ah, the good being done by government with data. What a shining example.

Oh, btw, back to that chiller. Access to data has a way of offering insights never before possible. Because DGS has been tracking energy consumption for multiple years now, analytics can readily identify outlier situations. In this case, the analytics noted a significant increase of energy consumption in a single building year over year that when previous buried in the summary of many building went unnoticed. The culprit? You guessed it. A faulty chiller.

It seems events such as the Super Bowl can have the effect of bringing out the best and the worst of society. In researching for stories of government using data for the public good, I stumbled upon an article entitled, “Super Bowl Prostitution Digitally Mapped by Data Trackers.” I did a quick scan and it triggered a thought, I have a personal friend, Kimberly Lewis Grabert, who knows a great deal about this topic because she is the State-wide Human Trafficking Prevention Director for the Florida Department of Children and Families. I PM’d (personal message) her on Facebook and immediately started peppering her with questions. We ended up on the phone and talked for over an hour and we were just getting started.

Yep, here’s a great story of how government is using data for the public good!

According to data from the Polaris Project, a leading organization in the global fight against human trafficking and modern-day slavery, Florida receives the 3rd highest rate of calls to the National Human trafficking hotline so the problem is high on the minds of those seeking to have an impact. A big change came when Florida altered its perspective and approach to the problem, treating children trapped in these dire circumstances as victims rather than criminals. A law called the Safe Harbor Act was put in place to ensure the safety of child victims who have been trafficked for sex, offering the victims a wealth of treatments. The Florida Safe Harbor Act was passed in 2012 and enacted January 2013 and accomplished several things. This legislation allows Law Enforcement to use their discretion regarding the arrest of minors for prostitution. Instead, the child can be turned over to the Florida Department of Children and Families who can work with the child's family to provide services to stabilize the child and the family. In cases where no caregiver or parent is able to care for the child, the child can be deemed dependent and placed in a specialized safe environment for victims of commercial sexual exploitation.

So, what’s going on behind the scenes? Yes, that’s right, data. From the beginning, advocates for the victims recognized that power of quantifying the problem with factual information based on data. Such an approach had the desired effect of making the problem much more personal and real to decision makers with the power to make a difference. According to Kimberly, “We were able to use data to illustrate the impact of the problem to individual decision makers within the context of their jurisdictions. Many were surprised which galvanized them into action.”

Today, data is shared among numerous agencies, from law enforcement, to the judiciary, to human service agencies to help identify victims and provide an appropriate array of services around the State. Insights gleaned from data are used to justify funding, to determine the most effective allocation of resources and to educate other states around the country.

Originally focused on looking at data from a historical perspective, Kimberly was happy to share a more recent focus on using predictive analytics as a means to get ahead of the problem. The net effect has enabled the State to move from reactively responding to being more proactive in its efforts to eliminate human trafficking.

As I write this post, the State is putting the finishing touches on their latest report. Kimberly agreed to supply me with a copy. I am excited to see the State’s progress quantified so I can follow up with a another post.

Data, data, data. It’s seems to be important to all people these days. Technologists talk about where to put it, how to manage it, how to secure it. Data scientists talk about how to analyze and answer questions using modern analytics that were previously difficult, if not impossible to answer. Business leaders talk about it as the means to become more efficient, more effective or gain advantage. Most everyone else talks about how Facebook and Amazon are using it to put offers in front of them, for better or worse.

So, not to be outdone or left out, I’m going to talk about it too - from the perspective of the good things government is doing with it. Over the coming weeks, I’ll be highlighting different examples of government using data to improve the lives of people. Going well beyond using data simply to save money and become more efficient, the examples I’ll be bringing to your attention will illustrate meaningful impact directly to people lives.

I have a number of ideas in progress, however, I’m always on the hunt for new stories as well. If you happen to have a story you would like me to highlight, please, drop me a note @ bobby.caudill@teradata.com.

I’m looking forward to bringing great stories to your attention and hopefully some inspiration to help guide you to new ways data can be used.

This week, I had the opportunity to attend Governing’s Outlook in the States and Localities event held in DC. Like every year, the room was chock full of industry folks hoping to hear something that would enable them to improve their relationship with state and local government, each listening with a unique filter for that little nugget of gold.

Of course, I was there for exactly that reason as well and I did discover a couple nuggets of gold. Not being the greedy type, I thought I would take a couple moments and share what I heard. The two key themes that jumped out at me were:

The importance of innovation

The desire to further improve citizen engagement

The discussions around innovation and reinventing government were quite interesting. The government speakers sounded like forward thinking private industry business executives. I heard phrases such as; think different, be bold, fail fast, fail forward, etc. There was a focus beyond the typical topic of efficiency, moving to that of effectiveness. Doing something cheaper that is not also effective is still a waste of resource. Further, the panelist discussed how to get an innovation to stick. It seems, far too often, someone in government will come up with a creative approach to solving a problem only to have it later squashed by the sheer weight of bureaucracy and culture. The advice? Protect the reformers, protect the reforms. Make room for people to try new things, knowing that some will fail. Allow failure (not to be confused with negligence) to be acceptable as the price to pay to gain innovation. Give new ideas the time to take hold and nurture them during the early phases.

Not bad advice. Sounds like a few successful tech companies I know of.

Moving on to citizen engagement. I believe the most insightful comment was the need to eliminate the transaction-based relationship with citizens and replace it with a long-term relationship that takes into account the entirety of a citizen’s situation, wants and needs. In Oakland County, CA, they have created a “One Stop Shop” environment intended to address ALL the needs of an individual at one time, from one source. What a novel idea! It’s kind of like a Super Target where you can buy everything from clothing to electronics to housewares to food to, well, you get the idea. Additional points made that stuck with me were:

Know what the citizen wants and give it to them

Know what the citizen values and deliver that

Know the citizen’s expectations and out-perform them

Re-build trust by being truly transparent – no spin

Mine data to learn and engage citizens with the gleaned insights

So, what is it common about these two themes? Well, they both require people in government to be in a position to make much better decisions. To see what they couldn't see before. To prove an idea quantitatively that was only a gut feeling in the past. To truly understand the impact of government on the individual citizen. To do more things in a much “smarter” fashion.

Is it making sense why these were my two big take-away themes? Do we want a government that is smarter and better engaged? Of course we do and the path to this is found, yes, that’s right, in the data.

You can’t open a government trade publication these days without bumping into some article about Edward Snowden, and not just US focused publications. His story has had global coverage. There is no doubt his actions are forcing change, for better or worse, within governments around the world.

Just this morning, as I was perusing through various articles, I came across the latest submission in FCW entitled “Agencies pay for public distrust in post-Snowden era”. Because I’m personally interested this topic, I decide to dive in, thinking I’m going to be reading more about the impacts and ramifications to the Intelligence Community (IC). However, from the very first paragraph, it was obvious this was not about the IC, rather, it was about cyber security, with insights gleaned from ACT-IAC’s 2014 Cybersecurity Forum. I was curious where this was heading, so I read the rest of the article. For the most part, it went on to discuss the importance of cyber security efforts within the Federal government, all of which made sense.

However after a bit of reflection, I started to wonder, is Snowden the real reason citizens are reluctant to provide personally identifiable information (PII) or is he being used as a timely and convenient excuse? No doubt, the Snowden situation has heightened the public’s awareness of the collection capabilities of the NSA, I’ll not dispute that, however, I’m not convinced his “revelations” are the driving factor for the public’s hesitancy to share with government. Perhaps it is a legitimate driver in limiting the public’s willingness to share PII (or anything else for that matter) on the Internet with anyone, but, not government specifically.

I’d like to offer a somewhat different culprit. I assert the more significant driver to the reluctance to share PII is the growing number of successful cyber-attacks on both public AND private entities. Everywhere you look, organizations are taking fire from cyber-attackers and far too often, the attackers are winning. This is not a result of anything Edward Snowden revealed, these successful attacks are the result of increasingly capable attackers combined with over-burdened, typically under-funded defenders.

While I’m not necessarily advocating a blind eye should be turned to the privacy concerns being raised based on the alleged capabilities and actions of the NSA, I am suggesting perhaps the greater enemy, the more immediate threat within the context of protecting PII is coming from the criminal community, not the IC. To me, it’s a matter of possible versus probable. It is possible the IC could do something nefarious with PII, but not necessarily probable. That community generally has bigger fish to fry. However, without a doubt, it is probable, if not guaranteed, that the criminal community will do something nasty with any PII it can get its hands on.

In my long years of experience, I’ve found it’s generally best to focus more on the probable and give the possible time to work itself out. More often than not, the possible proves to be an edge case and fades away on its own. While we are all getting wrapped around the axle of what the NSA might do, we are being robbed blind by criminals who know exactly what they are going to do.

I’m always amazed (frustrated?) at how often two or more groups can investigate a situation and walk away with very different conclusions, each substantiated by “facts”. One would think in this age of data and analytics it would be ever easier to have groups come to the same or, at least, similar conclusions because, in theory, the more data, the less assumption.

However, in practice, this does not (yet) seem to be the case. To illustrate my point, I recently read over this article from InformationWeek Government regarding a report released by the Inspector General of DOE. The report details the finding of the investigation into the data breach DOE suffered in July.

Of all the interesting nuggets, what caught my attention was the notion that the “data breach may be more extensive than realized.” Ignoring the obvious reason this initially caught my attention, I zeroed in on the fact that the IG and DOE officials are in disagreement over the number of possibly affected people. Of course, this is in the media, so, I expect there’s an element of PR here, but regardless of that, what this suggests to me is a data problem. I’m not referring to the data stolen, but, the data collected, stored and analyzed (or not) by the various security systems, sensors and teams. Perhaps the IG and the security team are looking different sources of data.

I find myself asking questions such as:

Did the IG and the security team have access to ALL the same data?

Was the data from multiple sources integrated together in the same manner?

How much data was used? How far back does the data go? Are there gaps in the data? Where did the data come from?

What does the analytics environment look like? Did the IG and the security team use the same tools?

What was the skillset of the analysts on both teams?

As the frequency and sophistication of cyber attacks continue to grow, access to data, well integrated, detailed data from ALL possible sources, will become more critical than ever. Not only for the purposes of forensics activities and after action investigations, but, also to support detection and remediation, it is such an approach that will offer organizations the ability to ask questions and “see” answers that have been previously “hidden” across disparate data repositories.

Of course, this is not in any way unique to the world of cyber security. Consider how many opportunities you have in your world to bring together data from disparate data sources.

The more complete the data, the accurate relevant the insight, the more relevant the conclusions…

Officials in the Maricopa County Community College District (MCCCD), located in Arizona have serious dilemma on their hands. They are one of the more recent victims of a data breach, one that has the potential to affect nearly 2.5 million people. Ouch!

I feel for everyone in this situation, with the exception of the criminals. Citizens really do wish to trust government entities with the personally identifiable information they share and government, I know, really does take the protection of this information seriously. Sad but true, government is a target rich environment for cyber criminals. The vast amount of personal information stored in databases, documents and systems can be a treasure trove for the unscrupulous.

However, it is far too often the case these days that the criminals want the information more desperately than many believe. Even with all the news, there are not enough people in government (yet) that fully comprehend the current level of sophistication of cyber criminals, nor the scope of the threat. ALL of government has become viable as targets, even small agencies and jurisdictions.

For officials in cash strapped jurisdictions, agencies and departments, especially those who've never faced a true data breach, it can be incredibly hard to get their head around all this, to fully appreciate the threats and to make the commitment to spend money on a modern cyber defense, money that is intended for other pressing needs. I appreciate money is tight, but, as the folks at MCCCD learned, you can decide for yourself to proactively spend to defend now, or you can allow a cyber-criminal to force you to reactively spend to clean up a breach PLUS defend later. Either way, spending on an active cyber defense is no longer optional or a nice to have. When faced with absolutely no other choice, the district’s governing board was able to free up $7M dollars just for the cleanup program alone. This is on top of whatever funding they decide to allocate to defend from future breaches.

So, stating the obvious, data breaches are costly in term of money, not to mention, the damage done to public trust. And, yes, a modern cyber defense is expensive, but, not nearly as expensive as adding in the cost of a clean up after the fact. Do you still believe this problem can be kicked down the road or that it will never happen to your agency?

About Us

Teradata's vast experience spans the entire range of major industries worldwide. We have learned that there’s no substitute for experience and hands-on training, which is why we offer the services of the largest and most experienced staff of consultants in the industry. Keep up to date with the latest industry developments with the Teradata Industry Experts blog.