Getting in the big-data game

By Frank Konkel

Jan 14, 2014

Every day, people, machines, sensors and systems produce more data than they did the day before.

The rapid growth of available information presents opportunities for the U.S. government -- the largest holder of information in the world -- but getting into the big-data game poses serious challenges for individual agencies.

Understanding those challenges is essential for agency employees to successfully make the case for big-data programs and ultimately succeed in any big-data initiatives, said Rajib Roy, president of Equifax Identity and Fraud Solutions.

The challenges

"There are several challenges as you embark in big data," said Roy, a panelist at the Big Data Technology Symposium in Washington on Jan. 14. "The first is technology, the second is culture, and the third is ecosystem."

The world produces more data right now than it can keep up with, Roy said, but much of it is unstructured information from which it is difficult to extract useful insights.

And although technology evolves quickly, culture changes much more slowly, especially in the public sector. Traditional, manpower-intensive solutions to problem solving in areas such as fraud and government waste are ripe for analytics solutions, but changing the status quo in leadership positions is difficult. Similarly, existing laws and policies make the public sector a difficult arena for big data because even though systems produce more data than ever, it does not flow freely.

"Even in the Government Accountability Office, we face a lot of legal obstacles and challenges getting access to data," said Jamie Berryhill, a senior analyst at GAO's Forensic Audits and Investigative Service. He added that investigations at GAO have saved the government tens of millions of dollars, but when it comes to accessing certain kinds of information, even the watchdogs are limited.

Big-data analytics today

Roy said leading private-sector companies and particular agencies -- mostly in the intelligence community and military -- are moving beyond performing regress-based analytics on sample data. Rather than examining small samples of data to look for trends, leading big-data innovators examine entire swaths of data with machine-based algorithms. Algorithms are unbiased, Roy said, and machine learning analytics provide truly unbiased correlations that humans do not.

"We're doing everything we can in big data," said Bryan Jones, deputy assistant inspector general for analytics and director of the Data Mining Group at the U.S. Postal Service. "We're trying to bring analytics to the level where frontline investigators use it every day."

Jones cited several instances in which USPS was able to reduce the resources spent on mail-theft investigations by using technology to focus its efforts rather than setting up large surveillance operations that were less likely to succeed.

"It's important to not limit yourselves to the ways you've done business in the past," Jones said. "The science and technology [are] real, and we try to think differently to utilize [them] on the front lines of our organization."

Talent is changing, too. Data scientists are in demand in both the private and public sectors. Although the world has not yet decided what makes a perfect data scientist, Roy said, most experts believe it involves a combination of creativity, tech savvy and a true understanding of the surrounding ecosystem.

Making your big-data case

Each agency mission differs, and each agency has different problems that big data and corresponding analytics can address, so there is no one-size-fits-all big-data or analytics solution.

Herb Strauss, assistant deputy commissioner for systems and deputy CIO at the Social Security Administration, said that although "big data" is a popular term today, feds should home in on what their problems are before diving into big data.

He also encouraged officials to have vendors provide solutions specific to their agency's needs.

"Come in and tell us against our mission and lines of business how your capabilities can improve our operations," Strauss said.

"This stuff is real. This is worth taking a risk on," he said. "So many people get to the point in their careers where they don't want to take risks [in order] to protect themselves. I think you're taking a risk if you pass this up. Start with something small, build a relationship, identify a problem, and prove this out. The future becomes whatever it can be."

OPM is partnering with CSID to try to manage the fallout from a massive breach of some 4 million federal personnel records.

Reader comments

Fri, Jan 17, 2014
Chuck Brooks
Virginia

According to recent industry reports, we produce more data every other day than we did from the inception of early civilization until the year 2003 combined. Therefore, organizing, managing and analyzing data is more important than ever. Big data and data analytics are collapsing the information gap and giving businesses and governments the tools to uncover trends, population movements, customer preferences, demographics, commerce traffic, transportation, etc. These tools can also help several industries, including the customer service industry by identifying caller trends, the healthcare industry by flagging potential fraud and the financial services industry by proactively flagging a borrower that is on the verge of lapsing in payment. The value of data analytics is something agencies and businesses cannot ignore and can increase productivity, efficiency, decision making and new business activities.

Wed, Jan 15, 2014
Big Data Queen

Frank, nice article. With the explosion of big data, companies are faced with data challenges in three different areas. First, you know the type of results you want from your data but it’s computationally difficult to obtain. Second, you know the questions to ask but struggle with the answers and need to do data mining to help find those answers. And third is in the area of data exploration where you need to reveal the unknowns and look through the data for patterns and hidden relationships. The open source HPCC Systems big data processing platform can help companies with these challenges by deriving insights from massive data sets quick and simple. Designed by data scientists, it is a complete integrated solution from data ingestion and data processing to data delivery. Their built-in Machine Learning Library and Matrix processing algorithms can assist with business intelligence and predictive analytics. More at http://hpccsystems.com

Please post your comments here. Comments are moderated, so they may not appear immediately
after submitting. We will not post comments that we consider abusive or off-topic.