Gold in the data, but a shortage of miners

By Mark Rockwell

Sep 17, 2013

Universities are graduating only a fraction of the data analysts the government needs. (Stock image)

Demand for analysts able to mine the mountains of data generated by federal agencies, including intelligence and homeland security, continues to soar as the government competes with the private sector for promising prospects and universities struggle to fill the need.

"There is a shortage of big data experts," said Michael Rappa, director of advanced analytics and distinguished professor at North Carolina State University. "I don't see the gap narrowing. Universities aren't producing enough. We have 80 grads per year" with master's degrees in analytics. "We could be producing 800 per year and still not meet demand. With each class, the demand goes up."

Details of the U.S. intelligence communities' formerly secret $52.6 billion budget -- reported by the Washington Post from documents provided by former National Security Agency (NSA) analyst Eric Snowden -- show that across-the-board staffing for intelligence agencies is declining, but there is a continued strong draw of analytical personnel to deal with data generated by intelligence gathering projects. Those projects include programs that capture information on billions of Internet communications, telephone calls, instant messages and texts, and programs that draw data from overseas surveillance programs.

These operations depend on data analysts to make sense of the information they generate.

The term, however, is somewhat vague. "Data analyst" can apply to a variety of scientists, statisticians and even data entry personnel. Specialists who can turn huge piles of data into useful information for employers in a team environment aren't common among more traditionally trained computer scientists and statisticians.

The masters of science in analytics degree combines math, analytics, business and science curricula to produce an understanding of not only data analytics, but also how to optimize that information for an employer. The math, analytics and science courses are necessary to do the heavy lifting of massive data analysis. The business curriculum, said Rappa, includes "intensive communications and teamwork training" critical to address the needs of complex organizations with hundreds or thousands of employees. The "single smart guy in a room" approach to analyzing data is dead, he said. Analysts have to learn to play as part of a team.

A handful of universities have budding programs aimed at producing graduates with masters of advanced analytics, data scientist or similar titles. The demand for those kinds of specialists from both the federal and private sector, according to Rappa, is vastly outstripping the supply.

More than two dozen universities across the nation have begun similar programs in the last three years or so. Northwestern University has offered a masters in predictive analytics since 2011.

U.S. universities produce only about 300 such specialists per year, according to Diego Klabjan, program director in analytics at Northwestern's Robert. R. McCormick School of Engineering and Applied Science.

Klabjan said Northwestern had about 30 students in its initial master of science analytics class. The program, he said, is full time for 16 months. Northwestern is very selective. It accepted only 30 of 400 applicants, looking for the best science, math and IT-based candidates.

The supply problem isn't unique to government. Rappa and Klabjan say demand is strong across the private and public sector, with areas such as health care, banking and other private industries struggling with their own needs for people to distill huge volumes of data into understandable information.

Federal agencies, however, face added hurdles in this highly competitive environment. The private sector has more flexibility in pay and benefits.

Security concerns are another obstacle. Klabjan said some students who had indicated interest in working for federal agencies got faster, and possibly more lucrative offers from the private sector as federal agencies took weeks to grind through background security checks.

In the post-Snowden era, background checks for federal employees who work with huge volumes of sensitive data are likely to get more stringent and time-consuming. "There is interest in working for the intelligence community among students," said Klabjan, "but acceptance took so long" they abandoned the effort.

Federal agencies "will be fighting for analysts," said Rappa. "Realistically, this is a long-term issue. We're going to be playing catch-up for the next five years."

Rappa and Klabjan both pointed to a 2011 McKinsey Global Institute study that forecast a 50 percent to 60 percent gap between the supply of and demand for people with deep analytical talent, including those with advanced training in statistics or machine learning as well as the ability to analyze large data sets. It projected between 140,000 and 190,000 unfilled positions of data analytics experts in the United States by 2018 and a shortage of 1.5 million managers and analysts who could understand and make decisions using big data.

Some argue that some departments with a growing need for analysts, such as DHS, might not be quite as strapped as intelligence agencies -- depending on how "data analyst" is defined.

"Different mission nuances [require] different types of analyst," Jayson Ahern, former acting commissioner of Customs and Border Protection, now a principal at the Chertoff Group, told FCW. DHS's mission, he contends, differs from that of the intelligence agencies and the department doesn't compete for the same pool of users. "Some are doing [signal intelligence], some are analyzing imagery and others do drug or other crime trends," he said of DHS analysts. "All analysts are not the same and skills need to be adapted to the mission."

Ahern said in an interview that CBP, with its increasing emphasis on border security capabilities, should beef up its analytical personnel to handle a coming wave of new data generated by border ground sensors, cameras, unmanned aircraft and other surveillance technology.

Klabjan said some of that might be wishful thinking. Analysts that work in areas that DHS has some interest in, such as text analytics —monitoring Web sites, tweets and instant messaging — use the same big data analysts that are being snapped up by private industry and intelligence agencies. Even analysts who monitor data from ground sensors along the border could also be considered "big data" analysts who might be called on to analyze data holistically, said Klabjan. Video image analysis, he said, is arguably not in the same category as big data, though, because it involves different imaging technology and analysis techniques.

NSA is taking matters into its own hands.

In August, the spy agency partnered with N.C. State to create its own Laboratory for Analytic Science (LAS) on the university's Centennial Campus near Raleigh's Research Triangle Park.