Are You Recruiting A Data Scientist, Or Unicorn?

Many companies need to stop looking for a unicorn and start building a data science team, says CEO of data applications firm Lattice.

The emergence of big data as an insight-generating (and potentially revenue-generating) engine for enterprises has many management teams asking: Do we need an in-house data scientist?

According to Shashi Upadhyay, CEO of Lattice, a big data applications provider, it doesn't make sense for organizations to hire a single data scientist, for a variety of reasons. If your budget can swing it, a data science team is the way to go. If not, data science apps may be the next best thing. "If you look at any industry, the top 10 companies can afford to have data scientists, and they should build data science teams," Upadhyay told InformationWeek in a phone interview.

But the solution is less clear for smaller organizations. "The pattern that I've seen now, having done this for over six years, is that very often medium-sized companies think of the problem as, 'I need to go and get me one data scientist,'" said Upadhyay.

But the shortage of data scientists, a problem that's only expected to worsen in the next few years, makes that approach a risky proposition.

For example, a company may hire one or two people, Upadhyay said, "but before you know it, because the supply for this talent group is so far behind demand, they have lost this person [who] has gone to the next company. And all of a sudden, all that good work is lost. And you ask yourself, 'Why did that happen? And how can I manage against it?'"

One common problem, he noted, is that companies simply don't understand data scientists and how they work. The job generally requires knowledge of a wide array of technical disciplines, including analytics, computer science, modeling, and statistics. "They also tend to be fairly conversant in business issues," Upadhyay added.

But it's often difficult to find these divergent skills in a single human being. "It's a little bit like looking for a unicorn," Upadhyay said.

When medium-sized companies -- those that fall below the top five in a given industry, for instance -- hire just one or two data scientists, they often can't provide a long-term career path for those people within the company. As a result, the data scientists get frustrated and move onto the next thing.

In Silicon Valley, where data scientists command six-figure salaries and are in great demand, it's very difficult to retain talented people.

The better solution? Build a team.

"You will absolutely get a benefit if you hire a data science team," said Upadhyay. "Go all the way [and] commit to creating a creating a career path for them. And if you do it that way, you will get the right kind of talent because people will want to work for you."

The team approach seems to be winning. "I rarely see teams that are one or two people in size," Upadhyay observed. "Obviously people have those teams, but they tend to evaporate over time. Until they get to a team of 10 people or more, [companies] can't justify it."

So what does a data science team cost, and what's the payoff?

Upadhyay offered this example: Say you hire a team of 10 data scientists with an average annual cost of $150,000 per employee. "That's $1.5 million for a data science team," he said. "So they better be creating at least $15 million dollars in value for you -- 10 times [the expense] -- to be worth it."

Emerging software tools now make analytics feasible -- and cost-effective -- for most companies. Also in the Brave The Big Data Wave issue of InformationWeek: Have doubts about NoSQL consistency? Meet Kyle Kingsbury's Call Me Maybe project. (Free registration required.)

Laurianne, that is a very good question, large firms have the capacity to predict into the future however, I am not too sure in the ability of small and medium firms being able to invest in data scientist in the same way.

It makes me wonder, since out sourcing, cloud technologies, unemployment being at around 7% in the U.S at the moment, and per capita income being around 50k. It might be possible to put together an online team of many to do the job of one. If anyone has come across a startup like this, it would be interest to study it.

SQIAR (http://www.sqiar.com/solutions/technology/tableau) is a leading global consultancy which provides innovative business intelligence services to small and medium size (SMEs) businesses. Our agile approach provides organizations with breakthrough insights and powerful data visualizations to rapidly analyse multiple aspects of their business in perspectives that matter most.

Training up people to be analytics specialists and then embedding them inside business groups -- where they could combine the analytics knowledge with the business group knowledge, say marketing or manufacturing logistics -- sounds quite appealng. But are companies willing to invest in the training?

Instead of thowing money at a small group of people (who'll be recruited like crazy even if you manage to snag them), companies should use these inflated budgets to expand the ranks of, let's call them, "data-competent" workers inside every part of the business. Employees aware of the basic ideas around big data and data analytics--and the data assets that exist inside the organization--will dream up all kinds of interesting uses, I bet. Put another way, the first step is coming up with an interesting problem or senario that'll be valuable to the business.

"Upadhyay offered this example: Say you hire a team of 10 data scientists with an average annual cost of $150,000 per employee. "That's $1.5 million for a data science team," he said. "So they better be creating at least $15 million dollars in value for you -- 10 times [the expense] -- to be worth it.""

I wonder how long that would be allowed to climb to that level. You're not going to get to 10x right away, but would a year of two of 2x or 3x be satisfactory in the build up, or would it be shut down first usually?

Most IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.

Why should big data be more difficult to secure? In a word, variety. But the business won’t wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.