More like this

Big data to be normal by 2020

If we can convince our kids to be data scientists

Big data is big business right now, and it will not only be embiggening as this decade rolls on, but it will become normal, like enterprise resource planning, supply chain and customer management, and other perfectly normal parts of the corporate computing landscape.

But that will only happen, it seems, if we get people involved in becoming data scientists and experts in related fields.

The hotshots at IT market researcher Gartner are hosting their Symposium/ITxpo shindig in Orlando, Florida, the company trotted out some prognostications as is the norm for the annual event.

The first thing that Gartner did was provide its latest projection for global IT spending this year, and while you might not put much stock in these things, the IT vendors and your IT department is at the very least indirectly affected by all of the chatter and guesstimation about what is going to happen with overall IT spending. When the CEO and CFO hear that IT spending is going down all over the world, it is a little more difficult for you to ask for more dough.

The good news is that Gartner is projecting that global IT spending is going to rise by 3.8 per cent to $3.7 trillion. In its latest projection, from back in July, Gartner had cut its spending projections for this year to only 3 per cent for spending across hardware, software, and services (including telecom equipment and services) to $3.63 trillion.

And once again, as it has done several times this year, Gartner has rejiggered what it thinks was spent in 2011 as well as changing the forecast amount for 2012 spending. Being human beings, what most managers will remember is the percent, not that the before and after numbers keep changing.

By the way, one of the reasons the numbers keep changing in the current year is that the value of the US dollar keeps bouncing around, and Gartner converts all projections in other currencies to greenbacks when it talks about it.

The big data market is expected to grow considerably faster than the IT market overall, with sales rising 21.4 per cent to $34bn in 2012. Gartner said in a statement that this year only $4.3bnm, or about 12.6 per cent, of total spending on big data will be for new software licenses and that most of the dough will be blown on adapting "traditional solutions" to the velocity, variety, and voluminous data needs that make big data different from data warehousing and online transaction processing.

As you might expect, social network and clickstream analysis will represent the largest portion of big data spending this year, about 45 per cent of that $34bn pie if you look at these as a single unit. Risk analysis and other financial services workloads are also increasingly adopting Hadoop and other data munching tools, which are at the same time being pushed from being batch tools to real-time, much as transaction processing moved from batch to online many decades ago.

While big data might seem distinct, Gartner thinks what many of us do, which is that this is an artificial category that is just as fake as cloud computing. Big data – by which we mean using funky modern tools to do big and fast things that plain vanilla relational databases can't do – is just another kind of computing, just like cloud is just the next phase of distributed computing.

Big data is just being smart and making use of log file information that you thought was junk, if you even kept it at all, to spy on your customers to try to serve them better. It involves mashing up your operational data with other data to try to make correlations. The kind of things humans do all day long, and often with disastrous results because we incorrectly correlate or don't close loops. (Shhh. Don't tell anyone that the Internet has been trying for months to get me to buy dresses for my wife that I already bought for her. Not only did I look at them, supercookie, but I bought them and you didn't seem to notice.)

"Despite the hype, big data is not a distinct, stand-alone market, it but represents an industry-wide market force which must be addressed in products, practices and solution delivery," explained Mark Beyer, a research vice president at Gartner who dices and slices the big data market. (Ah, but does it dice and slice him back? Only seems fair.)

"In 2011, big data formed a new driver in almost every category of IT spending. However, through 2018, big data requirements will gradually evolve from differentiation to 'table stakes' in information management practices and technology. By 2020, big data features and functionality will be non-differentiating and routinely expected from traditional enterprise vendors and part of their product offerings."

So on one side of Gartner, big data is looking like the new normal by the end of the decade, when we should be approaching exascale systems – exaflops of computing, exabytes of storage, tens of megawatts of electricity, and gigabucks of cost – if all goes well. But on the other side of Gartner, analysts are reminding people that there will be a shortage of data scientists who understand all this voluminous data and how to make use of it.

Peter Sondergaard, senior vice president and global head of research at Gartner, put out his own statement, saying that between now and 2015, the IT sector will create 4.4 million job openings, with 1.9 million of them generated in the United States.

Each of those big data jobs have a multiplicative effect, creating additional jobs downstream it is hoped. In this case, for the United States, those 1.9 million big data jobs will create another 5.7 million jobs outside of the IT department. El Reg would probably argue that it is getting difficult to tell the marketing and sales departments from the IT department, given all of the automation people are doing these days.

"But there is a challenge," Sondergaard said. "There is not enough talent in the industry. Our public and private education systems are failing us. Therefore, only one-third of the IT jobs will be filled. Data experts will be a scarce, valuable commodity."

Once again, we are left scratching our heads. IT people are looking for jobs, but apparently they don't have the skills to do the jobs that are open.

This looks like a situation that could be fixed if companies spent less buying back shares and wasting money in countless other ways and actually invested in training the big data workforce they think they will need. Companies always grouse about not having the right quality of talent, and still somehow they manage to get by.

And El Reg will make this prediction: Just like companies let SAP and Oracle take over the running of their businesses by bending those businesses to fit the shape of the SAP and Oracle software, there will be application and service providers in the big data arena that will bend companies to their wares.

Companies will not invest in the training for big data experts except where absolutely necessary, and only the biggest companies with the fattest compensation will get the experts. Just like you have to pay a fortune to have a good Java programmer or database admin who understands business, you're going to have to pay lots for a smart data scientist who understands your business and the data that will be useful to you. ®