January 2, 2010

I was reading an article The Decade of Data: Seven Trends to Watch in 2010 this morning and found it a fitting retrospective and perspective piece. I have been working in data analytics for the past 15 years, so naturally I went searching for similar articles with more of a focus on analytics, but came back empty handed 😦

I wish I could write a similar post, but feel the task is too big to take. A systematic review with vision into the future would require much more dedication and effort than I could afford at this point. However, I do have a couple of thoughts and went ahead to gather some evidence to share. I’d love to hear your thoughts; please comment and provide your perspectives.

The above chart shows search volume indices for several data analytics related keywords over the last six years. There are many interesting patterns. The one caught my eyes first is the birth of Google Analytics: Nov 14, 2005. No only did it cause a huge spike in the search trend for “analytics”, the first day “analytics” surpass “regression”, it become the driving force behind the growth of web analytics and analytics discipline in general. Today, more than half of all “analytics” searches are associated with “Google Analytics”. Anyone who writes the history of data analytics will have to study the impact of GA seriously.

I wish I could do a chart on the impact of SAS and SPSS on data analytics in a similar fashion, but unfortunately it is hard to isolate SAS searches for statistics software vs other “SAS” searches. When limited to the “software” category, it seems that SAS has about twice the volume of SPSS, so I used SPSS instead.

Many years ago, before Google Analytics and the “web analyst” generation, statistical analysis and modeling dominated the business applications of data analytics. Statistician and their predictive modeling practice were sitting in their ivy tower. Since the early years of the 21st century, data mining and machine learning became a strong competing discipline to statistics – I remember the many heated debates between statistician and computer scientists about statistical modeling vs data mining. New jargons came about, such as decision tree, neural network, association rule and sequence mining. To whomever had the newest, smartest, most math grade, efficient and powerful algorithm went the spoils.

Google Analytics changed everything. Along with data democratization came the democratization of data intelligence. Who would’ve guessed that today, for a large crowd of (web) analysts, analytics would become near-synonymous with Google Analytics and building dashboard, tracking and reporting the right metrics the holy grail of analytics? Those statisticians may still inhabit the ivy tower of data analytics, but the world is already owned by others – the people – as democracy would dictate.

No question about it, data analytics is trending up and flourishing as never before.