How big data delivers data driven stories

Sourcing a data-driven story is a complicated process. The magnitude of the information that is now available in even medium-sized datasets makes it difficult to know exactly what vein of information contains the most impressive or significant story, no matter how good a person may be at pattern recognition or spotting trends. Even if you believe you have found something insightful and original, there might be an even more interesting take on the same data that was only visible when the data is viewed at its most granular level.

Databases and data visualisation tools have become invaluable when finding the narratives in big data; the technical constraints of such tools mean that the larger a dataset is, the further the data needs to be shrunk to become manageable and to reduce processing time. This sacrifices crucial data granularity to the extent that interesting stories which rely on high levels of detail are lost.

So, what can be done? Let's look at a real-life example of how all of the data can come into play when you have the right tools available, and why the most detailed data can be the most valuable.

Data-driven storytelling

In November 2016, InterWorks' Tableau Zen Master Robert Rouse took part in the 'Iron Viz' competition with several other experts at Tableau's worldwide customer and partner conference. The challenge was for each contestant to demonstrate the best use of Tableau when creating a data driven story, with each contestant drawing from the same 14 Gigabytes and 161 million rows of business data detailing New York taxi journey details. Robert analysed how snowfall and public holidays affected trip counts and overall taxi fares, with an emphasis on visualising the effect that snow days had on New York taxi fares. The contest provided a perfect demonstration of Tableau's utility when creating data-driven stories through visualisations.

For the competition, Robert worked from a reduced data set created by taking a daily aggregate of trips taken and fares earned over the year of 2014. For the map visualisation, the dataset was further shrunk to a three-day period, chosen to best highlight trip frequency changes that a single snow day caused across New York. The dataset needed to be reduced this much because it was simply not possible to address the complete dataset within the visualisation tool, and this diluted dataset lost much of its granularity and detail.

Robert chose to re-run the analysis from the competition for a webinar, again in order to demonstrate the effectiveness of visualising data, but this time the dataset would be handled by EXASOL's in-memory analytic database instead of Tableau data extracts. In this re-run, Robert showcased the level of work required when identifying and extracting a story from such a large dataset, and demonstrated how by using a powerful analytic database he could draw from the complete dataset. A story that had previously taken hours to compile could now be done in minutes by working from the whole dataset, and additional insights could be discovered such as comparing whole years, something not possible with subsets of the data.

It is therefore clearly of great benefit to have complete access to the data contained within a dataset at the most detailed level possible when searching for a narrative thread. Transactional databases can sometime prove too slow for effectively processing data of this magnitude, making exploration of the data frustrating and cumbersome. As Robert demonstrated, having the ability to turbocharge the dataset with a fast analytic database saves time and effort.

The importance of maximising speed and high data granularity

InterWorks' Robert Rouse, said: "When you're still at that level where you're trying to figure out what the data means and find the story that needs to be told, you need to have access to as much data as you can at the finest level of granularity that you can manage. Being able to consider the data over a long period of time, rather than restricting yourself to a month or quarter, means you can see trends and patterns that might not otherwise be visible."

Using a combination of tools for the best data analytics experience

Data analytics is on course to further integrate itself into businesses' organisational models in the years to come, and there is a tangible shift underway of companies experimenting in data analytics by combining software packages to create unique solutions to meet their various needs. This approach leads to reducing reliance on IT resources and data scientists, allowing for the creation of self-service business intelligence. This will lead to reduced complexity and time investment associated with data science.

With a fast database purpose-built for analytics, a data preparation tool such as Alteryx, and a visualisation tool such as Tableau, a complete "big data analytics stack" can be created combining the strengths from multiple sources.

How does data-driven analytics benefit business?

So what impact can data analytics have on business? Over the years, there has been a trend within business of collecting more and more data, sourcing it from systems such as sales and CRM applications as well as from web traffic. There is value in this data and it is important to bring it all into a single data warehouse for analysis and reporting. More than this, it is important to have the right software solutions in place to ensure you make best use of the data with fast analytics and the best visualisations. Only with the right technology will organisations be able to make better decisions across every corner of the organisation.

Intelligent analysis of data can elevate a company to the next level and provide it with a competitive edge over its rivals through the insights it can achieve. In-memory analytic databases are making it possible for the smallest start-ups to compete on the same level as the big enterprises. It is levelling the playing field and providing competition in a marketplace that was once dominated by those with big budgets. Businesses who act now and implement advanced analytics will reap the benefits of a data-driven business.