A small how-to guide in big data

What is Big Data and what makes it so difficult?

In short, Big Data is something that gets mentioned all over. The problem with this popularity is that the meaning becomes unclear. This is nothing unusual for buzz-words, but in the end everyone benefits from a clear definition. Looking to Wikipedia; ‘Big Data’ is defined as“… a broad term for data sets so large or complex that traditional data processing applications are inadequate”. In hindsight, this was easy to spot. With the cost of bandwith and data storage having only one way but down, an increase in data is a logical result. Following that line of reasoning, it seems logical that at one point, such datasets will take on such sizes that conventional analysis will not do the trick anymore. But with that definition, many things are still unclear.

Let’s try to tackle the description of Big Data in a little more detail. Because the definition of Big Data is still fuzzy, a lot of people are also referring to the process of obtaining, analyzing and using Big Data. Applying a rather narrow definition (so more bits and bytes and less ‘how to use’) there are a few key terms that pop up, all conveniently starting with a V (and one C):

Volume – That’s why it’s called Big…

Variety – Great differences exist in forms of data (videos, tweets, data from wearable tech and so forth)

Velocity – This one has two sides; how quickly does the data accumulate, but also how quickly can you access and analyze it?

Variability – Some parts of the dataset can be inconsistent or contain errors, due to the fact that the data is gathered quickly (speed compromises quality)

Veracity – The quality of the data can vary greatly as a result from the size

Complexity – Saving, querying or even accessing can be a difficult task when the data is distributed over multiple datacenters

How to harness the power of Big Data

After you read a fraction of the available information on Big Data, the next question is: where do I start?

The secret of using Big Data is knowing that small, incremental improvements can result in massive gains.

This is shown by the example of Carnival Cruises. CC is a company that has 80 million ‘passenger-days’ on its fleet every year (a passenger day meaning a day spent aboard by one passenger). Which means that if $1 more is spent during every passenger-day CC makes $80 million in additional revenue. $1 on an average expenditure of a few hundred can seem pretty insignificant. This touches on a number of fascinating and interesting subjects. Because how can you make people spend just one more dollar? They do this by analyzing all of the interactions these people have on their cruise-ships.

This way, CC is able to meet their customer’s needs. Where do they get all the data from? Sources include: website searches, booking operator inquiries (both on- and offline), travel trends and so on. Using the analyses, the company has now opened up a few cruise-lines in Asia (apparent demand) and can dynamically change the number of beds ‘opened up’ in each city based on expected demand.

Using Big Data in your own company

The big question, of course, is; how do companies start when they want to harness the power of Big Data?

You try to emulate the success of CC and start gathering data within your own company. The question then, of course, is how, and where to start. One of the main epiphanies to have before you start, is that the secret of applying Big Data on your specific situation lies in seeing the potential of the individual datasets you already collect. You could be gathering data without even knowing it; do you have a website? You can harness the power of search-engine analytics by seeing what ‘key phrases’ were used to end up on your site.

“The trick is connecting the dots, literally”

You can also see at which point they decided that your product/service was interesting enough to start buying (or not). But there is also a lot of data hidden in the interactions with your customer, whether online, on the phone or face to face.

Where to go from here

Now, of course, defining which ‘dots’ to connect can be challenging. Even making sure which dots they have is enough of a challenge for a lot of companies. The power lies in simply starting; one of Netflix’s major Big Data tenets is “The longer you take to find the data, the less valuable it becomes.”. So don’t waste time waiting, just start (small) and try making data-driven decisions right away. In the end it all comes down to knowing your company down to the core. From there, you can decide on which data you’d like to use to improve your business. If starting still seems a little daunting; try the following steps:

1.Pick a topic/field

Picking a topic ensures a focus; it also incurs a certain opportunity cost. Picking one topic necessarily means discarding another. By creating a framework, however, the process becomes a lot easier to handle. Specific goals, key actions, process ownership, they should all be defined.

2.Pick a goal

It’s also important to pick a certain goal; this should not be something to big, but it should be something to aim for. An example could be to improve the duration rate at which customers visit your website. By tweaking website design and content, the average visitor can be motivated to stay longer. This, in turn, improves the chance that those visitors will convert into leads, and so on.

3.Pick the data

Finally, pick the data that needs to be gathered, this can be easy (in the case of a website) or really hard. However, data is key and it is better to have lots of it. After gathering the data it should be analyzed. Data-analysis is a very important task and should not be marginalized.

Conclusion

The thing is, it’s not even the BIG in Big Data that’s interesting. The thing companies should care about is data in itself. There are many, many opportunities in using the potential treasure trove of data sets. While the big analysis of Big Data is something that can possibly only be done by a few very large companies, data analysis in itself is very much possible on any budget. Do not kid yourself that your company does not have valuable data. Everyone does; you just have to know where to look, how to look and how to act on it.