I remember the times when salesman would knock on our doors during lazy afternoons and would try to sell us encyclopedias (yes, way before Google existed!) or the latest electronic food processor (if I can call them that!) or a new detergent that was out in the market that could really vanish off all the dirt stains.

What was wrong with this apart from the fact that it woke you up while you were fast asleep?

It was not effective. He had no idea if the person who opens the door is really interested in buying that product.

The door-to-door marketing was cumbersome for both the sales guy as well as the buyer.

The salesman had no idea whatsoever how many sales he would do.

In short, he had no data or very little data (or information) about his potential consumer and did not know if he was targeting the right customer, which would make his product sell at the end of the day.

The Present…

We live in the digital age. We have the Internet. We have latest smart phones. We have the social media. We have the technology. We have the people who interact with these modern devices and use technology every day to make their lives easier. And, most importantly, we have the DATA. And, we have an ever-increasing data, which a marketer can leverage to reach his target. This ever-increasing voluminous data, which covers every aspect of a person and his behavior over the Internet in a structured or an unstructured format, is called the BIG DATA.

Back to the Past…actually not so much in the Past…

Marketers have consumed data from databases in the past. There were tables where each row was a record and each column was a value. There could be multiple tables related to each other. To fetch a set of users, one would run a query (using SQL) on these tables. This method is not extinct. But, this is not enough to handle our BIG DATA. We want something that can –

Scalability

Perform Parallel Processing

Handle data in any format

Store large amounts of data

Handle failures

Traditional databases cannot do this effectively.

Let’s cover these goals one-by-one.

Scalability

We want a system that can handle the ever-increasing data seamlessly. The transition of data from bytes to gigabytes to petabytes should be smooth without affecting the processing time of the data.

Perform Parallel Processing

Processing data piece by piece takes a lot of time. It could take days to run a simple query on the kind of data (Big data!) we are talking about. The advantage of using parallel processing in such scenarios is that it can break the problem into small chunks, which can then be solved simultaneously. A simple analogy could be this: Say, you have to count the occurrence of the word “beautiful” in a book with 20 pages. If one person is assigned this task, he will have to go through each page and keep a count of each occurrence. Instead, if this problem is broken down i.e. 20 pages are distributed among 20 different people, where each one of them can simultaneously count the occurrence of the word on the page assigned to them. Individual counts can then be collected and summed up to get the total occurrence. Breaking the problem into smaller chunks and processing it simultaneously leads to better performance and faster results.

Handle data in any format

We see and interact with different forms of data.

Consider any social networking website. We see text, images, audio and videos that are all of different formats. Big data is not limited to any particular format or structure. We have all kinds of data and we want to store them as is and be able to gain insights from them.

Store large amounts of data

Now, we know that our big data is ever increasing. But, where do we store them? The data is increasing by leaps and bounds and we need to store every single byte of data. Every data is important and we need to maintain historical data as well because it can lead to some wonderful insights.

Handle failures

Things can go wrong. Machines can fail and queries can go wrong. Which is why we need a system that is resilient to all kinds of failures and makes these failures seamless to the user because it has the ability to internally handle them effectively. Also, there is no bigger joy than being able to retrieve the data you have accidentally deleted because your system internally replicates the data stored.

The amount of data that is available today and is being created every single day is huge. And, it is growing every second. We have the data with us; all we need to do is analyze this data effectively and make use of big data’s potential. Big data can help marketers understand their buyers better and can help in reaching the right buyer at the right time. Big data is another tool to master the art of effective marketing!