The personal view on the IT world of Johan Louwers, specially focusing on Oracle technology, Linux and UNIX technology, programming languages and all kinds of nice and cool things happening in the IT world.

Monday, February 13, 2012

How did data become big-data

Big data is currently a buzz word and as we all know buzz words are not always good. It has happened in the past that a buzzword made that a perfectly good solution or product was killed because it was simply so buzzed it could never live up to the expectations. Big data is currently seen as a solution to everything as also cloud computing is seen. Ass long as your solution has big data and cloud computing in its foundation it must be a great solution. This sounds crazy however a lot of (less tech minded people) do tend to believe it somewhere deep in the back of there minds.

Big data is, even though a buzz word, is something to pay attention to. Big data is very real and we have to take into consideration the amounts of data that are coming available. Every day, 2.5 quintillion bytes of data are created and 90% of the data in the world today was created within the past two years. IBM has done quite some research on the growth of data and you can read some interesting figures on their website.The amount of data coming from all kinds of devices which are from time to time operated by humans and are part of human interaction or which are fully automated and to provide sensory data all is now stored. As stated in one of the presentation on big data by Pentaho and also stated in the blogpost "sub transactional big-data and data analysis"you can find references to data lakes.

Where we used to throw away all data we could not use due to the effects on storage costs and handling costs we now store all data we are able to receive in what is called data lakes. we might not be able to give meaning to it at this moment in time however within one or two months it might turn out that this data is of vital importance. Also the data might never be of any value to us however it can be of extreme value to other people and companies.

The saving of data in data lakes and the handling of enormous sets of data is what is part of what we refer to as big data. We are now getting capable of receiving, storing and handling this massive amounts of data from a technical perspective however we have to learn what we can do with it in the upcoming times.

As an example, the below video is showing a new way of shopping which is introduced by Tesco in Korea where people can shop based upon QR codes in the subway and have their goods delivered to their home at a later moment.

We used to be able to know which goods where sold on which day thanks to simple store keeping in the past. Recently there was the introduction of loyalty members card where we could bundle purchases and state that a person who was buying product A was most likely to also by product B a couple of days later. This is already a start of a huge amount of data. with the above example of Tesco you can also state when someone was buying the product and where this person was when he did so and where this person lives. Your options to create a profile and add a geo-location part to it are now also introduced which gives an extra dimention to your set of data. The more you know from your customers the better you can create profiles which you can use to base decission on. For example decissions on where to open your new shop or where to place your adviertisment or even how to arrange the products in your store.

As stated, we are capable of storing this data now in the form of data lakes in the big data idea and we are able to process it we however have to start thinking of all the possibilities this is giving us and how we can make use of it.