Benefitting from data lakes

Jessie Rudd, Technical Business Analyst at PBT Group

Previously, I discussed some of the defining characteristics of a data lake and contextualised it in a business environment. In this article, the focus shifts to its importance for South African companies trying to remain relevant in an increasingly digital age.

One of its most significant elements is the raw data formats it contains. Instead of accessing preconfigured data sets (as with a data warehouse), organisations have access to a more free-flowing, natural environment. This means data scientists can find and access whatever they need faster and more effectively than if they had to go through the more rigid traditional process.

‘Drowning’ in data

However, this fluid approach can be quite intimidating especially given the sheer amount of data (both structured and unstructured) that is available to businesses. There is a risk of getting lost in the data and consequently not retrieving the intended information needed to drive decision-making.

Due to its nature, the temptation is to use the data lake as a repository for everything. And, while it might seem contradictory, keeping the data lake of a company organised so it stays useful and relevant should be a priority.

Enter the need for the data ponds I mentioned in the previous article. These are self-contained pieces of the same kind of data that are easily searchable and manageable. Instead of wading through all the data at the organisation’s disposal, data scientists can perform targeted searches in relevant pockets of data.

Given how relatively inexpensive storage (cloud or otherwise) has become and the fact that data can be stored there indefinitely, the business can extract information in real-time whenever required. This can significantly aid decision-making by having the ability to factor in the latest customer data and competitive trends.

Architecture management

Where it becomes expensive is how to approach the data lake platform used and the extent at which it is integrated into all existing operations inside the organisation. In the South African context, where big data is still being positioned as a business differentiator, data lakes are still going to be a hard sell for some time.

Despite this, the ability it provides to analyse data that was previously inaccessible (think social media posts and other digital communications) and develop better defined bespoke customer solutions, can significantly improve the business bottom-line.

Data lakes do provide advantages, yet there needs to be a change of approach from the business. It is no longer a matter of how and when to access data but rather using it in real-time to improve agility and business readiness for the digital world.

Organisations are starting to understand the importance of more effectively analysing and understanding data. Once they start embracing data lakes, the momentum will shift for the establishment of data-rich businesses that use their understanding of market needs to create product and service differentiation.