Data Lakes: 8 Enterprise Data Management Requirements

Security

Security in the various data lake backends is also evolving and it is addressed at different levels. Hadoop supports Kerberos authentication and UNIX-style authorization via file and directory permissions. Apache Sentry and Cloudera's Record Service are two approaches to fine-grained authorization within Hadoop data files. There is no universal agreement on an approach to authorization — consequently not all Hadoop tools support all of the different approaches. This makes it difficult to standardize at the moment because you will restrict the tools that you can use depending on the selected authorization approach.

A lack of a standard makes it difficult for commercial products to provide comprehensive support at this time. However, in the interim, commercial products can serve as a gateway to the data lake and provide a good amount of security functionality that can help enterprises meet their security requirements in the short term, then adopt standardized mechanisms as they become available.

2016 is the year of the data lake. It will surround and, in some cases, drown the data warehouse, and we'll see significant technology innovations, methodologies and reference architectures that turn the promise of broader data access and Big Data insights into a reality. But Big Data solutions must mature and go beyond the role of being primarily developer tools for highly skilled programmers. The enterprise data lake will allow organizations to track, manage and leverage data they've never had access to in the past. New data management strategies are already leading to more predictive and prescriptive analytics that are driving improved customer-service experiences, cost savings and an overall competitive advantage when there is the right alignment with key business initiatives.

So whether your enterprise data warehouse is on life support or moving into maintenance mode, it will most likely continue to do what it's good at for the time being: operational and historical reporting and analysis (a.k.a. rear-view mirror).

As you consider adopting an enterprise data lake strategy to manage more dynamic, poly-structured data, your data integration strategy must also evolve to handle the new requirements. Thinking that you can simply hire more developers to write code or rely on your legacy rows-and-columns-centric tools is a recipe to sink in a data swamp instead of swimming in a data lake. In this slideshow, Craig Stewart, VP product management at SnapLogic, has identified eight enterprise data management requirements that must be addressed in order to get maximum value from your Big Data technology investments.