Adapting Data Governance to Tend the Changing Data Landscape

Emerging technologies are outpacing data governance at a rapid clip. Specifically, the rate of growth and development of emerging technologies in areas such as artificial intelligence (AI), the Internet of Things (IoT), and machine learning (ML) drastically exceeds the current speed and willingness of businesses to change their governance models to manage and protect their data and information assets. Unfortunately, the larger the delta becomes between the advancements in technology and the changes in governance, the greater the risks and losses for the business.

IoT, ML, and AI will undoubtedly play significant roles in nearly all industries—from retail to military missions—and everything in between. However, to truly realize the potential value of these capabilities, businesses must adapt their approaches to data governance. And, these changes need to start now, not at the buckling point, where advanced technologies render traditional governance ineffective and failure abounds.

Why Traditional Data Governance Isn’t Enough

Traditional data governance provides a strategic, rigorous framework designed to establish data standards, outline roles and responsibilities, and create policies and procedures for data management and use throughout the enterprise. In fact, traditional data governance is necessary to maximize productivity and efficiency in the use of core business data assets in transactional and data warehousing environments. With focus typically on trust, data quality, and overall protection of data, these conventional methods serve well for recognized data sources with known business value. But once unknown or unstructured data sources or sources with undetermined business value such as big data or IoT are introduced, traditional data governance models fall short. Add on the capabilities of AI and ML, and the shortcomings become even more evident. The rigid nature of conventional data governance policies and procedures limit the possibilities created by advanced data and analytics technologies by expecting them to conform to standards designed for legacy data platforms and infrastructure.

Emerging Technologies Are Changing the Data Landscape

Data Collection and Generation

IoT represents an opportunity for billions of unrelated data sources to connect. IoT devices are not just data sources; they are data gatherers and generators. Wearable devices, sensors, actuators, or essentially anything in which a computing system can be embedded can collect data by the millisecond and stream that data into a cloud of infinite possible consumers. AI and ML technologies mine and analyze this data (often in real time) to determine patterns and relationships, learn from them, and act accordingly. These are autonomous actions based on the data, not explicit programming or instruction. So, as IoT devices gather and generate data, AI and ML technologies not only have the ability to analyze and make decisions on the data presented but also have the ability to ?recognize gaps or additional data needs and send subsequent requests back to the IoT devices to generate or collect new data.

The onboarding of IoT devices or the ingestion of data from these uncertified data sources is extremely difficult in an environment governed by conventional validation and authorization requirements. To foster AI and ML, in these early stages of the data lifecycle, data governance should not seek conformity with predefined rules or standards. Instead, governance should enable quick, efficient incorporation of new data and provide mechanisms to mitigate risks, encourage exploration, and maximize value.

Retention and Storage

As the volume and variety of data have exponentially increased in the age of big data, so have the needs for data storage. Data storage is often confused with data integration and provisioning. It is important that governance address these appropriately and separately.

Storage specifically refers to how data is physically retained by the business. In traditional data management approaches, the technology on which data is stored determines the storage requirements such as physical model/structure and size limitations. Combined with budget constraints and retention policies (often driven by compliance), these requirements considerably restrict how much data the business can store at any given time.