Ideally, a data management solution will provide a means to monitor data itself – the status of data as reflected in its metadata – since this is how data is instrumented for management in the first place. Metadata can provide insights into data ownership at the application, user, server, and business process level. It also provides information about data access and update frequency and physical location.

A real data management solution will offer a robust mechanism for consolidating and indexing this file metadata into a unified or global namespace construct. This provides uniform access to file listings to all authorized users (machine and human) and a location where policies for managing data over time can be readily applied.

That suggests a second function of a comprehensive or real data management solution. It must provide a mechanism for creating management policies and for assigning those policies to specific data to manage it through its useful life.

A data management policy may offer simplistic directions. For example, it may specify that when accesses to the data fall to zero for thirty days, the data should be migrated off of expensive high performance storage to a less expensive lower performance storage target. However, data management policies can also define more complex interrelationships between data, or they may define specific and granular service changes to data that are to be applied at different times in the data lifecycle. Initially, for example, data may require continuous data protection in the form of a snapshot every few seconds or minutes in order to capture rapidly accruing changes to the data. Over time, however, as update frequency slows, the protective services assigned to the data may also need change – from continuous data protection snapshots to nightly backups, for example. Such granular service changes may also be defined in a policy.

The policy management framework provides a means to define and use the information from a global namespace to meet the changing storage resource requirements and storage service requirements (protection, preservation and privacy are defined as discrete services) of the data itself. The work of provisioning storage resources and services to data, however, anticipates two additional components of a data management solution.

In addition to a policy management framework and global namespace, a true data management solution requires a storage resource management component and a storage services component. The storage resource management component inventories and tracks the status of the storage that may be used to provide hosting for data. This component monitors the responsiveness of the storage resource to access requests as well as its current capacity usage. It also tracks the performance of various paths to the storage component via networks, interconnects, or fabrics.

The storage services management component performs roughly the same work as the storage resource manager, but with respect to storage services for protection, preservation and privacy. This management engine identifies all service providers, whether they are software providers operated on dedicated storage controllers, or as part of a software-defined storage stack operated on a server, or as stand-alone third party software products. The service manager identifies the load on each provider to ensure that no one provider is overloaded with too many service requests.

Together with the policy management framework and global namespace, storage resource and storage service managers provide all of the information required by decision-makers to select the appropriate resources and services to provision to the appropriate data at the appropriate time in fulfillment of policy requirements. That is an intelligent data management service – with a human decision-maker providing the “intelligence” to apply the policy and provision resources and services to data.

However, given the amount of data in even a small-to-medium-sized business computing environment, human decision-makers may be overwhelmed by the sheer volume of data management work that is required. For this reason, cognitive computing has found its way into the ideal data management solution.

A cognitive computing engine – whether in the form of an algorithm, a Boolean logic tree, or an artificial intelligence construct – supplements manual methods of data management and makes possible the efficient handling of extremely large and diverse data management workloads. This cognitive engine is the centerpiece of “cognitive data management” and is rapidly becoming the sine qua non of contemporary data management technology and a key differentiator between data management solutions in the market.

Welcome to our blog on cognitive data management at DMI. This is intended to become a forum for the community of data managers who are interested in simplifying, streamlining and automating the data management workload through the application of cognitive computing technology.

"Cognitive" sounds so trendy. What "cognitive" is varies depending on who you ask.

In some cases, cognitive computing is metaphorical. It refers to a fairly common software engine that simply executes predefined instructions written in any number of scripting or programming languages.

In other cases, cognitive computing refers to the application of algorithms to data in to discern and respond to recognizable patterns.

In still other cases, cognitive refers to machine learning: a set of sophisticated programs that evaluate collected data, compare them to data management policies (criteria, standards, etc.) and determine what if any actions to take.

This blog provides a location to learn more about the theory of CDM and the capabilities of the current generation of vendor products portending to provide cognitive data management services. Ultimately, we agree that the volume of data that is amassing in most organizations already exceeds the capability of human administrators to manage; automated tools are needed to support the effort.

Let's learn more about CDM and share our experiences with data management generally using this forum.