Data Ownership and Collaboration

Some recent interest in exploring ideas about data ownership prompts me to revisit the topic of data ownership and collaboration, which we discussed in the March 2003 issue of DM Review. Recent client work associated with establishing data standards has exposed aspects of data ownership that transcend our earlier discussion, mostly inspired by the question of ownership of data (and meta data) that is exchanged.

The first question is introduced as a byproduct of a Web services application. In the client environment, a number of participants in a data exchange network coordinate their exchanges through a centralized Web service. For example, consider a product market in which suppliers register their products and prices with a centralized service, and customers visit the service searching for and potentially purchasing products. In this case, a standard for exchanging information is established by which requests for information are propagated to the suppliers and by which their responses are forwarded to the service. The service is essentially participating as a value-added pass-through for data.

The first question is: who "owns" the data provided by the centralized service? On one hand, one might simply say that the data suppliers own the data because they are supplying it. However, there is an equally good argument that the centralized service owns the data because the service collects and synthesizes the product that is eventually displayed to the data consumer. The question again focuses on responsibility for the quality of the data. However, because there is a possibility that flaws can be introduced in more than one administrative domain, the boundaries of governance may allow data of low quality to propagate under everyone's radar.

The second question, which is the more interesting one, is who owns the meta data associated with data exchange? In fact, let's take this one step further - who owns data standards in general? Rather, let's look at it this way: in an environment where there is an expectation for conformance to a data standard, who is responsible for ensuring the quality of data being passed?

As a concrete example, the introduction of XML as a framework for defining schemas for data interchange has only created an explosion in the proliferation of data standards. (This has prompted one of my colleagues to mention, tongue-in-cheek, that it is great to have so many standards to choose from.) However, embedded within XML schemas and even within traditional fixed-field format layouts are tons of data quality rules - data domain definitions, format constraints, cross-field constraints. For example, in a schema we recently worked on, the client introduced data elements that were what you might call "conditionally mandatory"- a group of elements is optional, but if the group is present, one of a set of three data elements must be present.

Validation of some of these rules can be performed simply through schema validation. Others, such as our conditional mandatory selection of fields, must be handled by an application - schema validation won't cut it. Consider the scenario: A requestor sends the request and anticipates a response. The service handles the request and sends it to the suppliers, who respond. The service forwards the response to the original requestor, who is the ultimate consumer of the data. Who is responsible for validating the data - the requestor, the service or the supplier?

It is more relevant to consider whose business process is most impacted by data that does not conform to the standard. In addition, data standards are effectively "common areas" to be tended by contributions from all participants. By considering the business impacts of not addressing governance of the standards that delineate the exchange of data, all participants will put their shared investment at risk.