Extended Enterprise Ontology

In a recent post I mentioned comments by Sir Tim Berners-Lee concerning the overlap between enterprise information models and semantic web ontology supporting the concept of linked data. Sir Berners-Lee argued that overlap is already sufficient to have a transformative effect on mainstream IT. I think he is right, but also that we are not there yet. There are many obstacles to adoption, not the least of which is the inertia of enterprise IT. Disruptive approaches to software development typically require ten years or so to cross the chasm from visionary and early adopters to the mainstream. We are only a few years into this and the technology is not ready.

First, let’s establish that there is plenty of semantics available for reuse now. There are existing models, some of which are well-designed, mature, and widely used. Unfortunately, most of what exists has little apparent relevance to enterprises. There is little on this diagram that would draw the attention of an enterprise architect, for example.

Now, if there was an ontology for customer relationship management (CRM) or eXtensible Business Reporting Language (XBRL) in this diagram, it would get more attention. I suggest that semantic web technology will be approaching the enterprise when one of these shows up.

Still, some mature and widely-used ontologies may offer enterprise benefits. For example, the “friend-of-a-friend” ontology provides the following concepts:

Every enterprise has some of FOAF’s basic elements, like surnames, organizations, and projects. By linking their enterprise models to FOAF ontology, directly or indirectly, people across the enterprise and its value chain gain visibility to and flexibility in using information that would otherwise be expensive to provide and use. There is a lot to this that I will not go into here, including its relationship to business intelligence and enterprise performance management (but see entity analytics at IBM, for example). Suffice it to say that there is meat on this bone.

As ontologies go, FOAF is small, at least in terms of its model, which consists of classes and properties. Generally speaking, the more mature and widely-used ontologies today are small models. Unfortunately, they are also isolated from one another. They have been developed by small communities that are substantially independent of one another. This reality is a symptom of a hurdle that needs to be crossed before semantics will cross the chasm to the mainstream.

Without collaboration, ontologies will remain small and independent. There will be isolated examples of large ontologies, such as OpenCyc and interesting combinations of multiple ontologies, such as UMBEL, but they will have no uptake until they are managed by a large community that includes enterprise members.

Let’s look at the problem of enterprise modeling head on. Consider, for example, defining the model underlying Oracle CRM and Salesforce.COM. Or consider defining an application that produces the XML format now required of public companies by the Securities and Exchange Commission: XBRL. These ontologies would be huge compared to any that has been widely adopted on the semantic web. And yet, if such an ontology existed for SEC filings, it would have tremendous corporate relevance, especially to CFOs and in financial analysis at commercial banks and by the capital markets. As for CRM, you don’t have to go far beyond FOAF to get the benefits of linked data and entity analytics. So much more functionality would emerge than any vendor can provide or maintain, such as in mash-ups using open technology and linked data, if CRM was ontological. (Mash-ups demonstrate that creativity is very powerful when unleashed.)

So, unless the vendors agree on an ontology for CRM or unless XBRL shifts from data modeling (e.g., XML schema definitions) to semantics, users are either doomed to redundant investments or trans-enterprise collaborative development of ontology is needed. (I would not hold my breath on the former!) I am ignoring a third alternative for the moment, which is that a technology startup or a professional organization gains significant traction or market leadership in providing such an ontology as a product or service offering. On the other hand, given the size of commercial ontologies such as these, even such a disruptive innovator will need to overcome the collaboration bottleneck with respect to ontology, which is my focus here.

The most promising approach for models that are common across enterprises may be to develop them using a wiki, as in Wikipedia. Certainly, some collaborative platform that exists “outside the firewall” is necessary. But we need this platform to avoid the problems of spam and nuisance editors that plague Wikipedia and other completely open platforms. We want to embrace the web’s lack of authority but we don’t want our platform to be completely egalitarian. We want the community to direct itself to a collective objective, which is ontology worthy of broad enterprise adoption.

Ontologies as standards

If we are successful, completely open standards – the ontologies themselves – will be defined without committees. There are challenges to be sure, but it seems possible that such ontological standards are within reach, driven by those who will derive the most benefit: enterprise consumers. There are things we need from ontology in order for it to be an enterprise worth standard, however. For one thing, we need version control. An enterprise cannot use an ontology that changes at the whim of a Wikipedia editor.

Take Semantic MediaWiki, for example. If we use SMW as a platform for collaborative ontology and we adopt that ontology in resources that our CRM or financial systems make available as linked data, what happens when someone deletes or renames a concept that we use?

The solution is not as simple as making a copy of the ontology at a moment in time. Don’t forget that people from around the world are editing at every hour. Asking the world to stop for a snapshot won’t work. Nor will asking the world to unanimously agree that every concept in an ontology is right and final, at least for this version. No, more sophisticated change management is needed.

All this should sound familiar to enterprise architects. We hold our models increasingly tightly as we approach a release. And then they are fixed in stone until the next release. We avoid the challenges of collaboration above. We share at first and gradually take control. It works.

But this approach won’t work across enterprises and is doomed to redundant investment within the enterprise. It may be tough to sell across enterprises but there is definite value and ROI if the collaboration can overcome the issues of change management.

Unfortunately, ontology standards and platforms simply don’t address this problem yet. Chalk this up as another bridge for semantic technology to cross in order to cross the chasm. Some ideas for crossing that bridge follow.

Social semantics

We need the collaborative platform to allow convergence towards a stable ontology. At the very least, we need to converge on some parts of the ontology. If the community never “finally” agrees on even a single concept or property a standard will certainly not be born. Fortunately, social media and change management may combine to address this problem.

Consider, for example, that social media ubiquitously support voting and rating functionality, whether five stars, thumbs up or down, dig it or favorite. We can use this kind of functionality within the wiki to gauge the quality of content. We can also take collective opinions on the need for elaboration. There are other possibilities and details concerning voting and rating. Among them are important details about aggregate opinions versus individual opinions.

Wiki communities

In the context of a wiki supporting enterprises, there might be a taxonomy of organizations, groups, and projects, each of which define a community for which voting or rating might be tracked. A community is a collection of individuals where those collections are organized taxonomically. Most wikis have only a single community that anyone can join, including anonymously. Within this community there might be individuals with different roles who can perform certain operations that typical users may not. A more general notion might define communities as requiring permission for an individual or community to join. The privilege to grant membership might be given to those who play an administrative role for the community. And the level of privilege or permission granted to an individual could use the voting and rating functionality described above.

Wiki democracy

As mentioned above, authority is not a natural phenomenon in open societies, such as the web. Respect may be earned, but authority must be given, not taken. The obvious technique for giving authority is by voting (possibly including a veto mechanism). For example, a community may adopt the policy of extending the role of administrator along with its privileges to the top quartile of rated members. Or, the community might follow a normal nomination and election process. Or the community might elect a president or board of directors to whom it delegates the assignment of roles to community members.

Wiki citizens

Unfortunately, there are people who will be disruptive or cheat in any system. Unchecked, they will post commercials or vote more than once for or against matters for various reasons. So, it is important to know who is eligible to vote and ensure that there is a most one vote per citizen and that enough citizens vote on specific matters of governance.

It’s nice to have the template of Democracy so well established in the world. Let’s use it.

Freeze-free releases

Now, with wiki democracy and communities, we have the basis for controlling release cycles. Communities can configure their democracy, especially with regard to membership and the roles assigned to members. More specifically, they decide the governance process by which versions are gradually stabilized. They do this by assigning privileges or permissions to individuals, roles, or sub-communities. These privileges or permissions allow them to “freeze” parts of the ontology. But we cannot really allow the wiki (or its ontology) to be frozen. Rather, we mark particular versions of parts of the ontology as being frozen with respect to a target release for our community. Then, when we have frozen all parts of the ontology that our community needs in its ontology, we extract those versions into a release. Other communities are unaffected by our release cycle.

The wiki might even show a community view of its content. That is, it might show the versions of the ontology (and related wiki content) that is consistent with a community view. Two different individuals might see two distinct views for the same pages. Each might see the version of content that was frozen for their next release or the current content for parts that are not frozen for their community.

The careful IT reader might at this point be wondering about forks and other change management functionality. I do not want to try to address such issues here, but I would welcome your comments and suggestions, especially if you think merges are important. I would argue that forks are to be avoided. They might be frozen for the practical reasons described here, but we need individual communities to converge with the entire community over time.

Bureaucracy

As the enterprise –worthy ontology underlying the collaborative platform grows and matures the electorate will inevitably demand more stability in what will become the core or upper ontology common to most enterprises. Perhaps an amendment to the constitution will be offered and adopted by super-majority. And a new community will form at the top of the taxonomy with permission to govern aspects and versions of parts of the ontology delegated to it by the electorate. When this happens, the chasm will have been crossed. Or, perhaps the wiki would start with such a board of directors acting as a senate or superior court, if you will.

Permissions

The bureaucracy will want to effectively freeze certain parts of the ontology for all communities other than itself. Even while frozen however, the upper ontology might still receive votes, ratings, and annotations (which is an important and substantial topic in itself). The mechanism for enforcing this is familiar to almost anyone who would use such a wiki: permissions. The typical approach is to map types of users to things they can or cannot do with, to, or on a particular thing (and to support inheritance between things as appropriate). For example, Revelytix supports some of what I have described for communities as well as role-based permissions in Knoodl, its semantic wiki.

Convergence

As the modifications to the content concerning a part of the ontology diminish, it is either becoming less relevant or more stable. Voting or rating can determine which. Combined with voting and rating information, version control helps identify parts of the ontology that are (close to being) ready for inclusion in a “frozen” version.

On the other hand, volatility in votes, ratings, and versions indicates potential controversy. Wiki democracy can govern controversy. For example, the electorate can dictate or delegate the definition of policies concerning limits and resolution of controversy.

More practically, if membership in the wiki is limited to people with a vested interest in the emergent and continuous use of the resulting ontology, such as by authentication and other policies, such as nomination and formal acceptance, many of the problems experienced by Wikipedia can be effectively eliminated.

Both of these approaches are becoming increasing common in popular wikis.

Authority

Social media demonstrate a variety of techniques for gauging authority, whether globally or per community, generally using voting and rating, but in some cases more objectively. In addition, communities, particularly where they correspond to organizations, may explicitly delegate authority concerning their operations without “resorting” to democracy. The wiki should support both kinds of authority, but it can leverage the former (within or across communities) in gauging the quality of content or to resolve controversy.

With appropriate views of the ontology, people can see what parts of the ontology have many positive or negative votes or ratings filtered by community and level of authority and ranked by volatility. Additional criteria that may help assess the stability and adequacy of ontological knowledge (e.g., concepts, whether categories, collections, or individuals, and properties) are certainly possible and we solicit your ideas and suggestions.

Private Content

Our interest is in deriving enterprise benefits. We believe that semantics offers significant benefits within enterprises. These are most widely recognized in the areas of data governance and integration but will become increasingly relevant in business process management, enterprise knowledge management, and policy automation. And we are looking forward to leveraging increased understanding of enterprise objectives, in business intelligence and enterprise performance management (e.g., leveraging XBRL for CFOs).

Enterprises who define their business processes, policies, and objectives semantically want their business process models (BPMN), their semantics of business vocabulary and rules (SBVR) and business motivation model (BMM) to be closely related to the ontology wiki. In effect, they want a wiki within their firewall that is seamlessly integrated with the extended enterprise ontology wiki. They want control over their contributions to the community.

In the extreme, enterprises want to host their own wiki as inheriting from the version of the external wiki that their community is working with and towards. As they edit their content they want careful control over what will be published to the broader community and what will be retained as proprietary. If there is a firewall between their wiki and the global wiki, they want to ensure that none of their proprietary content penetrates the firewall.

This is an extended effect of permissions that interacts with an inheritance scheme between a taxonomy of two wikis. Although this inheritance can be generalized to support a taxonomy of wikis rooted globally but extended internally, the typical use case is inheritance from one external wiki to one internal wiki. This permissions and inheritance mechanism should support overrides and exclusions such that the enterprise can prefer its own definitions which improve or conflict with the global wiki without disclosing them to the broader community.

Collaborative Protege

Although it does not provide wiki functionality, Collaborative Protégé addresses many of the issues discussed above. For example, the following shows Protégé’s support for an extensible set of annotations.

Protégé uses annotations to encode version information, which is cumbersome but stays within the framework of using an RDF representation. The following shows that even fine-grained constructs, such as annotations, are maintained with author and date/time information:

Note that the changes tab would show the history for a specific object in the ontology. Protégé also supports proposals and voting or rating within a discussion forum metaphor:

Workflow support is planned for future versions of Collaborative Protégé.

Overall, Protégé is a good tool for the ontological aspects but insufficient when compared to the blog, wiki, content management, and discussion forum functionality common across the social, read-write web. Unfortunately, as with other platforms discussed below, it does not yet go far enough with regard to communities, permissions, and inheritance (or other mappings) between federated ontologies.

Semantic MediaWiki

Semantic MediaWiki (SMW) has the virtue of facilitating evolutionary consensus on what concepts and relations between them mean by allowing authors to converge on a few paragraphs of text. SMW pages may correspond to ontological objects, such as concepts or properties and links between SMW pages may correspond to ontological properties.

SMW is an extension of MediaWiki, the platform used for Wikipedia. MediaWiki has adequate version control functionality but ignores conflicting edits; the last writer wins. This may be fine, in practice but other wikis provide automatic locking with timeouts to avoid conflicting edits.

Conflicting edits can be particularly important when modifying ontological structure, especially where inheritance is involved, as in SMW. Furthermore SMW content is more technical to edit than editing MediaWiki content, which is also technical. Although SMW provides an adequate backbone for merging ontology with wiki content, its unstructured approach to editing and collaboration may prove inadequate for convergent ontological development.

MediaWiki supports limited group functionality (i.e., Bureaucrats, Administrators and Everyone) with correspondingly little support for permissions, consequently SMW falls short with regard to communities and convergence.

Content Management Systems

CMS, such as Drupal, can provide wiki-like functionality in addition to blog and other functionality which may prove attractive and useful to a community. Drupal also provides discussion forum functionality. Although MediaWiki provides discussion pages, the thread functionality provided by Drupal and other systems (including Protégé) is much more suitable for collaborative development. Drupal also allows content to be edited in a more structured manner and to be more flexibly and dynamically composed.

Blogs

Wiki articles typically correspond to specific topics, as in an encyclopedia. WordPress and TypePad articles are more free-form, while content management systems are best suited for on-line publications, including magazines and newspapers.

In a blog, the discussion is typically more in-line with the content than a wiki’s discussion or “talk” pages. Many blogs support threaded commentary as in a more full-fledge discussion forum platform. Blog-like commentary has been adopted in content management but is generally more limited in its presentation, such as when an article is viewed individually at

The wiki metaphor aligns better with the conceptual structure of an ontology. This is demonstrated, for example, in the Semantic MediaWiki extension of MediaWiki (SMW). The primary principle underlying SMW is that the ontology is documented by a page per concept or property. Conceptually, the categories of MediaWiki become classes in an SMW ontology and the classes of the ontology correspond to MediaWiki categories. Instances within the ontology also have their own pages.

Social Networks

Social networks, such as Facebook and Twine, for example, facilitate interpersonal communication, groups, interest tracking, and content or contact rankings and recommendations. Clearly, such features should be considered extensions of a collaborative ontology platform. Among them the ranking is most important, as discussed above.

Bookmarking

Services that originally provided bookmarking services, such as Digg, have evolved into social networks. Nonetheless, their initial functionality was to provide tagging of bookmarks that would otherwise have remained unorganized within web browsers. The tags assigned to bookmarks are typically less formal than ontological classification. Some such services surreptitiously organize user tags and content ontologically, however. Some argue that such ontological structure enables better recommendation, search, and browsing functionality.

In a platform for collaborative ontology development, it makes sense to support explicit tagging using ontological concepts. Adding a tag would correspond to choosing (or defining) a concept in the ontology. If in a wiki presentation, the pages for concepts could have a region or tab for bookmarks that use the category. Moreover, semantic pivoting and drill-down could be supported, as is common in semantic web applications. As with other content, bookmarks could be openly shared or limited to communities or individuals.

A Foundation for Semantic Collaboration

We are investigating platforms and formulating requirements for such a collaborative platform for developing and maintaining extended enterprise ontologies. We will be writing further on this area in the future but welcome your thoughts and interests in the meantime.

One Comment

This entry is fascinating and got me thinking about the conversations I’ve had with others about ontology. One of the biggest questions in my mind is “how to capture knowledge”. Should a particular piece of knowledge be modeled by entities? Should it be capture as a rule? Should it be captured as a mixture of entity + rules? Deciding how a piece of knowledge is modeled often affects the usage. If I model it wrong, the system could become rigid and brittle. If I model it too abstractly, others might not understand it.

For me RDF style semantic web still looks like simple tagging. It isn’t as mature or robust as traditional AI knowledge base approach. I know that Sir TB Lee has stated numerous times “semantic web is not AI”. To me RDF/OWL centric semantic web won’t be able to make a breakthrough. A lot of the existing and failed semantic web projects using RDF fail to learn from history and prior art. It’s really unfortunate, but I guess it’s not fashionable to spend time learning from history. It’s much more glamorous for people to “reinvent” stuff and repeat the same mistakes.