Issues in Automated E-Commerce and the Semantic Web

Methods for requirements specification and for software engineering are widely studied because of their economic importance. For example, Agile Modeling [1] and Extreme Programming [2] are just two
of many approaches.

As organizations become increasingly interdependent via e-commerce and the web, such methods become even more important. EDI and XML based approaches [3], among other emerging standards, seek to
ease the manual interfacing of software systems across organizational boundaries.

There are strong economic reasons why systems should be designed to be self-integrating over the web. In distributed manufacturing, and in and other e-commerce activities, there are new
opportunities for automated matching of requirements to capabilities. The emerging second generation Semantic Web is expected to play a key role in this kind of automation.

This paper looks these issues, as we move from single systems, to manually integrated systems, and towards systems that will be self-integrating. First, we look at some issues in developing single
systems. Then, in section 3, we look at the situation when several systems must be manually integrated. In section 4, we mention some difficulties in applying a manual, standards based approach to
virtual organizations, and we look at an approach to systems that will self integrate across organizational boundaries. Finally, we conclude with a suggestion that a new method for the
specification and implementation of software systems is of interest for each of the situations we discussed.

2. ISSUES IN DEVELOPING SINGLE SYSTEMS

The Economist [4] has this to say about developing a software system:

A common cause of disaster in software development is that the end product is precisely what the customer ordered. In a world moving at Internet speed, a
customer’s objectives are constantly being revised, so programmers have to be able to hit a moving target.

To look at this in another way, according to Standish [5], 51 percent of all [software] projects are over budget and/or late, another 15 percent of all
projects fail altogether, and just 34 percent are completely successful.

These numbers may provide some comforting context when an IT department has to justify a cost overrun—“Look, we may not be doing so well, but all these other folks are doing so much
worse”. However, the situation plainly calls out for improvement.

This picture illustrates some of the things that tend to happen in requirements gathering and software development.

Figure 1 Current Requirements Gathering and Software Development Methods

In the picture, business opportunities are lost because the application design cannot anticipate all future needs. Also, a business policy change in imprecise English must be translated into
precise code, such as Java. To some extent, programmers must try to understand the business, and business people must try to understand about programming.

In my company, Reengineering, we work on a system called Internet Business Logic [6], with the aim of changing the situation pictured above to a more direct scenario like this.

Figure 2 A More Direct Scenario for Requirements Gathering and Software Development

We’ll return to this idea in talking about the Semantic Web in Section 4.

3. ISSUES IN INTEGRATING SYSTEMS MANUALLY

One of the problems in interfacing of software systems across organizational boundaries is that organizations may have different names for items that are the same, or almost the same. Here is an
example based on the paper “Semantic Resolution for E-Commerce” [7].

A retailer would like to order computers from manufacturers. In the retailer’s terminology, the kind of computer needed is called a PC for Gamers.

A particular manufacturer makes a computer, and in the manufacturer’s catalog, it is called a Prof Desktop.

At first sight, there is no match between the retailer’s requirement and the item that the manufacturer is offering. We can think of this as a “semantic distance”. If the retailer
uses a search engine, such as Google, to look for PCs for Gamers, the manufacturer of Prof Desktops will not be found. If the retailer asks for a more general search on, say, Computers, the
manufacturer of Prof Desktops will only be found if it is listed under Computers, and there will be a very large number of results that are not relevant to the search for PCs for Gamers.

In this scenario, the only hope for a match comes if the retailer searches for computer manufacturers in general. With luck, the retailer may then talk to the relevant manufacturer’s sales
people, and it may emerge during the conversation that the PC for Gamers and the Prof Desktop match on many features. So, the conversation has negotiated away most of the semantic distance. Then,
the retailer and the manufacturer can then ask their respective IT people to make entries in the relevant databases so that a purchase order for a PC for Gamers is taken to mean that Prof Desktops
should be supplied.

The above kind of scenario may work sometimes in simple cases with two or three organizations, but better ways of matching requirements to capabilities could yield significant economic advantages.

So far, we have looked at single systems, and at two systems that must be integrated, and we have seen that there are economically important issues in how the systems are specified and implemented.
Next, we look at the situation when many systems must be dynamically integrated across organizational boundaries.

4. ISSUES IN SELF-INTEGRATING SYSTEMS

In manufacturing in particular, and in e-commerce in general, there is a growing interest in the notion of a virtual organization (VO), one that could knit together several component companies to
meet a particular requirement. A VO might only exist for a few weeks, till it meets the requirement, then be dissolved. A particular company might be a member of several VOs simultaneously. The
manual method of integration described in the last section clearly falls short for this kind of VO. What is needed is a way for potential component companies to post information about their
capabilities and capacities in a way that a VO matchmaker can search and then integrate. However, the issue of how things are named now becomes a likely show-stopper.

One approach to this issue is to say that we need a standard that spells out how each company should post information on its capabilities. But, as Steve Ray of the US National Institute of
Standards has pointed out [8], “[There is] an increase in the number of interconnections among information systems supporting the manufacturing supply
chain as well as other businesses. Each of these interconnections must be carefully prescribed to ensure interoperability. However, the sheer number of interconnections and the resulting complexity
threaten to overwhelm the ability of the standards community or industry to provide the necessary specifications—a way out of this impasse must be found.”

In the paper, Steve Ray outlines the elements that must be developed so that systems can usefully self-integrate. One of the elements listed is “A
reasoning or inferencing capability within the communicating systems, to enable the systems to make judgments and draw conclusions about the meanings of terms.”

Over the past few years, there has been substantial research and development effort directed at implementing a successor to the web as we know it today. The successor is called the Semantic Web
(SW), since it aims to add meaning to the existing web. At a basic level, the SW does this by adding a label to each link in the web. So, if the web has a link from PCs for Gamers to Computers, the
SW will add a label with the information “is-a-kind-of” to the link. Then, the SW will also add reasoning and inferencing methods, so that if the SW has links “DroidBlaster500
is-a-kind-of PC-for-Gamers” and “PC-for-Gamers is-a-kind-of Computer”, then it can also reason to conclude that the DroidBlaster500 is-a-kind-of Computer.

The actual notations currently used in the SW are rather technical. As indicated at the bottom of Figure 3, one might use the Resource Description Framework (RDF) [9] to write down the labeled
links like the two just mentioned, and one might use the OWL [10] programming language to draw the conclusion. Up to a point, this is fine if the RDF and OWL notations are just used by computers to
communicate and reason with each other.

However, this leads to a picture of the future in which networked computers create, run, and dissolve VOs without any oversight from business people. The issues of requirements specification and
software engineering outlined in Sections 1 and 2 above threaten to overwhelm any effort at business control and audit-ability. We can picture the situation like this.

Figure 3 Semantic Disconnects between People and Machines

So, a Semantic Web based on notations that only computers and a few technical people can read and understand seems to lack a key component—one that would put business people and regulators in
control.

A candidate for a component to fill this gap is a system that we have been working on, called Internet Business Logic (IBL). One aim of our work on the IBL is to replace the picture in Figure 3,
with the one in Figure 4.

Figure 4 Removing the Semantic Disconnects between People and Machines

The IBL system supports the writing and running of business rules in English, and for many purposes this can replace programming in conventional languages such as Java or OWL. The system can reason
and make inferences, and it can automatically generate and run queries and transactions over networked relational (SQL) databases.

So, in the scenario in Figure 4, business people have direct control, in English, over what kinds of reasoning and inference their networked computers will do. As shown in the Figure, the machines
can supply English explanations of what they are doing, or even more importantly, of what they propose to do.

Of course, within a system like the IBL, there is a complex translation, in both directions, between the English rules that a person specifies and the technical notations that actually run in the
machines. However, the translation is an encapsulated, invisible service that the system provides. So, when a business person wishes to change the inferencing, or to get an explanation of something
that a VO proposes to do, there is no longer any need to call upon human programmers.

Let’s flesh out the example about a retailer and a manufacturer to get an idea of how this can work in practice. Recall that in the retailer’s terminology, a computer is called a PC for
Gamers, while in the manufacturer’s terminology, it is called a Prof Desktop. Let’s say that retailer and the manufacturer have each included in their requirements and capability
statements that they are interested in something called Worksts/Desktops. The retailer also knows that a PC for Gamers belongs to the class of Worksts/Desktops, although the manufacturer does not
know this. Similarly, the manufacturer knows that Prof Desktop belongs to the class of Worksts/Desktops, although the retailer does not know this.

So, we are dealing with three kinds of naming conventions, sometimes known as namespaces: one for the retailer, one for the manufacturer, and one that is shared in common between them.

In the IBL system, we can write down the above information as two tables, each with an English heading, like this:

for the retailer the term PC for Gamers has super-class this-class in the this-ns
namespace ==========================================================================
Computers to order — retailer
Worksts/Desktops — shared
for the manufacturer the term Prof Desktop has super-class this-class in the this-ns namespace
===========================================================================
Worksts/Desktops — shared
Computer Systems — manufacturer

Then, we can tell the IBL system how to reason so as to make a bridge between the retailer’s and manufacturer’s internal ways of naming things. We do this by writing a rule like this:

for the retailer the term some-item1 has super-class some-class in the some-ns namespace
for the manufacturer the term some-item2 has super-class that-class in the that-ns namespace
————————————————————————————————————————-
the retailer term that-item1 and the manufacturer term that-item2 agree – they are of type that-class

The first two lines are premises, and the rule tells the system that if both premises can match up with things in the tables, then the system should reason to conclude that the last line also
holds.

When we run the system, it can conclude for us that

the retailer term PC for Gamers and the manufacturer term Prof Desktop agree – they are of type Worksts/Desktops

We can then add further rules and tables, so that the system can reason about the extent to which a Prof Desktop has a fast enough processor, sufficient memory, the right kind of graphics card, and
so on. In fact, this whole example (and other related ones) can be viewed and run by pointing a browser to the Reengineering web site [11]

5. CONCLUSIONS

We described how some of the same issues occur in requirements gathering and software engineering

in single systems

in multiple systems that are manually integrated , and

in future systems that will have to be self-integrating

We said that the issues are progressively more important as we move towards virtual organizations that integrate a number of companies for a specific but temporary purpose. We argued that a move
away from current software engineering techniques is needed to address the issues, and we suggested that direct specification, in the form of business rules in English, is a useful approach.
Examples of the approach can be run live, online, including an example in which a retailer’s requirements are matched to a manufacturer’s capabilities.