One of the
enduring arguments in the coding community is that of Object Oriented
techniques vs classical procedural techniques. Whilst many programmers
may correctly point out that the issue is in a sense a non-argument,
since each model reflects a different aspect of a common reality, it is
worth exploring some of the trends in current OO methodology. This
month's feature will discuss some of the central issues and take a
brief look at the Unified Modelling Language (UML) and its various
implications.

The most fundamental issue which underpins the whole software
engineering methodology debate is that of dealing with the extreme
complexity of modern software products. Complexity is in many respects
the barrier which limits acheivable functionality and frequently also
interoperability in software products.

This was not always the case. If we look back over the last
two to three decades, the complexity of a software product would have
been bounded, in practical terms, by factors which are external to the
product itself. Key factors which evolution has since rendered either
irrelevant or incidental are:

Operating systems functionality. The advent of modern
operating systems with powerful interprocess communications facilities
and scheduling facilities has removed the previously difficult
constraints in getting data or messages between processes in complex
applications, and constraining the manner in which processes can
interact.

Runtime environments. Modern runtime environments provide
powerful facilities such as multithreading, which allow for much more
complex control flow management techniques within an application
process.

Common interfaces to services. The wide adoption and
standardisation of POSIX interfaces to operating systems promotes the
reuse of mature and proven pieces of code to provide low level
services, especially basic libraries and I/O routines.

Hardware memory size and cost. Whilst the advent of
virtual memory allowed applications to occupy enormous address spaces,
the performance penalties of swapping these in and out of memory
presented a serious obstacle to building very large and complex
programs. With commodity desktop machines now cheaply available with
hundreds of Megabytes of memory, there are few obstacles to prevent the
implementation of genuinely enormous programs. Moore's Law will
continue to drive this process.

Processor compute performance. Moore's Law has yielded
remarkable results over the last decade. Flavour of the month desktop
machines are now equipped with copper metallised microprocessors which
run at clock speeds between 1.4 and 1.6 GHz, and deliver compute
performance more than an order of magnitude better than a decade ago.
Therefore the execution time of even very large memory and cache bound
applications can be quite fast (more on this in next month's issue).

Development tools. Tools for crafting, compiling and
debugging code have improved significantly, generally in a manner which
facilitates the development of large and complex code. Running on
GigaHertz class processors, compile times need not be the overnight
batch jobs they had once been, even for relatively large programs or
systems.

Languages. The extension of ubiquitous C into C with
Classes, formalised into C++ provided a vehicle for a large scale in
programming technique and methodology from the classical procedural
approach to the OO approach, thus facilitating levels of complexity not
managable using older techniques. The plethora of other mature or
maturing OO languages in the market provides a developer with many
choices.

Object interface standards. The emergence of OMG's CORBA
and its proprietary equivalent(s) provided a common and standardised
interface model via which objects could be accessed and used. In a
networked or even basic multiprocessing environment, the CORBA model
allows the construction of very complex applications in which
components interact in a well defined manner.

These eight items represent the results of almost two decades
of focussed evolution in technology, all directly or indirectly aimed
at facilitating the design and implementation of increasingly complex
pieces or interacting systems of code. Applications which were
uncompilable, unrunnable, undebuggable, unable to relaibly communicate
internally and unaffordable in previous years due to basic technology,
are now technically feasible and in terms of basic technology,
implementable.

Yet the collective experience is still that bugginess and poor
reliability are endemic and expensive problems, whether we are
observing the behaviour of a shrinkwrapped application with or a flight
control system on a rocket booster (the only difference between a BSOD
and exploding Ariane booster being in the scale of the outcome and its
consequences).

Inevitably, any problem in a software product results in
finger pointing. The code cutters got it wrong, the testers missed it,
the user did something silly, the marketeers misunderstood the
requirement, indeed the number of ways in which responsbility for an
adverse outcome can be assigned is limited only by the imagination of
the party seeking to responsibility.

The root cause, in the most fundamental sense, is complexity.

The REAL Enemy -
Complexity

In the broadest philosophical sense, the trend toward
increasing complexity seems to be an artifact of evolution, be it
biological or technological. Trends in software are no exception, and
in recent times programs with sizes of the order of millions of lines
of code are becoming common. This is not only true of shrinkwrapped
commodity products, but also of large commercial products and larger
embedded systems, such as those found in space vehicles, large
industrial plants and military or commercial aircraft.

Complexity is thus unavoidable. Just as the strands of DNA
which make up a more evolved mammalian species became more complex over
time, code will simply get more complex over time.

The big difference between nature and man-made entities like
software is that the former is subject to Darwinian evolution over
enormous timescales. Software is driven by Lamarkian evolutionary
behaviour, and time to market and use are thus do or die parameters in
the evolutionary process of a software product.

The traditional programming model and software engineering
approach involved some omniscient chief software engineer or programmer
attempting to coordinate the activities of a small group, each member
of which would craft his own component. With enough iterations and
enough haggling over interfaces the system could be made to work.

In practice, this technique ran into difficulties with sizes
of hundreds of thousands of lines. While a program of this size can be
successfully developed and maintained by two dozen or perhaps fewer
programmers, the odds are that all participants will need a solid depth
of experience and preferably as much insight as possible into the
specific product being maintained. Reduce the level of programmer
experience and difficulties will arise very quickly.

With all complex problems, the proven and most robust strategy
is to divide and conquer the problem. In the most fundamental sense, it
is broken down into smaller chunks, ideally chunks which are small
enough to be well understood by individual code cutters or small teams.
Gigantic monolithic programs are not a very common sight.

Where extreme complexity bites hardest, even with a rigorous
divide and conquer methodology, is in one key area - the definition of
the interrelationships between the components in the program and the
interfaces which support these interrelationships.

This is frequently for good reasons and bad reasons:

An interrelationship between components of a program may
be inherently difficult to define, or may indeed change depending on
the state of the program.

Different components of a program may be implemented in
different languages, with disimilar parameter passing conventions.

Different components of a program may be implemented by
different programmers or teams of programmers, who may interpret the
intented interrelationship between these components differently.

Frequently, interrelationships between components of a
program fall between areas of responsibility in program development,
and thus less attention is paid to them in comparison with each
component itself.

The intended function of the program may not have
faithfully replicated in the formal definition, if it exists, the
interrelationships between components of a program.

The complexity of the application may be so great that no
individual programmer can properly understand how all of the major
components are intended to interact, let alone the lesser parts of the
program.

This problem of extreme complexity leading to severe
difficulties, especially in integrating the various components of a
product, is incidently not confined to the software industry alone. The
aerospace industry is replete with examples. Two noticable case studies
are the US 1960s TFX fighter development program, and the UK
1970s-1980s Nimrod AEW program. In both instances, the biggest problems
arose in getting various major components to operate together in the
manner intended. The first of these projects eventually succeeded, the
second crashed and burned. Both incurred many times the development
costs originally envisaged.

Dealing With Extreme
Complexity

One might argue that with enough discipline and rigour applied
in the development process, the spectre of component interrelationship
mis-definition and interface failure can be avoided. This may well be
true, but in practice the kind of regime required to impose that level
of discipline and rigour upon a group of developers may not be either
managerially or politically implementable within an organisation. The
natural human propensity to want to do things independently always
works against an organisationally imposed scheme of straitjacketing how
deisgns are put together.

The other difficulty which arises is that evolvability in the
design may be lost in the process. Where the user requirements for the
function of the design may evolve during and after the process of
developing the design, whatever model is employed to define the
structure of the design and the interrelationships between components
of a design must be capable of also evolving in step, preferably
without unreasonable expense.

Ideally, the basic technology should both impose the required
quality of evolvability in the architecture of the design, yet also
provide the framework for a rigorous and disciplined development
process.

The OO paradigm developed in a large part with these aims in
mind. It is customary in many discussions of OO technology to focus on
the details of implementation, rather than the broader systemic
implications of this model. This distracts from a more fundamental
issue, which is that of how the paradigm itself facilitates the design
and implementation of highly complex programs.

OO programming languages provide the basic brick and mortar
portion of the technology base, facilitating implementation. They do
not implicitly provide a mechanism for formally representing the high
level structure of large and complex programs.

That is the function of a higher level modelling language,
which is used to capture the critical interrelationships between the
components of the program. Such a language provides a means of
describing these in a format which is both rigorous and evolvable.

The Unified Modelling Language (UML), devised primarily by
Rational, is a product of the latter half of the nineties, and is now
the OMG ratified industry standard for this purpose.

Unified Modelling
Language

UML was created by the fusion of ideas developed in three
second generation software engineering methodologies, Booch, Objectory,
and OMT, devised by Grady Booch, Ivar Jacobson and Jim Rumbaugh, but
also incorporates ideas produced by a large number of other CASE
methodology theorists. The extended UML for Real-Time incorporates
features from the Real-Time Object-Oriented Modeling language (ROOM).

The process of creating UML started in 1994 when Booch and
Rumbaugh decided to unify their respective Booch and OMT methods. Ivar
Jacobson's use cases were incorporated, and Jacobson soon after joined
the unification effort which led to the current UML specification. The
decision to unify the three established methods was based on the
following criteria (Rational - UML FAQ by Booch, Rumbaugh and Jacobson,
http://www.rational.com/):

First, these methods were already evolving toward each other
independently. It made sense to continue that evolution together rather
than apart, thus eliminating the potential for any unnecessary and
gratuitous differences that would further confuse users.

Second, by unifying these methods now, we could bring some
stability to the object-oriented marketplace, allowing projects to
settle on one mature method and letting tool builders focus on
delivering more useful features.

Third, we expected that our collaboration would yield
improvements in all three earlier methods, helping us to capture
lessons learned and to address problems that none of our methods
currently handled well.

To model systems (and not just software) using object-oriented
concepts, To establish an explicit coupling to conceptual as well as
executable artifacts, To address the issues of scale inherent in
complex, mission-critical systems, To create a method usable by both
humans and machines.

These four lines by the authors of UML encapsulate, very
concisely, much of the argument presented earlier. Importantly, the UML
model is not unique to software, but provides a paradigm which is quite
general and thus applicable to defining the attributes and behaviour of
highly complex systems of any type.

UML comprises a number of components. A metamodel is used to
describe the semantics and syntax of the elements of the language. The
long term aim is to refine this using formal logic. A graphical
notation is used to provide a graphical syntax which can be read by
humans and by tools. The language also includes a set of idioms to
describe usage.

UML employs a set of models which are used to describe the
system:

Use-case diagrams, adopted from Objectory, are employed to
describe use cases.

Class diagrams, a feature of Booch and OMT, are used to
describe the static semantics of the classes in the system. *
State-machine diagrams, are used to describe the dynamic semantics of
classes.

Message-trace diagrams, object-message diagrams, and
process diagrams, adopted from the Booch, OMT, and Fusion schemes,
describe the dynamic semantics of collaborations of objects.

Module diagrams are employed to describe the developer's
view of the system.

Platform diagrams are used to describe the organisation
and topology of the hardware upon which the system executes.

Deployment diagrams show the configuration of the hardware
and software components of the system at run-time.

Capsules are complex active components, which interact
with the surrounding environment through boundary objects called ports.

Ports are objects which implement the interfaces between a
capsule and the external world. Ports are signal based, to provide
portability across platforms ad distributed implementations, and
implement protocols via which they communicate.

Connectors describe the communication relationships
between capsules.

State Machines describe the functionality of a simple
capsule. More complex capsules are described using internal
sub-capsules, interacting through connectors.

A well implemented UML toolset will provide extensive
facilities for binding the UML models to the object implementations in
an OO programming language, and some toolsets also provide reverse
engineering facilities which can produce UML descriptions of an
existing program. Whether the code is implemented in C++, ADA,
Smalltalk or any other applicable language, the toolset provides the
means of transfering a definition into a framework for implementation
in code.

UML is not a panacea. It is a mechanism via which the
behaviour of a complex system can be exactly described and defined, to
facilitate the process of creating code. Even with a perfect UML
description, poorly implemented and buggy code modules will cause
difficulties. However, bugs of this ilk are much easier to identify and
fix, typically, in comparison with bugs which arise at an architectural
level in the product design.

In terms of dealing with complexity, the widespread adoption
of UML will yield important benefits in the robustness and
predictability of the development and mainenance process, against older
techniques. A likely consequence, in coming years, is that this will
push complexity up even further beyond current bounds, introducing
difficulties which have yet to be seen.

Programs with tens of millions of lines of code will present
some very interesting challenges.