Anyone who has studied human geography has probably come across
the work of Walter Christaller. In 1933 he proposed a model to
describe the settlements patterns in southern Germany - see side
bar. While this model shed some light on where settlements had
occurred nobody really expected it to actually give the location of
settlements; it described settlement patterns in an abstracted
way.

Models, by their nature make assumptions to simplify things:
when Airfix produced their model F1-11 - the company attempted to
show you what a particular jet bomber looked like but they never
claimed it would fly (OK, I'm sure I was not the only boy who tried
to fly the odd Airfix model out the bedroom window!)

From a programming perspective we are concerned with information
models: this places us closer to the models of Christaller than
Airfix.

If at this point you wonder what all this has to do with
software let me spell it out: when we create computer systems we
are creating models. Sometimes these are obvious: I once worked in
a department modelling the electricity market, these predictions
where used directly by management to decide which electricity
contracts to sign; sometimes these models are less obvious: a
customer relationship management (CRM) system models the expected
interactions between company and customer, when the model fails we
find post-it notes on people's terminals "If Jack Smith phones
transfers him to Jo."

Our models have boundaries: within these conditions and
scenarios are dealt with. Some boundaries are explicit, some are
tacit. Economists are familiar with these boundaries: every
economics model comes with the "all other things being equal"
pre-condition. Their models will attempt to describe activity
provided every other boundary condition remains unchanged. In many
cases these models make sweeping generalisations. The monetarist
model (see side bar) attempts to predict inflation in a closed
economy. In itself it is useless because it is so general, but it
does allow economists to reason about an economy. It also forms the
starting point for sophisticated models used by make economic
predictions.

Central Place Theory

Originally devised to explain settlement
patters in southern Germany the theory has been applied to other
settlement patterns and found to fit North American geography well.
After studying southern Germany Christaller devised a model with
assumed a uniform flat plane, with no natural advantages of one
place to another, settlements could occur anywhere. Given this,
where would towns occur? He concluded people cluster in groups
equidistant from one another. Each town would have a hinterland of
equal size. This would result in circular spheres of influence
leaving some areas uncovered so the model uses hexagons instead of
circles, each of which has a town at the centre.

In Cristaller's model natural features like rivers and mountains
where ignored. The model could suggest where settlements would
arise - if there was a river close by the settlement would be a bit
closer to it, natural features could be allowed for when looking at
real settlements. But the model allowed geographers to reason about
settlement patterns, and it provided a common standard to measure
and compare settlement patterns by - it provided a language to
exchange information.

Nobody ever expected to find a perfect example of the model,
that wasn't the point, it was a tool for modelling the real
world.

What are the
models we build?

In software development we build many different kinds of model,
on any project the different models look at the problem domain from
different perspectives.

Metaphores are
small models

Drawing an analogy by metaphor is setting up a small, quick
model: we rely on the fact someone already knows the metaphor, so
when Kevlin compares software development to gardening, he is
relying on the fact that most of use have an idea what gardening
entails.

Metaphores are used in one of two modes:

Educators use a metaphor to describe a new concept to us in
terms of a known one

We use it to reason about a system: because X is like Y in one
respect, is it similar in other respects?

These are the same thing at different points. We tie our subject
(e.g. software engineering) to a target (e.g. gardening) by
pointing out a similarity, then we can follow this up with a less
obvious similarity. In doing this we leverage knowledge in one area
to increase knowledge in another. Think of it as proof by
induction.

Specifications

In traditional development a business analyst writes a
specification document that is then implemented. Some organisations
still work this way but many companies make do with a vague
statement of intent: "We intend to build an equities trading
system" - this relies on the developers understanding of what
features an equities trading system should contain, the developers
have their own mental model of what the system needs to do. (This
explains why banks prefer to hire people with experience in the
financial markets and why these people can command premium
wages.)

Monetrarist theory of money

One of the simplest economics models it is also one of the most
far reaching, advocated by Milton Friedman and, at least in public,
believed by Margaret Thatcher and Ronald Reagan.

M x V = P x T

Were, in any given time period:

M = the quantity of money in
circulation

V = the velocity of circulation

P = the price of all goods

T = number of transactions

Spending is: money multiplied by the speed at which it changes
hands, this must be equal to all the goods sold in the same time
period at their prevailing price. Obvious really when you think
about it, everything we spend must equal everything we buy.

Hence, if we increase the amount of money in the system,
all other things being equal
(velocity and number of transactions remain constant) prices must
raise - inflation, q.e.d.

Again, this is an idealised model and ignores little factors
like savings, investment and what actually constitutes money but
the idea is clear. Some elements of reality can be abstracted away
to show the central concept. More complex models used by banks,
firms and governments peek into the future, each of these makes
assumptions, each has some mathematical model at its heart that,
almost by definition, is inaccurate.

At heart both Christaller's and the Monetarist models are
attempts to reason about information and systems, which is not that
different to what we do when we write a design a system.

Whether a large document, in developers' heads or on many
individual CRC cards, the specification is a model of the problem.
Yet it is usually incomplete, and frequently inconsistent with
itself.

Design

While I hope all systems have a design, I believe most use the
Topsy Design Pattern: they just grow.

A design is high level representation of the source code and as
such it is indisputably a model. It is a model of the solution, not
the problem.

Like any model, it makes assumptions and abstractions. It is
important that all working on the system understand the model: if I
think we are building an Airfix Lancaster bomber and you think we
building a B17 we may well get something that looks like a four
engined World War II bomber but it will be neither one thing or
another. Auntie Dotty may think it is a good model but her terms of
reference are different to ours.

Source
code

Our ultimate model of the solution: the point where we start to
discover the inconstancies and holes in the specifications.

I once worked on a train timetabling system. The specification
was long and inevitably contained omissions and errors. These where
fixable, even when they occurred late in the development cycle we
could add new rules and change existing ones. The most difficult
problems occurred when rules conflicted each other, usually this
wasn't obvious until the source code was examined and we found that
a fix for requirement A had introduced a bug, requirement A was
quite respectable, but nobody foresaw that when implemented the
result contradicted requirement B. Only when the specification was
codified in the pure logic of code was this clear that neither A
nor B represented the true requirement.

Our voluminous specification model was neither complete nor
self-consistent and much was still locked in people heads.

Other
models

Specification, design and code may be the first models that
spring to mind but there are other models in software
development:

Test suites: test results part way between problem and solution;
they attempt to apply the solution to the problem.

Process models: We defined models for how we develop software,
our processes. SSADM, Extreme Programming, waterfall, and such are
models of process. Those who read my article on Extreme Programming
will notice this as one of my criticisms: [3] Beck sets out a model
called Extreme Programming
(XP), then, he says: you cannot modify this model, if you do so it
is no longer XP. I can't accept this, XP is a process model, no
team will ever have the exact conditions of the C3 team, it must be
adapted for each case.

Delivery schedules: need I say more?

What are our
tools?

When we use CASE tools like Rational Rose our modelling is
obvious, but even when we write C++, Java and Pascal we are
codifying our logic in a language model. The languages and machines
which run our models are all Turing equivalent so no computer or
language is really more powerful than another, but each brings
different techniques for modelling the problem, for thinking about
the problem, and this is where their power lies.

Object oriented Java is not more powerful than procedural Pascal
because it runs faster; it is more powerful because it allows us to
think, to model, in different concepts.

Beyond language we have notations: when I draw a UML chart you
know that a rectangle means one thing and circle means another.
Actually, I will take exception with my own argument here: I think
many of our notations, especially UML, rely too much on subtleties,
a folded corner on a rectangle means it is different to a regular
rectangle, while a dotted line is different to an solid line. I
think we often try and put too much information into our
notations.

I found the following story in Software Fundamentals:
"if the presenter showed a block
diagram, Dave [Parnas] would ask about the semantics - the meaning
of different block shapes ... the meaning of an arrow; whether an
unfilled arrow meant something different than a solid, filled
arrow... Usually these frills had no meaning. They certainly didn't
aid careful analysis, and they often got in the
way."[2]

In recent years the patterns community has moved to define more
labels for more models. In the case of the GoF book they defined a
meta-model that could be used to describe all their
models[3]. Now when I say Singleton,
Chain-of-responsibility or Mediator you know what I mean - OK, I
deliberately included Mediator because it is not so well known and
this is one of the problems the patterns community faces; it has
been so successful in defining patterns that, outside of a core
half dozen, few are widely known.

Why are our
models inexact?

Like Christaller we can never expect our models to exactly
describe a situation - by their nature they are abstractions. The
simplification we make to create the model and generalise it return
in real life. This brings us to the realm of Chaos Theory, a small
variation can, over time, when repeated, magnify into significant
difference.

We also face the problem of Catastrophe Theory - when multiple
parameters are varied things start to break down. And as if this
weren't bad enough we also have to face the law of diminishing
returns[4].

The key with a model is identifying variability[5] however, we
frequently miss points of variability and need to adjust our models
accordingly - but add too many points of variability, too many if's
and but's and instead of a model we have just a list of special
cases.

The more parameters a model has the less useful it is. There is
no model of the game of soccer because there are too many
parameters which can effect the game, we can make generalisations:
Manchester United usually win, Everton usually lose[6].

We should not expect our models to be exact, nor should we
expect to follow them blindly. Sometimes it just doesn't make
sense. Yes, we would like our singletons to be nicely destroyed at
the end of the program, but what does it matter if the OS model
will clean things up? Sometimes the work involved is not justified:
consider the GPS system in a cruise missile, what does a memory
leak matter in the final few milli-seconds before impact?
Attempting to have a GPS-singleton delete itself neatly is more
work for the programmers and adds more variability to a system at
the exact moment when it must be totally predictable.

And not only with code and patterns: the process models of
Yourdon, Jackson and Beck are really just more Christaller models.
Yes, we can learn from them, we can compare ourselves to them, but
to attempt to adopt them as laid out by the authors is about as
sensible as ignoring the a mountain range when building your
village!

Summary: Just
what does a model give us?

Models provide us with many benefits:

They give an idea or concept a label

A label allows us to communicate more efficiently

They allow us to compare different concepts

They qualify the main characteristics of an idea

However, they come with drawbacks:

No model will ever exactly describe an idea: if it does it is
not a model

Because they selectively hide elements that can be manipulated
to the advantage of the modeller

Applied incorrectly they can be wasteful on resources: you can't
apply Christaller to mountainous regions; to do so would simply
waste your time.

The value of a model lies in the abstractions it makes: by
focusing on what the important elements are we are not distracted
by the irrelevant ones. This is a classic definition of software
abstraction but it also brings us back to Christaller: on the
German plains Christaller accurately isolated the abstractions that
describe settlement patterns. But, you would never expect to find
Christaller's settlement patterns exactly because there are things
like rivers, hills, particularly fertile land areas and such.
Equally you should not expect your software model, your pattern, to
describe exactly your software.

No more green fields

I carry this model of the perfect job in my mind: at the
interview I'm told: "we have a few people here, we have a problem
and we have no idea of how we are going to solve it". So, maybe I
never expect to hear this but I settle for: "our current team has
decided to develop a completely new product without using any
existing code".

What I want is: green field development, a brand new
development, with no old code to maintain, no creaking database
scheme, no legacy source code control system...

In my old age, in my scepticism I don't believe there are any
green field projects left. So much has been done that every area
has been touched by legacy systems. Sometimes this is obvious: we
must keep the current system working; sometimes it is
contradictory: "we are writing a new product but we plan to salvage
as much as we can from the old one". Even if they were to throw
away all the code they would want to keep the user interface, or
the database format, or the file format.

Yesterday's models form part of today's reality: people's
expectations have changed - 10 years ago a text based calendar
system was magic, today people want an easy to use GUI, voice
control, and, and...

Your perspective is always shaped by what you know and what has
come before; today's problem are shaped by what exists, so, even if
a company doesn't have an application installed they will have
users accustomed to some systems[7].

Some of these advances are good: defining an XML schema is
better than defining a byte-by-byte file format, but such
developments can be limiting: you may need to use XML because it is
a buzz word, forget the fact that you are writing Space Invaders,
you must use XML somewhere! Models can constrain us too.

It is not only technology that limits us. If you think of your
problem domain as a blank canvas, or better, a Christaller like
uniform plain it is still bounded:

Our Easterly edge is compatibility: we must take data from this
system, or produce output for that system, we are asked to reuse as
much of the existing system as we can - even though we are supposed
to replace it!

A project deadline and costing forms our Westerly edge: maybe we
have a drop-dead project deadline, or maybe it simply effects our
bottom line! Maybe if we don't deliver in time the company will
cease to be. Even worse can be no deadline, an abyss on the Western
edge into which our team disappears on a blue-sky mission to seek
out and explore strange new technologies.

To the North there is the advance of technology: our project
will not be the first, or the last, system already out of date when
delivered; or maybe, you must target 386 Windows 3.1 machines to
save on costly upgrades.

And to the South our co-workers: we usually don't get to choose
the people we work with, some are there when we arrive, some are
hired without our involvement; some may be fresh from college and
lacking in real world experience, others may be jaded by too many
failed projects or over exposure to COBAL and Lisp.

And then when the terrain is not flat and fertile:

Barren patches mean that crops don't take: we may advance
Extreme Programming, templated designs or code reviews but if
people are unenthusiastic, and management won't back you then
nothing will grow.

Quagmires: you advance a generic design, but before you can cut
a line of code you are taken at your word and now your design must
work for your department and four others. You are bogged down in
endless meetings, advocacy, design reviews. Sir Humphrey would be
proud!

The seasons change: in the autumn the company decides to
standardise on Oracle for all database work, suddenly you must
abandon SQL Server; winter brings a cash flow crisis and nothing
will get signed off; spring brings a thaw but suddenly Java is in
favour, and then summer when everyone is on holiday, nothing is
decided and even less done!

And through the centre of the plain runs the rift valley of
office politics: will your proposals prove too radical for the
management? Would co-operation with the New York office further
someone's career? Does the office secretary hate contractors and
make their life difficult? Could you use COM in your design? Is the
senior architect convinced he knows all the answers? Is the
director's ear bent by a long standing employee who grew up on
Pascal and frankly, doesn't know the first thing about objects?
Sometimes you have to choose your battles.

Conclusion

Models are an essential way of abstracting problems and
patterns. Using models we can codify and communicate ideas - this
in turn allows us to learn and to share ideas. However, as other
disciplines know, models have limits, we should not expect our
idealised models to be used straight out the box. Every model,
whether it is a design pattern, a process or a programming style
must be adapted to our present circumstances.

[4] See any good economics text book for a
description of diminishing returns. Nor do I give references for
Chaos theory, Catastrophe theory, Christaller or Monetarism -
likewise these can be found in good mathematics, human-geography
and economics text books. Google searches provide lots of sources
on all.

[5] See Coplien, Multi-Paradigm design in
C++, Addison-Wesley, 1998 for discussions on commonality and
variability analysis.