Abstract

Models of social systems generally contain free parameters
that cannot be evaluated directly from data. A calibration phase is
therefore necessary to assess the capacity of the model to produce the
expected dynamics. However, despite the high computational cost of this
calibration it doesn't produce a global picture of the relationship
between the parameter space and the behaviour space of the model. The
Calibration Profile (CP) algorithm is an innovative method extending
the concept of automated calibration processes. It computes a profile
that depicts the effect of each single parameter on the model
behaviour, independently from the others. A 2-dimensional graph is thus
produced exposing the impact of the parameter under study on the
capacity of the model to produce expected dynamics. The first part of
this paper is devoted to the formal description of the CP algorithm. In
the second part,we apply it to an agent based geographical model
(SimpopLocal). The analysis of the results brings to light novel
insights on the model.

Keywords:

Calibration Profile, Model Evaluation

Introduction

Agent-based modelling is widely used in models of social
systems because it is well suited to describe individual-centred
dynamics with non-linear interactions. This modelling technique,
however, introduces many free parameters which critically influence the
behaviour of the model. Hand-picking parameters is a time-consuming
process and makes it hard to evaluate the ability of a model to
reproduce empirical data or formalised knowledge. Automatic calibration
of models have the potential to free the modeller from this burden and
to make the modelling process more rigorous by reducing the number of
arbitrary choices.

The current prevalent method for automatic calibration is
to see it as an optimisation problem in which the differences between
known data and model predictions have to be minimised. This
optimisation can be performed with different optimisation algorithms
like "Approximate Bayesian Computation" (Lenormand
et al. 2012) or evolutionary algorithms (Schmitt et al. 2014; Stonedahl 2011).
Nevertheless, these methods find a single set of parameters, or, in the
case of multi-objective evolutionary algorithms, several sets of
parameters that represent optimal trade-off with regard to several
model quality criteria. In both cases, the result of the calibration is
solely that the model can reproduce the data with a
given precision. These calibration processes do not say anything
about how often parameter sets lead to realistic behaviours, and about
how each parameter changes the behaviour of the model. For instance, it
is often interesting to know when some parameter values would prevent
the system to reach a realistic behaviour, rather than only knowing a
single set of "optimal" parameter values.

To compute a more global view of the parameter space, we
have designed a novel method that exposes the sensitivity of a single
parameter on the calibration of a model independently of the other
parameters. A high level representation of this method is shown on
Figure 1. It computes the
lowest calibration error that can possibly be obtained when the value
of a given parameter is fixed and the others are free. It computes this
minimal error for many values of the parameter under study. The values
of the parameter are sampled all along its domain of definition to
produce a so called calibration profile. For each
sample value, the value of the remaining parameters are optimised in
order to find the lowest possible calibration error. The profile can
then be drawn on a 2-dimensional chart that depicts the influence of
the parameter under study on the model calibration.

Conceptually, CP are similar to likelihood profiles (Pawitan 2001), a classic tool
in statistics[1].
However, computing likehood profiles requires an analytical expression
of the model, which is infortunately not available in most agent-based
simulations, whereas calibration profiles can be used with any model. A
calibration profile can thus be seen as a way to approximate the
likelihood profile when the analytical expression of the model is not
available or is too complicated.

To produce a calibration profile, a naive approach would
consist in executing an entire calibration algorithm for each value of
the parameter under study. Current automated calibration algorithms are
too computationally intensive to make this approach tractable in
practice. To tackle this problem we have designed an algorithm that
computes the numerous points composing a calibration profile
altogether. The design of this algorithm has been inspired by the
recently published MOLE method (Clune
et al. 2013; Mouret
& Clune 2012),
which pictures 2-dimensional phenotype landscapes for evolutionary
biology. MOLE and the CP method are both stochastic, population-based
search algorithms that exploit the fact that a small change to a good
solution is likely to lead to another similarly good solution (Deb et al. 2002). Nonetheless,
while classic stochastic optimization algorithms search for a single
optimum of a function, the CP method aims to fill a 1-dimensional array
of "categories" (a 2-D map for MOLE) that should each contain the best
performing sample for each possible value of the parameter under
scrutinity (after a discretization). To do so, the algorithm starts by
generating a a few random samples, evaluates their performance, and
stores the best sample of each category; these best samples are then
randomly modified to generate new samples, thus having a chance to
either improve the best samples of the current categories, of fill new
categories in the neighborhood with high-peforming samples.

The first section of this paper exposes a formal
description of the CP algorithm as well as a reusable implementation of
the method. In the second section we show how the application of the CP
algorithm to a real world geographical simulation model produces novel
insights toward the validation of the model.

The
Calibration Profile (CP) algorithm

Intuition

To compute the calibration profiles we propose a new
algorithm designed in the framework of evolutionary algorithms. Figure 2 illustrates the
progress of this algorithm through the example of the computation of a
9 points profile along x1 of a function f(x1, x2). In this example, the
function f computes the evaluation of a 2-parameter model (x1 and x2).
The calibration objective is to minimise f(x1, x2). In the step 1 the
algorithm randomly samples points (random values of x1 and x2) and
computes f(x1, x2). In the step 2, the algorithm divides the definition
domain of x1 in 9 disjoint even intervals and keeps only the sampled
point with the lowest f(x1, x2) in each of these intervals (this
constitutes the elitism stage). The points that have been selected
constitute a first approximation of the calibration profile of the
model f along x1. In step 3 new samples (x1, x2) are generated by
mutating the points in the current approximation of the calibration
profile and f(x1, x2) is evaluated for each of these new points. The
newly evaluated samples are merged with the existing ones and the
algorithm iterates to the step 2. This iteration stops once a given
stopping criterion is met after step 2. The projection of the last
selected points along x1 constitutes an approximation of the
theoretical continuous profile (step 4).

Figure 2. Intuition of the
profile algorithm.

Objective

The CP method takes inspiration from evolutionary
algorithms to provide a thorough analysis of the role played by each
parameter. Such analysis is independent from one parameter to the
other. A function that evaluates the performance of each particular
model is supposed to be given (typically such evaluation takes into
account the fitting of the model to the data).

Let be the set of possible models under
consideration and let
A
= A1×…×An⊆n be the domain of the
parameter vectors that we consider as determining the models, where n
is the number of parameters. We denote by
the map that provides a
model for a given
parameter vector. Such map can be viewed as a meta-model, that is a
model with free parameters.

Let
be the function providing an evaluation for each
model. We can obviously evaluate a parameter vector with the function
. In this context, when we
use the word solution
we just mean a parameter vector
∈A.
By convention we suppose that best solutions are closer to 0.

This function produces, for a fixed value of the k-th
parameter (i.e. ak = x),
the best possible performance of the model. In our view, this function
is a good indicator of the impact of a parameter and provides valuable
information for further analysis.

Our aim is then to compute, for each
k∈{1,…,
n}, a good approximation of hk.
A naive approach would consist in sampling Ak,
say by choosing
{a1k,…,
alkk}⊂Ak,
and running one optimisation algorithm for each ajk
to approximate hk(ajk).
Unfortunately this method is very inefficient. To overcome the
computational limits of this approach, we propose a new algorithm that
converges directly to the whole set
{hk(a1k),…,
hk(alkk)}.

Description of the algorithm

In this section we consider
k∈{1,…,
n} to be fixed and we describe the CP algorithm
for the analysis of the k-th parameter. As
mentioned earlier, the goal is to approximate hk(x),
that is to provide sufficiently good approximations of
hk(ajk),
for certain
a1k,…,
alkk∈Ak.

In order to explore Ak
uniformly, we use an evolutionary algorithm with diversity pressure
mechanism (Goldberg et al. 1992;
Toffolo & Benini 2003;
Mahfoud 1997; Mouret & Doncieux 2012).
Let C1,…,
Clk
be lk different semantic
categories to which one can associate an element belonging to Ak.
In our approach, we consider that categories can be identified with
intervals
I1,…,
Ilk⊆Ak,
and that such intervals form a partition of Ak:

We assume that these categories are good enough to
characterise Ak, that is
they are a good discretisation of Ak.
For each ak∈Ak,
we denote by cat(ak)
the category of ak, that is
the unique category Cj such
that ak∈Ij.
We also denote by
Cat
: A→{C1,…,
Clk}
the map defined by Cat() = cat(ak),
where ; it defines the category of a parameter vector
as the category of its k-th coordinate. The
algorithm is devised to find parameter vectors
{,…,},
with
and aji∈Ai,
such that:

Each vector lies in a different category:
Cat()
= Cj
for j =
1…lk. This means that, when
projecting on the k>-th coordinate, our
sample is representative of the whole space Ak.

We can consider, by induction, that XN
has already been constructed at this point.

Let us define the sets
Si
= {∈XN
: Cat() = Ci}
for i
= 1,…, lk (we may get
Si
= ∅ for some values of i). Then we apply elitism,
that is we select only the best parameter vector in each category: We
define the sets
Ei
= {∈Si
: f ()≤f ()
∀∈Si}
for i
= 1,…, lk, and
. In practice all the sets Ei
will be either empty, if Si
= ∅, or containing only one element (if
|Ei|≥2
we can randomly choose one element in Ei
and consider Ei composed of
just that element).

Generate H* from H
by randomly modifying the parameter vectors in H.
We can for instance generate each vector in H*
by applying a gaussian variation to a vector in H.
Define the set XN+1 as the
union of H and H*
and go to Step 2.

When the stopping criterion is met, we can output our approximation to
the function hk(x):
If {,…,}
is the last H computed, our approximation to hk(x)
is the set of 2-tuples
{(,
f ()),…,(,
f ())}. Let us note that
m≤lk
by construction, but in practice, when the number of iterations is high
enough (see section 2.1
point 1)
we get m = lk.

Validity

As we said in paragraph 2.7,
our approach consists in estimating the impact of the k-th
parameter on the performance of the model, by finding a good
approximation to the function hk(x).
To clarify why our method provides indeed such a good approximation, we
make the following remarks:

At each iteration we randomly generate new solutions
from old ones. Given a solution such that that Cat() = Cj,
the probability of getting a new solution from such that Cat()
= Cj', with j'≠j,
is different from 0 (this is clearly the case if we use, for instance,
a gaussian variation). Hence, if enough iterations are repeated, every
category will eventually be reached by as many solutions as we wish.

Let us consider u = inf{f
()
: Cat() = Cj}
= inf{hk(a)
: cat(a) = Cj}.
If the f is sufficiently regular (it is enough if
it is continuous, or if it is just semi-continuous on its local
minima), then, by the analogous argument to that in point 1., we will
eventually find an solution such that Cat() = Cj
and f () is as close to u
as we want.

These remarks, together with the fact that we can consider
as many categories as needed, justify our claim. We should note that,
as it is usually the case with evolutionary algorithms, we do not
provide a precise speed of convergence for our algorithm; however,
experimental results are very satisfactory (see section 3).

Heuristic improvement

In paragraph 2.11
we
have presented the most basic version of our algorithm, for the sake of
clarity. However in practice we use a variation of it, heuristically
devised, to improve its performance. In this section we present these
modifications. We use the same notation as in paragraph 2.11.

The main point of this modified version is the random
generation of new solutions (step 3). In this part we rely on the
standard techniques used in genetic algorithms, in which the random
variation comes from two biologically-inspired operators: mutation
and crossover. These operators are not applied to
the whole set H
of solutions, but rather on a
subset H0⊆H
which is generated by a certain selection mechanism
(we will explain below this selection mechanism). From H0
we generate H* in two steps:
We first generate a new set of solutions H'0
by applying a SBX-crossover to H0
(see Deb & Agrawal 1994),
and then we apply a self-adapting Gaussian mutation to H'0
(see Hinterding 1995).
The result is our H*.

In the selection of
H0⊆H
we follow a standard selection mechanism: Binary tournament selection.
It consists in randomly choosing two solutions
x,
y∈H, comparing them via a
certain function
g
: H→∪{∞}
and then keeping the best (x is kept if
g(x)≥g(y));
we repeat this procedure until we reach the desired size of H0.

Note Although we have
defined H0 as a set
(elements in H0 are unique),
we can also define H0 as a
tuple, allowing a same solution to appear more than once.

One feature that is specific to our approach is the
definition of the function g, that we call local
contribution.

Definition Let
H
= {,…,}
and . Given a solution
∈H {,},
let Δj denote the area of
the triangle in 2
formed by the 2-tuples (, f ()),
(,
f ())
and (,
f ()).
Then we define the local contribution of, as
g()
= (- 1)eΔj,
where e = 0 if the point (, f ()) is below the straight
line from (, f ()) and (,
f ()),
and e = 1 if it is not. If
is extreme, i.e. ∈{,},
then we define g()
= ∞.

We have just described how to generate a new set of
solutions H* from H.
The heuristics behind this procedure is to give some priority to the
generation of new solutions that are either locally good, which
accelerates the approximation to the local minima in each category, or
on the extremes, which widens the exploration of the space.

In order to describe the version of the algorithm that we
use in the experiments, the other point we should mention is the
stopping
criterion. A natural choice would be based on the stability of the
successive H computed: If we detect no important
changes from one H to the next one (during several
iterations), the algorithm stops. We can measure these changes in
different ways, for instance considering the function
F(H)
= f
().
However, in practice, we consider a maximal computation time as the
stopping criterion and the stability becomes an element of subsequent
analysis.

Generic implementation

In order to make the CP algorithm usable we provide along
with this paper a generic implementation in the free and open source
library called MGO (Multi Goal Optimisation)[2]
programmed in Scala[3].
MGO is a flexible library of evolutionary algorithms. It leverages the
Scala "trait" system to provide a high level of modularity.

This section shows how to compute a calibration profile
on
a toy model with MGO. In this example, the model is the Rastrigin
function and the objective is to minimise this function. The objective
function in MGO should be implemented through a specialised trait of
the "Problem" trait. Listing 1
shows how to implement the Rastrigin function into MGO.

To compute a calibration profile of this Rastrigin
function
it should be composed with the "Profile" trait that provides the CP
algorithm. Listing 2
shows how to achieve this composition. In this example two other traits
are used: A termination criterion based on a maximum number of steps
("CounterTermination") and a trait to specify that the profile is
computed on a part of the genome ("ProfileGenomePlotter").

Like most evolutionary algorithms, the CP algorithm
compute
the evaluation function many times. Those numerous evaluations could be
computationally costly. To cope with this, MGO takes advantages of the
multi-core parallelism avaiable in modern computers. This kind
of parallelism is suitable for many models but can be limiting in case
of large stochastic agent based models for instance. To overpass this
limit we have integrated the CP algorithm in the OpenMOLE[4] (Open Model
Exploration) framework (Reuillon et al. 2013). OpenMOLE enables the
distribution of the CP algorithm on large scale distributed computing
infrastructures including multi-core computers, computing clusters and
even world wide computational grids. In the following section we used
this distributed implementation of CP framework to study a real world
agent based model.

A model to simulate the emergence of structured urban
settlement system

SimpopLocal is a stylised model of an agrarian society in
the Neolithic period, during the primary “urban transition” manifested
by the appearance of the first cities (Schmitt
et al. 2014). It is designed to study the emergence
of a structured and hierarchical urban settlement system by simulating
the growth dynamics of a system of settlements whose development
remains hampered by strong environmental constraints that are
progressively overcome by successive innovations. This exploratory
model seeks to reproduce a particular structure of the Rank-Size
distribution of settlements well defined in the literature as a
generalised stylised-fact: For any given settlement system throughout
time and continents, the distribution of sizes is strongly
differentiated, exhibiting a very large number of small settlements and
a much smaller number of large settlements (Liu
1996; Durand-Dastès
et al. 1998; Berry
1964; Fletcher 1986).
This kind of distributions can be modeled by a power-law when small
settlements are not considered or a log-normal when small settlements
are considered (Robson 1973;
Favaro & Pumain 2011).
Such distributions are easily simulated by rather simple and non
spatial statistical stochastic models (Gibrat
1931; Simon 1955).
But for theoretical considerations and to specify the model in a
variety of geographical frames, we think it is necessary to make
explicit
the spatial interaction processes that generate the evolution of city
systems. According to the evolutionary theory of system of cities (Pumain 1997, 2009), the growth dynamics of
each settlement is controlled by its ability to generate interurban
interactions. The multi-agent system modeling framework enables us to
include mechanisms, derived from this theory, that govern the
interactions between settlements (Bretagnolle
et al. 2006; Pumain
& Sanders 2013). The application of this concept
resulted in several Simpop models (Bura
et al. 1996; Bretagnolle
& Pumain 2010; Sanders
et al. 1997) in which the expected macro-structure
of the log-normal distribution of sizes emerges from the differentiated
settlement growth dynamics induced by the heterogeneous ability of
interurban interactions. Therefore the aim of SimpopLocal is to
simulate the hierarchy process via the explicit modelling of a growth
distribution that is not entirely stochastic as in the Gibrat model (Gibrat 1931) but that emerges
from the spatial interactions between micro-level entities.

Agents, attributes, and mechanisms of the model

The SimpopLocal model is a multi-agent model developed with
the Scala software programming language. The landscape of the
simulation space is composed of a hundred of settlements. Each
settlement is considered as a fixed agent and is described by three
attributes: The location of its permanent habitat, the size of its
population, and the available resources in its local environment. The
amount of available resources is quantified in units of inhabitants and
can be understood as the carrying capacity of the local environment for
sustaining a population. This carrying capacity depends on the
resources exploitation skills that the local population has acquired
from inventing or acquiring innovation. Each new innovation acquired by
a settlement develops its exploitation skills. This resource
exploitation is done locally and sharing or trade is not explicitly
represented in the model. The growth dynamics of a settlement is
modelled according to the assumption that its population size is
dependent on the amount of available resources in the local environment
and is inspired from the Verhulst model (Verhulst
1845) or logistic growth. For this experiment, we assume a
continuous general growth trend for population - this may be different
in another application of the model. The annual growth factor rgrowth
is expressed on an annual basis, thus one iteration or step of the
model simulates one year of demographic growth. rgrowth
has been empirically estimated to 0.02. The limiting factor of growth RMi
is the amount of available resource that depends on the number M
of innovations the settlement i has acquired by the
end of the simulation step t. Pti
is the population of the settlement i at the
simulation step t:

The acquisition of a new innovation by a settlement allows
it to overtake its previous growth limitation by allowing a more
efficient extraction of resources and thus a gain in population size
sustainability. With the acquisition of innovations the amount of
available resources tends to the maximal carrying capacity Rmax
of the simulation environment:

The mechanism of this impact follows the Ricardo model of
diminishing returns, also a logistic model (Turchin
2003). The InnovationImpact parameter
represents the impact of the acquisition of an innovation and has a
decreasing effect on the amount of available resources RMi
with the acquisition of innovations:

Acquisition of innovations can occur in two ways, either by
the emergence of innovation within a settlement or by its diffusion
through the settlement system. In both cases, interaction between
people inside a settlement or between settlements is the driving force
of the dynamics of the settlement system. It is a probabilistic
mechanism, depending on the size of the settlement. Indeed, innovation
scales super-linearly: The greater the number of innovations acquired,
the bigger the settlement and the higher the probability of innovating (Lane et al. 2009; Diamond 1997). To model the
super-linearity of the emergence of innovation within a settlement, we
model its probability by a binomial law. If Pcreation
is the probability that the interaction between two individuals of a
same settlement is fruitful, i.e. leads to the creation of an
innovation, and N the number of possible
interactions, then by the binomial law, the probability Π
of the emergence of mcreation innovations can be
calculated. It is thus possible to compute the probability Π(mcreation
> 0) that at least one innovation appears in a given
settlement i. This probability then can be
used in a random
drawing:

The diffusion of an innovation between two settlements
depends both on the size of populations and the distance between them.
If Pdiffusion is the
probability that the interaction of two individuals of two different
settlements is fruitful, i.e. leads to the transmission of the
innovation, and K the number of possible
interactions, then by the Binomial law the probability Π
of the diffussion of mdiffusion innovations can
be calculated. It is thus possible to compute the probability Π(mdiffusion
> 0) that at least one innovation appears in a given
settlement i. This probability then can
be used in a random
drawing:

But in this case, the size K of the
total population interacting is a fraction of the population of the two
settlements i and j which is
decreasing by a factor DistanceDecay with the
distance Dij between the
settlements, as in the gravity model (Wilson
1971):

However another mechanism controls this innovation
diffusion process. Each innovation can only be diffused during a time
lap just after its acquisition by a settlement. This time lap of
diffusion is controlled by the parameter InnovationLife.
According to this deprecation mechanism, an innovation is considered
obsolete after InnovationLife simulated years and
cannot be diffused any more.

Objective function

As defined in paragraph 2.2, CP
requires the definition of an objective function (called f
). This function computes a scalar value that is a measure of the
quality of the model dynamics for a given set of parameter values. In
case of SimpopLocal, f has been designed
by aggregating three objective functions. Each of these objective
functions evaluate a different aspect of the stylised
facts we seek to simulate with the model.

The SimpopLocal model being stochastic, the simulation
outputs vary from one simulation to the next for a given parameter
setting. Therefore, the evaluation of a single parameter setting on the
three objectives must take into account this variability. Each
evaluation is therefore computed on the output of 100 simulations. We
have shown in Schmitt et al. (2014)
that it was a suitable compromise between capturing the variability and
not increasing to much the computation duration.

Contrary to some calibration methods such as ABC (Lenormand et al. 2012;
Beaumont 2010) which
make direct use of stochasticity, genetic algorithms can be strongly
biased by noisy evaluation functions. To compensate the bias of
over-evaluated set of parameters (that could occur despite the 100
replications), we periodically re-evaluate them. The newly computed
evaluation simply replace the old one. In the following experiments 1
percent of the computing time has been dedicated these re-evaluations
in order to avoid keeping mistakenly-well-evaluated set of parameters
in the population for too long. This re-evaluation scheme has the
advantage of simplicity but may be improved in the future.

Here is how we test each set of parameters according to the
three objective functions (a more detailed explaination of the
calibration objectives can be found in Schmitt
et al. 2014):

The objective of distribution quantifies the ability of
the model to produce settlement size distributions that fit a
log-normal distribution. First we evaluate the outcome of each subset
of 100 simulations corresponding to one parameter setting by computing,
according to a 2-sample Kolmogorov-Smirnov test, the deviation between
the simulated distribution and a theoretical log-normal distribution
having the same mean and standard deviation. Two criteria are reported,
with value 1 if the test is rejected and 0 otherwise: The likelihood of
the distribution (the test returns 0 if p - value
> 5%) and the distance between the two distributions (the test
returns 0 if D - value <
Dα). In order to summarise those tests in a
single quantified evaluation, we add the results of the two tests on
the 100 simulations. The best possible score on this objective is thus
0 (all tests returning 0) and the worst 200 (all tests returning 1). By
dividing this score by the worst possible value (200), we get a
normalised error objective noted o1.

The objective of population (noted o2)
quantifies the ability of the model to generate large settlements that
have an expected size. The outcome of one simulation is tested by
computing the deviation between the size of the largest settlement and
the expected value of 10,000 inhabitants: (populationOfLargestSettlement
- 10000) / 10000. The evaluation of the parameter setting reports the
median of this test on the 100 simulations.

The objective of simulation duration (noted o3)
quantifies the ability of the model to generate an expected
configuration in a suitable length of time (in simulation steps). The
duration of one simulation is tested by computing the deviation between
the number of steps of the simulation and the expected value of 4000
simulation steps: (simulationSteps - 4000) / 4000.
The evaluation of the parameter setting reports the median of this test
on the 100 simulations.

Given these three objectives, f is
defined as an aggregated objective value that summarises the
calibration error of a set of parameter values. This summarised
objective is defined as: f = max(o1,
o2, o3). The lowest f
is, the better a set of parameter values. This function compels the
algorithm to select sets of parameter values that score well on the
three objectives all together. It prevents from selecting parameter
values that would provides efficiency on solely a subset of the
objectives as it could be the case with compensative aggregation such
as average or a sum function. Indeed, a good calibration requires low
scores on all three objectives, none should be disregarded.

To analyse the produced results, we have established that
only input parameters leading to values of f of
less than 0.1 of error are considered valid. Indeed, the empirical data
and theoretical knowledge that led to the definition of the objective
function are not precise enough to justify a more thorough analysis of
the model. This threshold is largely exceeded for some parameter
values, however rendering this threshold of acceptability explicit
enables the definition of credible bounds for each of the free
parameter of the model. These bounds define a validity domain
of each of the parameters. Note that the profile algorithm iteratively
refines the computed profiles from high values toward lower ones
through an iterative process, therefore the proposed bounds are more
restrictive than the exact ones.

This section exposes results of the CP algorithm on each of
the six free parameters of SimpopLocal. Each profile presented in this
paper has required the achievement of 100,000 CPU hours time (meaning
that it would have required 100,000 hours of computation on a single
core of a modern processor). To achieve such a huge computational load
we have distributed it across the numerous CPU of the European
computation grid EGI (European Grid Infrastructure) using the
distributed implementation of CP that we have integrated into OpenMOLE.
The effective user time was around 15 days to compute all the profiles
achieving a gain of 2200. The OpenMOLE script files used to compute
those profiles have been made available[5].

Results

The value of the parameter InnovationLife
depicts the duration during which an innovation can diffuse in the
system of settlements. Figure 3
shows the application of the CP algorithm this parameter. When InnovationLife
is set to 0 the model can not be calibrated. For such a value the
innovations are instantly deprecated just after their creation, meaning
that no diffusion of innovation is possible. This particular point
shows that the diffusion process of the model is mandatory to produce
acceptable dynamics.

Under the value 150 the calibration error is higher than
our validity threshold and then we consider that the model cannot be
calibrated. From a thematical point of view: The deprecation of the
innovations is too fast, thus the innovations cannot percolate in the
system leading to unrealistic dynamics.

Over the value 150 the obsolescence of the innovations is
slow enough to have no noticeable effect on the calibration of the
model. In particular when InnovationLife is set to
4000, the calibration error is lower than our acceptance threshold. For
this particular value, the matching mechanism of innovation
obsolescence could be considered as disabled: The maximum duration of a
simulation is 4000 steps, thus when InnovationLife
is set to 4000, the innovations are never considered deprecated. This
profile shows that this mechanism does not constrain much the model and
is useless to reproduce the expected dynamics. Seeking for a more
parsimonious model we have removed this mechanism. The following of the
article is based on the SimpopLocal model without the deprecation
mechanism.

Rmax

The parameter Rmax
constraints the maximum size of settlements by limiting the impact of
the acquisition of an innovation on the size of a settlement. The
calibration profile for this parameter is shown on Figure 4.

The calibration objective for the size of the biggest
settlement has been set to 10,000 inhabitants. When Rmax
is set to a value lower than 10,000 it is impossible for the model to
reach the 10,000 population objective. Thus, by construction, low
values of this parameter can not achieve acceptable calibration errors.

The profile exposes a narrowed range of very low
calibration error between [9500;11500]. Surprisingly the calibration
algorithm is not able to find low calibration errors for Rmax
above 10,500. It indicates that this mechanism is necessary to produce
acceptable dynamics and reaching settlements of a size matching
empirical evidences.

InnovationImpact

The InovationImpact parameter controls
the impact of the acquisition of an innovation on the growth of a
settlement. Figure 5
show
the calibration profile for this parameter. The section of the profile
containing low calibration errors is very narrowed, only 2 solutions
with calibration errors under 0.1 have been found. This profile has
been computed using the definition domain provided by the modellers
[0;2]. From Figure 5
it can
be deduced that the definition domain is too wide compared to the
validity domain for this parameter. Therefore another profile has been
computed for a much more restraint definition domain: [0;2.10-2].

The zoomed profile is shown on Figure 6.
It shows that when the impact on the settlement growth of the
innovation is too low the dynamic is too slow to reach the calibration
objectives. On the contrary, when InnovationImpact
is too high the growth of the settlements is too fast to match credible
dynamics (whatever the value of the other parameters can be). This
profile shows that the validity domain of this parameter is
[0.006;0.01].

Pcreation

The Pcreation
parameter determines the capacity of a settlement to create a new
innovations. Figure 7
shows
the calibration profile for this parameter. Once again the definition
domain given by the modellers [0;0.01] is too wide to establish
directly a validity domain. However, this profile shows that for high
values of Pcreation the
model can not reach low calibration errors. The inflexion of the
profile near the value 0 indicates that low calibration errors are
located in a much more narrowed domain close to 0.

Figure 8 shows
the calibration profile of the Pcreation
parameter within the domain [0;1e-5].
From this profile it can be deduced that the validity domain of Pcreation
is [0.4×10-6;2×10-6]. For
low values of Pcreation the
model can not be calibrated. The innovations are generated too slowly
to engender a sufficient growth (whatever the value of the other
parameters can be). For high values of Pcreation
the system races and growth is too fast.

Figure 8. Detailed calibration
profile of the parameter Pcreation

Pdiffusion

The parameter Pdiffusion
governs the exchange of innovation between settlements. Figure
9 shows the calibration
profile for
this parameter which is very flat and contains only high calibration
errors. This profile exposes a shortcoming of the algorithm. If the
definition domain is very wide compared to the validity domain, the
method might take a lot of computation time to find the parameters
settings matching low calibration errors and give the false impression
that they don't exist. In the case of SimpopLocal we know that low
calibration errors exists because they were found when computing the
other profiles. By deduction from the model mechanisms, low calibration
errors should be found in a domain closer to 0.

A very interesting aspect of this profile is that low
values of Pdiffusion prevent
the model from producing acceptable dynamics. Noticeably when Pdiffusion
= 0 it disables entirely the diffusion mechanism of the model. At this
particular point the calibration error is unsatisfying, thus this
profile shows that the diffusion mechanism is mandatory in order to
produce realistic behaviours.

DistanceDecay

The parameter DistanceDecay acts on the
mechanism of diffusion of the innovations. It introduces a decaying
effect of the distance between settlements on the probability to
diffuse an innovation from one settlement to another. Figure 11 shows the calibration profile for
this parameter. High values of this parameter prevent the diffusion.
The calibration errors computed with high values of DistanceDecay
are almost equal to the calibration errors computed with Pdiffusion
set to 0 (that also inhibits the diffusion process). This matching
calibration error values provide an additonnal hint that the model and
its implementation are coherent.

The validity domain of DistanceDecay is
[0.2, 1.1]. The mechanism piloted by DistanceDecay
introduces a distance effect in the interaction process creating
spatial heterogeneity. For low values of DistanceDecay
the profile shows only unsatisfying calibration errors. It means that
spatial heterogeneity is mandatory to produce acceptable dynamics.

Conclusion

This paper proposes a new method to explore models. This
method, called Calibration Profile, computes 2-dimensional graphs that
depict the effect of each parameter on the model dynamics using a
calibration objective. It thus provides a global view of the local
effect of each parameter on the expected behaviour. This constitutes a
new form of sensitivity analysis of parameters on a calibration
objective that differs from classical sensitivity analysis methods (Saltelli et al. 2000).
Classical sensitivity analysis methods are based on statistical
analysis that are either local or global. Local sensitivity analysis
provides a measurement of the effect of the model parameters on the
outputs in a narrowed portion of the parameter space. Global
sensitivity analysis provides few aggregated global indicators of the
effect of each parameters in the whole parameter space. The CP method
produces a global picture of the local effect of a parameter on the
calibration objective all along its definition domain. Furthermore the
CP method seeks for extreme behaviours (the best ones) of the model and
that's why it is based on an optimisation algorithm. This kind of
information is not captured by classical sampling-based sensitivity
analysis. This CP method has been shown very useful to better
understand the dynamical behaviours of a geographical multi-agent
simulation model. It has lead to important novel understandings:

The mechanism linked to the parameter InnovationLife of
the model has been shown useless. This mechanism has thus been removed
from the model, making it more parsimonious.

Except from this mechanism all the other mechanisms
(i.e spatial heterogeneity, diffusion of the innovations, creation of
the innovations...) have been shown mandatory to produce acceptable
dynamics. From a geographical point of view, these results are
extremely interesting. They show how the consideration of space and of
spatial interaction plays a central role in the simulation of credible
settlement system growth dynamics.

For all the parameters (InnovationLife excepted) a
connected validity domain has been proposed, these boundaries are
summarised in Table 2. The
seemingly very narrow validity domain of several of those parameters
(in particular InnovationImpact, pCreation, pDiffusion) is not
preoccupying to our senses. On the one hand, this difference between
the exploration domain and the validity domain can be directly assigned
to the lack of empirical knowledge on the processes of innovation.
Indeed the probability of innovation or even the impact of innovation
adoption are not easily measurable variables in the real world,
especially for the ancient periods that the model focuses on.
Unsurprisingly the exploration domains that we defined a priori for
these parameters has been found much larger than the actual validity
domain. In the framework of the model hypothesis, it seems correct that
these values are very small. Hence that seemingly very narrow validity
domain of several parameters is not due to a lack of robustness of the
model but rather to lack of knowledge and empirical measures that
prevent the definition of a relevant observational scale. On the other
hand, the value of those parameters is also directly linked to the
definition of the concept of innovation implemented in the geographical
model and to how we can quantify it (such as the average number of
innovation per time period for example). Therefore the impact of each
innovation is strongly related to the number of all
considered/simulated innovations.

This model exploration using the CP method has shown very
instructive on the model behaviour. More methodological development is
being achieved to include this method in an extensive modelling
framework encompassing several complementary tools for model
evaluations. We think that CP can play a significant role in modelling
methodologies such as KISS (Keep It Simple Stupid) for which the
simplicity of the models prevails. Using CP during the modelling
process may constitute a very efficient tool to achieve progressive
complexification of a very simple model towards a realistic one.

Table 2: Validity
boundaries for SimpopLocal parameters

Parameter

Definition

Validity domain

Rmax

[1;40, 000]

[9, 500;11, 500]

InnovationImpact

[0;2]

[6×10-3, 1×10-2]

Pcreation

[0;0.1]

[4×10-7;2×10-6]

Pdiffusion

[0;0.1]

[3×10-7;8×10-6]

DistanceDecay

[0;4]

[0.2;1.1]

Acknowledgements

This work is funded by the ERC Advanced Grant Geodivercity and the
Institut des Systèmes Complexes Paris Île-de-France. The results
obtained in this paper were computed on the biomed and the
vo.complex-system.eu virtual organization of the European Grid
Infrastructure (http://www.egi.eu). We thank the European Grid
Infrastructure and its supporting National Grid Initiatives
(France-Grilles in particular) for providing the technical support and
infrastructure. We also thank the Réseau National des Systèmes
Complexes (RNSC) and anonymous reviewers for their help in improving
the first version of the paper.

Notes

1Let y
be the data and θ a vector of parameters that can
be decomposed as θ = (δ, ξ).
The likelihood function can be written as (θ| y)
= (δ,
ξ| y). If δ
is the parameter of interest, then the profile likelihood is the
function defined as Lp(δ)
= (ξ,
δ| y). See Pawitan (2001, p. 64) for a detailed
description.