Tag: Causality

As opposed to conventional modelling and simulation techniques, agent-based modelling is often claimed to be particular suited for the purpose of explanation. But what does it mean to explain a phenomenon, that is, to find an explanans for a given explanandum?

First of all, it is obvious that an answer to this question has to be dependent upon the type of the simulation model, upon its purpose and its domain; explanation of a phenomenon in a generator model (see my previous posts) has a different flavour than it has in a predictor model. The nature of explanation also depends on the background of the person attempting it. Statisticians often define explanation as the mere identification of sufficient conditions: explanation reduces to the analysis of correlated variables. As Marks put it, “for prediction, sufficiency suffices. There is no need to know which if any alternate conditions will also lead to the observed endogenous behaviours” [1]. In this context, conditions are defined in terms of correlations between variables. For example, variable A is correlated with B if the probability of A and B is significantly higher than their joint probabilities. This clearly makes sense. If the sole purpose of an agent-based simulation is to extrapolate historical trends and if there is enough individual data available that the individual agents can be trained with, then there is no purpose in explaining why the agents acted as they did. All the modeller needs to do is to make sure that the behaviour of the model represents the historical data with a sufficient degree of accuracy (i.e. shows a good fit). In this case, validation becomes mere line-fitting; correlation becomes the explanans.

It is also clear that the situation is rather different in the case of generator models which can often be found in the social sciences. In general, social scientists pose more strict requirements on the definition of explanation. Instead of mere correlation between variables, they tend to give explanation a causal flavour and define it as the identification of generative mechanisms, i.e. the ‘cogs and wheels’ that are able to bring about the phenomenon under consideration [2]. Consider again Schelling’s segregation model. Instead of fitting historical data (which would be hard to impossible anyway due to its abstract nature), the model’s purpose is to show how a reasonable behavioural hypothesis about individuals can lead to a puzzling and to some extent counterintuitive outcome at the macro level. In this case, explanation means answering the question why the overall behaviour emerged which, in turn, requires to unveil the mechanisms, the cogs and wheels that lead to the global outcome1. In this case, the causal relationship— the mechanism— becomes the explanans.

The issue of the explanatory power of agent-based simulations has been debated extensively in literature [3, 4]. An interesting overview of the epistemological issues surrounding agent-based simulation is given by Epstein in his paper about the foundations of agent-based generative social science [5], as well as by David [6]. According to Epstein, the central contribution of agent-based modelling is its ability to facilitate generative explanation. By growing a macro-level phenomenon from behavioural assumptions about the individual behaviour, generative sufficiency can be attained. However, as Epstein argues, “generative sufficiency is a necessary, but not sufficient condition for explanation”. This is due to the fact that, even if an agent-based simulation is capable of bringing about, i.e. generating, a certain macro-level phenomenon, there may be other (even better) models that are equally capable of generating it. In fact, as mentioned above, finding the model which explains the phenomenon as well as possible is the central purpose of validation; it is a continuous quest for a better model2.

The epistemological issues related with (agent-based) modelling are highly philosophical in nature but it becomes clear that, even in the context of an entirely practical simulation experiment, they cannot be fully neglected. If the purpose of the simulation is to explain a phenomenon and the task of validation (internal or external) is to assess whether the simulation satisfies its purpose with a sufficient level of accuracy, then a definition of explanation is necessary in order to define a proper validation technique.

We can thus conclude that assessing the validity of an agent-based simulation requires the analysis of its explanatory qualities which, in turn, requires the exposure of internal mechanisms. An important concept in this context is that of an artefact [7], i.e. a mechanism which has a causal effect on the observed behaviour where, in fact, it should not. A typical example of an artefact is a particular random number generator significantly influencing the qualitative outcome of the model. The detection of artefacts is a central issue in validation. In order to identify a mechanism as an artefact, one needs to perform two tasks: (i) confirming the existence of the mechanism, and (ii) establishing that it indeed has a causal effect—i.e. explaining it. Causal analysis requires the comparison of different models. Changing attributes and parameters and assessing their effect on the model’s outcome is also the purpose of exploration. As a consequence, explanation often requires exploration.

Let me summarise the ideas discussed so far. In order to assess the correctness of a model, explanation is essential — it is an integral part of the process. However, different types of models yield different definitions of explanation. The type of model determines how deep we need to immerge into the model in order to analyse its dynamics. For a predictive model, the detection of correlated variables and behaviours may suffice. For a generative, explanatory model, the underlying mechanisms may need to be unveiled. In order to accomplish that, causal relationships may need to be detected which, in turn, requires exploration of the model space.

1 Consequently, generator models are also often referred to as mechanistic models.2 According to Ripley, validation is not (as often stated) the quest for the best model since — especially in the case of generative or mechanistic models — multiple different models may explain a phenomenon equally well [8].