SafetySynthesis facets: Safety thinking and safety methods

The methods that are used to manage safety have always closely matched the thinking about safety. This is actually inevitable, since the methods represent the tacit assumptions about how things happen in the world around us, particularly about how things go wrong.

Safety thinking cannot be static, but must undergo continuous development and revision so that it corresponds to the ‘real’ world. One development in safety thinking has to do with how things happen in the world, especially the cases where things go wrong. In order for safety management to be effective, our theories and models about the ‘inner workings’ of technical systems, socio-technical systems, and organisations must have a reasonable resemblance to what actually happens. Although the correspondence never can be perfect, it must be good enough to enable us to control what we do. This control is needed both to prevent that unwanted outcomes occur and to ensure that wanted outcomes occur.

Looking at the history of safety management – if not from the beginning of historical records then at least from the beginning of the industrialised era – the dominating trend has been that systems have become gradually more difficult to understand and control. This goes for technical systems, socio-technical systems and organisations alike. These developments have on the whole been matched by a corresponding development in methods, although usually with a considerable delay. That history is well-known and has been described in many publications. Briefly told, the thinking has gone from single factor models (such as ‘error proneness’), to simple linear models (such as the Domino model), to composite linear models (such as the Swiss cheese model), and to complicated multi-linear models (such as STAMP).

Common to all these methods is the unspoken assumption that outcomes can be understood as effects that follow from prior causes. This assumption (the causality credo) can be expressed as follows:

It is possible to find these causes provided enough evidence is collected. Once the causes have been found, they can be eliminated, encapsulated, or otherwise neutralised.

Since all adverse outcomes have causes, and since all causes can be found, it follows that all accidents can be prevented. (This is the vision of zero accidents or zero harm that many companies find attractive.)

The development in methods has been mirrored by the invention of new types of causes whenever the ‘usual suspects’ turned out to be insufficient – generally in the aftermath of a major accident or disaster. The genealogy of causes goes from ‘acts of god’, to technical malfunctions and failures, to ‘human errors’, to organisational failures, to safety culture, and to complex systems – which for the time being represent the pinnacle of safety thinking. But in each case the new causes have been introduced without challenging the underlying assumption of causality. We have therefore become so used to explain accidents in terms of (linear) cause-effect relations that we no longer notice it.

The development of ever more detailed and intricate linear models to explain accidents can in principle continue forever – or at least until we reach the point when we must abandon the idea that explanations can be linear. The predilection for linear explanations is easy to understand: it enables us to decompose problems into smaller parts (steps, actions, components) that can be dealt with individually. And since each part only interacts with its immediate neighbours, there is no compelling need to consider the system as a whole. Or so we assume.

There is, however, a second development in safety thinking which is conceptually orthogonal to the first. The established approach to safety implies what one may call a hypothesis of different causes, namely that the causes or ‘mechanisms’ of adverse outcomes are different from those of successful outcomes. (If that was not the case, the elimination of such causes and the neutralisation of such ‘mechanisms’ would also reduce the likelihood that things could go right, hence hinder the system from achieving its purpose.) But while this hypothesis is convenient, its theoretical and empirical basis is highly questionable. Fortunately, there is a simple – if not even simpler - alternative, namely that things go right and go wrong in the same way. This idea is far from new and has been expressed in a number of ways. Abraham Lincoln, for instance, noted that “if the end brings me out all right what is said against me won’t amount to anything. If the end brings me out wrong, ten angels swearing I was right would make no difference.” Ernst Mach (1905) was more direct when he wrote that “knowledge and error flow from the same mental sources, only success can tell one from the other”.

The hypothesis that successes and failures are ‘two sides of the same coin’ has been one of the cornerstones of resilience engineering from the very beginning 10-15 years ago. But it was described at least as early as 1983, in the following way:

“Since errors are not intentional, and since we do not need a particular theory of errors, it is meaningless to talk about mechanisms that produce errors. Instead, we must be concerned with the mechanisms that are behind normal action. If we are going to use the term psychological mechanisms at all, we should refer to ‘faults’ in the functioning of psychological mechanisms rather than ‘error producing mechanisms’. We must not forget that in a theory of action, the very same mechanisms must also account for the correct performance which is the rule rather than the exception.” (Hollnagel, 1983).

While the purpose of the first (methodological) development is to make sure that models and methods are powerful enough to match the ever more complicated work environments (and accidents), the purpose of the second (conceptual or even ontological) development is to make sure that the assumptions underlying safety thinking are realistic and – if possible – correct. This second development has two important consequences. One is a rejection of the hypothesis of different causes. Because of that, safety studies and safety management should focus on what happens when things go well rather than when things go badly, corresponding to the change in thinking from Safety-I to Safety-II. The other is that models and methods are needed to understand how socio-technical systems and organisations work, and not just how they fail. The traditional approaches mentioned above clearly cannot do the job, both because they focus on failures and because they are linear.

The two developments are shown graphically in the following figure. The methodological development can be characterised by the ‘accident models’ that have been used to manage safety. The start was in single factor models, where characteristically the single factor was the human and the explanation was that some people were accident prone. (Greenwood & Woods, 1919; Schulzinger, 1956). This was followed by the multi-factor linear model (Heinrich, 1931), the multi-factor composite (or multi-linear) model (Reason, 1990), and finally the complicated hierarchical model (Leveson, 2004; Svedung & Rasmussen, 2002). The conceptual development is basically a change from a Safety-I to a Safety-II perspective, hence a change from a focus on failures to a focus on everyday activities.

The methodological developments shadow the developments in technologies. But whereas technologies develop continuously, and seemingly with a constant acceleration, methodologies develop in steps or jumps. The precipitating event is usual one or more accidents that cannot be analysed and/or explained satisfactorily with the existing methods; this prompts a change in methods as well as a change in causes, as illustrated by the vertical dimension in the figure. The conceptual developments have so far only resulted in one major change (although a more fine-grained account could be given as well). The change was brought about by resilience engineering and the insistence that “... an unsafe state may arise because system adjustments are insufficient or inappropriate rather than because something fails. In this view failure is the flip side of success, and therefore a normal phenomenon” (Hollnagel, Woods & Leveson, 2006). To do so requires methods that can describe how everyday activity takes place. Since socio-technical systems, with the possible exception of the most trivial kind, must be able to adjust their performance to the conditions, linear (cause-consequence) models are ruled out. Non-linear models and methods are required to account for a non-linear reality. But since ‘failures are the flip side of successes’, the same model can be used to elucidate a Safety-I and a Safety-II perspective. The need of specialised and detailed accident models is therefore passée.

References

Greenwood, M. and Woods, H. M. (1919). A report on the incidence of industrial accidents with special reference to multiple accidents. (Industrial Fatigue Research Board Report No.4) London: HMSO.