Abstract

Abstract

High risk volcanic events are commonly preceded by long periods of unrest during which scientists are asked to provide near real-time forecasts. The rarity of such events, inaccessibility of the underground volcanic system, non-linear behaviors, and limited datasets constitute major sources of uncertainty. In order to provide reasoned guidance in the face of uncertainties, monitoring observations and conceptual/theoretical models must be incorporated into a formal and structured probabilistic scheme using evidence science principles. As uncertainty and subjectivity are inescapable components of volcanic hazard forecasts, they must be dealt with and clearly communicated to decision-makers and society. Here, we present the set-up of an automated near-real-time tool for short-term eruption forecasting for Campi Flegrei caldera (CFc), Italy. The tool, based on a Bayesian Event Tree scheme, takes account of all the available information, and subjectivity of choices is dealt through a 5-year-long elicitation experiment with a team of about 30 of the major experts of the geological history, dynamics and monitoring of CFc. The tool provides prompt probabilistic assessment in near real-time, making it particularly suitable for tracking a rapidly evolving crisis, and it is easily reviewable once new observations and/or models become available. The quantitative rules behind the tool, which represent the group view of the elicited community of experts, are defined during a period of quiescence, thus allowing prior scrutiny of any scientific input into the model, and minimizing the external stress on scientists during an actual emergency phase. Notably, the results also show that CFc may pose a higher threat to the city of Naples than the better-known Mount Vesuvius.

Keywords

Background

The Campi Flegrei caldera (CFc) directly threatens a population of several hundred thousands who lives inside the caldera, and the city of Naples itself (∼1 million inhabitants), just outside the caldera. The latest eruption occurred in 1538, ∼ 4,000 years after the previous one that closed a period of intense eruptive activity (Orsi et al.1996). The 1538 eruption was preceded by uplift of the caldera floor, seismic swarms, and visible variations in fumarolic output that lasted at least several decades (Guidoboni and Ciucciarelli 2011). The eruption was explosive and resulted in the construction of the new hill of Monte Nuovo in the western caldera sector (di Vito et al. 1987). After about 4 centuries of caldera subsidence, the present unrest started in the 1950’s in the form of uprise of the caldera floor, seismic swarms, and changes in the flow, areal extent, and composition of fumaroles (Del Gaudio et al. 2010; Orsi et al. 1999). Periods of unrest concentrated at discrete periods of time, with major crises occurred in 1969-71 and 1982-84, the latter culminating in the evacuation of about 40,000 people from the city of Pozzuoli. Several other minor uplift periods have followed and continue, requiring the development of plans for scientific and civil protection operations. Such plans depend on the capability to interpret in real-time the observed dynamics and anticipate at least several days in advance the occurrence of a new eruption.

The extreme complexity of volcanic processes, nonlinearities, limited knowledge, and large number of degrees of freedom make deterministic predictions of volcanic system evolution extremely difficult, if not impossible (Mader et al. 2006; Newhall and Dzurisin 1988). The additional complexity of decision-making and civil protection operations, especially in highly inhabited areas like CFc, requires evaluations to be made on time windows up to weeks, further amplifying the influence of uncertainties. As a consequence, a probabilistic approach is needed in order to manage the uncertainties and build a quantitative reference frame for managing scientific evidence within a rational decision-support process (Marzocchi and Woo 2009). However, past pre-eruptive data at CFc are not available, with the exception of the descriptive, macroscopic observations reported in the chronicles related to the 1538 eruption (Guidoboni and Ciucciarelli 2011). This is unfortunately a common situation at volcanoes globally. On the other hand, volcanologists have developed sophisticated conceptual and theoretical models and deployed advanced monitoring systems that provide relevant information about the status of the volcano. The problem is therefore to integrate such heterogeneous information into a formal probabilistic scheme for eruption forecast.

With such a purpose, we have set up a real-time tool for short-term eruption forecasting at CFc (BETEF_CF; see Figure 1A). The statistical model adopted is BET_EF (Bayesian Event Tree for Eruption Forecasting, Marzocchi et al. 2008). BET_EF is based on an Event Tree logic (Newhall and Hoblitt 2002), in which branches are logical steps from a general starting event (the onset of unrest, node 1), through specific subsequent events (the presence of magma driving the unrest, node 2), to the final outcome (the onset of an eruption, node 3), as reported in Figure 1A. BET_EF assesses probabilities at all nodes through Bayesian inference, including any possible source of information (theoretical beliefs, models, past data, and volcano monitoring), accounting for both aleatory and epistemic uncertainty. Then, the probability of eruption is calculated by multiplying the probabilities at each node. Using a simplified formalism the probability of eruption is given by

P(eruption)=P(unrest)×P(magma — unrest)×P(eruption — magma, unrest)

(1)

Figure 1

Schematic representation of the BET_EF model’s settings (Marzocchi et al.2008). In panel A, the three nodes of the Event Tree are represented. At each node, a Bayesian inference scheme is performed assuming a Beta distribution for the probability, both for the analysis of anomalies (panel B) and for the background analysis (panel C). BET_EF automatically switches between these two regimes, based on the observed state of unrest (Punrestin panel B). During unrest episodes, the model is based on the analysis of monitoring anomalies (panel B), and it is set through the parameters T 1, T 2and wi, thresholds and weight of each monitoring measure, respectively. On the left, an example of fuzzy threshold is reported, where in x-axis the possible values for the parameter are reported, while in the y-axis is represented the degree of truth of the statement ’the parameter is anomalous’, given a measurement equal to x. On the right, the basic principles of the transformation from anomalies to probabilities are reported; Bayesian inference is performed on the parameters a and b. The background assessment is based on Bayesian inference on probabilities (panel C), where theoretical models set prior distributions (through the average Λ and the equivalent number of data Θ), updated with the available past data (through the number of successes y and of trials n). More details can be found in the text and in Marzocchi et al. (2008).

The method is described in details in (Marzocchi et al. 2008), and a free generic software tool is available online (Marzocchi et al. 2009), whose input could be defined by users in order to be applied to different volcanoes. A key feature of BET_EF is that it automatically updates the forecast procedure depending on the occurrence of relevant anomalies in the volcanic activity. Whenever anomalies occur, BET_EF bases its forecast on the interpretation of the evolving monitoring measures (see Figure 1B). When only background activity is registered, the eruption forecast addresses only the expected long-term activity (see Figure 1C). The definition of what is background vs anomaly and the interpretation of anomalies represent the core of the analysis, i.e., the selection of the parameters of interest, and the quantitative definition of anomalies. For CFc, the lack of previous pre-eruptive observations makes this analysis a necessary but rather subjective step that can be treated formally through expert opinion.

Expert opinion analysis is well established in many fields, including global political trends and economics, whenever decisions are made under limited knowledge and a high level of subjectivity (e.g., Cornish 1977), and in volcanology (Aspinall 2006; Neri et al. 2008). Weighting of experts varies, but it is a fundamental part of the elicitation process (e.g., Cook 1991), even though often equal-weighted procedures are still considered. Here we adopt a consensus-based expert scoring scheme and an innovative expert elicitation method that uses a structured and iterative process for developing consensus. This scheme resembles in its basic principles the Delphi method (e.g., Linstone and Turoff 1975), but it is targeted to overcome its main critics (vague questionnaire items, not equal treatment of respondents, significant number of dropouts, see Cook 1991and references therein). In this process, expert opinion is weighted on the basis of mutual recognition among experts expressed through a regularly repeated blind procedure. Consensus, in our opinion, is indeed critical for the effective applicability of the results. Differing from most of expert elicitations (e.g., Neri et al. 2008), experts have been asked to select monitoring parameters and relative critical thresholds at each node of the event tree, instead of being asked directly for probabilities. In this way the individual and collective specialist knowledge is more effectively exploited, since the experts are asked to discuss and express themselves directly in their field of expertise: their knowledge enters the statistical model without the filter of personal sensitivity to probabilities, a subject that is unfamiliar to many volcanologists.

In the following, we report the set-up of BETEF_CF, based on (i) the results of a 5-year-long elicitation experiment and (ii) the analysis of CFc “background” activity. The applicability BETEF_CF is then demonstrated with a retrospective analysis of the observed unrest dynamics at CFc in the period 1981-2010.

Development of the model BETEF_CF

The goal of this paper is the set up of the model BET_EF for CFc (hereinafter, BETEF_CF). This model estimates, in near-real time, the probability of occurrence in the time window (t0,t0 + τ)of episodes of “unrest′′(node 1), “magmatic unrest′′(node 2) and “eruption” (node 3). Note that, in this formulation, we concentrate on the magmatic activity only. Of course, one of future developments will be the parallel treatment of non-magmatic phenomena, such as phreatic eruptions. For practical reasons, τ is set to 1 month, as for for Mt. Vesuvius (Marzocchi et al. 2004; Marzocchi et al. 2008). BETEF_CF switches between two distinct regimes, hereinafter referred to as short-term and long-term analyses, that is: In both cases, the choice of the parameters and relative thresholds, at all the nodes, is the core of BETEF_CF, since it controls all probabilities assessments in the short-term regime, and it defines the reference background status of CFc (no unrest) for the long-term assessments. The subjectivity of this choice is herein dealt with expert elicitation.

When the state of unrest is detected at t=t0by BETEF_CF, all further probabilistic assessments are based on the analysis of changes in the volcanic system in rather short time frames (days to weeks). This situation is hereinafter referred to as short-term assessment. More precisely, monitoring anomalies are transformed (using a simple transfer function, see Figure 1B) into subjective probability distributions relative to the occurrence of “magmatic unrest” and “eruption”, respectively. Here, the basic input for BETEF_CF is to define the anomalies to be accounted for at each node. This goal is achieved defining (i) a list of parameters of interest at each node and (ii) thresholds (in a fuzzy perspective) to identify anomalies for each of these parameters.

When anomalies are not observed at time t=t0, BETEF_CF considers the so-called background probabilities, hereinafter also referred to as long-term probabilities. Such long-term probabilities are based on theoretical models and the analysis of past data since 1980 (date after which anomalies can be reasonably defined with the available recordings from the monitoring system of CFc), considering (i) the definition of unrest used for short-term assessments, (ii) the fact that no anomalies are recorded at t=t0, and (iii) the fact that CFc has been experiencing a long-term uplift since the 1970s.

Result 1: expert elicitations

We invited experts to multiple panels. The goal of each panel was to define the input for the model BETEF_CF. At each panel, each expert defined a list of parameters, their relevant thresholds to define the occurrence of anomalies, and their weights indicating the perceived importance of each parameter. The parameters are relative to different nodes of the event tree in Figure 1A. At each panel, the opinion of each expert was weighted by their peers.

Five elicitation sessions were organized, preceded by seminars, analysis of previous elicitation results, and debate, and followed by public discussion, during approximately 5 years covering two sequential projects funded by the Italian Dipartimento della Protezione Civile (INGV-DPC 2005,2007). A complete list of the elicited experts can be found in Endnotes and in Selva et al. (2009).

During the five expert meetings, the quantitative definition of monitoring parameters, and their availability in real-time from the monitoring network at CFc, were carefully considered. In addition, for each parameter, the concept of an anomaly’s inertia has been developed, which defines how long a given change remains significant for forecasting purposes: for example, if a new fracture opens today, for how many days will this count as an “anomaly” before it is no longer significant?

During the first elicitations (I and II), each expert was free to define both parameters and inertia. After these elicitation sessions, each proposed definition was collectively discussed. After elicitation II, a committee (subset of the group of experts) was put in charge of preparing a list of parameters complete with their operative definition and inertia, based on the first two elicitations proposals and subsequent discussions. This list was collectively reviewed before elicitation III, and adopted from there on. Of course, these definitions are reviewable in future, and indeed few minor changes were discussed and implemented before elicitation IV and V. Note that, according to the operative definition of parameters’ inertia, it decreases from node 1 to 3, consistent with the view that changes are expected to become progressively more rapid when approaching the eruption. This is of course an assumption that reflects the group’s view, and it may significantly affect the model’s forecasts if only few anomalies, with effective inertia much greater than the defined one, are recorded before an event. This was considered an acceptable assumption.

In each elicitation session, experts were individually elicited in a blind procedure, with the following objectives: (i) An individual weight wewas anonymously assigned to each expert by the other members of the panel. To do this, each expert voted up to 5 other experts with a weight of 1 or 2 (self-voting was not permitted). This vote was about each expert’s understanding of CFc. The expert’s weight weis then computed as the sum of all votes received. (ii) Each expert identified the monitoring parameters and thresholds that are relevant at each node of the event tree. One expert could also select a parameter, without defining thresholds, if he/she judged this out of his/her own expertise. In addition, at nodes 2 and 3 (magmatic unrest and eruption, respectively), for each parameter, the expert selected a weight wip (equal to 1 or 2, where i runs over all parameters) to indicate how informative an anomaly of that parameter is at that node. Each parameter received a score s that is the sum of the weights of the experts wethat indicated that parameter, as computed in step (i). Two score thresholds (sMand sm) were defined after each elicitation session in order to classify the parameters according to high, intermediate and low score. The parameters with high score (s≥sM) were selected, whereas the parameters with intermediate score (sM>s>sm) were still selected, but assigning them a probability of acceptance paequal to s−smsM−sm. The parameters with low score (s≤sm) were rejected. (iii) Thresholds values and weight wip for each parameter were identified from the estimates provided by the experts through a weighted procedure, with weights we. Lower and upper thresholds were selected as the 50-th percentile of the corresponding distribution. The weight of each parameter for forecasting purposes (wiin Figure 1) is assigned as the product of the 50-th percentile of the wip distribution multiplied by the probability of acceptance pa.

The results of each elicitation session are reported in Tables 1, 2 and 3 for seismologic, geodetic, and geochemical parameters, respectively. Note that parameters are of two types: fuzzy or boolean. For fuzzy parameters, two thresholds are reported, between which measures progressively evolve from “normal” to “anomalous”. In particular, they should be interpreted as in Figure 1B. For example, considering the results of Elicitation V, the number of VT per day (M > 0.8) is considered surely anomalous if greater than 15, possibly anomalous if between 5 and 15, and not anomalous if less than 5. Boolean parameters, reported as YES/NO, represents single observations which alone represent anomaly.

Table 1

Elicitation results for seismological parameters

Parameter

inertia

units

ELICIT. I

ELICIT. II

ELICIT. III

ELICIT. IV

ELICIT. V

(on line)

(on line)

Node 1

# seismic events

(∗)

ev/day

> 1-85

-

-

> 10-20

-

# VT (M >0.8)

(∗)

ev/day

-

> 2-15[mag]

> 15-85[mag 3]

> 2-5[mag 13]

> 5-15

Largest Magnitude

last month

-

> 1.7-3.3

> 2-3[co]

> 2-3

> 3-4[p 20]

-

# LP

(∗)

ev/month

> 0

> 2-10

> 1-10

canceled

canceled

# LP/VLP/ULP

(∗)

ev/month

-

-

-

> 1-10[p 20]

> 2-10

Node 2

# seismic events

(∗)

ev/day

> 1-100 (1)

-

> 20-110 (1)

-

-

# VT (M >1.3)

(∗)

ev/day

-

> 15-70 (1)[mag]

-

> 20 (1)[p 27]

-

# deep VT (>3.5 km, M >0.8)

(∗)

ev/day

-

-

-

> 1-4 (1)[mag 3]

> 2-20 (1)[p 90]

Largest Magnitude

last month

-

> 2.6-4.0 (1)

> 3.6-4.5 (1)

> 3.3-4.3 (1)

> 4 (1)[p 20]

-

# LP

(∗)

ev/month

> 0-10 (1)

> 1-30 (1)

> 20-40 (1)

-

-

# deep LP (>2.0 km)

(∗)

ev/month

-

-

> 1-7 (2)[dep 30]

> 2-8 (2)[dep 35]

> 3-20 (1)[p 50]

# VLP/ULP

(∗)

ev/month

-

> 0 (2)[ulp]

> 2-6 (2)

> 1-2 (2)[p 07]

> 1-5 (1)

# deep VLP/ULP (>3.5 km)

(∗)

ev/month

-

-

-

> 1-5 (2)[p 73]

-

Presence of tremor

last month

-

-

-

-

YES/NO (1)[red]

YES/NO (1)

Presence of deep tremor (>3.5 km)

last month

-

-

-

-

YES/NO (1)

YES/NO (1)

Maximum depth (>6 events)

-

km

> 6 (1)[min]

> 4-8 (1)[min]

> 4-6 (1)

canceled

canceled

Presence of CLVD

-

-

YES/NO (1)

YES/NO (1)

canceled

canceled

canceled

Node 3

# seismic events

(∗)

ev/day

> 50-240 (1)

-

-

-

-

# VT (M >1.3)

(∗)

ev/day

-

-

-

> 50-200 (2)[p 87]

-

Acceleration in # seismic events

last week

-

YES/NO (1)

YES/NO (2)

YES/NO (1)

YES/NO (1)

YES/NO (1)

Acceleration in RSAM

last week

-

-

-

-

YES/NO (1)[p 93]

YES/NO (1)[p 70]

Presence of tremor

last month

-

YES/NO (1)

YES/NO (1)

YES/NO (2)

YES/NO (2)[p 53]

YES/NO (1)

Upward migration

-

-

YES/NO (1)

canceled

canceled

canceled

canceled

Hypocenter dispersion (depth range)

last week

km

-

YES/NO (1)[qual]

YES/NO (1)[qual]

> 1-3 (1)[p 53]

> 1-3 (1)[p 30]

(10th - 90th perc.)

NOTES:

(∗): number of observed events divided by the number of days from the observation. This choice makes the inertia proportional to the registered number of events (and the total energy emitted), that is, the higher the number of events, the longer the inertia.

[co]: parameter added for coherence among nodes.

[depxx]: the threshold distinguishing between deep and shallow events is x.x Km.

Legend: The analytic elicitation results for seismological parameters are reported. For fuzzy parameters, lower (T1) and upper (T2) threshold are reported, with the symbolism “T1-T2 (weight)”. For Boolean parameters YES/NO is reported, with the symbolism “YES/NO (weight)”. At node 1, weights are not present, since they are not required by the model. The symbol # stands for “The number of”, while other symbols are defined in section List of Abbreviation. Further specifications, when necessary, are reported in NOTES.

Legend: The analytic elicitation results for geochemistry and thermal parameters are reported. The same symbols of Table 1 are used.

The tables illustrate the progressive convergence of the expert group decisions from highly scattered initial views toward a shared and stable group opinion. The results of subsequent elicitation sessions also show convergence of opinions towards a few stable parameters at each node (Figure 2). Some initial inconsistent definitions of parameters and thresholds were removed through time (e.g., Presence of CLVD, since the present resources at CFc do not allow its real-time assessment). Through the years we observed a progressively more willingness of single experts to openly illustrate the limits as well as the success of their models, and to become more open to modifying previously preferred quantifications in favor of others that emerged collectively from the expert group decision process (e.g., minimum magnitudes, thermal anomalies, seismic event counting).

Figure 2

Convergence in the number of selected parameters through elicitation sessions. The number of selected parameters (here we only show those with pα=1) decreases significantly through elicitation sessions, showing the convergence process of experts. The vertical dashed line highlights the significant change occurred after elicitation III, when the number of the selected parameters fell. The results of the last elicitation do not differ substantially from those of the previous session, and results show that both the number (in figure) and the definition of parameters (in Tables 1, 2 and 3) are stable. Thus, those results represent the outcome of the experiment. More details can be found in the text.

In Table 4, we report the results of the last elicitation. Noteworthy, the trends evident in the table do not reflect any choice of one single expert or subsets of experts, rather, emerge as an intrinsic group decision. It is therefore remarkable that the elicitation process produced a clear and consistent picture of the expected dynamics that might lead to a possible eruption at CFc. An example can be seen by inspecting the seismic parameters at the three nodes. The “Unrest” node turns out to be sensitive simply to the occurrence of earthquakes; at the “Magmatic” node, depth of hypocenters and waveforms become relevant; finally, acceleration of seismic activity is believed to be critical at “Eruption” node. Similar consistent trends also emerge from the parameters referring to geodetic and geochemical observations, overall providing a scientifically plausible and sound picture.

Table 4

results of the elicitation V

inertia

units

threshold

weight

Node 1

1) # VT (M > 0.8)

(∗)

ev/day

> 5-15

-

2) # LP/VLP/ULP

(∗)

ev/month

> 2-10

-

3) Uplift

cum. last 3 months

cm

> 2-6

-

4) Uplift rate

last 3 months

cm/month

> 0.7-1.3

-

5) New fractures

last 3 months

-

YES/NO

-

6) Degassing structure extension or increase in flux

last month

-

YES/NO

-

7) Presence of acid gases (HF, HCl, SO2)

last week

-

YES/NO

-

8) Temperature at fumarole “Pisciarelli”

last month

oC

> 100-110

-

Node 2

1) # deep VT (> 3.5 km, M > 0.8)

(∗)

ev/day

> 2-20

0.90

2) # deep LP (> 2.0 km)

(∗)

ev/month

> 3-20

0.50

3) # VLP/ULP

(∗)

ev/month

> 1-5

1

4) Presence of tremor

last month

-

YES/NO

1

5) Presence of deep tremor (> 3.5 km)

last month

-

YES/NO

1

6) Uplift

cum. last 3 months

cm

> 5-15

1

7) New fractures

last 3 months

-

YES/NO

0.20

8) Macroscopic variation on the

last 3 months

-

YES/NO

1

deformation pattern (tens of m)

9) Presence of acid gases (HF, HCl, SO2)

last week

-

YES/NO

1

10) Variation in magmatic component

last month

-

YES/NO

0.10

Node 3

1) Acceleration in # seismic events

last week

-

YES/NO

1

2) Acceleration in RSAM

last week

-

YES/NO

0.70

3) Presence of tremor

last month

-

YES/NO

1

4) Hypocenter dispersion (depth range) (10th - 90th perc.)

last week

km

> 1-3

0.30

5) Macroscopic variation on the

last week

-

YES/NO

1

deformation pattern (tens of m)

6) Migration of incremental maximum (m)

last week

-

YES/NO

0.70

7) New fractures

last 3 months

-

YES/NO

0.40

8) Presence of acid gases (HF, HCl, SO2)

last week

-

YES/NO

1

9) Phreatic activity

last week

-

YES/NO

1

NOTES:

(∗): number of observed events divided by the number of days from the observation. This choice makes the inertia proportional to the registered number of events (and the total energy emitted), that is, the higher the number of events, the longer the inertia.

Legend: The results of the elicitation V are here reported. The same symbols of Table 1 are used.

It is also worth noting that the relevance of fuzzy parameters progressively decreases, when moving from node 1 to 3, while the relevance of Boolean parameters increases. This reflects (i) the decrease in confidence of experts (there is previous instrumental experience of unrest episodes at CFc, while that experience is missing for pre-eruptive phases), and (ii) the global experience suggesting that an eruption at a long-dormant volcano is usually preceded by macroscopic (easily visible) escalation of phenomena.

In order to check the stability of the results and the possible existence of systematic divergence of opinions between high-weight and low-weight (or zero-weight) experts, the entire procedure was repeated by assigning to each elicited expert the same weight (we=1). The results show that the individual-weighting and equal-weighting produce similar probability distributions, but that individual weighting yielded narrower distributions or more unanimity than equal weighting, especially around parameters judged to be critical. In other words, individual-weighting results are less dispersed than equal weighting results, and hence provide more informative distributions, even though 50th percentile values are similar. For sake of example, in Figure 3, we report the comparison between the individual-weighted and equal-weighted results for node 1, in selecting the parameters (upper panel) and in assessing lower thresholds (bottom panel). Analytical results for all nodes and all parameters are available at Selva et al. (2009).

Figure 3

Sensitivity test on expert’s weighting scheme. Top panel, comparison of the parameter’s scores s, as assessed through individual-weighted (assessed from expert’s weights we) and equal-weighted (assessed imposing we=1) procedures, relative to node 1 of elicitation V. The results from the two methods appear well correlated, showing that the selection of parameters (s>sm) is rather stable with respect to we. In the bottom panel, we report the statistics on the lower threshold for selected parameters at node 1 of elicitation V, evaluated in the individual-weighted (red) and equal-weighted (blue) procedures. Bars indicate confidence interval (80%) and stars represent the median. The results show that the two procedures results in equivalent medians, but the equal-unweighted procedure generally provides larger confidence intervals. Equivalent results for all nodes and parameters can be found in Selva et al. (2009).

Result 2: BETEF_CF settings

The BETEF_CF code yields long- and short-term eruption forecasting in the form of a probability distributions of expected frequencies of each node’s event (“unrest per month” for node 1, “magmatic unrest” given unrest for node 2, “eruption” given magmatic unrest for node 3), see Figure 1A. The parameters of each distribution are set through a Bayesian inference according to the logic described in Figure 1, panels B and C, and in the text. Here, we report in details how the elicitation results, together with other relevant models/past data, are used to parameterize BETEF_CF at each node.

Node 1: defining background state and unrest phase at CFc

Node 1 of the Event Tree considers whether there is either (i) unrest, or (ii) no unrest, in the time interval (t0,t0 + τ), where t0 is the present time, and τ is the time window considered (1 month in this application). The definitions of background and unrest are necessarily subjective, since they have to reflect the specific aim of the forecast. Slow secular subsidence over preceding centuries was interrupted by caldera floor uplift beginning about 60 years ago. However, classifying all of the past 60 years as “unrest” is useless for short-term forecast and decision makers. Instead, unrest is pragmatically defined as a state of the volcano that forces us to face the question at node 2: is what is being observed due to magma movements? The corresponding definition of background state is therefore that of a “normal” state in the frame of the present long-lasting unrest at Campi Flegrei.

In this respect BETEF_CF code requires as input a list of monitoring parameters and their thresholds that identify anomalies with respect to the background activity, i.e., a phase of unrest. The output of the expert elicitation sessions for node 1 is reported in Table 4. When at least one anomaly is detected by BETEF_CF, the probability of unrest is set to the degree of unrest, that is, the largest degree of anomaly detected all over the parameters (see Marzocchi et al. 2008and Figure 1B).

When no anomalies are detected, the BETEF_CF estimates the long-term probability of unrest. To do so, we set the prior information to a uniform distribution (maximum ignorance, i.e., Λ1=0. 5 and Θ1=1, see Figure 1B and Marzocchi et al. 2008). In order to define the likelihood distribution in the Bayesian inference scheme, we divide the period 1981-2009 into subsequent non-overlapping time windows of length τ=1 month. Then, we count the number of months that started with no unrest up to that time window (number of ’trials’) and the number of times that a new unrest episode starts within the time window, the latter corresponding to the number of observed unrest episodes up to then (number of ’successes’) (Marzocchi et al. 2008; Sandri et al. 2009). In this count, we also include partial unrest episodes with a fractional value equal to the measured Degree of Unrestη (see Marzocchi et al. 2008, ESM). With these parameters, the probability distribution relative to the occurrence of an unrest (node 1) in the next τ is completely defined for each of the time windows, and it changes with time as new information is acquired in a sort of learning procedure. At the end of the examined period (Jan. 1st 1981 - Dec. 31st 2009), the number of trials at node 1 was n1=306, while the successes were y1=7. 4. The posterior distribution is therefore a Beta distribution with parameters α=8. 4 and β=299. 6.

Node 2: magmatic unrest

In case of unrest, we must focus on quantifying whether the unrest is due (i) to new magma, or (ii) to other causes (e.g., hydrothermal, tectonics, etc.). A hydrothermal eruption could threaten areas within a few kilometers of a vent. In CFc this is serious and deserves attention in future work. Here, though, we focus only on magmatic unrest that can lead to magmatic eruptions. The distinction between magmatic and non-magmatic unrest involves some subjective considerations because the presence of magma in a volcanic system is obvious. Pragmatically, we identify magmatic unrest when magma is in motion (e.g., significant reactivation of convection in a magma chamber, of dyke intrusion).

If BETEF_CF detects unrest at node 1, the short-term analysis is based on the anomalies recorded to the parameters relative to node 2. The output of the expert elicitation sessions for node 2 is again reported in Table 4. These anomalies are then transformed into probability distributions.

When BETEF_CF does not detect unrest at node 1, the probabilistic analysis at node 2 is based on the long-term assessment. In this case, the prior information is given by a uniform distribution (Λ2=0. 5 and Θ2=1, see Figure 1B and Marzocchi et al. 2008) since specific knowledge on the relative frequency of occurrence of magmatic unrest episodes, with respect to other unrest types, is not available. Similarly, no past data are used since the magmatic vs. hydrothermal origin of the unrest episodes since 1981 is still debated (e.g., Bonafede 1991; De Siena 2010, and references therein). For this reason, we have chosen not to consider past data at node 2. The posterior distribution is therefore a Beta distribution with parameters α=β=1.

Node 3: magmatic eruption

In case of unrest with a magmatic origin, at node 3 we consider whether (i) the magma will reach the surface (i.e., it will erupt), or (ii) it will not, in the time interval (t0,t0 + τ).

If BETEF_CF detects unrest at node 1, the short-term analysis is based on the anomalies recorded to the parameters relative to node 3. The output of the expert elicitation sessions for node 3 is again reported in Table 4. These anomalies are then transformed into probability distributions.

When BETEF_CF does not detect unrest at node 1, the probabilistic analysis at node 3 is based on the long-term assessment. In this case, prior information was derived from the worldwide database of unrest at calderas similar to CFc (Newhall and Dzurisin 1988). Such database shows a frequency of unrest culminating in an eruption at silicic calderas (with unrest in the past 100 years and repose of more than 100 years, as in CFc case) of about 1 out of 6. Here, allowing for our ignorance on the nature of unrest (see above for Node 2), we estimate that 50%of unrest might be magmatic and so the prior best guess at Node 3 (Λ3, see Figure 1B and Marzocchi et al. 2008) is set at (1/6)/0.5= 0.33. For this prior model, we also set the maximum allowed epistemic uncertainty (the equivalent number of data Θ3=1; see Figure 1B and Marzocchi et al. 2008). Note that an informative (even if weak) prior model is introduced only at node 3, and not at the previous nodes. This reflects the effective lack of credible models about volcanic unrest episodes. Indeed, future improvement on this issue will allow us to use more informative prior distributions at the different nodes of BET_EF. At node 3, past data are represented by the number of eruptions (’successes’) compared to the number of observed magmatic unrest episodes (node 2) which we do not know (see above). Thus, allowing again for an expectancy of 50%of magmatic unrest out of all unrest episodes, the background assessment at node 3 of BETEF_CF accounts for no observed eruption (y3=0) out of n3=0. 5∗y1=3. 7supposedly magmatic unrest episodes since 1981. With these parameters, the background probability distribution relative to node 3 (eruption) is completely defined by a beta distribution with parameters α=0. 67 and β=5. 03.

Result 3: Current and retrospective application

The main result of this paper is the set up of the BETEF_CF model, as reported in Tables 4 and 5. This model, based on the group opinion of experts, is able to analyze the continuous flux of information coming from the CFc monitoring system, estimating the monthly probability of eruption in almost real-time through Eq. 1.

Table 5

Background settings of BETEF_CF

Node

Prior parameters

Likelihood

parameters

Node 1: UNREST

No info - uniform

n1=306

(Λ1=0. 5 & Θ1=1)

y1=7. 4

Node 2: MAGMATIC-UNREST

No info - uniform

n2=0

(Λ2=0. 5 & Θ2=1)

y2=0

Node 3: MAGMATIC-UNREST

(Λ3=0. 33 & Θ3=1)

n3=3. 7

Θ3=1

y3=0

Legend: Background settings of BETEF_CF, updated to the end of 2009. See text for more details.

If no anomalies are detected, BETEF_CF provides the background monthly probability of eruption at CFc (Figure 1C). As shown above, this background analysis is based on theoretical models and data since 1981 and it accounts only for the long-term ongoing uplift dynamics of CFc. The long-term (background) expected (mean) eruption probability, updated to the 31st December 2009, in the following 1 month is 1. 6·10−3, with a 80%confidence interval [4·10−5−4·10−3]defined by the 10-th and 90-th percentiles of the distribution. Such confidence interval reflects the epistemic uncertainty on the expected probability estimate, as it propagates through nodes 1 to 3. Noteworthy, this estimated background monthly probability of eruption is of the same order of magnitude of the better-known Mt. Vesuvius, as estimated through an analogous procedure in Marzocchi et al. (2008). This implies that the hazard exposure of Naples due to CFc, even in quiet periods, is higher than for Vesuvius, given that expected eruption sizes are comparable (Marzocchi et al. 2004; Orsi et al. 2009), but the city center is closer to the eruptive vents of CFc, and more directly downwind (Selva et al. 2010, 2012).

Whenever anomalies are detected, monitoring measures start being informative about the short-term behavior of the system, and the forecasts provided by BETEF_CF account for their fast evolution in near real-time. Such a strategy has been shown to provide results in agreement with more classical processes of experts’ decision during crises (e.g., Sandri et al. 2009; Lindsay et al. 2010), usually involving the set-up of a team receiving real-time data and discussing them collectively to achieve consensus. BETEF_CF can speed delivery of analysis to decision-makers.

In Figure 4 we show a retrospective application of the BETEF_CF code to track the unrest evolution at CFc in the period 1981-2010. At the beginning of this time interval, the monitoring capability was not comparable to the present one; this inhomogeneity poses some constraints to the resolution of the probability variations through time. Nonetheless, this example highlights the main features of the BETEF_CF code applied to a real case. In Figure 4 we also report the eruption probability distribution at three different times; each distribution displays the estimated probability (central value) and the associated epistemic uncertainty (dispersion around the central value) (Marzocchi et al. 2008).

Figure 4

Retrospective application of BETEF_CF from 1981 on. At each time t0, BETEF_CF is calibrated with the data for t<t0. In panel A, we report the average (best estimate) probability of unrest (blue), magmatic unrest (green) and eruption (red) for the following 1 month. In panels B, C and D, at three different time, we report a snapshot of the cumulative distribution (percentiles) of the probability of eruption, highlighting the epistemic uncertainty on the estimated probability. Spikes in the probability values (main figure) represent unrest episodes, during which monthly probabilities are much greater than the background ones. The major unrest period 1982-84 (Barberi et al. 1984), as well as each one of the minor uplift phases that followed, are correctly identified as anomalous. In particular, BETEF_CF shows that starting from mid-1982 the volcano was certainly in an anomalous state (probability 100%at node 1), the average probability that the unrest was due to active magma movements was about 70%, and the probability of an eruption on a time window of one month was about 20%, with a peak of nearly 40%in the period June-September 1983 (in October the evacuation took place). Such a high value is in agreement with the perception of some volcanologists at the time (Civetta and Gasparini 2012), even if explicit quantifications of probabilities were not available. In late 1984 the eruption probability returned to lower values around 10%, and the crisis was definitely over at the beginning of the following year. The so-called mini-uplift phases that punctuated the activity of CFc from year 2000 are similar to each other in terms of probabilities, with the eruption probability always less than 10%.

Conclusions

The BETEF_CF represents a valuable tool that can be used in real-time during an episode of caldera unrest. However, it is not intended to replace advisory groups, expert panels, or other means of evaluation that are commonly set up during major crises; rather, it represents an additional powerful tool that can help by focusing discussion and by saving time in exploring the changing eruption parameter-uncertainty space (Lindsay et al. 2010). Some of the characteristics of BETEF_CF make this procedure unique and highly desirable: i) the estimates from BETEF_CF are quantitative and reproducible, allowing therefore a fully transparent process of scientific evaluation during the crisis; ii) they are not the product of one single expert or restricted to a limited sub-group of experts, but instead represent a decision distilled from a large community, thus giving more robustness to the forecasts; iii) the forecasts are unaffected by temporal, political or sociological demands during the crisis, but are objective since based on new data and previously quantified consensus views; iv) the calibration through expert elicitation can be updated with the most recent and robust scientific results. The expert community decides through a blind process if new results should be included in BETEF_CF, and the weight to assign to them, so that scientific robustness is the only driver of new updates; v) BETEF_CF provides a clear aid to volcano scientists during a crisis, represented by the interpretation of observations and provision of forecasts, helping distinguish in a clear and unambiguous way the role of volcanologists from that of decision-makers.

Whilst we report here the specific case of CFc, the approach can be generalized to other volcanoes where little or no pre-eruptive data are available.

Endnotes

We report a complete list of the participants to all the elicitation sessions, in strict alphabetical order. In parenthesis, we report the elicitation at which single researchers participated. This list can be found also at the elicitations’ website (Selva et al. 2009).

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

JS coordinated the writing of the paper, prepared all relative materials, implemented software codes and performed the analysis. WM and JS conceived and co-led the elicitation experiment. PP coordinated the elicitation meetings with WM. LS participated in code development and data analysis. All authors read and approved the final manuscript.

Copyright

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License(http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.