Expert expectations on the accuracy of forecasts from different methods

Scott Armstrong and Kesten Green surveyed diverse experts on their expectations of the accuracy of conflict forecasting methods when used by experts and by novices (see questionnaire). They asked the experts for their expectations of the proportion of correct forecasts from various combinations of method and expertise assuming 28% accuracy could be achieved by choosing decisions at random. On average, the respondents expected experts to correctly pick the actual outcome 45% of the time if they used unaided judgement and half of the time if they used game theory, structured analogies, or simulated interaction. They expected 30% of novices' unaided-judgement forecasts and 40% of novices' simulated-interaction forecasts to be accurate.

Descriptions of methods

Outlines of how to implement conflict forecasting methods are provided on these pages and in the Forecasting Dictionary. More comprehensive descriptions are provided in:

Theory and Commentary

Research findings by Kesten Green on the accuracy of game-theorists' forecasts relative to that of simulated interactions forecasts published in the International Journal of Forecasting (18:3) were accompanied by six commentaries by nine authors:

Abstracts of Relevant Forthcoming Papers

Nine papers and a panel at the International Symposium on Forecasting 2005, in San Antonio, Texas, USA

"Terrorist Attack Prediction using Discrete Choice Models," This email address is being protected from spambots. You need JavaScript enabled to view it. and Michael Smith, Department of Systems and Information Engineering, University of Virginia, Charlottesville, VA 22904, USA

Terrorists employ a range of attack modes to include suicide bombings, improved explosive devices (IED), mortar and rocket firings, and portable air defense missiles. The range of attack modes and the rareness of these events makes effective defensive measures difficult with the result that defensive actions typically impose greater restrictions on the larger population. Predicting the locations and times of terrorist events can enable more directed defensive efforts. While a number of predictive technologies might be used for this problem, very few are capable of dealing explicitly with the inherent decision making used by the terrorists in their attack planning. This paper describes an approach to terrorist incident prediction that uses discrete spatial choice models to predict the behavior of the terrorist. This work builds on our previous work using point process models and spatial choice analysis to forecast criminal behavior and suicide bombings. We give examples of the use of this approach and an evaluation of its performance. These evaluations show the discrete spatial choice models are more effective at predicting future attack locations than the more commonly used methods that employ kernel density estimates.

The criminal threat environment is ever-changing. As a consequence, law enforcement agencies have had to develop forecasting capabilities based on intelligence gathering and analysis that will improve targeting and resource allocation. Criminal Intelligence Service Canada (CISC), in partnership with Carleton University, has spearheaded an effort to develop a strategic early warning methodology and intelligence network in order to forecast threats. The methodology draws on existing capabilities developed by the Country Indicators for Foreign Policy at Carleton University. Both the methodology and the outputs derived from it have been adapted to meet the specific needs of law enforcement personnel and decision makers. We first examine the rationale and purpose for developing a strategic early warning capability. Second, we outline the methodology for identifying potential risks and their relationship to criminal activity. Third, we describe current research using the existing framework. Finally, we specify directions for future work and implications for both policy and strategy development. Early results from this project have been positive, as evidenced by the community's feedback to our warning product (SENTINEL). Early warning methodologies from other fields (including public health, military and national security) have proven useful to law enforcement's mission.

Part one of this project outlined the research problem: explanation and prediction of joint combat operations. It proposed a theory that contained testable hypotheses, and suggested methods by which the hypotheses could be examined. Part two initiated the testing of the theory. The testing revealed weaknesses in the theory and method (systematic judgment.) It also suggested means by which the theory and accompanying hypotheses could be improved.

Part three will demonstrate the efficacy of the improvements, and explore the reliability and validity of the measures that bring form to the theory. The reliability of the variable matrix is .75, very acceptable considering the small number of the population. The prima facie and content validity as assessed by the judges is consistent and strong. Thus, the basic empirical elements of the theory seem sound.

The expansion of the variability and the inclusion of an operational leadership variable, recommended by the last round of judges, improve the explanatory power of the theory, but only marginally and at the cost of statistical significance. Finally, the judges in this round strongly recommend that the theory's limit is to the descriptive and explanatory, and should not be expanded to the predictive.

"Forecasting the Unforecastable: The Impact of 9/11 on Las Vegas Gaming Revenues," This email address is being protected from spambots. You need JavaScript enabled to view it., Virginia Commonwealth University, 702-526-8154 and This email address is being protected from spambots. You need JavaScript enabled to view it., Virginia Commonwealth University, 804-798-3041.

When a major disruption occurs, a time series can no longer be predicted reliably from the historical data. The issue most important to the forecaster's client, the decision maker, becomes identifying which of several alternative scenarios will characterize the future. For example, the events of 9/11 caused a major loss of gaming revenue in Las Vegas. Would revenues return to the previous trend line, return to the previous trend but at a lower level, remain flat indefinitely, or continue to deteriorate? Decisions to build/expand casinos, lay off employees, cancel contracts, and temporarily close hotel wings depended on which scenario would prevail. The challenge to the forecaster is to provide reliable answers rapidly as new data become available.

In this paper, we develop a method for forecasting time series after significant disruptions. We develop simple models of the response to a disruption that blend easily with the pre-disruption time series model. Using Bayesian methods to adjust business judgments as new data arrive, we are able to identify the nature of the response and to develop a reliable forecast as rapidly as possible. We illustrate the method using gaming revenue for Clark County (Las Vegas) before and immediately after 9/11.

"Forecasting Domestic Conflict," This email address is being protected from spambots. You need JavaScript enabled to view it., Florida International University, and This email address is being protected from spambots. You need JavaScript enabled to view it., University of Peloponnese

We take domestic conflicts across the world, measured and classified in terms of number of deaths and forecast their occurrence likelihood in future using the following models: (1) Poisson Autoregressive model, (2) Markov Switching model, (3) Artificial Neural Networks model and (4) Smooth Transition Autoregressive model. First two models take care of underlying conditionalities, if any, present in the original data. As the data generating process is unknown a priori, therefore, we use the neural network framework to investigate if the conflict process itself is state-independent. Additionally, as the sample ranges from 1950 to 2003, we choose the smooth transition model to explore the potential nonlinear pattern. We have also evaluated the first two and the last model in presence of economic, institutional and political control variables as being identified in the literature. Various model forecasts are then combined and compared with the individual model forecasts to generate improvements in forecasting performance. However, final results provide ambiguous improvement in out-of-sample predictions and call for a more general approach to correctly classify the data pattern.

"CASCON and MIT Research on Conflict," This email address is being protected from spambots. You need JavaScript enabled to view it., MIT Sloan School of Management and Center for International Studies

The focus of this discussion will be on methods for better understanding the process by which disputes either do or do not escalate to threats of violence or to outright hostilities. One research approach currently underway is to use systems dynamics to operationalize theories linking post-conflict conditions of one conflict to the precursors of subsequent conflict. This work attempts to quantify measure where possible, identify data sources, and test theories against reality. In another vein of research, CASCON (Computerized System for Analyzing Conflict) supports decision analysis by historical analogy using a conceptual map of 571 factors influencing the dynamics of the conflict process. Factors were developed by generalizing case-specific events and circumstances identified as significant by case experts. Using the factor map, expert case knowledge is captured in an extensible database and software enables the analyst to explore patterns of similarities and dissimilarities between a current case and historical cases. In contrast to other approaches that use either simplified theories or collections of unconnected facts, CASCON offers a method for applying a multivariate abstract framework to assist in organizing research on a conflict situation, whether incipient or ongoing, in identifying comparable historical situations, and in analyzing potential future courses.

"What we know about forecasting methods for conflicts?" This email address is being protected from spambots. You need JavaScript enabled to view it., International Graduate School of Business, University of South Australia and This email address is being protected from spambots. You need JavaScript enabled to view it., The Wharton School, University of Pennsylvania

We have learned much about forecasting for conflicts over the past 28 years. On the one hand, we have evidence that the accuracy of judgmental forecasts from domain experts and from game theorists is no better than chance. On the other hand, we know that forecasts from two methods, structured analogies and simulated interaction (a form of role playing), are substantially more accurate, especially when forecasts are combined. The error reduction compared to chance of combined forecasts for eight situations we used in our research was 13% for experts' unaided judgement and game theory experts, versus 31% for structured analogies and 83% for simulated interaction. Our findings are contrary to people's expectations. This situation presents opportunities for those who are first to adopt the new methods. Improvements in decision-making will occur even if only one party adopts the improved forecasting methods. There is still much to learn. For example, might some game-theoretic analysis aid forecasting under some conditions? Can Delphi technique or prediction markets provide useful forecasts for conflict situations? Are there conditions under which simulated interaction fails to provide accurate forecasts?

"Red Teaming Approaches for Homeland Security: A Review of Current and innovative Methodologies," This email address is being protected from spambots. You need JavaScript enabled to view it., Ms. Shelley Asher and Ms. Catherine Bott, Homeland Security Institute

From a U.S. homeland security perspective, the need for understanding and anticipating the adversary's adaptive nature in a dynamic environment is greater now than ever before. Existing approaches used in the homeland security community for assessing the adversary's perspective will be discussed, including wargaming, vulnerability assessment, table top exercises, and red cell approaches. To enhance the ability to address current and future adversaries, innovative red team approaches must be identified. We looked outside and inside the defense and intelligence communities to identify innovative methodologies that could be adapted for use in red teaming. The adversary was not always defined in terms of terrorists or terrorist groups in the methodologies, but rather as competitors, unions, suppliers, or customers. Five innovative methodologies were identified: (1) Competitive Intelligence Wargaming, (2) Simulated Interaction, (3) Structured Analogies, (4) Structured Idea Generation, and (5) Rapid Ethnography. The innovative methodologies and issues for potentially adapting them to a homeland security context will be discussed. Additionally, implications for sharing and adopting methodologies across broad sectors and fields will be discussed.

"Role thinking: Does standing in the other guy's shoes improve forecast accuracy?" This email address is being protected from spambots. You need JavaScript enabled to view it., International Graduate School of Business, University of South Australia

Two methods have been shown to provide accurate forecasts of the decisions that people will make in conflict situations. The first is simulated interaction, a kind of role playing, using novices; the second is structured analogies, a formal analysis of similar situations by experts. In contrast, when they use their unaided judgement, experts and novices alike provide forecasts that are no more accurate than chance. The success of the simulated interaction method suggests that realistic modelling of role and interaction between parties is important, while the success of the structured analogies method reinforces findings, from other research, that forecasts derived from expert judgements using a structured process are more accurate than those that derive from unaided judgement. Is it possible to obtain forecasts that are more accurate than unaided-judgement forecasts by encouraging participants to think about roles and interactions in a formal way? I will provide a tentative answer to this question using findings from research on the relative accuracy of forecasts from novices' role thinking.

This email address is being protected from spambots. You need JavaScript enabled to view it.* International Graduate School of Business, University of South Australia, GPO Box 2471, Adelaide SA 5001, Australia

This email address is being protected from spambots. You need JavaScript enabled to view it., Department of Systems and Information Engineering, University of Virginia, Charlottesville, VA 22904, USA

This email address is being protected from spambots. You need JavaScript enabled to view it., MIT Sloan School of Management, and Center for International Studies

This email address is being protected from spambots. You need JavaScript enabled to view it., Kent Center for Analytic Tradecraft

Empirical research has led to the identification of methods for forecasting in conflicts that are superior to the current practice of using unaided judgement. Accurate forecasts of how people will behave in conflicts offer the prospect of better decisions. Adoption of superior practices can be slow, however. For example, it was 264 years after the discovery that lemons could be used to prevent scurvy that the British Merchant Marine changed their practices to take advantage of this knowledge. We think that improving conflict forecasting practice can and should be more rapid. Panellists will describe briefly their experiences in facilitating the adoption of superior methods and their thoughts on how best to achieve wider adoption. We will invite questions and suggestions from the audience.

Five papers presented at the International Symposium on Forecasting 2004, in Sydney, Australia

"How to Use Experts to Forecast in the War on Terrorism," J. Scott Armstrong, The Wharton School, University of Pennsylvania, and Kesten C. Green, Victoria Management School, Victoria University of Wellington

In 2003, the Pentagon proposed the use of a market on terrorism as a way to assess dangers. While it was politically unpopular, one might ask whether it was a good idea. In many situations, combining the predictions of large groups of unbiased people can provide accurate forecasts. As we will show, given the conditions involved in terrorism forecasting, markets are unlikely to increase accuracy substantially. They might also have negative consequences for behavior. We examine alternative approaches to using experts to forecast acts of terrorism: these include the Delphi technique and structured analogies. Information on these approaches to forecasting conflicts with terrorist organizations is available at conflictforecasting.com. (Address correspondence to This email address is being protected from spambots. You need JavaScript enabled to view it.)

"An Oracle of Battle: Forecasting Results of Joint Military Operations," Jonathan E. Czarnecki, Naval War College of the United States

War, campaigns, operations, and combat appear to be chaotic in their application of violence. However, within that chaos, there are common processes and behaviors that seem to transcend history and culture in the conduct of such chaos. Are there common elements or variables critical to all joint military operations? If there are common elements, can one begin to develop a theory that describes and explains this class of societal behavior? Finally, can one use such a theory to forecast the success or failure of joint operations, and thus obtain insight into forecasting the future results of the war in which such operations occur?

This paper argues that there are common variables critical to joint military operations. It develops a theory that can concisely explain and described these operations through four independent variables. These variables are: training; integrated combat fires; decision space; and information processing. Using selected historical data from the post-1975 United States experience with joint military operations and applying psychometric judgmental scaling methods, the paper tests the theory's ability to explain the results of past joint military operations. It concludes that the theory has merit, and recommends further refinement through the continuation of research and production of forecasts. (Address correspondence to This email address is being protected from spambots. You need JavaScript enabled to view it.)

Important decisions in the war on terrorism are based on predictions of the decisions that allies, adversaries, and terrorist leaders will make. Decision makers typically resort to unaided judgment but other approaches such as game theory and acting out the interactions between the parties (a procedure we call simulated interactions) have been proposed. Forecasts from simulated interactions using novice role players have been found to be more accurate than forecasts from both experts using their unaided judgment and game theorists. We review the evidence on these forecasting methods for conflicts and make suggestions on how simulated interaction would be useful for assessing alternative strategies and tactics, for example, the reactions of Iraqi groups to different constitutions, or the reactions of hijackers to different types of armed response. A description of the simulated interaction method is available at conflictforecasting.com. (Address correspondence to This email address is being protected from spambots. You need JavaScript enabled to view it.)

Although widely used in business, the legal profession, and the military, studies demonstrating the predictive value of interactive role-playing in conflict forecasting are both sparse and suspect. More recent studies have explored the comparative accuracy of role playing in forecasting a single decision or outcome . We hypothesize that using simulated interactions to forecast the ostensible set of plausible decisions and/or outcomes is of greater utility to decision makers in conflict environments than a single decision and/or outcome forecast. The U.S. Military employs an interactive war-game to forecast the "success" or "failure" of a prospective battle plan when played against a single enemy course of action. A deficiency in the current doctrine is the inability to account for the uncertainty in the threat reactions. Our methodology remedies this shortcoming by allowing simultaneous play of multiple enemy courses of action. We believe the resulting risk assessment will facilitate the identification and development of more robust courses of action. (we define robustness as a quality which describes how well a course of action is expected to perform, taking into account the ostensible set of possible adversary reactions.) We present our findings from experiments conducted within the U.S. Military. (Address correspondence to This email address is being protected from spambots. You need JavaScript enabled to view it.)

The empirical literature on domestic conflict shows an inverted U-shaped relationship between democracy, development, and onset of civil war. Our study examines this aspect to predict conflict intensity for seventeen Latin American countries using two different modeling perspectives.

First, we build an economic model using explanatory variables from existing theoretical work. The conflict intensity is then analyzed from this model using ordinal regression and multinomial logit techniques. Using data from International Peace Research Institute, World Bank, and Statistical Abstracts of Latin America, we find overdependence on agricultural exports, along with lack of public and private investment in an economy characterized by poor socio-political performance, could lead to higher intensity of conflict.

Second, we explore how our results change and possible improve by using a variety of potentially more powerful models. We examine whether Artificial Neural Networks framework, the Cox Proportional Hazard model, and Markov Switching Model can improve the accuracy of classification prediction for the intensity of conflict. Our results indicate that, for predictive purposes, there may be advantages for prediction in combining prior knowledge in the form of explanatory economic variables with a nonlinear classification model, rather than relying exclusively on the performance of the traditional ordinal regression approach. (Address correspondence to This email address is being protected from spambots. You need JavaScript enabled to view it.)

Mass Media Coverage

To date, the list of mass media coverage is a short one. If you know about mass media report that we have not listed, please contact Kesten Green.

December 2004

Unusually, a journalist for The Atlantic Monthly sought to anticipate the news. "Will Iran be next?" (PDF version)describes a war game devised to predict US plans to deal with an Iran intent on arming itself with nuclear weapons.

November/December 2004

Which terrorist threats should we be concerned about and which are the product of feverish imaginations? "Rethinking doomsday" (PDF version) in the Bulletin of the Atomic Scientists attempts to answer this question.

November 2004

The National Geographic ran an article on "The World of Terror" in their November 2004 issue. Highlights of the article, a public forum, a poll, links, and bibliography are available here.

October 2004

Is it possible to accurately forecast the behavior of terrorists? An article by Kesten Green inThe Oracleaddresses this question on page 8.

Terrorism forecasting using structure analogies and simulated interaction was the subject of an interview with Scott Armstrong on the Australian Broadcasting Corporation's "The World Today" program. (also available inPDF format)

Play acting in order to increase big-ticket sales: One software company CEO describes using simulated interactions to improve the performance of his sales force in a CRN article, "Not Just Play Acting," by Chris Penttila

March 2003

Scott Armstrong was interviewed about the war on terrorism by BBC radio in Manchester, England on March 14, 2003. Here is a summary of that interview

September 2002:

Entrepreneur - Implications of findings on the relative accuracy of conflict forecasting methods for small businesses.

March 26 2002:

"Games or serious business?"Financial Times - Discussion of findings on the relative accuracy of game theorists' forecasts including interviews with academics and practitioners.

The material for this Special Interest Group is organized and submitted by This email address is being protected from spambots. You need JavaScript enabled to view it.- Please contact him for further information, and corrections, additions, or suggestions for these pages