Coup Forecasts for 2014

This year, I’ll start with the forecasts, then describe the process. First, though, a couple of things to bear in mind as you look at the map and dot plot:

Coup attempts rarely occur, so the predicted probabilities are all on the low side, and most are approximately zero. The fact that a country shows up in dark red on the map or ranks high on an ordered list does not mean that we should anticipate a coup occurring there. It just means that country is at relatively high risk compared to the rest of the world. Statistically speaking, the safest bet for any country almost any year is that a coup attempt won’t occur. The point of this exercise is to try to get a better handle on where the few coup attempts we can expect to see this year are most likely to happen.

These forecasts are estimates based on noisy data, so they are highly imprecise, and small differences are not terribly meaningful. The fact that one country lands a few notches higher or lower than other on an ordered list does not imply a significant difference in risk.

Okay, now the the forecasts. First, the heat-map version, which sorts the world into fifths. From cross-validation in the historical data, we can expect nearly 80 percent of the countries with coup attempts this year to be somewhere in that top fifth. So, if there are four countries with coup attempts in 2014, three of them are probably in dark red on that map, and the other one is probably dark orange.

Now, a dot plot of the Top 40, which is a slightly larger set than the top fifth in the heat map. Here, the gray dots show the forecasts from the two component models (see below), while the red dots are the unweighted average of those two—what I consider the single-best forecast.

A lot of food for thought in there, but I’m going to leave interpretation of these results to future posts and to you.

Now, on the process: As statistical forecasters are wont to do, I have tinkered with the models again this year. As I said in a blogged research note a couple of weeks ago, this year’s tinkering was driven by a combination of data practicalities and the usual sense of, “Hey, let’s see if we can do a little better this time.”Predictably, though, I also ended up doing things a little different than I’d expected in December. Specifically:

I trained and validated the models on an amalgamation of two coup data sets—as described in a November post that showed an animated map of coup attempts worldwide since 1946—instead of just using the Powell and Thyne list. So that map and the bar plots with it should give you a clearer sense of what these forecasts are (and aren’t) trying to anticipate.

After waiting for Freedom House to update its Freedom in the World data, which it did a few days ago, I decided to go back to using Polity after all because the forecasts based on it were noticeably more accurate in cross-validation. The models include a categorical measure of regime type based on the Polity scale and a “clock” counting years since the last significant change in that score. I hard-coded updates to those measures, which are much coarser (and therefore easier to update) than the Polity scale or its component variables.

As with coup events, I used an amalgamation of GDP growth data from the World Bank and IMF instead of picking one. I also went back to summarizing this feature in the models with a binary indicator for slow growth of less than 2 percent (annual, per capita).

Finally, I did not include GDELT summaries in the models because they only slightly improved forecast accuracy, and they did not cover a country of great interest to me (South Sudan). The latter is surely a fixable glitch, but it’s not fixed now, and I really wanted to have a forecast for that particular country in this year’s list for reasons that should now be evident from the results. On the accuracy part, I should note that I’ve only done a little bit of checking, and there are still plenty of ways to try to squeeze more forecasting power out of those data, not the least of them being to build more dynamic models that use monthly instead of annual summaries.

The forecasts are an unweighted average of predicted probabilities from a logistic regression model and a Random Forest that use more or less the same inputs. Both models were trained on data covering the period 1960-2010; applied to data from 2011 to 2013 to assess their predictive performance; and then applied to the newest data to generate forecasts for 2014. Variable selection was based mostly on my prior experience working this problem. As noted above, I did a little bit of model checking—using stratified 10-fold cross-validation—to make sure the process worked reasonably well, and to help choose between some different measures for the same concept. In that cross-validation, the unweighted average got good but not great accuracy scores, with an area under the ROC curve in the low 0.80s. Here are the variables used in the models:

Geographic Region. Per the U.S. Department of State (and only in the Random Forest).

Last Colonizer. Indicators for former French, British, and Spanish colonies.

Infant Mortality Rate. Relative to the annual global median, logged, and courtesy of the U.S. Census Bureau. The latest version ends in 2012, so I’ve simply pulled those values forward a year here.

Political Regime Type. Four-way categorization based on the Polity scale into autocracies, “anocracies,” democracies, and transitional, collapsed, or occupied cases.

Political Stability. Count of years since a significant change in the Polity scale, logged.

Political Salience of Elite Ethnicity. Yes or no, per a data set on elite characteristics produced by the Center for Systemic Peace (CSP) for the Political Instability Task Force (PITF), with hard-coded updates for 2013 (no changes). This one is not posted on CSP’s data page and was obtained from PITF and shared with their permission.

Violent Civil Conflict. Yes or no, per CSP’s Major Episodes of Political Violence data set (here), with hard-coded updates for 2013 (a few changes).

Election Year. Yes-or-no indicator for any national elections—executive, legislative, or constituent assembly—courtesy of the NELDA project, with hard-coded updates for 2012 through 2014 (scheduled).

Slow Economic Growth. Yes-or-no indicator for less than 2 percent, as described above.

Domestic Coup Activity. Yes-or-no indicator for countries with any attempts in the past 5 years, successful or failed.

Regional Coup Activity. A count of other countries in the same region with any coup attempts the previous year, logged.

Global Coup Activity. Same as the previous tic, but for the whole world.

All of the predictors are lagged one year except for region, last colonizer, country age, post-Cold War period, and the election-year indicator. The fact that a variable appears on this list does not necessarily mean that it has a significant effect on the risk of any coup attempts. As I said earlier, I drew up a roster of variables to include based on a sense of what might matter (a.k.a., theory) and past experience and did not try to do much winnowing.

If you are interested in exploring the results in more detail or just trying to do this better, you can replicate my analysis using code I’ve put on GitHub (here). The posted script includes a Google Drive link with the requisite data. If you tinker and find something useful, I only ask that you return the favor and let me know. [N.B. As its name implies, the generation of a Random Forest is partially stochastic, so the results will vary slightly each time the process is repeated. If you run the posted script on the posted data, you can expect to see some small differences in the final estimates. I think these small differences are actually a nice representation of the forecasts’ inherent uncertainty, so I have not attempted to eliminate it by, for example, setting the random number seed within the R script.]

UPDATE: In response to a comment, I tried to produce another version of the heat map that more clearly differentiates the quantiles and better reflects the fact that the predicted probabilities for cases outside the top two fifths are all pretty close to zero. The result is shown below. Here, the differences in the shades of gray represent differences in the average predicted probabilities across the five tiers. You can decide if it’s clearer or not.

86 Comments

antoniomandre

You should seriously consider to call this exercise something elese rather than “coup forecast”. From the “coup” vantage point, it looks a great nonsense. On the other hand, in the map of South East Asia you have a dark red spot without having any mention of the redded country in the “plot”. Really, all these “forecasts” of coups that are not going to happen discredit the blog, you know.

A forecast is just a probabilistic statement about the likelihood that a certain event will occur during a certain time period. That’s what these numbers are, so I stand behind the title. The fact that all of the probabilities are low doesn’t change that. I realize it can be confusing, though, so I took pains to call out that fact at the start of the post. I have also explained the method used to derive them in detail and even posted the data and code so anyone who cares to can replicate my analysis, find errors if there are any, and possibly improve on it.

La Palisse

Dear Jay
I believe I am aware of what probabilities mean as I assume you are aware of what political innuendo means. As opposed to appearances, the “coup forecast” posts are not about statistics and probabilities. They are about political messages. As it happens, in my view intelligently, the messages are being conveyed through a colourful festival of disregard of the intelligence of blog users.

I don’t think I’m going to persuade you that these forecasts are useful, but I am interested in trying to understand better why you object so vehemently to them. When you refer to “political innuendo” and say that the forecasts “are about political messages,” what do you mean?

Grant

I can’t think of a better word in the English language to use than forecast. Despite what people have come to expect after years of improving results on predicting the weather, forecasts are not absolute predictions of any kind. They are simply stating that, based on all available data, there is a certain possibility that a certain event will happen. Indeed, Ulfelder outright says that these are not intended to predict coups that won’t be happening (at least not in the way you mean). It’s the same as saying that there is a 20% chance of rain.

To quote Ulfelder’s very first note:

“Coup attempts rarely occur, so the predicted probabilities are all on the low side, and most are approximately zero. The fact that a country shows up in dark red on the map or ranks high on an ordered list does not mean that we should anticipate a coup occurring there. It just means that country is at relatively high risk compared to the rest of the world. Statistically speaking, the safest bet for any country almost any year is that a coup attempt won’t occur. The point of this exercise is to try to get a better handle on where the few coup attempts we can expect to see this year are most likely to happen.”

It would be pretty reckless of a reader to assume, based purely on this site’s posts, that a coup was about to happen this year. A coup might very well happen. As the years go by, if the circumstances don’t change then you could assume that a coup would happen at some point or other. But it outright says here that for any single year the odds are low.

antoniomandre

Actually a more precise rod-hot-dot-plot correction is necessary if this post is to claim any seriousness at all. In fact, while in the SEA map you redded Myanmar, Thailand, Malaysia, The Philipinnes and Timor-Leste, in the “plot” (which the post says is “a slightly larger set than the top fifth in the heat map”) only Myanmar and Thailand are mentioned. Did you actually expect anyone at all to pay attention to the post?

There are two darkish shades of red/orange on the map that are admittedly hard to tell apart. Malaysia, the Philippines, and Timor-Leste are in the second fifth of the rank-ordered list. If you want to see exactly what their forecasts are and how they were derived, please use the replication data and code. Meanwhile, I’ll try to find some time later today to produce a version of the map that shows more contrast between those tiers.

RAMBLINGS:
The updated map certainly helps! I enjoyed this exercise in thought for sure, very comprehensive. Perhaps it is a contribution that is not “needed” due to the lowness of likeliness, but IMO coups have taken an unwarranted backseat in recent years so it is nice to see a study of this kind.

The folks over at PITF are doing some really interesting things and I’d like to see continued expansions in this direction. The predictability of future IR events is always a game that nobody wants to play due to the statistical issues and impossibility of ever being truly conclusive. However, (again, IMO) forecast-minded studies still hold value as indicators of where to focus. The book “Bottom Billion”‘s chapter on coups has some great statistical gems regarding their probability. I lost my copy before I was able to jot down its sources for those numbers…

Anyways with all that, I’d say it’s safe to say that Madagascar will be in the cross-hairs for several years–maybe even until the next election, one without defacto backers of prior regimes…

The grey scale map is easier. Thanks for that. Concerns me that Ecuador was on your map of countries at risk of mass killings and this map. I have several friends who are from Ecuador as well as colleagues who travel there frequently.

Rafael

Trust me, there is virtually no chance for a coup or mass killings in Ecuador. We have the second most popular president in America by domestic approval rating after the Dominican Republic, 85% I believe (study carried out by mytofsky, Mexican Company). Free education, free healthcare, improving infrastructure, growing equality, and average growing economy (not the best but certainly nothing shabby). Hopefully I don’t eat my words, but I’m fairly confident with the establishment of this government the days of coup-ridden banana republic are over at last. They are making real structural changes so that the same people that controlled the country before can’t do so again even if they manage to win an election (got beat by 36% of the popular vote in the last election). Of course the model represented here probably doesn’t capture all that, but that’s what the disclaimer at the beginning was for.

Greetings.I am anxious to hear more on the interpretations of the data for the coup forecasts for 2014, especially about why Guinea came out number 1.

I’m writing now with a question about the use of Freedom House data, which you answered partly in your post. In a previous post, you stated you were waiting on Freedom House’s Freedom in the World Report to incorporate in your coup forecasts. I was going to write you then to question your use of this organization’s data, but ran out of time. When I read in your post today that you tossed the Freedom in the World report because it was less accurate in cross-validation than Polity, I became even more curious.

My experience with Freedom House over the last few decades is that it performs a similar function for the United States government as the National Endowment for Democracy’s associated appendages, which is to produce comprehensive propaganda about a particular region or country and, frequently, assemble teams for in-country destabilization functions. When a guy by the name of Frank Calzon, a Cuban exile, became Freedom House’s Washington representative in 1986, you could add dirty tricks to the resume as well, especially as it pertained to Cuba. A Cuban exile who worked for Calzon was indicted for embezzling a couple million dollars and when Calzon left in 1997, Freedom House was discredited entirely.

In many respects, the data from Freedom House may be an accurate predictor for identifying countries the U. S. would like to take down in a coup. Given that the U. S. usually “gets it man,” Freedom House might offer a good window.

Any thoughts you might have on Freedom House and its data would be appreciated.

The simple but unsatisfying answer to your question about why Guinea has such a (relatively) high risk this year is that it exhibits virtually all of the major risk factors. It’s a relatively poor country in a coup-prone region with a mixed political regime in which elites’ ethnicity is politically salient; it has a recent history of coup activity; and right now it’s experiencing slow economic growth. Again, I know that’s not terribly satisfying, but I think it does establish a useful baseline for thinking about how susceptible it might be and what to make of certain political developments over the course of the year.

As for Freedom House, my view of them is less jaundiced than yours, but I agree that their work blends social science with advocacy and activism in a way that we should always keep in mind when using and interpreting their data. I wrote something about that in a post a couple of years ago.

I’d like to know if perhaps there’s been an error regarding Germany, as it seems to be in the ‘2nd fifth from the bottom’, so to say – still very low, but higher than other consolidated western democracies. By looking at the variables you used, Germany would not seem to fulfil any condition to appear anywhere else other than the bottom fifth, along with nearly all other EU countries.

Replication data and code can be found through a link at the tail end of the post, so please feel free to rerun and check. Meanwhile, let me just emphasize that down there in the bottom couple of quintiles, the differences in estimates are tiny and not statistically or substantively meaningful.

Nicholas Creel

No, I didn’t. It’d be interesting to rerun the analysis with that adjustment and see how much the results change. Of course, it wouldn’t change the rank order of the countries; it would just nudge the probabilities up a bit. That should be pretty easy to implement in the replication data I posted.

I should say, too, that my decision not to use rare-events logit was partly a matter of habit, but that habit has stuck, in part, because the couple of times I have compared out-of-sample forecasts from RE logit to ones from plain-old logit, the latter have proved more accurate as measured by AUC score. I haven’t undertaken a systematic comparison, but it’d be interesting to see if this pattern holds broadly and to think about why.

As noted in the post, I tinkered with the underlying models. Using this year’s algorithm, Pakistan would have landed in the top 20 each of the past few years and probably many before that. It hits on a bunch of the risk factors: high infant mortality, a relatively young set of political institutions, ongoing civil conflict, salient elite ethnicity, and slow economic growth. The Random Forest is the much higher of the two forecasts this year (0.16 vs. 0.03), and I can’t say exactly what’s driving that, but from this year’s version of the logit model, Pakistan still looks a lot like other countries that have had coups in the past 50 years.

Texas Cowman

Hi Jay,
I have been following the Middle East, and do not feel that your parameters cover the real
factors not so much for a coup as for degradation of the government. It is obvious that
Government degradation in the 21st century is related to oil or other resources available
and not in control of a major western power, various pipeline and communications routes,
and the present government’s support of a more equitable division of Palestine/Israel.
Sudan, Iraq, Afghanistan, and Libya have all been damaged by these influences. Yemen, Syria, Egypt and Iran are in the crosshairs for alteration.

It amazes me that resource-rich countries seem to naturally develop their own guerrilla
movement. Look at Colombia, Yemen, Nigeria, etc. Somebody has to invest money into
these movements. Then others use religion to stoke up the fires as in the Central African
Republic and the original South Sudan partition. But underneath the flags and guns
there always seems to be a resource that somebody wants.

Abel

what a forecast!? it seems like you have been coloring the map with eyes closed. but let me give you information about a country called Eritrea, may be it would help you to reconsider:
– has no constitution
– has never experienced national election
– is illegal to organize a political party
– a self-appointed president ruling the country since 1991
– the government led the country and its people to military confrontation with all its neighbors i.e. Yemen, Sudan, Ethiopia and Djibouti
– the people is allowed to have one of the four religions listed by the government
– every youth is forced to undertake a military training at the age of 18
– had seen a coup attempt in 2013
surprisingly, this country is less likely to have a coup in 2014

It’s not clear to me how or why you think most of these things would affect the risk of a coup attempt. The one that is directly relevant to the statistical models I’m using here is the possibility that there was a coup attempt in 2013. Neither of the coup data sets I use considered those reports credible enough to record a failed coup in Eritrea last year, so my forecast is predicated on the assumption that there was no such event. If you believe there was and changed that data point accordingly, the estimated risk for Eritrea would be substantially higher.

Abel

I don’t actually agree with the model you are using. There is no a direct correlation between some of your ”variables” and the possibility for a coup to occur. Yet, even using your model many countries are given inappropriate forecast. As i have tried to show in the above, one is Eritrea.
I am not also clear why many of the data I gave you are irrelevant to your model. As you have listed in the above ”Political regime type” and ”Election Year” are few examples of the variables used in your forecast.
Another point that is not clear is regarding the coup attempt in Eritrea i.e. your ”assumption that there was no such event”. You are right, only if your sole source is Isayas Afewerki or other Eritrean officials.
On the other hand, a country that you mistakenly (if not deliberately) forecast to have high possibility of facing a coup is Ethiopia, a country in the same region with Eritrea. Eritrea and Ethiopia, today, are at different situations. bur ironically, your forecast show that Eritrea is at a much better conditions.
Let me do the comparison using your model, if only it could help you to reconsider:
GEOGRAPHIC REGION: Both are located in the Horn of Africa
LAST COLONIZER: Eritrea had been colonized by Italy and British (last colonizer). Ethiopia has never been colonized
COUNTRY AGE (YEARS SINCE INDEPENDENCE): Ethiopia has been independent in its’ entire history. Eritrea seceded from Ethiopia in 1991.
POST-COLD WAR PERIOD: Governments of both Ethiopia and Eritrea came to power in 1991.
INFANT MORTALITY RATE: Ethiopia is credited for substantial decline in infant mortality rate. One of the few countries expected to achieve the Millennium Development Goals.
POLITICAL REGIME TYPE: Ethiopia- democracy, Eritrea- one man rule
ELECTION YEAR- Ethiopia conducts national elections every 5 years. Eritrea has never seen election in the past two decades.
ECONOMIC GROWTH: Ethiopia has achieved steady double digit economic growth for the last 10 years.
POLITICAL STABILITY: Ethiopians witnessed peaceful transition of power in 2012. Eritreans, on the other hand, had seen a coup attempt in 2013.
…………………. please check the rest for yourself!
i hope you would correct the mistake, if it has not been done on purpose.

If you’re actually interested in unpacking the details of specific forecasts and seeing how they get combined by the statistical models, I’d suggest you work through the replication data and code I’ve posted. If you bothered to do that instead of proceeding on the basis of assumption alone, I think you’d see that much of what you’re saying here is already taken into account. Also, please note that I didn’t determine the values of the inputs or the weights each variable gets in the models. The inputs all come from public data sources, and the weights come from statistical comparisons of historical cases. Last but not least, I don’t appreciate your insinuation that I am tweaking the data to get to certain results. If you want to talk about the methods and results, fine. If you want to hurl innuendo at me because you don’t like the results of this exercise, please do it somewhere else.

These models put Ukraine 47th in the world for 2014, with an estimated risk of about 3 percent. That was the highest forecast for any European country but, obviously, a low forecast in any wider sense. I don’t think U.S. intervention was the decisive factor in Yanukovich’s ouster, but I agree that it would be very useful to be able to include foreign interventions in the models. Alas, the data aren’t there to do that.

Texas Cowman

I believe there are a few things that can be considered in the Ukraine:
Russia has a base in Syria and treaties. The mercenaries and religious fanatics have
failed to oust Assad, and the NATO bombing campaign never got off the ground. To turn Russia’s attention off Syria, the Europeans and others with interests in Syria supported the
disturbance in Ukraine.
On top of that the U.S. Senate is setting up plans to export natural gas to Europe.
The fastest way to do that is by making Russian gas and the pipelines through Ukraine
less dependable.
If nothing else similarity of the pipelines running through Syria and a non-western leaning government and the pipelines running
through Ukraine and a non-western leaning government should have been a clue.

Do you have code that can take all the independent nation-level coup probabilities to compute the globe-level probability of “one coup attempt, somewhere? two? three? what about ‘at least one?'” etc. It would be interesting to see how these independent and relatively small probabilities accumulate.

No, I don’t. What you can do to get a simple estimate of global-level risk, though, is just sum the case-level probabilities and treat that sum as your estimated event count (and, presumably, do the same with the upper and lower bounds of CIs to get a sense of the variance). With logistic regression models, it’s going to be unusual to find years where that sum deviates greatly from the base rate, but it’s still an interesting number to track.

I have a suggesting for potentially improving the model. Wouldn’t it make sense to look at GDP growth RELATIVE to potential growth. And maybe look at a two-year period. Defining 2% as “low growth” is in my view way too low for Emerging Markets like China or India. If Chinese growth hit for example 4% – maybe half of trend-growth – then we likely would see a sharp increase in social unrest.

Concluding, most of the Emerging Markets where we have seen social unrest over the past year have grown faster than 2%, but significantly slower than these countries’ trend-growth (potential growth).

Well, the simplest way would probably be to assume that the average real GDP growth rate over the past 10 years or so is the same as potential growth. An idea might be to look at the average RGDP growth rate in the past 2 years and compare that to the growth rate over 10 years. If it is one or two standard deviations below the long-term average then you could say it was low growth.

Thanks. Next time around, I think I’ll try that standardized score (recent growth – mean of prior growth / s.e. of prior growth) and see if it improves predictive power. If I do, I will report back on what I find.

mapearlson

Jay,
Thanks for the post. I added an linear discriminant analysis model and the AUC increase substantially in the low-90s for the mean across the three models (which include the logic and random forest). Any reason to be suspicious of the high score? Thanks.

Huh. Well, I’ll admit that I’m reflexively suspicious of any situation where the addition of a single method to an ensemble that already includes a couple of well-suited methods produces an improvement of more than 10 percent in out-of-sample accuracy. If it is really providing that much of a boost, that’d be some “secret sauce” kind of stuff. If you’ll share the code with me (ulfelder at gmail), I’d really like to take a look.

mapearlson

Yeah, my mistake. I corrected an error in the code, but it was exciting for a bit. The AUC went back down to low-80s so nothing really out of the ordinary. I used the same indicators as the logit so it seems to perform on par with that. I guess, as you’ve said before, that with rare events and sparse data feature selection is paramount and the choice of model can only provide so much power.

“I don’t think U.S. intervention was the decisive factor in Yanukovich’s ouster,” – Well we have it from the Obama’s mouth that they were behind it. US interefering in another state’s business against the wishes of its poplace must be a consideration if you’re going to make any headway in this crank appraoch to politics mate.

I agree the US is the driving force behind the Ukraine coup and the installation of this Kafkaesque oligarchal-neo-Nazi cabal. Only the US could have pulled this off. The coup had litle to do with internal Ukrainian politics and more with US’ need to smuggle NATO Into the saga followed by ferocious chewing on Putin’s ankle. All the US needed to do was ignore the fascist thugs as they attacked civilians and killed cops and bill these criminals as a freedom-loving uprising. In spite of incredulous questions posed to the Jen-Marie duo at the State department, the obliteration of Eastern Ukraine continues. So, if data on the possibility of coups in countries in which another country conceives of the coup, manufactures the dissenters and pays for it, it would seem like a pertinent factor to add. On the horizon, a similar USA venture –Venezuela.