Reducing Risks of Astronomical Suffering: A Neglected Priority

Summary

Will we go extinct, or will we succeed in building a flourishing utopia? Discussions about the future trajectory of humanity often center around these two possibilities, which tends to ignore that survival does not always imply utopian outcomes, or that outcomes where humans go extinct could differ tremendously in how much suffering they contain. One major risk is that space colonization, either through humans or (misaligned) artificial intelligence, may end up producing astronomical quantities of suffering. For a variety of reasons, such scenarios are rarely discussed and often underestimated. The neglectedness of risks of astronomical suffering (“suffering risks” or “s-risks”) makes their reduction a plausible priority from the perspective of many value systems. Rather than focusing exclusively on ensuring that there will be a future, we recommend interventions that improve the future’s overall quality.

I. Introduction

Among actors and organizations concerned with shaping the “long-term future,” the discourse has so far been centered around the concept of existential risks. “Existential risk” (or “x-risk”) was defined by Bostrom (2002) as “[...] an adverse outcome [which] would either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential.”

This definition is unfortunate in that it lumps together events that lead to vast amounts of suffering and events that lead to the extinction (or failure to reach potential) of humanity. However, many value systems would agree that extinction is not the worst possible outcome, and that avoiding large quantities of suffering is of utmost moral importance.

We should differentiate between existential risks (i.e., risks of “mere” extinction or failed potential) and risks of astronomical suffering1(“suffering risks” or “s-risks”). S-risks are events that would bring about suffering on an astronomical scale, vastly exceeding all suffering that has existed on Earth so far.

The above distinctions are all the more important because the term “existential risk” has often been used interchangeably with “risks of extinction”, omitting any reference to the future’s quality.2 Finally, some futures may contain both vast amounts of happiness and vast amounts of suffering, which constitutes an s-risk but not necessarily a (severe) x-risk. For instance, an event leading to a future containing 1035 happy individuals and 1025 unhappy ones, would constitute an s-risk, but not an “x-risk".3

The Case for Suffering-Focused Ethics outlined several reasons for considering suffering reduction one’s primary moral priority. From this perspective in particular, s-risks should be addressed before addressing extinction risks. Reducing extinction risks makes it more likely that there will be a future, possibly one involving space colonization and the astronomical stakes that come with it. But it often does not affect the quality of the future, i.e. how much suffering or happiness it will likely contain.4 A future with space colonization may contain vastly more sentient minds than have existed so far. If something goes wrong, or even if things do not go “right enough”, this would multiply the total amount of suffering (in our part of the universe) by a huge factor.

Reducing extinction risks essentially comes down to buying lottery tickets over the distribution of possible futures: A tiny portion of the very best futures will be worthy of the term “utopia” (almost) regardless of one’s moral outlook, the better futures will contain vast amounts of happiness but possibly also some serious suffering (somewhat analogous to the situation in Omelas); and the bad or very bad futures will contain suffering at unprecedented scales. The more one cares about reducing suffering in comparison to creating happiness, the less attractive such a lottery becomes. In other words, efforts to reduce extinction risks are only positive according to one’s values if one's expected ratio of future happiness vs. suffering is greater than one’s normative exchange rate.5 Instead of spending all our resources on buying as many lottery tickets as possible, those with suffering-focused values should try to ensure that as few tickets as possible contain (astronomically) unpleasant surprises.

The following sections will present reasons why s-risks are both neglected and tractable, and why actors concerned about the long-term future should consider investing (more) resources into addressing them.

II. The future contains less happiness and more suffering than is commonly assumed

Within certain future-oriented movements, notably Effective Altruism and transhumanism, there is a tendency for people to expect the (far) future to contain more happiness than suffering. Many of these people, in turn, expect future happiness to outweigh future suffering by many orders of magnitude.6 Arguments put forward for this position include that the vast majority of humans – maybe excluding a small percentage of sadists – value increasing happiness and decreasing suffering, and that technological progress so far has led to many welfare improvements.

While it seems correct to assume that the ratio of expected future happiness to suffering is greater than one,7 the case is not open-and-shut. Good values alone are not sufficient for ensuring good outcomes, and at least insofar as the suffering humans inflict on nonhuman animals is concerned (e.g. with factory farming), technology’s track record is actually negative rather than positive. Moreover, it seems that a lot of people overestimate how good the future will be due to psychological factors, ignorance about some of the potential causes of astronomical future suffering, and insufficient concern for model uncertainty and unknown unknowns.

II.I Psychological factors

It is human nature to (subconsciously) flinch away from contemplating horrific realities and possibilities; the world almost certainly contains more misery than most want to admit or canimagine. Our tendency to underestimate the expected amount of future (as compared to present-day) suffering might be even more pronounced. While it would be unfair to apply this characterization to all people who display great optimism towards the future, these considerations certainly play a large role in the epistemic processes of some future “optimists.”

One contributing factor is optimism bias (e.g. Sharot, Riccardi, Raio, & Phelps, 2007), which refers to the tendency to overestimate the likelihood of positive future events while underestimating the probability and severity of negative events – even in the absence of evidence to support such expectations. Another, related factor is wishful thinking, where people are prone to judging scenarios which are in line with their desires as being more probable than what is epistemically justified, while assigning lower credence to scenarios they dislike.

Striving to avert future dystopias inevitably requires one to contemplate vast amounts of suffering on a regular basis, which is often demotivating and may result in depression. By contrast, while the prospect of an apocalypse may also be depressing, working towards a utopian future is more inspiring, and could therefore (subconsciously) bias people towards paying less attention to s-risks.

Similarly, working towards the reduction of extinction risks or the creation of a posthuman utopia is also favored by many people’s instinctual, self-oriented desires, notably one’s own survival and that of family members or other loved ones. As it is easier to motivate oneself (and others) towards a project that appeals to altruistic as well as more self-oriented desires, efforts to reduce risks of astronomical suffering – risks that lie in the distant future and often involve the suffering of unusual or small minds less likely to evoke empathy – will be comparatively neglected. This does not mean that the above motivations are misguided or unimportant; rather, it means that if one also, upon reflection, cares a great deal about reducing suffering, then it might take deliberate effort to give this concern due justice.

Lastly, psychological inhibitions against contemplating s-risks and unawareness of such considerations are interrelated and tend to reinforce each other.

II.II Unawareness of possible sources of astronomical suffering

In discussions about the risks from smarter-than-human artificial intelligence, it is often assumed that the sole reason to consider AI safety an important focus area is because it decides between utopia or human extinction. The possibility that misaligned or suboptimally aligned AI might instantiate suffering in astronomical quantities is, however, rarely brought up.

Misaligned AI as a powerful but morally indifferent optimization process might transform galactic resources into highly optimized structures, some of which might very well include suffering. The structures a superintelligent AI or an AI-based economy decoupled from human interests would build in the pursuit of its goals may for instance include a fleet of “worker bots,” factories, supercomputers to simulate ancestral Earths for scientific purposes, and space colonization machinery, to name a few. In the absence of explicit concern for suffering reflected in the goals of such an optimization process, AI systems would be willing to instantiate suffering minds (or “subroutines”) for even the slightest benefit to their objectives. This is especially worrying because the stakes involved could literally turn out to be astronomical: Space colonization is an attractive subgoal for almost any powerful optimization process, as it leads to control over the largest amount of resources. Even if only a small portion of these resources are used for purposes that involve suffering, the resulting disvalue would tragically be enormous.8 Finally, next to bad states of affairs being brought about for instrumental reasons, because of indifference to suffering, there is also the risk that bad states of affairs could be brought about for strategic reasons: In competition or conflict between different factions, if one side is compassionate and the other not, threatening to bring about bad states of affairs could be used as an extortion tactic.

For an overview on ways the future could contain vast amounts of suffering – including as a result of suboptimally aligned AI or human-controlled futures where dangerous ideologies win – see here and here.

II.III Astronomical suffering as a likely outcome

One might argue that the scenarios just mentioned tend to be speculative, maybe extremely speculative, and should thus be discounted or even ignored altogether. However, the claim that creating extremely powerful agents with alien values and nocompassion might lead to vast amounts of suffering – through some way or another – is a disjunctive prediction. Only one possible action by which the AI could increase its total utility, yet involving vast quantities of suffering, would be required for the AI to pursue this path without reservation. Worries of this sort are weakly supported by the universe’s historical track record, where the “morally indifferent optimization process” of Darwinian evolution instantiated vast amounts of misery in the form of wild-animal suffering.

Even if the probability of any one specific scenario involving astronomical amounts of suffering (like the ones above, or other scenarios not yet mentioned or thought of) is small, the probability that at least one scenario will occur may be fairly high. In this context, we should beware the disjunction fallacy (Bar-Hillel & Neter, 1993), according to which most people not only underestimate the probability of disjunctions of events, but they actually judge the disjunction as less likely than a single event comprising it.9

II.IV Unknown unknowns and model uncertainty

Lastly, taking seriously the possibility of unknown unknowns, black swans or model uncertainty generally seems incompatible with predicting a very large (say, 1,000,000 to 1) ratio of expected future happiness to suffering. Factoring in such model uncertainty brings matters back towards a more symmetrical prior. Predicting an extreme ratio, on the other hand, would require enormous amounts of evidence, and is thus suggestive of overconfidence or wishful thinking – especially in the light of historical data on the distribution of suffering and happiness.

In conclusion, there are several reasons why the probability of risks of astronomical suffering – although difficult to assess – is significant; we should be careful to not underestimate them.10

III. Reducing s-risks is both tractable and neglected

If the future will be controlled by actors that care about both creating happiness and reducing suffering, more happiness could be created than expected suffering can be reduced. Versions of this argument have been brought up to justify a focus on utopia creation and the reduction of extinction risks even if one gives some more (but not a lot more) weight to suffering reduction over happiness creation. This is a valid concern, but it does not yet factor in tractability and neglectedness, which may point the other way.

III.I Tractability

Creating vast amounts of happiness in the future without also causing vast amounts of suffering in the process requires a great deal of control over future outcomes; it essentially requires hitting a bulls-eye in the space of all possible outcomes. This is especially true for views where value is fragile, but also for other views. By contrast, steering our future trajectory away from broad classes of bad or very bad outcomes could arguably be easier.

Applied to AI risks in particular, this means that instead of betting on the unlikely case that we will get everything right, worst-case AI safety – approaches informed specifically by considerations of minimizing the probability of the worst failure modes – seem promising, especially as a first step or first priority.

III.II Neglectedness

Partly due to the reasons described above (psychological factors, unawareness of some potential sources of astronomical future suffering, and undue disregard of model uncertainty or the possibility of black swans), efforts to reduce risks of astronomical suffering appear neglected. So far, the Foundational Research Institute is the only organization that has made this their main focus area.11

Because reducing s-risks is a neglected priority, we should expect there to be several low-hanging fruit in in terms of possible interventions. This suggests that researching effective and cooperative strategies to avert risks of astronomical suffering may have a particularly high expected value on the margin.

IV. Concluding thoughts

Risks of astronomical suffering are important to address for many value systems, including ones where suffering reduction is only one (strong) concern out of many. Similarly, if one has uncertainty over different moral views, following a parliamentary model for dealing with moral uncertainty would suggest that suffering reduction should get many votes in one’s “moral parliament,” and correspondingly, that significant resources should be directed towards reducing s-risks.

Moral uncertainty can simultaneously be a reason why even people who care strongly about reducing suffering should welcome some efforts to preserve humanity’s option value. More importantly, considerations about positive-sum cooperation suggest that all s-risk reducers should aim to compromise with those who want to ensure that humanity has a cosmic future. While the thought of prioritizing anything above s-risks may feel unacceptable to some of us, it must be acknowledged that other people may (and will) disagree. Values differ, and alternative views on the matter often also have their legitimacy. Rather than fighting other people’s efforts to ensure humanity’s survival and the chance to develop into an intergalactic, long-lasting and flourishing civilization, we should complement these efforts by taking care of the things that could go wrong. Cooperation between future optimists and future pessimists will be best for everyone, as it improves the odds that those who at all care about the long-term future can beat all the future apathists, or the general trend for things to slide towards the abyss when actors only follow their own incentives.

VI. Footnotes

See Tomasik (2013) who essentially coined the term. Bostrom has used the term “hyper-existential catastrophe” here. (back)

Below are two out of many examples:
The Centre for Study of Existential Risk (CSER) describes its mission as follows (emphasis added; accessed 09.09.2016): “The Centre for Study of Existential Risk is an interdisciplinary research centre focused on the study of human extinction-level risks that may emerge from technological advances.”
In his article “We're Underestimating the Risk of Human Extinction,” Ross Andersen writes (our emphasis): “Bostrom, who directs Oxford's Future of Humanity Institute, has argued over the course of several papers that human extinction risks are poorly understood and, worse still, severely underestimated by society. Some of these existential risks are fairly well known, especially the natural ones.” (back)

Except perhaps in the rather weak sense in which even more positive potential could be actualized if those 1025 individuals, instead of being unhappy and subtracting from the overall value, were happy too. (back)

Some interventions could affect both the likelihood of the future and its quality: For instance, interventions that increase global stability plausibly improve both humanity’s chances of survival as well as the chances of a good outcome conditional on survival. Similarly, some interventions in the space of AI safety may reduce both the chance of human extinction and the chance of misaligned AI bringing about a lot of suffering. Finally, building disaster shelters may make it likelier that humanity survives, but it may also lead to worse futures in expectation, assuming that “rebuilding” leads to worse values and worse arms race dynamics than the current trajectory we are on. (back)

One's normative exchange rate specifies how much happiness can counterbalance a given amount of suffering according to one’s values. (back)

Notable effective altruists (in personal communication) have estimated the ratio to be 1,000,000 : 1, which seems unfortunately way too high given the considerations below. It should also be noted that assessments of the ratio between happiness and suffering, and whether/how the two can be compared, involve subjective judgment calls. Ratio estimates seem meaningless if they do not refer to a specified exchange rate. For the discussion here, we are going with an exchange rate where, contrary to suffering-focused ethics, extreme happiness and extreme suffering are valued as roughly equally good/bad. (back)

The median future however is less likely to be positive, given that most of the expected happiness comes from a few unlikely scenarios with vast amounts of happiness. And again, “greater than one” is here meant conditional on an exchange rate that treats extreme suffering and extreme happiness symmetrically. (back)

And the stakes could theoretically become even bigger: If physics allows for it, the drive for maximizing goal achievement could also result in the creation of new suffering through the formation of baby universes or the enlargement or temporal expansion of the universe (which could e.g. increase suffering in Boltzmann brains). (back)

Dawes and Hastie (2001) offer the following explanation for the disjunctive fallacy: “First, our judgments tend to be made on the basis of the probabilities of individual components, but, [...], even though those probabilities may be quite low, the probability of the disjunction may be quite high. We attribute this error primarily to the anchor-and-(under)adjust heuristic estimation process. Second, any irrational factors that lead us to underestimate the probabilities of the component events – such as lack of imaginability – may lead us to underestimate the probability of the disjunction as a whole.” Arguably, both of these factors prompt us to underestimate risks of astronomical suffering. (back)

Admittedly, many of the sources of suffering mentioned above could symmetrically also lead to the instantiation of unintended pleasure – e.g. there could be “happy subroutines” or “happy Boltzmann brains.” However, if one prioritizes suffering prevention over the creation of happiness at least somewhat, then the expected risks are more likely to outweigh the expected benefits for futures shaped by morally indifferent processes. Furthermore, human intuitions about what is valuable are often complex and fragile (Yudkowsky, 2011), taking up only a small area in the space of all possible values. In other words, the number of possible configurations of matter constituting anything we would value highly (under reflection) is arguably smaller than the number of possible configurations that constitute some sort of strong suffering or disvalue, making the incidental creation of the latter ceteris paribus more likely. To give an example, a lot of people would not strongly care, hypothetically, for there being a Boltzmann brain experiencing a few seconds of intense happiness, because this experience would be unconnected to any actual life(-history) or actual preferences about the state of things. By contrast, a Boltzmann brain in agony might still evoke a lot of concern. (back)

A few other organizations, especially the Machine Intelligence Research Institute (MIRI) and the Future of Humanity Institute (FHI), put significant effort into reducing s-risks and have made valuable contributions to identifying them. Most notably, we are particularly indebted to Carl Shulman and Nick Bostrom. However, out of the total effort by actors trying to affect the long-term future, only a very small portion of it is spent on s-risk reduction. FRI occupies an important niche with a lot of room for others to join. (back)

See Tomasik (2013) who essentially coined the term. Bostrom has used the term “hyper-existential catastrophe” here.

Below are two out of many examples:
The Centre for Study of Existential Risk (CSER) describes its mission as follows (emphasis added; accessed 09.09.2016): “The Centre for Study of Existential Risk is an interdisciplinary research centre focused on the study of human extinction-level risks that may emerge from technological advances.”
In his article “We're Underestimating the Risk of Human Extinction,” Ross Andersen writes (our emphasis): “Bostrom, who directs Oxford's Future of Humanity Institute, has argued over the course of several papers that human extinction risks are poorly understood and, worse still, severely underestimated by society. Some of these existential risks are fairly well known, especially the natural ones.”

Except perhaps in the rather weak sense in which even more positive potential could be actualized if those 1025 individuals, instead of being unhappy and subtracting from the overall value, were happy too.

Some interventions could affect both the likelihood of the future and its quality: For instance, interventions that increase global stability plausibly improve both humanity’s chances of survival as well as the chances of a good outcome conditional on survival. Similarly, some interventions in the space of AI safety may reduce both the chance of human extinction and the chance of misaligned AI bringing about a lot of suffering. Finally, building disaster shelters may make it likelier that humanity survives, but it may also lead to worse futures in expectation, assuming that “rebuilding” leads to worse values and worse arms race dynamics than the current trajectory we are on.

One's normative exchange rate specifies how much happiness can counterbalance a given amount of suffering according to one’s values.

Notable effective altruists (in personal communication) have estimated the ratio to be 1,000,000 : 1, which seems unfortunately way too high given the considerations below. It should also be noted that assessments of the ratio between happiness and suffering, and whether/how the two can be compared, involve subjective judgment calls. Ratio estimates seem meaningless if they do not refer to a specified exchange rate. For the discussion here, we are going with an exchange rate where, contrary to suffering-focused ethics, extreme happiness and extreme suffering are valued as roughly equally good/bad.

The median future however is less likely to be positive, given that most of the expected happiness comes from a few unlikely scenarios with vast amounts of happiness. And again, “greater than one” is here meant conditional on an exchange rate that treats extreme suffering and extreme happiness symmetrically.

And the stakes could theoretically become even bigger: If physics allows for it, the drive for maximizing goal achievement could also result in the creation of new suffering through the formation of baby universes or the enlargement or temporal expansion of the universe (which could e.g. increase suffering in Boltzmann brains).

Dawes and Hastie (2001) offer the following explanation for the disjunctive fallacy: “First, our judgments tend to be made on the basis of the probabilities of individual components, but, [...], even though those probabilities may be quite low, the probability of the disjunction may be quite high. We attribute this error primarily to the anchor-and-(under)adjust heuristic estimation process. Second, any irrational factors that lead us to underestimate the probabilities of the component events – such as lack of imaginability – may lead us to underestimate the probability of the disjunction as a whole.” Arguably, both of these factors prompt us to underestimate risks of astronomical suffering.

Admittedly, many of the sources of suffering mentioned above could symmetrically also lead to the instantiation of unintended pleasure – e.g. there could be “happy subroutines” or “happy Boltzmann brains.” However, if one prioritizes suffering prevention over the creation of happiness at least somewhat, then the expected risks are more likely to outweigh the expected benefits for futures shaped by morally indifferent processes. Furthermore, human intuitions about what is valuable are often complex and fragile (Yudkowsky, 2011), taking up only a small area in the space of all possible values. In other words, the number of possible configurations of matter constituting anything we would value highly (under reflection) is arguably smaller than the number of possible configurations that constitute some sort of strong suffering or disvalue, making the incidental creation of the latter ceteris paribus more likely. To give an example, a lot of people would not strongly care, hypothetically, for there being a Boltzmann brain experiencing a few seconds of intense happiness, because this experience would be unconnected to any actual life(-history) or actual preferences about the state of things. By contrast, a Boltzmann brain in agony might still evoke a lot of concern.

A few other organizations, especially the Machine Intelligence Research Institute (MIRI) and the Future of Humanity Institute (FHI), put significant effort into reducing s-risks and have made valuable contributions to identifying them. Most notably, we are particularly indebted to Carl Shulman and Nick Bostrom. However, out of the total effort by actors trying to affect the long-term future, only a very small portion of it is spent on s-risk reduction. FRI occupies an important niche with a lot of room for others to join.