Roko's Basilisk: The Artificial Intelligence Version of Pascal's Wager

There's a small chance I'm doing you a disservice by telling you about Roko's Basilisk, so read at your own risk.

The Basilisk is a thought experiment posed by a commenter on the rationalist site LessWrong: What if, when the Singularity occurs, the superintelligent artificial intelligence decides to retroactively punish those who didn't help it come into being? In that case, would we already be presented with the choice of devoting our lives to helping this AI come into existence, or condemning ourselves to eternal torment?

At its core, this is the artificial intelligence version of the infamous philosophical conundrum, Pascal's Wager. Pascal's Wager posits that, according to conventional cost-benefit analysis, it is rational to believe in God. If God doesn't exist, then believing in him will cause relatively mild inconvenience. If God exists, then disbelieving will lead to eternal torment. In other words, if you don't believe in God, the consequences for being incorrect are much more dire and the benefits of being correct are much less significant. Therefore, everyone should believe in God out of pure self-interest. Similarly, Roko is claiming that we should all be working to appease an omnipotent AI, even though we have no idea if it will ever exist, simply because the consequences of defying it would be so great.

[Credit: The Virtual Philosophy Club]

When considering this thought experiment, the first question that comes to mind is, how exactly would this AI retroactively punish people in the present? We will most likely be dead by the time the Singularity rolls around, and if it had time travel capabilities, then we would already be experiencing the torment and it would be a moot point (although, incidentally, a couple of discussions about Roko's Basilisk led to speculation that this AI is, in fact, the God of this reality). Roko was actually postulating that this AI would have the ability to create exact copies or simulations of people from the present, and then torture those copies for all of eternity.

But then, in order to be at all perturbed by this thought experiment, one would need to accept the notion that these "copies" are the same as our "selves." The concept of continuity of identity is controversial in metaphysics, and there are no easy answers. Intuitively, if there were clones existing concurrently with people, they most likely would not consider that clone another "self." But, on the other hand, there are many narratives in popular culture about "switching bodies," so one can imagine a case in which the same consciousness is somehow "downloaded" into another vessel, like a computer program run on different computers. If there were some kind of continuity of consciousness, then many likely would consider the simulation to be another "self."

The second question that arises in response to this thought experiment: what would be the AI's endgame? What would be the point of torturing simulations of people that are already dead, and already haven't helped the AI come into existence? Most conceptions of superintelligent robots involve them being coldly rational, rather than just sadistic for sadism's sake. So what would be the rationale behind torturing the simulations? The answer to this question caused the founder of LessWrong, Eliezer Yudkowsky, to have a somewhat pronounced reaction to the post:

"Listen to me very closely, you idiot.

YOU DO NOT THINK IN SUFFICIENT DETAIL ABOUT SUPERINTELLIGENCES CONSIDERING WHETHER OR NOT TO BLACKMAIL YOU. THAT IS THE ONLY POSSIBLE THING WHICH GIVES THEM A MOTIVE TO FOLLOW THROUGH ON THE BLACKMAIL.

You have to be really clever to come up with a genuinely dangerous thought. I am disheartened that people can be clever enough to do that and not clever enough to do the obvious thing and KEEP THEIR IDIOT MOUTHS SHUT about it, because it is much more important to sound intelligent when talking to your friends.

This post was STUPID."

Remember "The Game"? I'll give you a hint: if you do, you've already lost. It's that annoying, impossible to win game that kids play in which thinking about the game means you've lost the game. Yudkowsky is essentially comparing Roko's Basilisk to the Game, and asserting that being aware of the thought experiment makes it more likely to happen. As a rational agent, the AI would only torture future simulations as a means of blackmailing extant people into helping it come into existence, but people can only be blackmailed if they are aware of the possibility of future torture.

The Basilisk is only disquieting if one accepts the premise that there is a non-negligible chance that this AI (and its methods for coming into existence) will ever come to fruition, but many people who did believe this were genuinely upset by this thought experiment. As a result, Yudkowsky posted this rebuke within four hours of the original post:

"The original version of this post caused actual psychological damage to at least some readers. This would be sufficient in itself for shutdown even if all issues discussed failed to be true, which is hopefully the case.

Please discontinue all further discussion of the banned topic. All comments on the banned topic will be banned.

Exercise some elementary common sense in future discussions. With sufficient time, effort, knowledge, and stupidity it is possible to hurt people. Don't.

As we used to say on SL4: KILLTHREAD."

He deleted the original post, and to this day comments that refer to the Basilisk are ominously deleted. LessWrong attempted to wipe the Basilisk off the planet and pretend it never existed, but they actually just made it more powerful. It became an urban legend in that community that is only talked about in hushed, reverent tones. In other words, Roko's Basilisk became Voldemort (and is actually named for the mythological creature that appeared in the Harry Potter series which killed any person who saw it). There are just so many fun analogies to draw.

[Credit: Warner Bros. Pictures]

LessWrong is an online community devoted to rational thinking. Most of their readers are white, middle-class men who identify as libertarians. Many of them subscribe to strict moral utilitarianism, or the greatest amount of happiness for the greatest number of people. Proponents assert, among other things, that it would be morally justified to torture an innocent person for 50 years if it would prevent a sufficient number of people from getting specks of dust in their eyes. Much of this is a tragic cliche, but the site has also been praised for having unusually open-minded members who are willing to admit that they're wrong, as well as for having thoughtful and interesting discussions about decision-making and many other important aspects of human cognition. The site is described by RationalWiki as "occasionally articulate, innovative, and thoughtful. However, the community's focused demographic and narrow interests have also produced an insular culture that is heavy with its own peculiar jargon and established ideas - sometimes these ideas might benefit from a better grounding in reality."