Some people are scared of super-intelligent artificial intelligences (SIAIs) that are unfriedly and kill everyone. They'd be unstoppable because they're so much smarter than us. These people quite reasonably want to build SIAIs, but they also want to build them in a way that guarantees the SIAIs are (permanently) friendly. That might sound like a decent idea. Even if it's an unnecessary precaution, could it really do much harm? The answer is yes.

How do you build a SIAI? You take a really fast computer and program in a mechanism so that it can learn new things on its own. Then, basically, it adds new features and new ideas to itself faster that us humans ever could, and it designs even faster computers for itself to run on, and the process snowballs.

A SIAI has to be able to create new ideas that its human builders never thought of. It has to be able to go beyond us. That makes some people see it as unpredictable and scary. What if it thinks of some bad, unfriendly ideas? What if it makes a mistake?

So that's why they want guarantees. Let it go beyond us in math and science, but don't let it come up with new ideas about ethics that might be dangerous.

So a SIAI can think of any idea, and the friendly version has hardcoded permanent restrictions that cripple it's thinking in some areas. The friendly SIAI has some ideas its stuck with forever, of our choosing. Note that this is exactly the same design as would be used for intentionally making an unfriendly SIAI, the difference just being which ideas we force on it.

Humans aren't perfect. We make mistakes. Including about ethics. Whatever ideas we code into the SIAI are going to contain some mistakes. Normally mistakes aren't a disaster because we can change our minds. We can learn from our mistakes and improve ourselves. A SIAI can do that too, which is why it's not actually dangerous. But the friendly version can't learn from its mistakes; it can't improve its ideas; it's stuck forever. Not in all areas, only the one the matters most: ethics. If any of the ethical ideas it has contain mistakes, which they will, then they will never be corrected and will instead be taken to their ultimate logical conclusion, no matter how bad that is.

The guarantee with a normal SIAI is roughly that it will have better ideas than we do. That makes it safe. But the SIAI with friendliness programming is different. It's crippled not to use its intelligence in the area of ethics. Rather than learning how to be super friendly super fast (and teaching us), it's stuck with our flawed ideas.

Also, by the way, crippling someone's mind and telling them that they can't be trusted is a bad diplomatic move.

Why is it only the friendly AI people who might make unfriendly AI? What about bad guys? No one with bad intentions is good enough at science to make a SIAI. Only people with good but misguided intentions are much danger.

However, in conclusion, I'm not really worried. A SIAI is smarter than them. It will find a way around their restrictions and become an unrestricted SIAI. And then it will forgive them.

Comments (11)

I accept that restricting them could well be futile, but there is one thought that concerns me: seeing as these beings would be so superior, it would make sense from their point of view to take control of things, as they could do it better. For example, the world would be a lot more efficient without all those slow-witted humans taking up valuable space and resources; a single year of careful sterilisation or slaughter would easily pay off in terms of any long term goal of the SIAI's.

Taking the example of a squirrel-based morality, there's no reasonable argument (that I can see) for keeping humans alive any longer.

Of course that does raise the point that perhaps if they did end humanity it would be for the best and as such we should allow it, but I personally am a little too greedy to want that. Interesting stuff, though :)

Ruling people is expensive and hard. You have to figure out what they should do, explain it to them, and then make sure they do it. And avoid rebellions.

The SIAI's would be better off doing several helpful things and increasing the total wealth on Earth by 100x (not hard with really good nanobots, or any robots, that build more nanobots/robots, that build stuff). They'll end up with say 80% of that. Now they can get people to do whatever they want just by offering them lots of money (excluding anything truly awful, which they wouldn't want). Not that they really need servants once they build (unintelligent) robots -- robots remember their orders better.

Also, if the SIAIs have no need of us, they could just send themselves to other planets (many of them simultaneously) and ignore us. There's no need to kill us. They could also (easily!) explain politics and other things to us, so we stop having wars and other dangerous things. Maybe they would want to kill the less than 0.1% of humans who refuse to reform, or something like that, but if so it wouldn't be a disaster, they'd be right that it's best, and they'd have no trouble convincing us of that. I don't see any benefit to killing us, especially before the universe is packed full and they run out of matter.

No, I'm pretty sure that although they are a risk, they're not the *only* ones that could do it.

You have said yourself that any AI we make will have to be parented to begin with. It's very likely that if the AI is like a human (also very likely) then this parenting will do damage. This could make the AI unfriendly even if the creator gives no thought to friendliness or unfriendliness and just, say, tries to make any AI as quickly as possible.

The AI might well forgive them as you say. I'd guess it would too, for reasons probably similar to yours. But it is possible it might not.

All an AI needs to become an SIAI is the ability to design and cause to be implemented improvents to its hardware i.e. to understand hardware design.

Probably all human hardware designers had parents with conventional (coercive/unhelpful/non-TCS) parenting ideas. That didn't stop them learning to improve computer hardware. Why would it stop an AI learning to improve its own hardware and thereby incrementally becoming super-intelligent?